Skip to main content

Qwen3.6-35B-A3B-FP8

Description

"Qwen3.6-35B-A3B-FP8" is a Mixture-of-Experts (MoE) language model by Alibaba with 35 billion total parameters, of which approximately 3 billion are active per forward pass. It is designed for efficient, high-quality chat and agentic workflows with reasoning and vision capabilities, suitable for long-document analysis and extended multi-turn conversations.

It supports and is suitable for:

  • Text generation within a chat completion (text to text)
  • Tool-calling for agentic workflows
  • Image understanding (vision)
  • Thinking / reasoning for step-by-step problem solving
  • Processing long documents and extended contexts

The following limitations apply:

  • Maximum context length: 262,144 tokens
  • Thinking mode requires at least 128,000 tokens of remaining context to function properly

Thinking mode is enabled by default. To disable it, pass "enable_thinking": false in your API request's extra body parameters.

The model has different recommended settings depending on the use case. Do not use greedy decoding (temperature 0) — it can cause performance degradation and repetitions.

Thinking mode (default)

General tasks:

ParameterValue
temperature1.0
top_p0.95
top_k20
presence_penalty1.5

Precise coding / web development:

ParameterValue
temperature0.6
top_p0.95
top_k20
presence_penalty0.0

Non-thinking mode (enable_thinking: false)

General tasks:

ParameterValue
temperature0.7
top_p0.8
top_k20
presence_penalty1.5

Reasoning / math / complex problem solving:

ParameterValue
temperature1.0
top_p1.0
top_k40
presence_penalty2.0

Output length

Set max_tokens according to task complexity to control cost and latency:

Task typeRecommended max_tokens
Standard queries32,768
Complex problems (math, programming contests)81,920

Tips for specific tasks

Math problems

For best results on mathematical tasks, append the following instruction to your prompt:

Please reason step by step, and put your final answer within \boxed{}.

Multiple-choice questions

To get consistent, parseable output on multiple-choice tasks, add this to your prompt:

Please show your choice in the 'answer' field with only the choice letter, e.g., 'answer': 'C'.

Terms of use and licensing

The general terms of use apply. The model is provided by Alibaba under the Apache 2.0 License, and reuse of the generated content is not subject to any additional restrictions.