Qwen 3 235B Instruct
Alibaba's Qwen 3 flagship MoE: 235B total / 22B active. Strong reasoning and tool use, open-weights.
Qwen 3 235B Instruct is text & chat AI model from Alibaba / Qwen, priced at β¬0.900 per 1M input tokens with a 131.1K tokens context window.
0.7
Pricing
API Integration
Use our OpenAI-compatible API to integrate Qwen 3 235B Instruct into your application.
npm install railwailimport railwail from "railwail";
const rw = railwail("YOUR_API_KEY");
// Simple β just pass a string
const reply = await rw.run("qwen-3-235b", "Hello! What can you do?");
console.log(reply);
// With message history
const reply2 = await rw.run("qwen-3-235b", [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);
// Full response with usage info
const res = await rw.chat("qwen-3-235b", [
{ role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);Deep dive β Alibaba Cloud (Qwen team)'s Qwen 3 235B Instruct
The Qwen team inside Alibaba Cloud has shipped one of the most prolific open-weight model lines in industry, starting with Qwen-7B (Aug 2023, first Chinese open-weight foundation model from a hyperscaler), through Qwen 1.5 (Feb 2024), Qwen2 (Jun 2024), Qwen2.5 (Sep 2024, sizes 0.5B-72B plus Coder/Math/VL), Qwen2.5-Max closed-weight flagship (Jan 2025) and the Qwen3 family in April-May 2025. Qwen3 introduced a uniform 'hybrid thinking' approach across the family, letting developers toggle between fast direct answers and long chain-of-thought reasoning in the same checkpoint. The 235B-A22B MoE flagship anchors the family alongside dense variants from 0.6B to 32B. The team is led by Junyang Lin and has published over a dozen technical reports. Models ship under the Apache 2.0 license starting with Qwen3, a substantial liberalisation over the earlier Tongyi Qianwen LICENSE. Alibaba Cloud, founded in 2009, is the largest cloud provider in China and hosts the Qwen models on its Model Studio service while also distributing weights freely on HuggingFace, ModelScope and GitHub.
Visit Alibaba Cloud (Qwen team) βQwen3-235B-A22B is the flagship Mixture-of-Experts model of the Qwen3 family, released by Alibaba's Qwen team on 29 April 2025 with weights under Apache 2.0. The architecture is a Sparse MoE Transformer with 235 billion total parameters and 22 billion active per token (128 experts, 8 selected per token), 94 layers, and Grouped Query Attention (64 query heads / 4 KV heads). It supports a 128K native context window extended to 256K via YaRN scaling. The model was pretrained on approximately 36 trillion tokens spanning 119 languages with strong Chinese, English and code coverage, more than doubling the 18T-token corpus used for Qwen2.5. Pretraining was performed in three stages with progressively longer context lengths and improved data filtering. Post-training applied a four-stage pipeline: long-CoT cold start, RL on reasoning tasks, integration of thinking and non-thinking modes via mixed SFT, and a final general-purpose RL stage. The resulting model exposes hybrid thinking, toggled via the chat template, where the same checkpoint can produce either an R1-style chain-of-thought before the answer or a fast direct response. Qwen3-235B leads several open-weight benchmarks including AIME 2025, LiveCodeBench, ArenaHard and BFCL agent-eval as of release.
- Parameters
- 235B total, 22B active per token
- Context
- 262.1K tokens
- 235B-parameter MoE with 22B active per token (128 experts, 8 selected)
- Hybrid thinking mode toggle (CoT or direct answer in one checkpoint)
- Pretrained on ~36T tokens across 119 languages
- 256K context window with YaRN scaling
- Apache 2.0 license, fully commercial
- Top open-weight scores on AIME 2025, LiveCodeBench, ArenaHard, BFCL
- Function calling, MCP server support, parallel tool calls
- Specialised siblings: Qwen3-Coder, Qwen3-Math, Qwen3-VL
- Compatible with vLLM, SGLang, llama.cpp, Ollama, MLX, HuggingFace
- Broad multilingual coverage with strong Chinese/English/Japanese/Korean performance
- Best for: open-weight reasoning, agentic workloads, multilingual chat, on-prem enterprise.
Pretrained on approximately 36 trillion tokens covering 119 languages with strong Chinese and English emphasis, code repositories and scientific content. Post-training uses a four-stage pipeline: long-CoT cold start, reasoning RL, mixed SFT integrating thinking/non-thinking modes, and a final general-purpose RL stage.
License: Apache 2.0. Open weights, fully commercial use permitted including for >100M MAU products (a notable liberalisation versus the earlier Tongyi Qianwen LICENSE).
Known limitations
- Filters Chinese political topics
- Large memory footprint requires multi-GPU inference for FP16
- Vision requires separate Qwen3-VL checkpoint
- Knowledge cutoff approximately early 2025
- Long context >128K degrades on some recall tasks
Frequently asked questions
Related Models
View all Text & ChatClaude Opus 4
Anthropic's most powerful model. Exceptional at complex analysis, agentic tasks, and extended reasoning.
Claude Sonnet 4
Anthropic's most capable model. Excellent for complex analysis, coding, math, and creative writing.
DeepSeek V3.1
DeepSeek's refreshed V3.1 release. 671B MoE / 37B active. Tops open-weights leaderboards on coding and reasoning.
DeepSeek V4 Pro
DeepSeek's April 2026 flagship. 1.6T MoE / 49B active params, 1M context, rivals top closed-source models on STEM and coding at a fraction of the price.
Start using Qwen 3 235B Instruct today
Get started with free credits. No credit card required. Access Qwen 3 235B Instruct and 100+ other models through a single API.