How much does Qwen 3 235B Instruct cost via Railwail?

Input: €0.900 per 1M tokens. Output: €0.900 per 1M tokens. No monthly minimum, no subscription. Start with €5 free credits.

What is the context window of Qwen 3 235B Instruct?

Qwen 3 235B Instruct supports a 131.1K tokens context window — enough for long books, technical manuals, and extended analysis.

How fast is Qwen 3 235B Instruct?

Latency depends on prompt length and load — typically 200ms to 2s for short prompts. We measure p50/p95 in real-time on /rankings.

Is Qwen 3 235B Instruct better than Claude Opus 4?

It depends on your use case. Qwen 3 235B Instruct (Alibaba / Qwen) and Claude Opus 4 (Anthropic) are both strong choices in text & chat. Compare them side-by-side at /compare/qwen-3-235b-vs-claude-opus-4.

Qwen 3 235B Instruct

Name: Qwen 3 235B Instruct
Brand: Together AI
SKU: qwen-3-235b
Price: 0.0009 EUR
Availability: InStock

Popular

Alibaba / Qwen

Text & Chat

Alibaba's Qwen 3 flagship MoE: 235B total / 22B active. Strong reasoning and tool use, open-weights.

Try Qwen 3 235B Instruct now

Send a single prompt and stream a response inline. Hit Cmd+Enter to submit.

Press Cmd+Enter to send

Response appears here.

TL;DR·Last updated May 16, 2026

Qwen 3 235B Instruct is text & chat AI model from Alibaba / Qwen, priced at €0.900 per 1M input tokens with a 131.1K tokens context window.

Try Qwen 3 235B Instruct

System Prompt

Message

Temperature

0.7

Max Tokens

Pricing

Price per Generation

Per generationFree

API Integration

Use our OpenAI-compatible API to integrate Qwen 3 235B Instruct into your application.

Install

npm install railwail

JavaScript / TypeScript

import railwail from "railwail";

const rw = railwail("YOUR_API_KEY");

// Simple — just pass a string
const reply = await rw.run("qwen-3-235b", "Hello! What can you do?");
console.log(reply);

// With message history
const reply2 = await rw.run("qwen-3-235b", [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);

// Full response with usage info
const res = await rw.chat("qwen-3-235b", [
  { role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);

Specifications

Context window

131,072 tokens

Max output

16,384 tokens

Developer

Alibaba / Qwen

Deep dive — Alibaba Cloud (Qwen team)'s Qwen 3 235B Instruct

About Alibaba Cloud (Qwen team)

Founded 2009 · Hangzhou, China

The Qwen team inside Alibaba Cloud has shipped one of the most prolific open-weight model lines in industry, starting with Qwen-7B (Aug 2023, first Chinese open-weight foundation model from a hyperscaler), through Qwen 1.5 (Feb 2024), Qwen2 (Jun 2024), Qwen2.5 (Sep 2024, sizes 0.5B-72B plus Coder/Math/VL), Qwen2.5-Max closed-weight flagship (Jan 2025) and the Qwen3 family in April-May 2025. Qwen3 introduced a uniform 'hybrid thinking' approach across the family, letting developers toggle between fast direct answers and long chain-of-thought reasoning in the same checkpoint. The 235B-A22B MoE flagship anchors the family alongside dense variants from 0.6B to 32B. The team is led by Junyang Lin and has published over a dozen technical reports. Models ship under the Apache 2.0 license starting with Qwen3, a substantial liberalisation over the earlier Tongyi Qianwen LICENSE. Alibaba Cloud, founded in 2009, is the largest cloud provider in China and hosts the Qwen models on its Model Studio service while also distributing weights freely on HuggingFace, ModelScope and GitHub.

Visit Alibaba Cloud (Qwen team) →

Architecture

Sparse Mixture-of-Experts Transformer (hybrid thinking)

Qwen3-235B-A22B is the flagship Mixture-of-Experts model of the Qwen3 family, released by Alibaba's Qwen team on 29 April 2025 with weights under Apache 2.0. The architecture is a Sparse MoE Transformer with 235 billion total parameters and 22 billion active per token (128 experts, 8 selected per token), 94 layers, and Grouped Query Attention (64 query heads / 4 KV heads). It supports a 128K native context window extended to 256K via YaRN scaling. The model was pretrained on approximately 36 trillion tokens spanning 119 languages with strong Chinese, English and code coverage, more than doubling the 18T-token corpus used for Qwen2.5. Pretraining was performed in three stages with progressively longer context lengths and improved data filtering. Post-training applied a four-stage pipeline: long-CoT cold start, RL on reasoning tasks, integration of thinking and non-thinking modes via mixed SFT, and a final general-purpose RL stage. The resulting model exposes hybrid thinking, toggled via the chat template, where the same checkpoint can produce either an R1-style chain-of-thought before the answer or a fast direct response. Qwen3-235B leads several open-weight benchmarks including AIME 2025, LiveCodeBench, ArenaHard and BFCL agent-eval as of release.

Parameters: 235B total, 22B active per token
Context: 262.1K tokens

What it can do

235B-parameter MoE with 22B active per token (128 experts, 8 selected)
Hybrid thinking mode toggle (CoT or direct answer in one checkpoint)
Pretrained on ~36T tokens across 119 languages
256K context window with YaRN scaling
Apache 2.0 license, fully commercial
Top open-weight scores on AIME 2025, LiveCodeBench, ArenaHard, BFCL
Function calling, MCP server support, parallel tool calls
Specialised siblings: Qwen3-Coder, Qwen3-Math, Qwen3-VL
Compatible with vLLM, SGLang, llama.cpp, Ollama, MLX, HuggingFace
Broad multilingual coverage with strong Chinese/English/Japanese/Korean performance
Best for: open-weight reasoning, agentic workloads, multilingual chat, on-prem enterprise.

Training & License

Pretrained on approximately 36 trillion tokens covering 119 languages with strong Chinese and English emphasis, code repositories and scientific content. Post-training uses a four-stage pipeline: long-CoT cold start, reasoning RL, mixed SFT integrating thinking/non-thinking modes, and a final general-purpose RL stage.

License: Apache 2.0. Open weights, fully commercial use permitted including for >100M MAU products (a notable liberalisation versus the earlier Tongyi Qianwen LICENSE).

Known limitations

Filters Chinese political topics
Large memory footprint requires multi-GPU inference for FP16
Vision requires separate Qwen3-VL checkpoint
Knowledge cutoff approximately early 2025
Long context >128K degrades on some recall tasks

Research papers

Frequently asked questions

Related Models

View all Text & Chat

Claude Opus 4

Anthropic

Anthropic's most powerful model. Exceptional at complex analysis, agentic tasks, and extended reasoning.

Free

Claude Sonnet 4

Anthropic

Anthropic's most capable model. Excellent for complex analysis, coding, math, and creative writing.

Free

DeepSeek V3.1

DeepSeek

DeepSeek's refreshed V3.1 release. 671B MoE / 37B active. Tops open-weights leaderboards on coding and reasoning.

Free

DeepSeek V4 Pro

DeepSeek

DeepSeek's April 2026 flagship. 1.6T MoE / 49B active params, 1M context, rivals top closed-source models on STEM and coding at a fraction of the price.

Free

Start using Qwen 3 235B Instruct today

Get started with free credits. No credit card required. Access Qwen 3 235B Instruct and 100+ other models through a single API.

Get Started Free Browse All Models

Qwen 3 235B Instruct

Pricing

API Integration

Deep dive — Alibaba Cloud (Qwen team)'s Qwen 3 235B Instruct

Research papers

Frequently asked questions

What is Qwen 3 235B Instruct?

How much does Qwen 3 235B Instruct cost via Railwail?

What is the context window of Qwen 3 235B Instruct?

How fast is Qwen 3 235B Instruct?

Is Qwen 3 235B Instruct better than Claude Opus 4?

Related Models

Claude Opus 4

Claude Sonnet 4

DeepSeek V3.1

DeepSeek V4 Pro

Start using Qwen 3 235B Instruct today