Qwen 2.5-Max

Custom
Text & Chat

Alibaba's flagship pretrained MoE model. Top-tier reasoning and code performance via DashScope API.

Try Qwen 2.5-Max now
Send a single prompt and stream a response inline. Hit Cmd+Enter to submit.
Sign in to try this model with €5 free credits.
Sign in
Press Cmd+Enter to send
Response appears here.
TL;DR·Last updated May 16, 2026

Qwen 2.5-Max is text & chat AI model from Custom, priced at €1.60 per 1M input tokens with a 32.8K tokens context window.

Try Qwen 2.5-Max

0.7

Direct API access coming soon

Pricing

Price per Generation
Per generationFree

API Integration

Use our OpenAI-compatible API to integrate Qwen 2.5-Max into your application.

Install
npm install railwail
JavaScript / TypeScript
import railwail from "railwail";

const rw = railwail("YOUR_API_KEY");

// Simple — just pass a string
const reply = await rw.run("qwen-2-5-max", "Hello! What can you do?");
console.log(reply);

// With message history
const reply2 = await rw.run("qwen-2-5-max", [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);

// Full response with usage info
const res = await rw.chat("qwen-2-5-max", [
  { role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);
Specifications
Context window
32,768 tokens
Max output
8,192 tokens
Developer
Custom
Category
Text & Chat
Supported Formats
text
Tags
qwen
alibaba
moe
flagship

Deep dive — Alibaba Cloud (Qwen team)'s Qwen 2.5-Max

About Alibaba Cloud (Qwen team)
Founded 2009 · Hangzhou, China

The Qwen team is Alibaba Cloud's large-language-model research unit, building on Alibaba's Damo Academy and Tongyi Lab research dating back to the late 2010s. Alibaba Cloud itself was founded in 2009 and is the largest public cloud provider in China. The Qwen series first appeared in 2023 with the open-weight Qwen-7B and Qwen-72B, followed by Qwen 1.5 (Feb 2024), Qwen2 (Jun 2024), Qwen2.5 (Sep 2024) with sizes from 0.5B to 72B plus Coder, Math and VL specialised siblings, Qwen2.5-Max as the closed-weight flagship (Jan 2025) and the Qwen3 family in 2025 introducing built-in thinking mode. The team led by Junyang Lin has published over a dozen technical reports and powers the Tongyi Qianwen consumer chat product across Alibaba properties. Qwen open weights have become the largest derivative-model family on HuggingFace by download volume, with thousands of community fine-tunes. Qwen2.5-Max is positioned as the closed-source flagship that benchmarks against GPT-4o, Claude 3.5 Sonnet and DeepSeek V3, exclusively available via the Alibaba Cloud API.

Visit Alibaba Cloud (Qwen team)
Architecture
Sparse Mixture-of-Experts Transformer (closed-weight flagship)

Qwen2.5-Max is the closed-weight flagship of the Qwen2.5 family, announced by Alibaba Cloud on 29 January 2025 - one week after DeepSeek V3. It is a Sparse Mixture-of-Experts Transformer; Alibaba has not publicly disclosed total or active parameter counts but confirms MoE architecture. The model was pretrained on more than 20 trillion tokens, surpassing Qwen2.5-72B's 18T-token corpus, with continued heavy emphasis on Chinese, English and code. Post-training combines supervised fine-tuning on Alibaba's curated multi-domain instruction set with Reinforcement Learning from Human Feedback (RLHF). Qwen2.5-Max is positioned by Alibaba as competitive with GPT-4o, Claude 3.5 Sonnet and DeepSeek V3 across general benchmarks, with reported leadership on Arena-Hard, LiveBench, LiveCodeBench and GPQA in the lab's own evaluations. The model is exclusively available via the Alibaba Cloud Model Studio API and through chat.qwenlm.ai; weights are not released. Function calling, JSON mode and vision (via separate Qwen2.5-VL-Max) are supported. The default context window is 32K tokens; longer contexts are available on request via Alibaba Cloud.

Parameters
Undisclosed (estimated several hundred billion total MoE parameters)
Context
32.8K tokens
What it can do
  • Closed-weight MoE flagship of the Qwen2.5 family
  • Pretrained on 20T+ tokens, surpassing Qwen2.5-72B
  • Benchmarks competitive with GPT-4o, Claude 3.5 Sonnet, DeepSeek V3
  • Available exclusively via Alibaba Cloud Model Studio API
  • Function calling and JSON mode
  • Strong bilingual Chinese-English performance
  • Code generation across major programming languages
  • 32K default context window (longer on request)
  • Vision via separate Qwen2.5-VL-Max checkpoint
  • Cost-competitive with US frontier APIs
  • Best for: enterprise China-based deployments, bilingual chat, Alibaba Cloud customers.
Training & License

Pretrained on more than 20 trillion tokens of multilingual web text, code, books and scientific papers, with strong Chinese and English emphasis. Knowledge cutoff approximately late 2024. Post-training uses supervised fine-tuning and RLHF on curated instruction data.

License: Proprietary closed-weight commercial license via Alibaba Cloud Model Studio. Weights not released. Standard Alibaba Cloud commercial terms apply.

Known limitations
  • Closed weights; no on-prem option
  • Filters Chinese political topics
  • Default 32K context shorter than Western flagships
  • API region availability concentrated in Asia
  • Limited third-party safety audits published

Frequently asked questions

Start using Qwen 2.5-Max today

Get started with free credits. No credit card required. Access Qwen 2.5-Max and 100+ other models through a single API.