MiniMax-01
MiniMax's 456B hybrid lightning-attention model with native 4M-token context. Industry-leading long-context.
MiniMax-01 is text & chat AI model from Minimax, priced at €0.200 per 1M input tokens with a 4.1M tokens context window.
0.7
Pricing
API Integration
Use our OpenAI-compatible API to integrate MiniMax-01 into your application.
npm install railwailimport railwail from "railwail";
const rw = railwail("YOUR_API_KEY");
// Simple — just pass a string
const reply = await rw.run("minimax-01", "Hello! What can you do?");
console.log(reply);
// With message history
const reply2 = await rw.run("minimax-01", [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);
// Full response with usage info
const res = await rw.chat("minimax-01", [
{ role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);Deep dive — MiniMax's MiniMax-01
MiniMax (上海稀宇科技, Shanghai Xiyu Technology) is a Shanghai-based AI startup founded in late 2021 by Yan Junjie (former vice president of SenseTime). The company has built a portfolio of consumer AI products including the Talkie and Hailuo AI chatbots, abab text models, and the Hailuo Video generative video line. MiniMax has raised over $850M from investors including Alibaba, Tencent, Hillhouse Capital and IDG Capital, with a 2024 valuation above $2.5B. MiniMax-01, released January 2025, was the company's first open-weight frontier release and introduced lightning attention at scale — a linear-attention variant — with a 4M token training context and 1M context inference window.
Visit MiniMax →MiniMax-01 (MiniMax-Text-01 base plus MiniMax-VL-01 vision variant) combines lightning attention — MiniMax's linear-attention design — with periodic softmax attention layers (every 8th layer is full softmax). The architecture has 80 layers with 6,144 hidden size, and MoE feed-forwards with 32 experts and top-2 routing yielding 45.9B active out of 456B total. The hybrid design enables 4M-token effective training context and 1M-token inference context, with near-linear compute per token in the lightning-attention layers. MiniMax describes MiniMax-01 in its technical paper as the first production-scale linear-attention LLM. The VL variant adds 336x336 image patch encoding for vision input. Released January 2025 under the MiniMax Model License with open weights on Hugging Face alongside hosted API access via the MiniMax Open Platform.
- Parameters
- 456B total, 45.9B active per token (32 experts, top-2 routing)
- Context
- 4M tokens
- Industry-leading 1M-4M token context window
- First production-scale lightning (linear) attention LLM
- 456B total / 45.9B active parameters
- Strong needle-in-haystack performance reported in the technical paper
- Competitive with GPT-4o on standard benchmarks (MMLU, GSM8K, HumanEval)
- Open weights released — first frontier-scale lightning-attention model in the open
- Vision variant (MiniMax-VL-01) supports image inputs
- Multilingual with strong Chinese and English performance
- Best for: ultra-long-context analysis, Chinese-language applications, long-form agentic workflows, research on linear attention.
Pretrained on trillions of tokens of multilingual web data with heavy Chinese and English representation, code and math; the VL variant adds image-text pairs. Exact data composition is partially described in the technical paper. Knowledge cutoff approximately late 2024.
License: MiniMax Model License. Permits commercial use with acceptable-use restrictions; products at scale may require registration with MiniMax. Review the license file on Hugging Face before deployment.
Known limitations
- 456B total parameters — substantial GPU memory required for self-hosting
- Lightning attention has less mature kernel support than softmax
- Behind o1 / R1 / Claude 4 on hardest reasoning tasks
- Vision quality below GPT-4o and Claude 3.5 Sonnet
- Filters politically sensitive topics consistent with Chinese regulations
Frequently asked questions
Related Models
View all Text & ChatClaude Opus 4
Anthropic's most powerful model. Exceptional at complex analysis, agentic tasks, and extended reasoning.
Claude Sonnet 4
Anthropic's most capable model. Excellent for complex analysis, coding, math, and creative writing.
DeepSeek V3.1
DeepSeek's refreshed V3.1 release. 671B MoE / 37B active. Tops open-weights leaderboards on coding and reasoning.
DeepSeek V4 Pro
DeepSeek's April 2026 flagship. 1.6T MoE / 49B active params, 1M context, rivals top closed-source models on STEM and coding at a fraction of the price.
Start using MiniMax-01 today
Get started with free credits. No credit card required. Access MiniMax-01 and 100+ other models through a single API.