MiniMax-01

Popular
Minimax
Text & Chat

MiniMax's 456B hybrid lightning-attention model with native 4M-token context. Industry-leading long-context.

Try MiniMax-01 now
Send a single prompt and stream a response inline. Hit Cmd+Enter to submit.
Sign in to try this model with €5 free credits.
Sign in
Press Cmd+Enter to send
Response appears here.
TL;DR·Last updated May 16, 2026

MiniMax-01 is text & chat AI model from Minimax, priced at €0.200 per 1M input tokens with a 4.1M tokens context window.

Try MiniMax-01

0.7

Direct API access coming soon

Pricing

Price per Generation
Per generationFree

API Integration

Use our OpenAI-compatible API to integrate MiniMax-01 into your application.

Install
npm install railwail
JavaScript / TypeScript
import railwail from "railwail";

const rw = railwail("YOUR_API_KEY");

// Simple — just pass a string
const reply = await rw.run("minimax-01", "Hello! What can you do?");
console.log(reply);

// With message history
const reply2 = await rw.run("minimax-01", [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);

// Full response with usage info
const res = await rw.chat("minimax-01", [
  { role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);
Specifications
Context window
4,096,000 tokens
Max output
16,384 tokens
Developer
Minimax
Category
Text & Chat
Supported Formats
text
Tags
minimax
long-context
lightning-attention
open-weights
4m-context

Deep dive — MiniMax's MiniMax-01

About MiniMax
Founded 2021 · Shanghai, China

MiniMax (上海稀宇科技, Shanghai Xiyu Technology) is a Shanghai-based AI startup founded in late 2021 by Yan Junjie (former vice president of SenseTime). The company has built a portfolio of consumer AI products including the Talkie and Hailuo AI chatbots, abab text models, and the Hailuo Video generative video line. MiniMax has raised over $850M from investors including Alibaba, Tencent, Hillhouse Capital and IDG Capital, with a 2024 valuation above $2.5B. MiniMax-01, released January 2025, was the company's first open-weight frontier release and introduced lightning attention at scale — a linear-attention variant — with a 4M token training context and 1M context inference window.

Visit MiniMax →
Architecture
Hybrid Lightning-Attention + Softmax-Attention Mixture-of-Experts

MiniMax-01 (MiniMax-Text-01 base plus MiniMax-VL-01 vision variant) combines lightning attention — MiniMax's linear-attention design — with periodic softmax attention layers (every 8th layer is full softmax). The architecture has 80 layers with 6,144 hidden size, and MoE feed-forwards with 32 experts and top-2 routing yielding 45.9B active out of 456B total. The hybrid design enables 4M-token effective training context and 1M-token inference context, with near-linear compute per token in the lightning-attention layers. MiniMax describes MiniMax-01 in its technical paper as the first production-scale linear-attention LLM. The VL variant adds 336x336 image patch encoding for vision input. Released January 2025 under the MiniMax Model License with open weights on Hugging Face alongside hosted API access via the MiniMax Open Platform.

Parameters
456B total, 45.9B active per token (32 experts, top-2 routing)
Context
4M tokens
What it can do
  • Industry-leading 1M-4M token context window
  • First production-scale lightning (linear) attention LLM
  • 456B total / 45.9B active parameters
  • Strong needle-in-haystack performance reported in the technical paper
  • Competitive with GPT-4o on standard benchmarks (MMLU, GSM8K, HumanEval)
  • Open weights released — first frontier-scale lightning-attention model in the open
  • Vision variant (MiniMax-VL-01) supports image inputs
  • Multilingual with strong Chinese and English performance
  • Best for: ultra-long-context analysis, Chinese-language applications, long-form agentic workflows, research on linear attention.
Training & License

Pretrained on trillions of tokens of multilingual web data with heavy Chinese and English representation, code and math; the VL variant adds image-text pairs. Exact data composition is partially described in the technical paper. Knowledge cutoff approximately late 2024.

License: MiniMax Model License. Permits commercial use with acceptable-use restrictions; products at scale may require registration with MiniMax. Review the license file on Hugging Face before deployment.

Known limitations
  • 456B total parameters — substantial GPU memory required for self-hosting
  • Lightning attention has less mature kernel support than softmax
  • Behind o1 / R1 / Claude 4 on hardest reasoning tasks
  • Vision quality below GPT-4o and Claude 3.5 Sonnet
  • Filters politically sensitive topics consistent with Chinese regulations

Frequently asked questions

Start using MiniMax-01 today

Get started with free credits. No credit card required. Access MiniMax-01 and 100+ other models through a single API.