How much does MiniMax-01 cost via Railwail?

Input: €0.200 per 1M tokens. Output: €1.10 per 1M tokens. No monthly minimum, no subscription. Start with €5 free credits.

What is the context window of MiniMax-01?

MiniMax-01 supports a 4.1M tokens context window — enough for entire codebases or research papers in one prompt.

How fast is MiniMax-01?

Latency depends on prompt length and load — typically 200ms to 2s for short prompts. We measure p50/p95 in real-time on /rankings.

Is MiniMax-01 better than Bio_ClinicalBERT?

It depends on your use case. MiniMax-01 (Minimax) and Bio_ClinicalBERT (huggingface) are both strong choices in text & chat. Compare them side-by-side at /compare/minimax-01-vs-bio-clinicalbert.

MiniMax-01

Name: MiniMax-01
Brand: Custom
SKU: minimax-01
Price: 0.0002 EUR
Availability: InStock

Popular

Minimax

Text & Chat

MiniMax's 456B hybrid lightning-attention model with native 4M-token context. Industry-leading long-context.

Try MiniMax-01 now

Send a single prompt and stream a response inline. Hit Cmd+Enter to submit.

Press Cmd+Enter to send

Response appears here.

TL;DR·Last updated June 24, 2026

MiniMax-01 is text & chat AI model from Minimax, priced at €0.200 per 1M input tokens with a 4.1M tokens context window.

Try MiniMax-01

System Prompt

Message

Temperature

0.7

Max Tokens

Direct API access coming soon

Pricing

Price per Generation

Per generationFree

API Integration

Use our OpenAI-compatible API to integrate MiniMax-01 into your application.

Install

npm install railwail

JavaScript / TypeScript

import railwail from "railwail";

const rw = railwail("YOUR_API_KEY");

// Simple — just pass a string
const reply = await rw.run("minimax-01", "Hello! What can you do?");
console.log(reply);

// With message history
const reply2 = await rw.run("minimax-01", [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);

// Full response with usage info
const res = await rw.chat("minimax-01", [
  { role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);

Specifications

Context window

4,096,000 tokens

Max output

16,384 tokens

Developer

Minimax

Deep dive — MiniMax's MiniMax-01

About MiniMax

Founded 2021 · Shanghai, China

MiniMax (上海稀宇科技, Shanghai Xiyu Technology) is a Shanghai-based AI startup founded in late 2021 by Yan Junjie (former vice president of SenseTime). The company has built a portfolio of consumer AI products including the Talkie and Hailuo AI chatbots, abab text models, and the Hailuo Video generative video line. MiniMax has raised over $850M from investors including Alibaba, Tencent, Hillhouse Capital and IDG Capital, with a 2024 valuation above $2.5B. MiniMax-01, released January 2025, was the company's first open-weight frontier release and introduced lightning attention at scale — a linear-attention variant — with a 4M token training context and 1M context inference window.

Visit MiniMax →

Architecture

Hybrid Lightning-Attention + Softmax-Attention Mixture-of-Experts

MiniMax-01 (MiniMax-Text-01 base plus MiniMax-VL-01 vision variant) combines lightning attention — MiniMax's linear-attention design — with periodic softmax attention layers (every 8th layer is full softmax). The architecture has 80 layers with 6,144 hidden size, and MoE feed-forwards with 32 experts and top-2 routing yielding 45.9B active out of 456B total. The hybrid design enables 4M-token effective training context and 1M-token inference context, with near-linear compute per token in the lightning-attention layers. MiniMax describes MiniMax-01 in its technical paper as the first production-scale linear-attention LLM. The VL variant adds 336x336 image patch encoding for vision input. Released January 2025 under the MiniMax Model License with open weights on Hugging Face alongside hosted API access via the MiniMax Open Platform.

Parameters: 456B total, 45.9B active per token (32 experts, top-2 routing)
Context: 4M tokens

What it can do

Industry-leading 1M-4M token context window
First production-scale lightning (linear) attention LLM
456B total / 45.9B active parameters
Strong needle-in-haystack performance reported in the technical paper
Competitive with GPT-4o on standard benchmarks (MMLU, GSM8K, HumanEval)
Open weights released — first frontier-scale lightning-attention model in the open
Vision variant (MiniMax-VL-01) supports image inputs
Multilingual with strong Chinese and English performance
Best for: ultra-long-context analysis, Chinese-language applications, long-form agentic workflows, research on linear attention.

Training & License

Pretrained on trillions of tokens of multilingual web data with heavy Chinese and English representation, code and math; the VL variant adds image-text pairs. Exact data composition is partially described in the technical paper. Knowledge cutoff approximately late 2024.

License: MiniMax Model License. Permits commercial use with acceptable-use restrictions; products at scale may require registration with MiniMax. Review the license file on Hugging Face before deployment.

Known limitations

456B total parameters — substantial GPU memory required for self-hosting
Lightning attention has less mature kernel support than softmax
Behind o1 / R1 / Claude 4 on hardest reasoning tasks
Vision quality below GPT-4o and Claude 3.5 Sonnet
Filters politically sensitive topics consistent with Chinese regulations

Research papers

Frequently asked questions

Related Models

View all Text & Chat

Bio_ClinicalBERT

huggingface

The original Bio_ClinicalBERT from Alsentzer et al., a BERT model initialized from BioBERT and further pretrained on all MIMIC-III clinical notes. Served as a fill-mask endpoint it predicts masked tokens in clinical text and produces clinical embeddings. It is the standard encoder backbone behind many downstream clinical NLP fine-tunes.

€1.00

Biomedical NER (all entities)

huggingface

Token-classification model from d4data that tags 84 biomedical entity types in clinical and medical text, including disease, sign, symptom, medication, dosage, lab value, body part and procedure. Trained on the Maccrobat clinical case corpus on a DistilBERT base, so it runs cheaply for high-volume tagging.

€1.00

Claude Opus 4

Anthropic

Anthropic's most powerful model. Exceptional at complex analysis, agentic tasks, and extended reasoning.

Free

Claude Opus 4.8