Minimax Video
MiniMax's video generation model. Fast, high-quality video output with text-to-video capabilities.
Minimax Video is video generation AI model from Minimax, priced at €0.000 per 1M input tokens with a unknown context window.
Examples
See what Minimax Video can generate
Food Animation
"Close-up of chocolate being poured over a stack of fresh strawberries in slow motion, the chocolate coating each berry as it cascades down, warm studio lighting"
Social Media Clip
"A person walking confidently through a colorful mural-covered alley, camera tracking alongside them, vibrant graffiti art on both walls, casual street fashion, golden hour"
Pricing
API Integration
Use our OpenAI-compatible API to integrate Minimax Video into your application.
npm install railwailimport railwail from "railwail";
const rw = railwail("YOUR_API_KEY");
// Simple — just pass a string
const reply = await rw.run("minimax-video", "Hello! What can you do?");
console.log(reply);
// With message history
const reply2 = await rw.run("minimax-video", [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);
// Full response with usage info
const res = await rw.chat("minimax-video", [
{ role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);Deep dive — MiniMax's Minimax Video
MiniMax (Shanghai Xiyu Technology) was founded in December 2021 by Yan Junjie, former SenseTime general manager, alongside several senior researchers from SenseTime and Microsoft Research Asia. It is one of China's 'four AI tigers' along with Zhipu, Moonshot and Baichuan, and has raised over $1B from investors including Tencent, Alibaba, HongShan and Mihoyo, with a $2.5B+ valuation in 2024. MiniMax developed the abab LLM family and the Hailuo AI consumer chat product. The 'Minimax Video' product (a.k.a. Hailuo Video on the consumer surface) generates short cinematic clips and was, at launch in September 2024, one of the first Chinese video AIs widely available to international users. This entry covers the developer-facing MiniMax Video API which exposes the underlying Hailuo Video models to third-party platforms.
Visit MiniMax →MiniMax Video is the API-tier branding for the Hailuo Video model family (currently Video-01, Video-01-Live, Video-01-Subject, Hailuo 02 and Hailuo 2.3). All variants share a closed latent-video-diffusion architecture: a 3D VAE compresses video into a spatio-temporal latent grid, and a transformer-based denoiser is conditioned on bilingual Chinese/English text embeddings (from a MiniMax abab-family LLM encoder) plus optional image embeddings. Different variants emphasise different conditioning modalities: text-only (Video-01), live2D-style anime animation (Video-01-Live) and subject-reference / identity locking (Video-01-Subject). Generation produces 6-10 second clips at 720p-1080p / 24-30 fps. MiniMax publish no model card; behaviour and reported throughput suggest a multi-billion-parameter DiT trained on a curated multilingual video corpus with synthetic dense captions and reward-model post-training.
- Parameters
- Undisclosed
- Context
- unknown
- API-tier access to Hailuo Video family (Video-01, Live, Subject, Hailuo 02, 2.3)
- Text-to-video and image-to-video
- Subject-reference mode for locked character identity
- Live2D-style stylised animation mode
- 6-10 second clips at 720p-1080p
- Bilingual Chinese/English prompts
- Director-style camera control vocabulary
- Competitive pricing for high resolution
- Best for: third-party creative platforms, anime/stylised animation, social shorts.
Closed corpus of licensed and web video with bilingual dense captions; exact size undisclosed.
License: Proprietary commercial licence via MiniMax API and Hailuo AI platform terms.
Known limitations
- 6-10 second clip limit
- No native audio
- Heavy political and public-figure moderation
- Less English prompt-following precision than Veo / Sora 2
- Closed model, no technical paper
Research papers
Frequently asked questions
Related Models
View all Video GenerationGoogle Veo 2
Google's state-of-the-art video generation model. Simulates real-world physics with various visual styles.
Google Veo 3
Google's Veo 3. High-fidelity text-to-video with native audio generation, up to 8s clips.
Google Veo 3.1
Latest Veo with image-to-video and context-aware audio
Kling v3
Cinematic video up to 15s with multi-shot and native audio
Start using Minimax Video today
Get started with free credits. No credit card required. Access Minimax Video and 100+ other models through a single API.