What is Wan 2.2 Text-to-Video?

Wan 2.2 Text-to-Video is video generation AI model developed by Replicate. Ultra-cheap T2V for pennies Access it through Railwail's unified, OpenAI-compatible API at €0.000 per 1M input tokens.

How much does Wan 2.2 Text-to-Video cost via Railwail?

Per-call: €0.10. No monthly minimum, no subscription. Start with €5 free credits.

What is the context window of Wan 2.2 Text-to-Video?

Wan 2.2 Text-to-Video supports a unknown context window — enough for typical AI workloads.

How fast is Wan 2.2 Text-to-Video?

Average response latency: 30.0s (p50 across recent Railwail traffic). See live p50/p95 metrics on /rankings.

Is Wan 2.2 Text-to-Video better than Google Veo 2?

It depends on your use case. Wan 2.2 Text-to-Video (Replicate) and Google Veo 2 (Google DeepMind) are both strong choices in video generation. Compare them side-by-side at /compare/wan-t2v-vs-google-veo-2.

Wan 2.2 Text-to-Video

Name: Wan 2.2 Text-to-Video
Brand: Replicate
SKU: wan-t2v
Price: 0.1 EUR
Availability: InStock

New

Replicate

Video Generation

Ultra-cheap T2V for pennies

Queue video with Wan 2.2 Text-to-Video

Video generation runs asynchronously — we'll queue a job and you can track it in your history.

Generates as an async job — typically 30 s to 2 min.

TL;DR·Last updated March 25, 2026

Wan 2.2 Text-to-Video is video generation AI model from Replicate, priced at €0.000 per 1M input tokens with a unknown context window.

Try Wan 2.2 Text-to-Video

Prompt

Duration

Aspect Ratio

Examples

See what Wan 2.2 Text-to-Video can generate

0:05

Quick

"Cat playing with yarn on wooden floor"

Pricing

Price per Generation

Per generation€0.10

API Integration

Use our OpenAI-compatible API to integrate Wan 2.2 Text-to-Video into your application.

Install

npm install railwail

JavaScript / TypeScript

import railwail from "railwail";

const rw = railwail("YOUR_API_KEY");

// Simple — just pass a string
const reply = await rw.run("wan-t2v", "Hello! What can you do?");
console.log(reply);

// With message history
const reply2 = await rw.run("wan-t2v", [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);

// Full response with usage info
const res = await rw.chat("wan-t2v", [
  { role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);

Specifications

Price

€0.10

Avg. latency

30.0s

Est. duration

30s

Developer

Replicate

Deep dive — Alibaba (Tongyi Wanxiang Lab)'s Wan 2.2 Text-to-Video

About Alibaba (Tongyi Wanxiang Lab)

Founded 1999 · Hangzhou, China

Alibaba's Tongyi Lab in Hangzhou runs the Qwen LLM family and the Wanxiang generative-media family. After Wan 2.0 (mid-2024) and Wan 2.1 (early 2025), the team released Wan 2.2 in 2025 as the next-generation open-weight video model. Wan 2.2 ships as purpose-tuned variants for Text-to-Video, Image-to-Video and Audio/A2V. Wan 2.2 Text-to-Video is the flagship pure-text-conditioned variant and replaces Wan 2.1 T2V-14B as the principal open-weight text-to-video reference for the Chinese research community. The Wan team consistently rank near the top of open-model VBench leaderboards and ship reproducible training code under a permissive Wan-series licence.

Visit Alibaba (Tongyi Wanxiang Lab) →

Architecture

Diffusion Transformer (DiT) with 3D causal VAE; MoE-style scaling

Wan 2.2 Text-to-Video is a Diffusion Transformer operating on a 3D causal Wan-VAE latent. Wan 2.2 introduces architectural refinements over Wan 2.1: improved 3D Rotary Position Embeddings, larger attention windows, and (in the flagship) a Mixture-of-Experts feed-forward design that routes tokens to specialist experts. Text conditioning uses a Qwen-family multilingual encoder with strong Chinese-English capability. The denoiser is trained with Flow Matching on a curated multi-million-clip multilingual video corpus with synthetic dense bilingual captions. Native generation is 5 seconds at 720p / 24 fps (with 1080p extensions). The training recipe and weights are open-source on Hugging Face and GitHub under the Wan-series permissive licence, designed to enable broad commercial and research use.

Parameters: 14 billion (flagship); smaller variants available
Context: unknown

What it can do

Open-weight text-to-video flagship at 14B parameters (smaller variants available)
5-second 720p / 24 fps generation natively, 1080p extensions
Bilingual Chinese/English prompts via Qwen-based text encoder
MoE-style scaling and improved 3D RoPE in flagship variant
Permissive Wan-series licence for research and commercial use
Top-tier results on VBench among open-weight models
Active community ecosystem (LoRAs, fine-tunes, ComfyUI nodes)
Reproducible training recipe and code
Best for: open-source video pipelines, research, on-prem creative tooling, branded fine-tunes.

Training & License

Curated multi-million-clip multilingual video corpus filtered for aesthetics, motion and caption quality, with dense bilingual captions; specifics documented in Wan technical materials.

License: Open weights under an Apache-style permissive licence (Wan-series release).

Known limitations

Native duration 5 seconds
No native audio
High VRAM requirements for the 14B flagship
Closed leaders (Veo 3, Sora 2, Kling v3) still ahead on absolute fidelity
Resolution capped at 720p natively (1080p only in extended modes)

Research papers

Frequently asked questions

Related Models

View all Video Generation

Google Veo 2

Google DeepMind

Google's state-of-the-art video generation model. Simulates real-world physics with various visual styles.

€5.00

Google Veo 3

Google DeepMind

Google's Veo 3. High-fidelity text-to-video with native audio generation, up to 8s clips.

€0.75

Google Veo 3 (Replicate)

Google DeepMind

Google's Veo 3 served via Replicate. Text-to-video with native synchronized audio generation. High-fidelity motion and scene coherence in short clips.

€8.00

Google Veo 3.1

Google DeepMind

Latest Veo with image-to-video and context-aware audio

€6.00

Start using Wan 2.2 Text-to-Video today

Get started with free credits. No credit card required. Access Wan 2.2 Text-to-Video and 100+ other models through a single API.

Get Started Free Browse All Models