Video Generation

Generate and edit videos with AI-powered models

Video generation models for marketing, motion, and prototyping

Video models turn a prompt — or a still frame, or a short reference clip — into a moving picture. The category is the youngest and most volatile in the catalog: every quarter brings a new flagship that resets the quality bar. Reach for one when you need motion content faster than a human editor can produce it.

51 models available

Google Veo 2

VideoGoogle DeepMind
Popular

Google's state-of-the-art video generation model. Simulates real-world physics with various visual styles.

€5.00120.0s
high-qualitypopular

Google Veo 3

VideoGoogle DeepMind
Popular

Google's Veo 3. High-fidelity text-to-video with native audio generation, up to 8s clips.

€0.7592.0s
googleveotext-to-video

Google Veo 3.1

VideoGoogle DeepMind
NewPopular

Latest Veo with image-to-video and context-aware audio

€6.0092.0s
popularaudioi2v

Kling v3

VideoReplicate
NewPopular

Cinematic video up to 15s with multi-shot and native audio

€2.00120.0s
popularaudioi2v

Kling v3 Omni

VideoReplicate
NewPopular

Most versatile: multi-reference images, video editing, native audio

€2.50120.0s
popularaudioi2v

OpenAI Sora 2

VideoOpenAI
Popular

OpenAI's second-generation Sora video model. Realistic motion, improved physics, audio support.

€0.50
openaisoratext-to-video

Runway Gen 4.5

VideoReplicate
NewPopular

Top-ranked for motion quality and visual fidelity

€1.0030.0s
populartop-quality

Sora

VideoOpenAI
NewPopular

OpenAI video generation model. Create realistic and imaginative videos from text prompts up to 20 seconds.

€1.00180.0s
popularhigh-qualityopenai

AnimateDiff

VideoCommunity

Plug-and-play motion module that animates personalized Stable Diffusion models without further training. 16-frame clips at 512x512.

€0.04
replicateanimationanimatediff

AnimateDiff Evolved

VideoReplicate

Community fork of AnimateDiff with improved motion modules, beta scheduler control and ControlNet integration for richer animation control.

€0.05
replicateanimationanimatediff

AnimateDiff Lightning

VideoByteDance

ByteDance distillation of AnimateDiff. 4-step sampling for over 10x faster inference at comparable quality to multi-step base model.

€0.02
replicateanimationbytedance

Champ Human Animation

VideoCommunity

Champ controllable human image animation. Uses 3D parametric guidance (SMPL) for realistic full-body motion transfer from a single reference image.

€0.12
replicateanimationhuman-motion

CogVideoX-5B (open)

VideoReplicate

Zhipu/Tsinghua's 5B open text-to-video model. 720x480 @ 8fps, 6s clips, image-to-video variant available.

Free
zhiputsinghuacogvideox

DreamGaussian 4D

VideoReplicate

4D Gaussian-splatting generator extending DreamGaussian to video. Image-conditioned dynamic 3D scenes with view-consistent motion.

€0.18
replicateanimation4d

DynamiCrafter

VideoCommunity

Tencent DynamiCrafter. Animates still images into short videos preserving texture and structure, with strong open-domain coverage.

€0.09
replicateanimationimage-to-video

EchoMimic

VideoReplicate

Ant Group EchoMimic. Lifelike audio-driven portrait animation with editable landmark conditioning for fine-grained motion control.

€0.10
replicatelipsyncant-group

FILM Frame Interpolation

VideoGoogle Research

Google FILM frame interpolation. Synthesizes high-quality intermediate frames between near-duplicate inputs, designed for large motion gaps.

€0.01
replicateupscaleframe-interpolation

Google Veo 3 Fast

VideoGoogle DeepMind
New

Faster cheaper Veo 3 with audio

€3.2059.0s
fastaudio

Google Veo 3.1 Fast

VideoGoogle DeepMind
New

Faster Veo 3.1 with image-to-video and audio

€3.2059.0s
fastaudioi2v

Grok Imagine Video

VideoReplicate
New

xAI video with native audio and lip-sync, up to 15s

€1.5090.0s
audioi2vxai

Hailuo / MiniMax Video-01

VideoCustom

MiniMax's Hailuo video-01. 6s 1280x720 clips with strong cinematic motion and physical realism.

€0.43
minimaxhailuotext-to-video

Hailuo 2.3

VideoMinimax
New

Minimax model for realistic human motion and VFX

€0.5060.0s
i2v1080p

HunyuanVideo

VideoTencent

Tencent's 13B open-weights video diffusion transformer. SOTA among open video models at release.

Free
tencenthunyuantext-to-video

HunyuanVideo

VideoTencent

Tencent's open-source video generation model. Strong visual quality with diverse style support.

€2.00120.0s
open-source

Kling 1.6 Pro

VideoKuaishou

Kuaishou's Kling 1.6 Pro. Premium cinematic motion and physics realism, ~$0.07/sec.

€0.35
kuaishouklingtext-to-video

LivePortrait

VideoCommunity

Kuaishou LivePortrait. Efficient portrait animation driven by reference videos with stitching, retargeting and motion-control parameters.

€0.08
replicatelipsynckuaishou

LTX-Video (Lightricks)

VideoReplicate

Lightricks' 2B DiT video model. Realtime generation on consumer GPUs (~6s @ H100, 24fps).

Free
lightricksltxtext-to-video

Luma Dream Machine v1.6

VideoCustom

Luma's Dream Machine 1.6. 720p text/image-to-video with strong motion and camera control.

€0.40
lumatext-to-videoimage-to-video

Luma Ray Flash 2

VideoLuma AI
New

Fast affordable video with I2V support

€0.5045.0s
fastbudgeti2v

MagicAnimate

VideoCommunity

ByteDance MagicAnimate. Temporally consistent human-image animation driven by a DensePose motion sequence with strong identity preservation.

€0.10
replicateanimationhuman-motion

Minimax Video

VideoMinimax

MiniMax's video generation model. Fast, high-quality video output with text-to-video capabilities.

€2.5090.0s
fastaffordable

Mochi 1

VideoGenmo

Genmo's 10B open-weights text-to-video model. AsymmDiT architecture, 5.4s @ 480p.

Free
genmomochitext-to-video

MOFA-Video

VideoReplicate

Motion-Field-Adapter video generator. Controllable image animation from trajectories, keypoints or audio with a strong identity preservation prior.

€0.10
replicatelipsyncanimation

MuseTalk

VideoCommunity

Tencent MuseTalk real-time lip-sync model. Audio-driven mouth-region editing in latent space at 30+ fps on a single GPU.

€0.06
replicatelipsynctencent

Pika 2.0 (Official)

VideoPika

Pika Labs' 2.0 release. Cinematic text/image-to-video with scene composition controls.

€0.20
pikatext-to-videoimage-to-video

PixVerse v5.6

VideoReplicate
New

Physics-accurate video generation up to 1080p

€0.5060.0s
i2v1080pphysics

Real-CUGAN

VideoCommunity

Real-CUGAN anime-focused upscaler. 2x/3x/4x super-resolution tuned for animation, line-art, and illustrated content.

€0.01
replicateupscaleanime

RIFE Frame Interpolation

VideoReplicate

Real-Time Intermediate Flow Estimation. Doubles or quadruples FPS of an existing video via learned optical-flow-based frame interpolation.

€0.01
replicateupscaleframe-interpolation

Runway Gen-3 Alpha Turbo

VideoCustom

Runway's faster, cheaper Gen-3 variant. Image-to-video at 5 credits/sec (~$0.05/sec).

€0.05
runwayimage-to-videofast

SadTalker

VideoCommunity

Stylized audio-driven talking-head generator. Synthesizes 3D motion coefficients from audio to animate a single portrait image with natural head movements.

€0.07
replicatelipsynctalking-head

Seedance Lite

VideoByteDance
New

Budget ByteDance video, fast and cheap

€0.5070.0s
budgeti2vfast

Seedance Pro

VideoByteDance
New

ByteDance video with T2V and I2V, up to 1080p

€1.0095.0s
i2v1080p

StreamingT2V

VideoReplicate

Picsart StreamingT2V. Generates long, consistent videos by chaining short autoregressive clips with motion and appearance memory.

€0.15
replicateanimationlong-form

SwinIR Video

VideoCommunity

SwinIR transformer-based super-resolution and denoising applied per-frame to video. Handles classic, real-world and lightweight upscaling.

€0.02
replicateupscaletransformer

ToonCrafter

VideoCommunity

Tencent ToonCrafter generative cartoon interpolation model. Synthesizes smooth in-between frames between two cartoon keyframes.

€0.08
replicateanimationtooncrafter

V-Express

VideoTencent

Tencent V-Express. Audio-driven portrait animation with progressive training, weak-condition learning, and expressive lip sync.

€0.09
replicatelipsynctencent

VideoCrafter

VideoCommunity

Tencent VideoCrafter latent video diffusion. Text-to-video and image-to-video generation up to 2s at 1024x576 with strong motion fidelity.

€0.07
replicateupscalevideo-generation

Wan 2.1 (Alibaba)

VideoReplicate

Alibaba's Wan 2.1 open-weights video diffusion model. 14B MoE-based, supports T2V and I2V.

Free
alibabawantext-to-video

Wan 2.2 Image-to-Video

VideoReplicate
New

Ultra-cheap I2V. Upload image and animate it.

€0.1030.0s
budgeti2vfast

Wan 2.2 Text-to-Video

VideoReplicate
New

Ultra-cheap T2V for pennies

€0.1030.0s
budgetfast

Wav2Lip

VideoCommunity

Lip-sync model that re-syncs a target video's lip movement to an arbitrary audio track. Robust to identity and language with a lip-sync discriminator loss.

€0.05
replicatelipsyncvideo-edit

Top video generation picks

Hand-picked across four common criteria — resolved against the live catalog so the picks track price and performance changes.

Best overall
Google Veo 2

Google's state-of-the-art video generation model. Simulates real-world physics with various visual styles.

Learn more
Cheapest
CogVideoX-5B (open)

Zhipu/Tsinghua's 5B open text-to-video model. 720x480 @ 8fps, 6s clips, image-to-video variant available.

Learn more
Longest clip
Google Veo 2

Google's state-of-the-art video generation model. Simulates real-world physics with various visual styles.

Learn more
Fastest
Runway Gen 4.5

Top-ranked for motion quality and visual fidelity

Learn more

Pricing in video is per-second of output rather than per-token or per-call. A flagship five-second clip costs anywhere from twenty cents (Kling 1.6, Hunyuan Video) to about a euro (Veo 3, Runway Gen-3 Alpha). Sound-on tiers cost more than silent tiers. Resolution multipliers stack on top of duration: 720p is the standard, 1080p costs roughly 2× more, and 4K is rare and expensive.

The trade-off here is duration versus coherence. Most commercial models cap output at five to ten seconds because longer clips drift — characters change clothes, backgrounds morph, and physics breaks down. For longer narratives, generate a sequence of shorter shots and stitch them in post. Image-to-video (start frame + motion prompt) typically produces more stable results than pure text-to-video, especially for characters and product shots.

Watch out for the five-second wall: virtually every model on the market today maxes out around five seconds of continuous output, and quality degrades sharply when you push for more. If your script needs ten seconds, plan for two shots. Also watch out for sound: most models ship silent and you have to overlay audio separately — only Veo 3 and a few research previews ship integrated audio so far.

Top picks above cover the flagship realism leader, the cheapest workhorse, the longest-clip model, and the fastest preview option in the category.

Frequently asked questions

Start Building with AI

Access all models through a single API. Get free credits when you sign up — no credit card required.