PixVerse v5.6

New
Replicate
Video Generation

Physics-accurate video generation up to 1080p

Queue video with PixVerse v5.6
Video generation runs asynchronously — we'll queue a job and you can track it in your history.
Sign in to try this model with €5 free credits.
Sign in
Generates as an async job — typically 30 s to 2 min.
TL;DR·Last updated March 25, 2026

PixVerse v5.6 is video generation AI model from Replicate, priced at €0.000 per 1M input tokens with a unknown context window.

Try PixVerse v5.6

Image References

Sign in to generate — 50 free credits on sign-up

Examples

See what PixVerse v5.6 can generate

0:05

Physics

"Glass of water tipping over in slow motion"

Pricing

Price per Generation
Per generation€0.50

API Integration

Use our OpenAI-compatible API to integrate PixVerse v5.6 into your application.

Install
npm install railwail
JavaScript / TypeScript
import railwail from "railwail";

const rw = railwail("YOUR_API_KEY");

// Simple — just pass a string
const reply = await rw.run("pixverse-v5", "Hello! What can you do?");
console.log(reply);

// With message history
const reply2 = await rw.run("pixverse-v5", [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);

// Full response with usage info
const res = await rw.chat("pixverse-v5", [
  { role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);
Specifications
Price
€0.50
Avg. latency
60.0s
Est. duration
1min
Developer
Replicate
Category
Video Generation
Supported Formats
mp4
Tags
i2v
1080p
physics

Deep dive — AISphere (PixVerse)'s PixVerse v5.6

About AISphere (PixVerse)
Founded 2023 · Singapore / Beijing

PixVerse is the generative-video product of AISphere, a startup founded in 2023 by former senior researchers from ByteDance and Tencent. Headquartered between Singapore and Beijing, AISphere targets international short-form-video creators with a consumer-friendly web and mobile experience. PixVerse v1 shipped in early 2024, followed by v2, v3, v4 and v5 (mid-2025) and v5.6 (late 2025). The product emphasises stylised animation (anime, 3D pixar-style, paper craft, etc.), short-clip generation at 720p-1080p and integrated lip-sync and templates. AISphere is backed by Hillhouse, GGV Capital and IDG Capital, with a reported valuation of around $300M in 2025.

Visit AISphere (PixVerse) →
Architecture
Latent video diffusion (DiT-style) with style adapters

PixVerse v5.6 is a closed latent-video-diffusion model with a transformer denoiser operating on a learned spatio-temporal latent space. The model supports text-to-video, image-to-video, character-reference mode and a library of pre-trained style adapters (anime, 3D animation, paper craft, claymation, comic, neon, etc.) that operate as fine-tuned LoRAs or auxiliary cross-attention heads on top of the base DiT. Conditioning uses a bilingual Chinese/English text encoder and image embeddings for first-frame conditioning. v5.6 generates 5-8 second clips at 720p-1080p / 24 fps with a lip-sync module that aligns mouth motion to a user-provided audio track. PixVerse has not released a formal technical paper; behaviour and feature surface suggest a multi-billion-parameter DiT trained on a curated multilingual video corpus with extensive synthetic stylised data.

Parameters
Undisclosed
Context
unknown
What it can do
  • Text-to-video, image-to-video and character-reference modes
  • Rich library of pre-trained style adapters (anime, 3D, claymation, neon, ...)
  • Lip-sync module aligned to user audio
  • 5-8 second clips at 720p-1080p
  • Templates for popular meme and effect formats
  • Web and mobile apps with one-click sharing to social platforms
  • Bilingual Chinese/English prompts
  • Active creator community and library of presets
  • Best for: stylised social-media video, anime shorts, meme content, lip-sync.
Training & License

Closed corpus including licensed footage, web video and large amounts of synthetic stylised data for style adapters; exact size undisclosed.

License: Proprietary commercial licence via PixVerse / AISphere terms; commercial use on paid plans.

Known limitations
  • 5-8 second clip limit
  • No native non-lipsync audio generation
  • Stylised model can struggle on photorealistic prompts
  • Closed model without technical disclosure
  • Quality below frontier (Veo 3, Sora 2) on realistic scenes

Frequently asked questions

Start using PixVerse v5.6 today

Get started with free credits. No credit card required. Access PixVerse v5.6 and 100+ other models through a single API.