Kling v3 Omni
Most versatile: multi-reference images, video editing, native audio
Kling v3 Omni is video generation AI model from Replicate, priced at β¬0.000 per 1M input tokens with a unknown context window.
Image References
Examples
See what Kling v3 Omni can generate
Character
"Cartoon character waving and dancing"
Pricing
API Integration
Use our OpenAI-compatible API to integrate Kling v3 Omni into your application.
npm install railwailimport railwail from "railwail";
const rw = railwail("YOUR_API_KEY");
// Simple β just pass a string
const reply = await rw.run("kling-v3-omni", "Hello! What can you do?");
console.log(reply);
// With message history
const reply2 = await rw.run("kling-v3-omni", [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);
// Full response with usage info
const res = await rw.chat("kling-v3-omni", [
{ role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);Deep dive β Kuaishou Technology's Kling v3 Omni
Kuaishou Technology (founded 2011 by Su Hua and Cheng Yixiao) operates one of China's two dominant short-video platforms and runs the KLING AI lab. Following the success of Kling 1.x and Kling v3, the team released Kling v3 Omni in 2025 as the multimodal-conditioning variant of v3. Omni accepts not only text and a single image but also audio reference, video reference, character-reference images, sketches, depth maps and pose skeletons, making it the broadest conditioning surface among Chinese video models. Omni is exposed through klingai.com, the Kling Studio editor and the Kling API.
Visit Kuaishou Technology βKling v3 Omni shares the same core DiT denoiser as Kling v3 but adds a suite of conditioning adapters that inject heterogeneous control signals into the diffusion backbone. The model accepts text prompts (bilingual LLM encoder), first frame, end frame, subject-reference images (for locked identity), pose / skeleton videos (for motion transfer), depth maps (for 3D-aware control), sketches (for layout), audio reference (for music-driven motion and lip-sync) and short video references (for style transfer). Each modality is encoded by a dedicated adapter network and merged via cross-attention layers into the 3D spatio-temporal latent stream. Joint audio-video diffusion produces synchronized sound, including lip-sync to a provided dialogue audio track. Training uses paired multimodal-conditioning data including motion-capture libraries, sketch-to-video pairs and music videos.
- Parameters
- Undisclosed
- Context
- unknown
- Text-, image-, pose-, sketch-, depth-, audio- and video-reference conditioning
- Lip-sync to a user-provided dialogue audio track
- Motion transfer from a pose / skeleton video
- Subject-reference for locked character identity
- Sketch- and depth-controlled layout for storyboards
- 1080p / 30 fps, native synchronized audio
- Available via Kling Studio and API
- Strongest conditioning surface among Chinese video models
- Best for: animation studios, lip-sync content, motion-capture-driven workflows.
Closed corpus including platform video, licensed footage, motion-capture libraries, sketch / depth annotation pairs and music videos with dense multimodal captions.
License: Proprietary commercial licence via Kling AI / Kuaishou terms.
Known limitations
- Complex conditioning workflows can be brittle
- Heavy political and public-figure moderation
- Higher per-clip cost than vanilla v3
- Audio sync is short-form and English/Chinese-leaning
- Closed model with limited public documentation
Research papers
Frequently asked questions
Related Models
View all Video GenerationGoogle Veo 2
Google's state-of-the-art video generation model. Simulates real-world physics with various visual styles.
Google Veo 3
Google's Veo 3. High-fidelity text-to-video with native audio generation, up to 8s clips.
Google Veo 3.1
Latest Veo with image-to-video and context-aware audio
Kling v3
Cinematic video up to 15s with multi-shot and native audio
Start using Kling v3 Omni today
Get started with free credits. No credit card required. Access Kling v3 Omni and 100+ other models through a single API.