Kling v3
Cinematic video up to 15s with multi-shot and native audio
Kling v3 is video generation AI model from Replicate, priced at €0.000 per 1M input tokens with a unknown context window.
Image References
Examples
See what Kling v3 can generate
Action
"Parkour athlete on rooftops at sunset"
Pricing
API Integration
Use our OpenAI-compatible API to integrate Kling v3 into your application.
npm install railwailimport railwail from "railwail";
const rw = railwail("YOUR_API_KEY");
// Simple — just pass a string
const reply = await rw.run("kling-v3", "Hello! What can you do?");
console.log(reply);
// With message history
const reply2 = await rw.run("kling-v3", [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);
// Full response with usage info
const res = await rw.chat("kling-v3", [
{ role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);Deep dive — Kuaishou Technology's Kling v3
Kuaishou Technology, founded in 2011 by Su Hua and Cheng Yixiao, is one of China's two dominant short-video platforms, with ~700M monthly active users in 2025. Its KLING AI team shipped Kling 1.0 in June 2024, Kling 1.5 and 1.6 (late 2024), Kling 2.0 (Q1 2025) and Kling v3 (mid-2025). Kling v3 introduced native synchronized audio (sound effects and dialogue), 1080p+ at 30 fps, much improved motion physics, and the Omni multimodal-conditioning variant. The model is exposed via klingai.com, the Kling mobile app and a commercial API and is widely used by Chinese ad agencies, music labels and social-media creators.
Visit Kuaishou Technology →Kling v3 is a closed-source video diffusion model built on Kuaishou's third-generation Diffusion Transformer architecture. The denoiser operates on a 3D spatio-temporal latent produced by a high-compression VAE and uses full 3D spatio-temporal attention rather than factorised attention. v3's headline change is a joint audio-video diffusion module that generates synchronized soundtrack (ambient sound, footsteps, music swells and short dialogue) aligned to the visual track. Text conditioning uses a bilingual LLM encoder with extended context for very long prompts; image conditioning supports first-frame, end-frame and subject-reference modes. Training uses a curated multi-billion-clip corpus (including platform data) with synthetic dense audio-visual captions; post-training combines reward-model alignment for visual aesthetics, prompt fidelity and audio-video sync.
- Parameters
- Undisclosed
- Context
- unknown
- Text-to-video, image-to-video and subject-reference modes
- 1080p / 30 fps clips up to 10 seconds natively, extendable beyond 1 minute
- Synchronized native audio (ambient sound, simple dialogue, music swells)
- Director-style camera control (lenses, rigs, moves)
- Strong physics on water, fabric, hair and crowd dynamics
- Bilingual Chinese/English prompts with extended context
- Lip-sync and motion-brush in Kling editor
- Commercial API and Kling editor integration
- Best for: cinematic shorts, ads with audio, music videos, character-driven content.
Closed multi-billion-clip corpus combining licensed video, web video and Kuaishou platform data, with dense audio-visual captions; exact size undisclosed.
License: Proprietary commercial licence via Kling AI / Kuaishou terms; commercial use on paid plans.
Known limitations
- 10-second native clip limit
- Audio quality is short-form (limited length and music sophistication)
- Heavy political and public-figure moderation
- Closed model with no public technical paper
- Output queues during peak hours
Research papers
Frequently asked questions
Related Models
View all Video GenerationGoogle Veo 2
Google's state-of-the-art video generation model. Simulates real-world physics with various visual styles.
Google Veo 3
Google's Veo 3. High-fidelity text-to-video with native audio generation, up to 8s clips.
Google Veo 3.1
Latest Veo with image-to-video and context-aware audio
Kling v3 Omni
Most versatile: multi-reference images, video editing, native audio
Start using Kling v3 today
Get started with free credits. No credit card required. Access Kling v3 and 100+ other models through a single API.