HunyuanVideo

Tencent
Video Generation

Tencent's 13B open-weights video diffusion transformer. SOTA among open video models at release.

Queue video with HunyuanVideo
Video generation runs asynchronously β€” we'll queue a job and you can track it in your history.
Sign in to try this model with €5 free credits.
Sign in
Generates as an async job β€” typically 30 s to 2 min.
TL;DRΒ·Last updated May 16, 2026

HunyuanVideo is video generation AI model from Tencent, priced at €0.000 per 1M input tokens with a unknown context window.

Try HunyuanVideo
Sign in to generate β€” 50 free credits on sign-up

Pricing

Price per Generation
Per generationFree

API Integration

Use our OpenAI-compatible API to integrate HunyuanVideo into your application.

Install
npm install railwail
JavaScript / TypeScript
import railwail from "railwail";

const rw = railwail("YOUR_API_KEY");

// Simple β€” just pass a string
const reply = await rw.run("hunyuanvideo-tencent", "Hello! What can you do?");
console.log(reply);

// With message history
const reply2 = await rw.run("hunyuanvideo-tencent", [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);

// Full response with usage info
const res = await rw.chat("hunyuanvideo-tencent", [
  { role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);
Specifications
Developer
Tencent
Category
Video Generation
Supported Formats
text
Tags
tencent
hunyuan
text-to-video
open-weights
pricing-tbd

Deep dive β€” Tencent's HunyuanVideo

About Tencent
Founded 1998 Β· Shenzhen, China

Tencent Holdings, founded 1998 by Ma Huateng and four co-founders in Shenzhen, is one of the world's largest internet conglomerates and runs WeChat, Tencent Games and Tencent Cloud. Tencent's Hunyuan team developed the company's foundation-model family (Hunyuan Large for text, Hunyuan DiT for image, Hunyuan3D, and HunyuanVideo for video). HunyuanVideo (December 2024) was the first 13B-parameter open-weight video diffusion model and remains one of the largest publicly released video generators. This deep-dive entry is the Tencent-platform-facing presentation of HunyuanVideo, covering both the open release and the Tencent-hosted commercial API surfaced through Hunyuan Cloud, WeChat creator tools and Tencent's own ad platforms.

Visit Tencent β†’
Architecture
Diffusion Transformer (13B) with dual-stream text/video design and 3D causal VAE

HunyuanVideo combines a 3D causal Variational Autoencoder for video compression with a 13B-parameter Diffusion Transformer denoiser. The denoiser adopts a FLUX-1-style dual-stream architecture: separate text and video streams that process each modality independently before merging into single-stream joint self-attention. Text conditioning fuses signals from a CLIP-style vision-language encoder and a large multimodal LLM, which the Tencent team found materially improves prompt fidelity on long captions. Training uses Flow Matching with 3D Rotary Position Embeddings and a progressive curriculum (images, low-res videos, high-res videos) on a heavily filtered multi-billion-clip multilingual corpus with dense bilingual captions. The Tencent-hosted commercial endpoint adds longer-duration generation (via chained extensions), 1080p upscaling, image-to-video conditioning and a 'sound' module producing aligned audio.

Parameters
13 billion
Context
unknown
What it can do
  • 13B open-weight base model with permissive Hunyuan Community Licence
  • 5-second 720p / 24 fps native clips; extended to 10-15s on commercial endpoint
  • Strong bilingual (Chinese/English) prompt understanding via MLLM text encoder
  • Dual-stream DiT with high prompt adherence on dense captions
  • Image-to-video and audio-sync available on commercial Tencent API
  • Massive open-source ecosystem of LoRAs, control nets and pipelines
  • Available on Hugging Face, GitHub and Tencent Hunyuan Cloud
  • Strong on cinematic motion, lighting and dynamic camera
  • Best for: research, open pipelines, Chinese-market creative tools, branded campaigns.
Training & License

Multi-billion-clip curated video corpus with hierarchical filtering for aesthetics, motion and caption alignment, plus dense bilingual captions from an in-house MLLM captioner. Exact size disclosed only in summary form in the paper.

License: Tencent Hunyuan Community Licence on weights (free for research and limited commercial use with attribution; thresholds for very large deployers). Commercial API governed by Tencent Cloud terms.

Known limitations
  • Native open weights limited to ~5s / 720p
  • Audio only on commercial Tencent endpoint
  • High VRAM requirements for local inference (~60-80 GB FP16)
  • Commercial API has stricter content moderation on Chinese surfaces
  • Licence has thresholds for very large commercial users

Frequently asked questions

Start using HunyuanVideo today

Get started with free credits. No credit card required. Access HunyuanVideo and 100+ other models through a single API.