Kuaishou Kolors
Kuaishou's bilingual (CN/EN) latent diffusion text-to-image model with strong text rendering.
Kuaishou Kolors is image generation AI model from Replicate, priced at β¬0.000 per 1M input tokens with a unknown context window.
Pricing
API Integration
Use our OpenAI-compatible API to integrate Kuaishou Kolors into your application.
npm install railwailimport railwail from "railwail";
const rw = railwail("YOUR_API_KEY");
const images = await rw.run("kolors-kuaishou", "A beautiful sunset over Tokyo");
console.log(images[0].url);
// Or use the image() method for full control
const res = await rw.image("kolors-kuaishou", "A cat in space", {
size: "1024x1024",
n: 1,
});
console.log(res.data[0].url);Deep dive β Kuaishou (Kolors Team)'s Kuaishou Kolors
Kuaishou Technology (εΏ«ζ) is a Chinese short-video platform founded in 2011 in Beijing by Su Hua and Cheng Yixiao, listed on the Hong Kong Stock Exchange since 2021. Its AI research division, often referred to as the Kuaishou KwaiVGI or Kling AI team, has produced several generative-media foundation models including Kling (video) and Kolors (image). Kolors was open-sourced in July 2024 with weights released on Hugging Face and GitHub (Kwai-Kolors/Kolors). The model is positioned as the best open Chinese-English bilingual text-to-image model, particularly strong on Chinese-language prompts, calligraphy and culturally specific content.
Visit Kuaishou (Kolors Team) βKolors is a latent diffusion text-to-image model released open-source by Kuaishou in July 2024. The architecture follows the Stable Diffusion XL family β a latent-space U-Net with cross-attention conditioning on text embeddings β but replaces the standard CLIP text encoder with the much larger ChatGLM3-6B, a bilingual Chinese-English instruction-tuned LLM developed by Tsinghua KEG and Zhipu AI. This gives Kolors strong understanding of Chinese-language prompts, idioms, calligraphy and culturally specific content (food, festivals, Chinese architecture, traditional clothing) that Western models like SDXL and FLUX often struggle with. The U-Net has ~2.6B parameters and is trained at 1024x1024 resolution with the standard noise-prediction objective. Kuaishou also released a Kolors-Inpainting variant and a Kolors-ControlNet collection (Canny/Depth). The model is open-source under the Apache 2.0 license, making it the leading commercial-use open bilingual model.
- Parameters
- ~2.6B U-Net + 6B ChatGLM3 text encoder
- Context
- 256 tokens
- Best-in-class understanding of Chinese-language prompts
- Strong English performance comparable to SDXL
- 1024x1024 native resolution
- Open weights under Apache 2.0
- Excellent rendering of Chinese characters and calligraphy
- Culturally accurate Chinese-food, festival, architecture and clothing imagery
- Companion models: Kolors-Inpainting, Kolors-ControlNet (Canny/Depth/Pose)
- Best for: Chinese-market apps, bilingual creative tools, culturally specific content, self-hosted Chinese services.
Trained on a large mixed Chinese-English image-text corpus including high-quality Chinese-captioned data. Exact composition is not disclosed.
License: Apache 2.0 β open weights, full commercial use permitted including redistribution and derivatives.
Known limitations
- Image quality below FLUX 1.1 [pro] on photorealism
- Smaller model than SD3.5 Large or FLUX
- Some Chinese-regulatory filters baked into training
- Open weights have no integrated safety classifier
Research papers
- Kolors: Effective Training of Diffusion Model for Photorealistic Text-to-Image Synthesis (Tech Report) (2024) β
- High-Resolution Image Synthesis with Latent Diffusion Models (2022) β
- SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis (2023) β
- ChatGLM3-6B model card (2023) β
Frequently asked questions
Related Models
View all Image GenerationFlux 1.1 Pro Ultra
FLUX 1.1 Pro in ultra mode. Up to 4 megapixel images with raw mode for photorealism.
Flux Dev
Black Forest Labs' development model. Fast, high-quality image generation with LoRA support.
Google Imagen 4
Google's Imagen 4. Text-to-image with strong photorealism and improved typography support.
Google Imagen 4 Ultra
Premium Imagen 4 tier. Highest fidelity, prompt adherence and typography quality from Google.
Start using Kuaishou Kolors today
Get started with free credits. No credit card required. Access Kuaishou Kolors and 100+ other models through a single API.