Janus Pro 7B
DeepSeek's unified multimodal model. Decouples vision encoding for both understanding and generation tasks.
Janus Pro 7B is image generation AI model from Replicate, priced at €0.000 per 1M input tokens with a unknown context window.
Pricing
API Integration
Use our OpenAI-compatible API to integrate Janus Pro 7B into your application.
npm install railwailimport railwail from "railwail";
const rw = railwail("YOUR_API_KEY");
const images = await rw.run("janus-pro-7b", "A beautiful sunset over Tokyo");
console.log(images[0].url);
// Or use the image() method for full control
const res = await rw.image("janus-pro-7b", "A cat in space", {
size: "1024x1024",
n: 1,
});
console.log(res.data[0].url);Deep dive — DeepSeek's Janus Pro 7B
DeepSeek (深度求索) is a Chinese AI research lab founded in May 2023 in Hangzhou by Liang Wenfeng, the founder of the quant hedge-fund High-Flyer (which funds the lab). DeepSeek became globally prominent in late 2024 and early 2025 with the release of DeepSeek-V3 (Dec 2024), DeepSeek-R1 (Jan 2025) and the Janus multimodal family. The Janus series is DeepSeek's unified understanding-and-generation model: Janus (Oct 2024), JanusFlow (Nov 2024) and Janus Pro (Jan 2025) extend a single Transformer to both interpret images and generate them with a decoupled visual encoder design. All Janus models are open-sourced under a permissive license on Hugging Face and GitHub.
Visit DeepSeek →Janus-Pro-7B (January 2025) is the largest member of the Janus family from DeepSeek. The model unifies multimodal understanding and image generation in a single autoregressive Transformer, but uniquely decouples the visual encoders for the two tasks: a SigLIP-style ViT encoder is used for image understanding (so visual features are semantic), while a VQ tokeniser based on LlamaGen is used for image generation (so visual features are reconstruction-oriented). Both feature paths are projected to the same 7B LLM backbone, which generates either text or image tokens depending on the task. For image generation Janus-Pro produces 384x384 images by autoregressive sampling of VQ tokens which are then decoded by the LlamaGen decoder. Janus-Pro improves over Janus by scaling data to ~90M image-text pairs and adding a second stage of supervised fine-tuning. DeepSeek reports that Janus-Pro-7B beats DALL-E 3, SD3-Medium and SDXL on GenEval and DPG-Bench, despite being a much smaller and unified model.
- Parameters
- 7B parameters (Janus-Pro-7B)
- Context
- 4.1K tokens
- Unified image understanding + image generation in one 7B model
- Open weights under permissive DeepSeek license
- Outperforms DALL-E 3 and SD3-Medium on GenEval (per DeepSeek paper)
- 384x384 native generation resolution
- Compatible with Hugging Face Transformers and vLLM
- Useful for multimodal agents that both see and draw
- Strong instruction-following thanks to LLM-style backbone
- Best for: research, multimodal agents, prototyping unified pipelines, fine-tuning.
Pretrained on a mix of ~90M image-text pairs, text-only data and image-only data. Janus-Pro-7B adds extra supervised fine-tuning stages and a larger unified dataset compared to Janus 1B.
License: DeepSeek Janus License — open weights, free for research and commercial use with attribution and standard restrictions.
Known limitations
- Only 384x384 native resolution — needs upscaler for production
- Image quality below dedicated diffusion models like FLUX 1.1 [pro]
- Open weights have no built-in safety filter
- Autoregressive sampling is slower per pixel than diffusion at high res
Research papers
- Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling (2025) →
- Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation (2024) →
- JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation (2024) →
Frequently asked questions
Related Models
View all Image GenerationFlux 1.1 Pro Ultra
FLUX 1.1 Pro in ultra mode. Up to 4 megapixel images with raw mode for photorealism.
Flux Dev
Black Forest Labs' development model. Fast, high-quality image generation with LoRA support.
Google Imagen 4
Google's Imagen 4. Text-to-image with strong photorealism and improved typography support.
Google Imagen 4 Ultra
Premium Imagen 4 tier. Highest fidelity, prompt adherence and typography quality from Google.
Start using Janus Pro 7B today
Get started with free credits. No credit card required. Access Janus Pro 7B and 100+ other models through a single API.