Stable Diffusion 3.5 Medium
Stability AI's 2.5B-parameter SD3.5 with strong quality/speed trade-off. Consumer-GPU friendly.
Stable Diffusion 3.5 Medium is image generation AI model from Custom, priced at €0.000 per 1M input tokens with a unknown context window.
Pricing
API Integration
Use our OpenAI-compatible API to integrate Stable Diffusion 3.5 Medium into your application.
npm install railwailimport railwail from "railwail";
const rw = railwail("YOUR_API_KEY");
const images = await rw.run("sd-3-5-medium-stability", "A beautiful sunset over Tokyo");
console.log(images[0].url);
// Or use the image() method for full control
const res = await rw.image("sd-3-5-medium-stability", "A cat in space", {
size: "1024x1024",
n: 1,
});
console.log(res.data[0].url);Deep dive — Stability AI's Stable Diffusion 3.5 Medium
Stability AI was founded in 2019 by Emad Mostaque (CEO until March 2024; succeeded by Prem Akkaraju) and is headquartered in London. The company sponsored and released the original Stable Diffusion (2022, with CompVis/LMU Munich and Runway), SDXL (2023), SD3 (2024) and SD 3.5 (Oct 2024). The October 2024 SD 3.5 release shipped three variants — Large (8.1B), Large Turbo (distilled few-step) and Medium (2.5B) — all under a permissive Community License.
Visit Stability AI →Stable Diffusion 3.5 Medium is the mid-tier model of the October 2024 SD 3.5 release. It uses a refined architecture Stability AI calls MMDiT-X, a 2.5B-parameter Multimodal Diffusion Transformer with improved positional embeddings and joint self-attention designed to fix the anatomical issues that plagued the earlier SD3 Medium 2B release. The model is trained with the rectified-flow / flow-matching objective described in the SD3 paper (Esser et al. 2024) and uses three parallel text encoders — CLIP-L, CLIP-G and T5-XXL — for strong long-prompt understanding. SD 3.5 Medium targets a sweet spot of quality and inference cost: it runs comfortably on a single 12GB consumer GPU, supports 1024x1024 native resolution, and is fast enough for interactive use. It is also intended to be the most fine-tune-friendly member of the SD 3.5 family, with a smaller training footprint than the Large variant.
- Parameters
- 2.5B parameters
- Context
- 256 tokens
- Runs on consumer GPUs (12GB VRAM with FP16)
- Open weights under Stability AI Community License
- MMDiT-X architecture fixes SD3 anatomy issues
- Three text encoders for strong prompt adherence
- 1024x1024 native resolution
- Fine-tune-friendly thanks to smaller footprint
- Compatible with ControlNet, LoRA, IP-Adapter ecosystem
- Best for: fine-tuning base, consumer apps, on-device generation, education, prototyping.
Pretrained on a large licensed and publicly available image-text dataset; filtered for safety and quality. Smaller training footprint than SD 3.5 Large.
License: Stability AI Community License — free for non-commercial and commercial use up to $1M annual revenue; enterprise license required above that threshold.
Known limitations
- Lower quality than SD 3.5 Large on photorealism
- Smaller capacity limits very complex compositions
- Open weights have no built-in safety filter
- Commercial-revenue threshold ($1M) for free use
Frequently asked questions
Related Models
View all Image GenerationFlux 1.1 Pro Ultra
FLUX 1.1 Pro in ultra mode. Up to 4 megapixel images with raw mode for photorealism.
Flux Dev
Black Forest Labs' development model. Fast, high-quality image generation with LoRA support.
Google Imagen 4
Google's Imagen 4. Text-to-image with strong photorealism and improved typography support.
Google Imagen 4 Ultra
Premium Imagen 4 tier. Highest fidelity, prompt adherence and typography quality from Google.
Start using Stable Diffusion 3.5 Medium today
Get started with free credits. No credit card required. Access Stable Diffusion 3.5 Medium and 100+ other models through a single API.