Stable Diffusion XL
Stability AI's SDXL model via Replicate. High-quality image generation with extensive customization.
Stable Diffusion XL is image generation AI model from Stability AI, priced at €0.000 per 1M input tokens with a unknown context window.
Examples
See what Stable Diffusion XL can generate
Sample output
Anime Character
Prompt: "A fierce warrior princess with flowing silver hair and golden armor, standing atop a cliff overlooking a vast battlefield, anime art style, dramatic wind effects, detailed cel shading"
Sample output
Cozy Interior
Prompt: "A hygge-inspired reading nook with floor-to-ceiling bookshelves, a plush velvet armchair, warm fairy lights, a sleeping cat, and rain visible through a large arched window, digital painting style"
Sample output
Abstract Art
Prompt: "Geometric crystalline formations emerging from a pool of liquid gold, refracting rainbow light prisms throughout the composition, ultra-detailed 3D render with volumetric lighting and caustics"
Pricing
API Integration
Use our OpenAI-compatible API to integrate Stable Diffusion XL into your application.
npm install railwailimport railwail from "railwail";
const rw = railwail("YOUR_API_KEY");
const images = await rw.run("stable-diffusion-xl", "A beautiful sunset over Tokyo");
console.log(images[0].url);
// Or use the image() method for full control
const res = await rw.image("stable-diffusion-xl", "A cat in space", {
size: "1024x1024",
n: 1,
});
console.log(res.data[0].url);Deep dive — Stability AI's Stable Diffusion XL
Stability AI was founded in 2019 by Emad Mostaque (CEO until March 2024) and is headquartered in London. The company sponsored and released the original Stable Diffusion 1.4/1.5/2.x (2022) in partnership with CompVis (LMU Munich, Robin Rombach and Patrick Esser), Runway and LAION. Stable Diffusion XL (SDXL) was released in July 2023 by Stability AI as a successor to SD 1.5/2.1 and was widely adopted by the open-source community as the de-facto open base model for fine-tuning and downstream pipelines until FLUX.1 and SD 3.5 arrived in 2024.
Visit Stability AI →Stable Diffusion XL (SDXL) is a 2023 latent diffusion model from Stability AI, described in 'SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis' (Podell et al. 2023). The architecture is a refined latent-diffusion U-Net operating in the latent space of a learned 8x VAE. Key changes over SD 1.5 include: (1) a larger U-Net backbone (~3.5B parameters, ~3x larger than SD 1.5), (2) two parallel text encoders concatenating CLIP ViT-L and OpenCLIP ViT-bigG features for stronger prompt adherence, (3) conditioning on the original image resolution and crop coordinates to fix the resolution and cropping artefacts of SD 1.5, (4) training natively at 1024x1024 instead of 512x512, and (5) an optional 6.6B-parameter refiner model that performs a final 5-step denoising pass to add high-frequency detail. SDXL became the dominant open base model for fine-tuning between mid-2023 and late 2024, spawning huge ecosystems including IP-Adapter, AnimateDiff, ControlNet-XL, T2I-Adapter and tens of thousands of community LoRAs.
- Parameters
- ~3.5B base U-Net + 6.6B refiner = ~10.1B total
- Context
- 75 tokens
- Native 1024x1024 generation
- Two text encoders (CLIP-L + OpenCLIP-G) for strong prompts
- Optional refiner for high-frequency detail
- Open weights under CreativeML Open RAIL++-M license
- Massive ecosystem: ControlNet-XL, IP-Adapter, AnimateDiff, LoRAs
- Strong fine-tuning base for art styles and brand models
- Runs on consumer GPUs (8-12GB VRAM with quantisation)
- Best for: fine-tuning base, ControlNet pipelines, community LoRAs, on-prem deployments, education.
Pretrained on a large subset of LAION-5B and additional Stability AI-curated data. Conditioned on resolution and crop coordinates to mitigate aspect-ratio artefacts.
License: CreativeML Open RAIL++-M license — open weights with usage restrictions (no NCII, CSAM, etc.). Commercial use permitted with these restrictions.
Known limitations
- Older architecture — outperformed by FLUX.1 and SD 3.5 Large
- Hands, text and complex anatomy imperfect
- Default text encoder limited to 75 tokens
- Open weights have no built-in safety filter
Frequently asked questions
Related Models
View all Image GenerationFlux 1.1 Pro Ultra
FLUX 1.1 Pro in ultra mode. Up to 4 megapixel images with raw mode for photorealism.
Flux Dev
Black Forest Labs' development model. Fast, high-quality image generation with LoRA support.
Google Imagen 4
Google's Imagen 4. Text-to-image with strong photorealism and improved typography support.
Google Imagen 4 Ultra
Premium Imagen 4 tier. Highest fidelity, prompt adherence and typography quality from Google.
Start using Stable Diffusion XL today
Get started with free credits. No credit card required. Access Stable Diffusion XL and 100+ other models through a single API.