TL;DR — Switch in Under 15 Minutes
- Railwail mirrors the most popular Replicate models — Flux, SDXL, Stable Diffusion 3, Whisper, Bark, MusicGen, WAN-2.5, Veo-3, AnimateDiff
- Synchronous calls only — no need for the predictions polling pattern of Replicate
- Standard OpenAI-shaped API for images (/v1/images/generations), audio (/v1/audio/*), and video (/v1/videos)
- Per-call pricing, not per-second-of-compute — easier cost modelling
- EU hosting, EUR billing, and 275+ models behind one API key
Why Move Off Replicate?
Replicate's strength is the long tail of open-source models. Its weakness is the developer experience: async predictions that you must poll, per-second-of-GPU pricing that is hard to budget, no LLMs (text generation has migrated to other vendors), and cold-start latency on rarely-used models. Railwail mirrors the top-demand Replicate models behind always-warm endpoints with predictable per-call pricing.
Common reasons to migrate: simpler synchronous calls, EUR billing, no cold starts on Flux or SDXL, and the ability to mix open-source image generation with frontier LLMs (Claude, GPT-4o) under one API key.
Step 1 — Get a Railwail API Key
Sign up at railwail.com, generate a key. Free credits included to validate.
Sponsored
Access 100+ AI Models with One API Key
GPT-4o, Claude, Gemini, Llama, Flux, DALL-E and more — all through a single, OpenAI-compatible endpoint. No more juggling multiple providers.
Step 2 — Replace the Replicate Async Pattern
Replicate's API requires creating a prediction, polling for completion, then fetching the output URL. Railwail uses synchronous calls — the response includes the result directly.
Image Generation — TypeScript
Before (Replicate):
import Replicate from "replicate";
const replicate = new Replicate({ auth: process.env.REPLICATE_API_TOKEN });
const output = await replicate.run(
"black-forest-labs/flux-1.1-pro",
{ input: { prompt: "a cyberpunk city at sunset", aspect_ratio: "16:9" } }
);
// output is an array of URL stringsAfter (Railwail):import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.RAILWAIL_API_KEY,
baseURL: "https://api.railwail.com/v1",
});
const res = await client.images.generate({
model: "flux-1.1-pro",
prompt: "a cyberpunk city at sunset",
size: "1792x1024",
n: 1,
});
console.log(res.data[0].url);Image Generation — Python
from openai import OpenAI
client = OpenAI(
api_key=os.environ["RAILWAIL_API_KEY"],
base_url="https://api.railwail.com/v1",
)
resp = client.images.generate(
model="flux-1.1-pro",
prompt="a cyberpunk city at sunset",
size="1792x1024",
)
print(resp.data[0].url)Image Generation — cURL
curl https://api.railwail.com/v1/images/generations \
-H "Authorization: Bearer $RAILWAIL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "flux-1.1-pro",
"prompt": "a cyberpunk city at sunset",
"size": "1792x1024"
}'Audio Transcription
const res = await client.audio.transcriptions.create({
model: "whisper-large-v3",
file: fs.createReadStream("./meeting.mp3"),
});
console.log(res.text);Step 3 — Video Generation
Railwail exposes Replicate-mirrored video models (Veo-3, WAN-2.5, AnimateDiff) under POST /v1/videos. These remain async (video generation takes 30s-2min), but Railwail polls for you and returns a single completed response when you await it. No client-side polling needed.
const res = await fetch("https://api.railwail.com/v1/videos", {
method: "POST",
headers: {
"Authorization": `Bearer ${process.env.RAILWAIL_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "wan-2.5-i2v",
prompt: "a cat playing piano, cinematic",
duration_seconds: 5,
}),
});
const json = await res.json();
console.log(json.video_url);API Endpoint Mapping
Replicate endpoint → Railwail equivalent
| Replicate | Railwail equivalent | Notes |
|---|---|---|
| POST /v1/predictions | POST /v1/images/generations | For image models |
| POST /v1/predictions (audio) | POST /v1/audio/transcriptions or /v1/audio/speech | Whisper, Bark |
| POST /v1/predictions (video) | POST /v1/videos | Synchronous wait, no polling |
| GET /v1/predictions/{id} | Not needed — sync response | Railwail blocks until complete |
| GET /v1/models | GET /v1/models | Filter by provider=replicate-mirror |
| POST /v1/predictions (LLM) | POST /v1/chat/completions | Llama, Mixtral, DeepSeek via standard chat API |
Sponsored
Test Any AI Model Instantly
Our built-in playground lets you compare models side by side. Find the perfect model for your use case in minutes, not days.
Popular Replicate Model Mapping
Replicate model → Railwail model ID
| Replicate identifier | Railwail model | Category |
|---|---|---|
| black-forest-labs/flux-1.1-pro | flux-1.1-pro | Image |
| black-forest-labs/flux-schnell | flux-schnell | Image |
| stability-ai/stable-diffusion-3.5-large | stable-diffusion-3.5-large | Image |
| stability-ai/sdxl | sdxl | Image |
| openai/whisper | whisper-large-v3 | Audio STT |
| suno-ai/bark | bark | Audio TTS |
| meta/musicgen | musicgen-large | Music |
| lucataco/animate-diff | animate-diff | Video |
| meta/llama-3.1-405b-instruct | llama-3.1-405b-instruct | Text |
| mistralai/mixtral-8x7b-instruct-v0.1 | mixtral-8x7b-instruct | Text |
Pricing Comparison
Replicate (per-second compute) vs Railwail (per-call)
| Model | Replicate (USD) | Railwail (EUR) | Notes |
|---|---|---|---|
| flux-1.1-pro per image (1024x1024) | ~$0.04 (avg) | EUR 0.037 | Predictable per-call |
| flux-schnell per image | ~$0.003 | EUR 0.0028 | Identical |
| sdxl per image | ~$0.0035 | EUR 0.0032 | Per call |
| whisper-large-v3 per minute audio | $0.0036 | EUR 0.0033 | Identical |
| bark per minute speech | $0.045 | EUR 0.042 | Identical |
| wan-2.5 per 5s video | ~$0.30 | EUR 0.28 | Predictable |
Why Railwail Over Replicate
- Synchronous responses — no polling, no webhook plumbing
- Always-warm endpoints — no 30s cold starts on popular models
- Per-call pricing, not per-second compute — clean cost modelling
- EUR billing with VAT receipts
- EU-hosted gateway
- Same key unlocks Claude, GPT-4o, Gemini for the rest of your AI stack
- Built-in playground at railwail.com/models — preview Flux/SDXL prompts in-browser
Sponsored
Pay Only for What You Use
Transparent per-token pricing with no monthly minimums. Start with free credits and scale as you grow.
FAQ
What if Railwail does not have the specific Replicate model I use?
Railwail mirrors the top 100 most-popular Replicate models by usage. If yours is missing, submit a request at railwail.com/models/submit and we typically add new mirrors within 48 hours.
Do I lose access to community-built Replicate models?
Long-tail community models that are not mirrored stay on Replicate. You can use both APIs side-by-side — Railwail for production high-volume models, Replicate for experiments.
Can I get the raw image bytes instead of a URL?
Yes. Set response_format: "b64_json" on the request to receive base64-encoded image data inline, matching OpenAI's DALL-E response shape.
Does the Flux LoRA / fine-tuning workflow work?
Pass a public LoRA URL in the lora_url parameter for Flux models. For training new LoRAs, use Replicate's training pipeline — the resulting weights can be hosted on Hugging Face and called via Railwail.
What about Replicate's webhook on completion?
Because Railwail responses are synchronous, you do not need webhooks for short jobs. For long video jobs, you can pass a webhook_url in the request body.
How does the pricing compare for high-volume image gen?
Railwail's per-call pricing is typically 5-15% lower than Replicate's per-second compute pricing at default settings, because we batch GPU utilisation across customers. Bring an existing Replicate cost dashboard and we can show you a side-by-side estimate.
Next Steps
- Create your Railwail account at railwail.com
- Generate an API key in Dashboard → API Keys
- Replace the replicate package with the openai SDK
- Switch from polling to synchronous calls
- Read the full API reference at railwail.com/docs
- Browse all Replicate-mirrored models at railwail.com/models
- Compare per-call pricing at railwail.com/pricing