Migrate from Replicate to Railwail

TL;DR — Switch in Under 15 Minutes

Railwail mirrors the most popular Replicate models — Flux, SDXL, Stable Diffusion 3, Whisper, Bark, MusicGen, WAN-2.5, Veo-3, AnimateDiff
Synchronous calls only — no need for the predictions polling pattern of Replicate
Standard OpenAI-shaped API for images (/v1/images/generations), audio (/v1/audio/*), and video (/v1/videos)
Per-call pricing, not per-second-of-compute — easier cost modelling
EU hosting, EUR billing, and 275+ models behind one API key

Why Move Off Replicate?

Replicate's strength is the long tail of open-source models. Its weakness is the developer experience: async predictions that you must poll, per-second-of-GPU pricing that is hard to budget, no LLMs (text generation has migrated to other vendors), and cold-start latency on rarely-used models. Railwail mirrors the top-demand Replicate models behind always-warm endpoints with predictable per-call pricing.

Common reasons to migrate: simpler synchronous calls, EUR billing, no cold starts on Flux or SDXL, and the ability to mix open-source image generation with frontier LLMs (Claude, GPT-4o) under one API key.

Step 1 — Get a Railwail API Key

Access 100+ AI Models with One API Key

GPT-4o, Claude, Gemini, Llama, Flux, DALL-E and more — all through a single, OpenAI-compatible endpoint. No more juggling multiple providers.

Get Started Free

Step 2 — Replace the Replicate Async Pattern

Replicate's API requires creating a prediction, polling for completion, then fetching the output URL. Railwail uses synchronous calls — the response includes the result directly.

Image Generation — TypeScript

Before (Replicate):
import Replicate from "replicate"; const replicate = new Replicate({ auth: process.env.REPLICATE_API_TOKEN }); const output = await replicate.run( "black-forest-labs/flux-1.1-pro", { input: { prompt: "a cyberpunk city at sunset", aspect_ratio: "16:9" } } ); // output is an array of URL stringsAfter (Railwail):
import OpenAI from "openai"; const client = new OpenAI({ apiKey: process.env.RAILWAIL_API_KEY, baseURL: "https://api.railwail.com/v1", }); const res = await client.images.generate({ model: "flux-1.1-pro", prompt: "a cyberpunk city at sunset", size: "1792x1024", n: 1, }); console.log(res.data[0].url);

Image Generation — Python

from openai import OpenAI client = OpenAI( api_key=os.environ["RAILWAIL_API_KEY"], base_url="https://api.railwail.com/v1", ) resp = client.images.generate( model="flux-1.1-pro", prompt="a cyberpunk city at sunset", size="1792x1024", ) print(resp.data[0].url)

Image Generation — cURL

curl https://api.railwail.com/v1/images/generations \ -H "Authorization: Bearer $RAILWAIL_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "flux-1.1-pro", "prompt": "a cyberpunk city at sunset", "size": "1792x1024" }'

Audio Transcription

const res = await client.audio.transcriptions.create({ model: "whisper-large-v3", file: fs.createReadStream("./meeting.mp3"), }); console.log(res.text);

Step 3 — Video Generation

Railwail exposes Replicate-mirrored video models (Veo-3, WAN-2.5, AnimateDiff) under POST /v1/videos. These remain async (video generation takes 30s-2min), but Railwail polls for you and returns a single completed response when you await it. No client-side polling needed.

API Endpoint Mapping

Replicate endpoint → Railwail equivalent

Replicate	Railwail equivalent	Notes
POST /v1/predictions	POST /v1/images/generations	For image models
POST /v1/predictions (audio)	POST /v1/audio/transcriptions or /v1/audio/speech	Whisper, Bark
POST /v1/predictions (video)	POST /v1/videos	Synchronous wait, no polling
GET /v1/predictions/{id}	Not needed — sync response	Railwail blocks until complete
GET /v1/models	GET /v1/models	Filter by provider=replicate-mirror
POST /v1/predictions (LLM)	POST /v1/chat/completions	Llama, Mixtral, DeepSeek via standard chat API

Test Any AI Model Instantly

Our built-in playground lets you compare models side by side. Find the perfect model for your use case in minutes, not days.

Open Playground

Popular Replicate Model Mapping

Replicate model → Railwail model ID

Replicate identifier	Railwail model	Category
black-forest-labs/flux-1.1-pro	flux-1.1-pro	Image
black-forest-labs/flux-schnell	flux-schnell	Image
stability-ai/stable-diffusion-3.5-large	stable-diffusion-3.5-large	Image
stability-ai/sdxl	sdxl	Image
openai/whisper	whisper-large-v3	Audio STT
suno-ai/bark	bark	Audio TTS
meta/musicgen	musicgen-large	Music
lucataco/animate-diff	animate-diff	Video
meta/llama-3.1-405b-instruct	llama-3.1-405b-instruct	Text
mistralai/mixtral-8x7b-instruct-v0.1	mixtral-8x7b-instruct	Text

Pricing Comparison

Replicate (per-second compute) vs Railwail (per-call)

Model	Replicate (USD)	Railwail (EUR)	Notes
flux-1.1-pro per image (1024x1024)	~$0.04 (avg)	EUR 0.037	Predictable per-call
flux-schnell per image	~$0.003	EUR 0.0028	Identical
sdxl per image	~$0.0035	EUR 0.0032	Per call
whisper-large-v3 per minute audio	$0.0036	EUR 0.0033	Identical
bark per minute speech	$0.045	EUR 0.042	Identical
wan-2.5 per 5s video	~$0.30	EUR 0.28	Predictable

Why Railwail Over Replicate

Synchronous responses — no polling, no webhook plumbing
Always-warm endpoints — no 30s cold starts on popular models
Per-call pricing, not per-second compute — clean cost modelling
EUR billing with VAT receipts
EU-hosted gateway
Same key unlocks Claude, GPT-4o, Gemini for the rest of your AI stack
Built-in playground at railwail.com/models — preview Flux/SDXL prompts in-browser

Pay Only for What You Use

Transparent per-token pricing with no monthly minimums. Start with free credits and scale as you grow.

View Pricing

FAQ

What if Railwail does not have the specific Replicate model I use?

Railwail mirrors the top 100 most-popular Replicate models by usage. If yours is missing, submit a request at railwail.com/models/submit and we typically add new mirrors within 48 hours.

Do I lose access to community-built Replicate models?

Long-tail community models that are not mirrored stay on Replicate. You can use both APIs side-by-side — Railwail for production high-volume models, Replicate for experiments.

Can I get the raw image bytes instead of a URL?

Yes. Set response_format: "b64_json" on the request to receive base64-encoded image data inline, matching OpenAI's DALL-E response shape.

Does the Flux LoRA / fine-tuning workflow work?

Pass a public LoRA URL in the lora_url parameter for Flux models. For training new LoRAs, use Replicate's training pipeline — the resulting weights can be hosted on Hugging Face and called via Railwail.

What about Replicate's webhook on completion?

Because Railwail responses are synchronous, you do not need webhooks for short jobs. For long video jobs, you can pass a webhook_url in the request body.

How does the pricing compare for high-volume image gen?

Railwail's per-call pricing is typically 5-15% lower than Replicate's per-second compute pricing at default settings, because we batch GPU utilisation across customers. Bring an existing Replicate cost dashboard and we can show you a side-by-side estimate.

Next Steps

Create your Railwail account at railwail.com
Generate an API key in Dashboard → API Keys
Replace the replicate package with the openai SDK
Switch from polling to synchronous calls
Read the full API reference at railwail.com/docs
Browse all Replicate-mirrored models at railwail.com/models
Compare per-call pricing at railwail.com/pricing