TL;DR โ Switch in Under 5 Minutes
- DeepInfra is already OpenAI-compatible โ change baseURL only
- All major open models mirrored: Llama 3.3, Mixtral, Qwen 2.5, DeepSeek V3 / R1
- Image, audio, embeddings โ all supported
- EU-hosted endpoint, EUR billing
- Plus closed-source Claude, GPT-4o, Gemini behind the same key
Why Move Off DeepInfra?
DeepInfra is competitive on price for hosted open-source models. The trade-offs: US-default routing, USD-only billing, no closed-source frontier models, and a smaller image / audio catalog than Railwail. Migrating is essentially a baseURL change.
Step 1 โ Get a Railwail API Key
Sign up at railwail.com and generate a key.
Sponsored
Access 100+ AI Models with One API Key
GPT-4o, Claude, Gemini, Llama, Flux, DALL-E and more โ all through a single, OpenAI-compatible endpoint. No more juggling multiple providers.
Step 2 โ Change Base URL
TypeScript / JavaScript
Before (DeepInfra):
import OpenAI from "openai";
const di = new OpenAI({
apiKey: process.env.DEEPINFRA_API_KEY,
baseURL: "https://api.deepinfra.com/v1/openai",
});
const res = await di.chat.completions.create({
model: "meta-llama/Meta-Llama-3.3-70B-Instruct",
messages: [{ role: "user", content: "Hello" }],
});After (Railwail):import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.RAILWAIL_API_KEY,
baseURL: "https://api.railwail.com/v1",
});
const res = await client.chat.completions.create({
model: "llama-3.3-70b-instruct",
messages: [{ role: "user", content: "Hello" }],
});Python
from openai import OpenAI
client = OpenAI(
api_key=os.environ["RAILWAIL_API_KEY"],
base_url="https://api.railwail.com/v1",
)
resp = client.chat.completions.create(
model="llama-3.3-70b-instruct",
messages=[{"role": "user", "content": "Hello"}],
)
print(resp.choices[0].message.content)cURL
curl https://api.railwail.com/v1/chat/completions \
-H "Authorization: Bearer $RAILWAIL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "llama-3.3-70b-instruct",
"messages": [{"role": "user", "content": "Hello"}]
}'API Endpoint Mapping
DeepInfra endpoint โ Railwail equivalent
| DeepInfra | Railwail | Notes |
|---|---|---|
| POST /v1/openai/chat/completions | POST /v1/chat/completions | Identical |
| POST /v1/openai/embeddings | POST /v1/embeddings | Identical |
| POST /v1/inference/{model} | Use OpenAI-shaped endpoints | Legacy DeepInfra path |
| POST /v1/openai/audio/transcriptions | POST /v1/audio/transcriptions | Whisper |
| POST /v1/openai/images/generations | POST /v1/images/generations | Flux, SDXL |
| GET /v1/openai/models | GET /v1/models | 275+ models |
Model Mapping
DeepInfra model โ Railwail
| DeepInfra | Railwail | Notes |
|---|---|---|
| meta-llama/Meta-Llama-3.3-70B-Instruct | llama-3.3-70b-instruct | Llama 3.3 |
| meta-llama/Meta-Llama-3.1-405B-Instruct | llama-3.1-405b-instruct | Largest |
| mistralai/Mixtral-8x7B-Instruct-v0.1 | mixtral-8x7b-instruct | MoE |
| Qwen/Qwen2.5-72B-Instruct | qwen2.5-72b-instruct | Alibaba |
| deepseek-ai/DeepSeek-V3 | deepseek-v3 | Frontier MoE |
| deepseek-ai/DeepSeek-R1 | deepseek-r1 | Reasoning |
| black-forest-labs/FLUX-1-schnell | flux-schnell | Image |
| stabilityai/sd3.5-large | stable-diffusion-3.5-large | Image |
Sponsored
Test Any AI Model Instantly
Our built-in playground lets you compare models side by side. Find the perfect model for your use case in minutes, not days.
Pricing Comparison (per 1M tokens, May 2026)
Same model, Railwail in EUR
| Model | DeepInfra (USD) | Railwail (EUR) | Notes |
|---|---|---|---|
| llama-3.3-70b-instruct | $0.40 / $0.40 | EUR 0.37 / 0.37 | Input/output |
| llama-3.1-405b-instruct | $2.50 / $3.50 | EUR 2.30 / 3.22 | Input/output |
| mixtral-8x7b-instruct | $0.24 / $0.24 | EUR 0.22 / 0.22 | Identical |
| deepseek-v3 | $0.49 / $0.89 | EUR 0.45 / 0.82 | Input/output |
| flux-schnell per image | $0.003 | EUR 0.0028 | Identical |
Why Railwail Over DeepInfra
- EU billing in EUR with VAT receipts
- Frankfurt-region hosting for low EU latency
- Same OpenAI-compatible API
- Adds closed-source models (Claude, GPT-4o, Gemini)
- Built-in playground at railwail.com/models
- Per-key rate limits and spend caps
FAQ
What about DeepInfra's deploy-your-own-model feature?
Custom model deployments are not yet supported on Railwail. Keep them on DeepInfra and use Railwail for the standard catalog.
Does the embedding API work the same?
Yes. POST /v1/embeddings accepts identical request shapes for BAAI/bge, Sentence Transformers and Cohere Embed models.
What is the throughput per request?
Comparable to DeepInfra for the same model. Railwail uses vLLM with continuous batching on EU hardware.
Are streaming responses supported?
Yes. Standard OpenAI SSE stream chunks.
What about quantised model variants?
FP8 / int4 variants are exposed with -fp8 or -int4 suffix in the model slug when available.
Sponsored
Pay Only for What You Use
Transparent per-token pricing with no monthly minimums. Start with free credits and scale as you grow.
Next Steps
- Sign up at railwail.com
- Generate an API key
- Update baseURL to https://api.railwail.com/v1
- Switch model IDs to Railwail short slugs
- Read the reference at railwail.com/docs
- Compare pricing at railwail.com/pricing