TL;DR โ Switch in Under 5 Minutes
- Both APIs are OpenAI-compatible โ change only the base URL and API key
- All major Together models mirrored: Llama 3.3, Mixtral, DeepSeek V3, Qwen 2.5
- EU-hosted endpoint, EUR billing โ clean for European entities
- Plus access to GPT-4o, Claude, Gemini, Flux through the same key
- Comparable per-token pricing on the open-source models
Why Move Off Together AI?
Together AI is great for hosted open-source LLMs at competitive prices. The trade-offs: US-hosted by default, USD-only billing, and the catalog is limited to open-source models. If you want Claude or GPT-4o in the same codebase, you still need separate accounts. Railwail unifies all of these under one API, one EUR invoice, and EU residency.
Step 1 โ Get a Railwail API Key
Sign up at railwail.com and create a key in Dashboard โ API Keys.
Sponsored
Access 100+ AI Models with One API Key
GPT-4o, Claude, Gemini, Llama, Flux, DALL-E and more โ all through a single, OpenAI-compatible endpoint. No more juggling multiple providers.
Step 2 โ Change the Base URL
TypeScript / JavaScript
Before (Together AI):
import OpenAI from "openai";
const together = new OpenAI({
apiKey: process.env.TOGETHER_API_KEY,
baseURL: "https://api.together.xyz/v1",
});
const res = await together.chat.completions.create({
model: "meta-llama/Llama-3.3-70B-Instruct-Turbo",
messages: [{ role: "user", content: "Hello" }],
});After (Railwail):import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.RAILWAIL_API_KEY,
baseURL: "https://api.railwail.com/v1",
});
const res = await client.chat.completions.create({
model: "llama-3.3-70b-instruct",
messages: [{ role: "user", content: "Hello" }],
});Python
from openai import OpenAI
client = OpenAI(
api_key=os.environ["RAILWAIL_API_KEY"],
base_url="https://api.railwail.com/v1",
)
resp = client.chat.completions.create(
model="llama-3.3-70b-instruct",
messages=[{"role": "user", "content": "Hello"}],
)
print(resp.choices[0].message.content)cURL
curl https://api.railwail.com/v1/chat/completions \
-H "Authorization: Bearer $RAILWAIL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "llama-3.3-70b-instruct",
"messages": [{"role": "user", "content": "Hello"}]
}'Step 3 โ Update Model IDs
Together AI uses fully-qualified Hugging Face style model IDs like meta-llama/Llama-3.3-70B-Instruct-Turbo. Railwail normalises these to short slugs (llama-3.3-70b-instruct). The long IDs are also accepted as aliases for compatibility โ your existing Together strings will work, but the short forms are recommended.
API Endpoint Mapping
Together AI endpoint โ Railwail equivalent
| Together AI | Railwail | Notes |
|---|---|---|
| POST /v1/chat/completions | POST /v1/chat/completions | Identical |
| POST /v1/completions | POST /v1/completions | Legacy completions supported |
| POST /v1/embeddings | POST /v1/embeddings | Identical |
| POST /v1/images/generations | POST /v1/images/generations | Flux, SDXL |
| POST /v1/audio/transcriptions | POST /v1/audio/transcriptions | Whisper |
| GET /v1/models | GET /v1/models | 275+ models, filter by provider |
| POST /v1/rerank | POST /v1/rerank | Cohere-compatible rerank API |
Sponsored
Test Any AI Model Instantly
Our built-in playground lets you compare models side by side. Find the perfect model for your use case in minutes, not days.
Model Mapping
Together AI model โ Railwail model ID
| Together AI | Railwail | Notes |
|---|---|---|
| meta-llama/Llama-3.3-70B-Instruct-Turbo | llama-3.3-70b-instruct | Frontier open LLM |
| meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo | llama-3.1-405b-instruct | Largest open |
| meta-llama/Llama-3.1-8B-Instruct-Turbo | llama-3.1-8b-instruct | Small, fast |
| mistralai/Mixtral-8x7B-Instruct-v0.1 | mixtral-8x7b-instruct | MoE |
| mistralai/Mixtral-8x22B-Instruct-v0.1 | mixtral-8x22b-instruct | Larger MoE |
| deepseek-ai/DeepSeek-V3 | deepseek-v3 | Frontier MoE |
| deepseek-ai/DeepSeek-R1 | deepseek-r1 | Reasoning |
| Qwen/Qwen2.5-72B-Instruct-Turbo | qwen2.5-72b-instruct | Alibaba |
| WhereIsAI/UAE-Large-V1 | uae-large-v1 | Embedding |
| black-forest-labs/FLUX.1-schnell | flux-schnell | Image |
Pricing Comparison (per 1M tokens, May 2026)
Same open-source model, Railwail in EUR
| Model | Together AI (USD) | Railwail (EUR) | Notes |
|---|---|---|---|
| llama-3.3-70b-instruct | $0.88 | EUR 0.81 | Identical |
| llama-3.1-405b-instruct | $3.50 | EUR 3.22 | Identical |
| llama-3.1-8b-instruct | $0.18 | EUR 0.17 | Identical |
| mixtral-8x7b-instruct | $0.60 | EUR 0.55 | Identical |
| deepseek-v3 | $1.25 | EUR 1.15 | Identical |
| deepseek-r1 | $3.00 / $7.00 | EUR 2.76 / 6.44 | Input/output |
| qwen2.5-72b-instruct | $1.20 | EUR 1.10 | Identical |
Why Railwail Over Together AI
- EU billing in EUR with VAT receipts
- Frankfurt-region hosting for GDPR-compliant logs
- Same OpenAI-compatible API โ drop-in replacement
- Access to Claude, GPT-4o, Gemini, Mistral La Plateforme through the same key
- Built-in playground at railwail.com/models for A/B testing
- Comparable pricing on Together's lineup, plus per-key spend caps and rate limits
Sponsored
Pay Only for What You Use
Transparent per-token pricing with no monthly minimums. Start with free credits and scale as you grow.
FAQ
Are the Turbo variants (FP8) supported?
Yes. Railwail serves the same FP8-quantised Llama 3.x Turbo variants Together AI offers. Specify -turbo in the model slug to opt into FP8.
What about Together's batch inference?
Use Railwail's POST /v1/batches โ same 50% discount on async workloads.
Does the rerank API work the same?
Yes. POST /v1/rerank accepts the same query/documents/top_n payload.
Can I bring my own fine-tuned Llama?
Custom fine-tunes are not yet hosted on Railwail. You can keep them on Together AI and use Railwail for everything else.
How is latency vs Together AI?
First-token latency from EU origins is typically 30-80ms faster via Railwail's Frankfurt edge than via Together's US-default routing.
Next Steps
- Create your Railwail account at railwail.com
- Generate an API key in Dashboard โ API Keys
- Change baseURL to https://api.railwail.com/v1
- Update model IDs to Railwail short slugs (or keep long Together IDs as aliases)
- Read the full reference at railwail.com/docs
- Compare per-token pricing at railwail.com/pricing