Migrate from Together AI to Railwail

TL;DR — Switch in Under 5 Minutes

Both APIs are OpenAI-compatible — change only the base URL and API key
All major Together models mirrored: Llama 3.3, Mixtral, DeepSeek V3, Qwen 2.5
EU-hosted endpoint, EUR billing — clean for European entities
Plus access to GPT-4o, Claude, Gemini, Flux through the same key
Comparable per-token pricing on the open-source models

Why Move Off Together AI?

Together AI is great for hosted open-source LLMs at competitive prices. The trade-offs: US-hosted by default, USD-only billing, and the catalog is limited to open-source models. If you want Claude or GPT-4o in the same codebase, you still need separate accounts. Railwail unifies all of these under one API, one EUR invoice, and EU residency.

Step 1 — Get a Railwail API Key

Access 100+ AI Models with One API Key

GPT-4o, Claude, Gemini, Llama, Flux, DALL-E and more — all through a single, OpenAI-compatible endpoint. No more juggling multiple providers.

Get Started Free

Step 2 — Change the Base URL

TypeScript / JavaScript

Before (Together AI):
import OpenAI from "openai"; const together = new OpenAI({ apiKey: process.env.TOGETHER_API_KEY, baseURL: "https://api.together.xyz/v1", }); const res = await together.chat.completions.create({ model: "meta-llama/Llama-3.3-70B-Instruct-Turbo", messages: [{ role: "user", content: "Hello" }], });After (Railwail):
import OpenAI from "openai"; const client = new OpenAI({ apiKey: process.env.RAILWAIL_API_KEY, baseURL: "https://api.railwail.com/v1", }); const res = await client.chat.completions.create({ model: "llama-3.3-70b-instruct", messages: [{ role: "user", content: "Hello" }], });

Python

from openai import OpenAI client = OpenAI( api_key=os.environ["RAILWAIL_API_KEY"], base_url="https://api.railwail.com/v1", ) resp = client.chat.completions.create( model="llama-3.3-70b-instruct", messages=[{"role": "user", "content": "Hello"}], ) print(resp.choices[0].message.content)

cURL

curl https://api.railwail.com/v1/chat/completions \ -H "Authorization: Bearer $RAILWAIL_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "llama-3.3-70b-instruct", "messages": [{"role": "user", "content": "Hello"}] }'

Step 3 — Update Model IDs

Together AI uses fully-qualified Hugging Face style model IDs like meta-llama/Llama-3.3-70B-Instruct-Turbo. Railwail normalises these to short slugs (llama-3.3-70b-instruct). The long IDs are also accepted as aliases for compatibility — your existing Together strings will work, but the short forms are recommended.

API Endpoint Mapping

Together AI endpoint → Railwail equivalent

Together AI	Railwail	Notes
POST /v1/chat/completions	POST /v1/chat/completions	Identical
POST /v1/completions	POST /v1/completions	Legacy completions supported
POST /v1/embeddings	POST /v1/embeddings	Identical
POST /v1/images/generations	POST /v1/images/generations	Flux, SDXL
POST /v1/audio/transcriptions	POST /v1/audio/transcriptions	Whisper
GET /v1/models	GET /v1/models	275+ models, filter by provider
POST /v1/rerank	POST /v1/rerank	Cohere-compatible rerank API

Test Any AI Model Instantly

Our built-in playground lets you compare models side by side. Find the perfect model for your use case in minutes, not days.

Open Playground

Model Mapping

Together AI model → Railwail model ID

Together AI	Railwail	Notes
meta-llama/Llama-3.3-70B-Instruct-Turbo	llama-3.3-70b-instruct	Frontier open LLM
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo	llama-3.1-405b-instruct	Largest open
meta-llama/Llama-3.1-8B-Instruct-Turbo	llama-3.1-8b-instruct	Small, fast
mistralai/Mixtral-8x7B-Instruct-v0.1	mixtral-8x7b-instruct	MoE
mistralai/Mixtral-8x22B-Instruct-v0.1	mixtral-8x22b-instruct	Larger MoE
deepseek-ai/DeepSeek-V3	deepseek-v3	Frontier MoE
deepseek-ai/DeepSeek-R1	deepseek-r1	Reasoning
Qwen/Qwen2.5-72B-Instruct-Turbo	qwen2.5-72b-instruct	Alibaba
WhereIsAI/UAE-Large-V1	uae-large-v1	Embedding
black-forest-labs/FLUX.1-schnell	flux-schnell	Image

Pricing Comparison (per 1M tokens, May 2026)

Same open-source model, Railwail in EUR

Model	Together AI (USD)	Railwail (EUR)	Notes
llama-3.3-70b-instruct	$0.88	EUR 0.81	Identical
llama-3.1-405b-instruct	$3.50	EUR 3.22	Identical
llama-3.1-8b-instruct	$0.18	EUR 0.17	Identical
mixtral-8x7b-instruct	$0.60	EUR 0.55	Identical
deepseek-v3	$1.25	EUR 1.15	Identical
deepseek-r1	$3.00 / $7.00	EUR 2.76 / 6.44	Input/output
qwen2.5-72b-instruct	$1.20	EUR 1.10	Identical

Why Railwail Over Together AI

EU billing in EUR with VAT receipts
Frankfurt-region hosting for GDPR-compliant logs
Same OpenAI-compatible API — drop-in replacement
Access to Claude, GPT-4o, Gemini, Mistral La Plateforme through the same key
Built-in playground at railwail.com/models for A/B testing
Comparable pricing on Together's lineup, plus per-key spend caps and rate limits

Pay Only for What You Use

Transparent per-token pricing with no monthly minimums. Start with free credits and scale as you grow.

View Pricing

FAQ

Are the Turbo variants (FP8) supported?

Yes. Railwail serves the same FP8-quantised Llama 3.x Turbo variants Together AI offers. Specify -turbo in the model slug to opt into FP8.

What about Together's batch inference?

Use Railwail's POST /v1/batches — same 50% discount on async workloads.

Does the rerank API work the same?

Yes. POST /v1/rerank accepts the same query/documents/top_n payload.

Can I bring my own fine-tuned Llama?

Custom fine-tunes are not yet hosted on Railwail. You can keep them on Together AI and use Railwail for everything else.

How is latency vs Together AI?

First-token latency from EU origins is typically 30-80ms faster via Railwail's Frankfurt edge than via Together's US-default routing.

Next Steps

Create your Railwail account at railwail.com
Generate an API key in Dashboard → API Keys
Change baseURL to https://api.railwail.com/v1
Update model IDs to Railwail short slugs (or keep long Together IDs as aliases)
Read the full reference at railwail.com/docs
Compare per-token pricing at railwail.com/pricing