Migrate from Anyscale Endpoints to Railwail

TL;DR — Switch in Under 5 Minutes

Anyscale Endpoints was shut down in 2024 — Railwail offers a direct successor with the same OpenAI-compatible API
All major models that were on Anyscale (Llama 2/3, Mistral, Mixtral, CodeLlama, Zephyr) are mirrored
Change baseURL and model slug — that is the entire migration
EU-hosted, EUR billing, 275+ models on one key

Why Anyscale Endpoints Closed and What Comes Next

Anyscale focused on Ray Serve enterprise and sunsetted the public Endpoints product. Many teams that built on Anyscale's Llama / Mixtral APIs found themselves needing a drop-in replacement. Railwail is the cleanest path: identical OpenAI-compatible schema, modern Llama 3.x and Mixtral models, transparent pricing, EU-hosted with EUR billing.

Step 1 — Get a Railwail API Key

Access 100+ AI Models with One API Key

GPT-4o, Claude, Gemini, Llama, Flux, DALL-E and more — all through a single, OpenAI-compatible endpoint. No more juggling multiple providers.

Get Started Free

Step 2 — Change Base URL

TypeScript / JavaScript

Before (Anyscale):
import OpenAI from "openai"; const anyscale = new OpenAI({ apiKey: process.env.ANYSCALE_API_KEY, baseURL: "https://api.endpoints.anyscale.com/v1", }); const res = await anyscale.chat.completions.create({ model: "meta-llama/Llama-2-70b-chat-hf", messages: [{ role: "user", content: "Hello" }], });After (Railwail):
import OpenAI from "openai"; const client = new OpenAI({ apiKey: process.env.RAILWAIL_API_KEY, baseURL: "https://api.railwail.com/v1", }); const res = await client.chat.completions.create({ model: "llama-3.3-70b-instruct", messages: [{ role: "user", content: "Hello" }], });

Python

from openai import OpenAI client = OpenAI( api_key=os.environ["RAILWAIL_API_KEY"], base_url="https://api.railwail.com/v1", ) resp = client.chat.completions.create( model="llama-3.3-70b-instruct", messages=[{"role": "user", "content": "Hello"}], ) print(resp.choices[0].message.content)

cURL

curl https://api.railwail.com/v1/chat/completions \ -H "Authorization: Bearer $RAILWAIL_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "llama-3.3-70b-instruct", "messages": [{"role": "user", "content": "Hello"}] }'

API Endpoint Mapping

Anyscale endpoint → Railwail equivalent

Anyscale Endpoints	Railwail	Notes
POST /v1/chat/completions	POST /v1/chat/completions	Identical
POST /v1/completions	POST /v1/completions	Legacy
POST /v1/embeddings	POST /v1/embeddings	Identical
POST /v1/fine_tunes	Not yet supported	Custom fine-tuning roadmap
GET /v1/models	GET /v1/models	275+ models

Model Mapping (Modernised)

Anyscale focused on Llama 2 era models. Railwail recommends mapping to the modern Llama 3.x equivalents for better quality at similar or lower cost.

Anyscale model → Railwail modernised equivalent

Anyscale	Railwail (recommended)	Notes
meta-llama/Llama-2-7b-chat-hf	llama-3.1-8b-instruct	Better quality at same size
meta-llama/Llama-2-13b-chat-hf	llama-3.1-8b-instruct or qwen2.5-14b	Smaller, better
meta-llama/Llama-2-70b-chat-hf	llama-3.3-70b-instruct	Same size, much better
meta-llama/CodeLlama-34b-Instruct-hf	codestral-2501	Better code model
mistralai/Mistral-7B-Instruct-v0.1	mistral-small	Modernised
mistralai/Mixtral-8x7B-Instruct-v0.1	mixtral-8x7b-instruct	Direct
HuggingFaceH4/zephyr-7b-beta	llama-3.1-8b-instruct	Modernised general-purpose

Test Any AI Model Instantly

Our built-in playground lets you compare models side by side. Find the perfect model for your use case in minutes, not days.

Open Playground

Pricing Comparison (per 1M tokens, May 2026)

Modernised model on Railwail in EUR

Modernised model	Railwail (EUR)	Anyscale historic (USD)	Notes
llama-3.3-70b-instruct	EUR 0.54 / 0.74	Llama 2 70B was ~$1.00	Cheaper and better
llama-3.1-8b-instruct	EUR 0.046 / 0.074	Llama 2 7B was ~$0.15	Cheaper and better
mixtral-8x7b-instruct	EUR 0.22 / 0.22	~$0.50	Comparable
codestral-2501	EUR 0.30 / 0.90	CodeLlama ~$1.00	Better code

Why Railwail Is the Right Successor

Same OpenAI-compatible API — your Anyscale code works with a baseURL change
Modernised model lineup — Llama 3.3, Mistral Large, Mixtral 8x22B available
EU-hosted with EUR billing
Adds Claude, GPT-4o, Gemini — Anyscale was open-source only
Built-in playground at railwail.com/models
Per-key rate limits and spend caps

FAQ

Can I still use Llama 2 models if I have to?

Llama 2 70B chat is available as llama-2-70b-chat for legacy compatibility. Llama 3.3 70B is recommended — it is strictly better and cheaper.

What about Anyscale's fine-tuning?

Anyscale offered custom LoRA fine-tuning. Railwail does not currently host custom fine-tunes. Train on Together or Hugging Face TRL and self-host, or use prompt engineering / RAG to achieve similar specialisation.

Are embeddings supported?

Yes. POST /v1/embeddings supports BGE, e5, Cohere Embed, Voyage and OpenAI embeddings.

Does Anyscale's Ray Serve self-hosted setup migrate too?

Self-hosted Ray Serve deployments are not in scope for Railwail (we are a managed inference API, not a deployment platform). If you need self-hosting, look at vLLM Endpoints or BentoML.

Will my existing prompts work?

Yes for the model mapping. Note that Llama 3.x has different default behaviour than Llama 2 — slightly less refusal-prone, better at following instructions, longer context. You may want to retest prompt quality.

Pay Only for What You Use

Transparent per-token pricing with no monthly minimums. Start with free credits and scale as you grow.

View Pricing

Next Steps

Sign up at railwail.com
Generate an API key
Update baseURL to https://api.railwail.com/v1
Map Llama 2 → Llama 3.3 for better quality at lower cost
Read the reference at railwail.com/docs
Compare pricing at railwail.com/pricing