TL;DR โ Switch in Under 5 Minutes
- Anyscale Endpoints was shut down in 2024 โ Railwail offers a direct successor with the same OpenAI-compatible API
- All major models that were on Anyscale (Llama 2/3, Mistral, Mixtral, CodeLlama, Zephyr) are mirrored
- Change baseURL and model slug โ that is the entire migration
- EU-hosted, EUR billing, 275+ models on one key
Why Anyscale Endpoints Closed and What Comes Next
Anyscale focused on Ray Serve enterprise and sunsetted the public Endpoints product. Many teams that built on Anyscale's Llama / Mixtral APIs found themselves needing a drop-in replacement. Railwail is the cleanest path: identical OpenAI-compatible schema, modern Llama 3.x and Mixtral models, transparent pricing, EU-hosted with EUR billing.
Step 1 โ Get a Railwail API Key
Sign up at railwail.com and generate a key.
Sponsored
Access 100+ AI Models with One API Key
GPT-4o, Claude, Gemini, Llama, Flux, DALL-E and more โ all through a single, OpenAI-compatible endpoint. No more juggling multiple providers.
Step 2 โ Change Base URL
TypeScript / JavaScript
Before (Anyscale):
import OpenAI from "openai";
const anyscale = new OpenAI({
apiKey: process.env.ANYSCALE_API_KEY,
baseURL: "https://api.endpoints.anyscale.com/v1",
});
const res = await anyscale.chat.completions.create({
model: "meta-llama/Llama-2-70b-chat-hf",
messages: [{ role: "user", content: "Hello" }],
});After (Railwail):import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.RAILWAIL_API_KEY,
baseURL: "https://api.railwail.com/v1",
});
const res = await client.chat.completions.create({
model: "llama-3.3-70b-instruct",
messages: [{ role: "user", content: "Hello" }],
});Python
from openai import OpenAI
client = OpenAI(
api_key=os.environ["RAILWAIL_API_KEY"],
base_url="https://api.railwail.com/v1",
)
resp = client.chat.completions.create(
model="llama-3.3-70b-instruct",
messages=[{"role": "user", "content": "Hello"}],
)
print(resp.choices[0].message.content)cURL
curl https://api.railwail.com/v1/chat/completions \
-H "Authorization: Bearer $RAILWAIL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "llama-3.3-70b-instruct",
"messages": [{"role": "user", "content": "Hello"}]
}'API Endpoint Mapping
Anyscale endpoint โ Railwail equivalent
| Anyscale Endpoints | Railwail | Notes |
|---|---|---|
| POST /v1/chat/completions | POST /v1/chat/completions | Identical |
| POST /v1/completions | POST /v1/completions | Legacy |
| POST /v1/embeddings | POST /v1/embeddings | Identical |
| POST /v1/fine_tunes | Not yet supported | Custom fine-tuning roadmap |
| GET /v1/models | GET /v1/models | 275+ models |
Model Mapping (Modernised)
Anyscale focused on Llama 2 era models. Railwail recommends mapping to the modern Llama 3.x equivalents for better quality at similar or lower cost.
Anyscale model โ Railwail modernised equivalent
| Anyscale | Railwail (recommended) | Notes |
|---|---|---|
| meta-llama/Llama-2-7b-chat-hf | llama-3.1-8b-instruct | Better quality at same size |
| meta-llama/Llama-2-13b-chat-hf | llama-3.1-8b-instruct or qwen2.5-14b | Smaller, better |
| meta-llama/Llama-2-70b-chat-hf | llama-3.3-70b-instruct | Same size, much better |
| meta-llama/CodeLlama-34b-Instruct-hf | codestral-2501 | Better code model |
| mistralai/Mistral-7B-Instruct-v0.1 | mistral-small | Modernised |
| mistralai/Mixtral-8x7B-Instruct-v0.1 | mixtral-8x7b-instruct | Direct |
| HuggingFaceH4/zephyr-7b-beta | llama-3.1-8b-instruct | Modernised general-purpose |
Sponsored
Test Any AI Model Instantly
Our built-in playground lets you compare models side by side. Find the perfect model for your use case in minutes, not days.
Pricing Comparison (per 1M tokens, May 2026)
Modernised model on Railwail in EUR
| Modernised model | Railwail (EUR) | Anyscale historic (USD) | Notes |
|---|---|---|---|
| llama-3.3-70b-instruct | EUR 0.54 / 0.74 | Llama 2 70B was ~$1.00 | Cheaper and better |
| llama-3.1-8b-instruct | EUR 0.046 / 0.074 | Llama 2 7B was ~$0.15 | Cheaper and better |
| mixtral-8x7b-instruct | EUR 0.22 / 0.22 | ~$0.50 | Comparable |
| codestral-2501 | EUR 0.30 / 0.90 | CodeLlama ~$1.00 | Better code |
Why Railwail Is the Right Successor
- Same OpenAI-compatible API โ your Anyscale code works with a baseURL change
- Modernised model lineup โ Llama 3.3, Mistral Large, Mixtral 8x22B available
- EU-hosted with EUR billing
- Adds Claude, GPT-4o, Gemini โ Anyscale was open-source only
- Built-in playground at railwail.com/models
- Per-key rate limits and spend caps
FAQ
Can I still use Llama 2 models if I have to?
Llama 2 70B chat is available as llama-2-70b-chat for legacy compatibility. Llama 3.3 70B is recommended โ it is strictly better and cheaper.
What about Anyscale's fine-tuning?
Anyscale offered custom LoRA fine-tuning. Railwail does not currently host custom fine-tunes. Train on Together or Hugging Face TRL and self-host, or use prompt engineering / RAG to achieve similar specialisation.
Are embeddings supported?
Yes. POST /v1/embeddings supports BGE, e5, Cohere Embed, Voyage and OpenAI embeddings.
Does Anyscale's Ray Serve self-hosted setup migrate too?
Self-hosted Ray Serve deployments are not in scope for Railwail (we are a managed inference API, not a deployment platform). If you need self-hosting, look at vLLM Endpoints or BentoML.
Will my existing prompts work?
Yes for the model mapping. Note that Llama 3.x has different default behaviour than Llama 2 โ slightly less refusal-prone, better at following instructions, longer context. You may want to retest prompt quality.
Sponsored
Pay Only for What You Use
Transparent per-token pricing with no monthly minimums. Start with free credits and scale as you grow.
Next Steps
- Sign up at railwail.com
- Generate an API key
- Update baseURL to https://api.railwail.com/v1
- Map Llama 2 โ Llama 3.3 for better quality at lower cost
- Read the reference at railwail.com/docs
- Compare pricing at railwail.com/pricing