Migrate from Fireworks AI to Railwail
Migration Guides

Migrate from Fireworks AI to Railwail

Move from Fireworks AI to Railwail. Same OpenAI-compatible API, same low-latency Llama and Mixtral, EU hosting, EUR billing, 275+ models on one key.

Railwail Teamยท Developer Relations7 min readMay 16, 2026

TL;DR โ€” Switch in Under 5 Minutes

  • Both APIs are OpenAI-compatible โ€” change base URL and key only
  • All Fireworks-hosted models mirrored: Llama 3.3, Mixtral, DeepSeek, Qwen, FireFunction
  • EU-hosted endpoint, EUR billing
  • Same low-latency vLLM-based serving for open models
  • Plus access to closed-source frontier models (Claude, GPT-4o, Gemini) on the same key

Why Move Off Fireworks AI?

Fireworks AI is well-known for fast inference on open-source LLMs. The trade-offs are US-only hosting, USD billing, and the lack of closed-source frontier models (Claude, GPT-4o, Gemini) in the catalog. Railwail keeps the speed and adds EU hosting, EUR billing, and the closed-source models โ€” all behind one API key.

Step 1 โ€” Get a Railwail API Key

Sign up at railwail.com and generate a key. Free credits included.

Sponsored

Access 100+ AI Models with One API Key

GPT-4o, Claude, Gemini, Llama, Flux, DALL-E and more โ€” all through a single, OpenAI-compatible endpoint. No more juggling multiple providers.

Step 2 โ€” Change Base URL and Model Slug

TypeScript / JavaScript

Before (Fireworks):

import OpenAI from "openai";

const fireworks = new OpenAI({
  apiKey: process.env.FIREWORKS_API_KEY,
  baseURL: "https://api.fireworks.ai/inference/v1",
});

const res = await fireworks.chat.completions.create({
  model: "accounts/fireworks/models/llama-v3p3-70b-instruct",
  messages: [{ role: "user", content: "Hello" }],
});
After (Railwail):
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.RAILWAIL_API_KEY,
  baseURL: "https://api.railwail.com/v1",
});

const res = await client.chat.completions.create({
  model: "llama-3.3-70b-instruct",
  messages: [{ role: "user", content: "Hello" }],
});

Python

from openai import OpenAI

client = OpenAI(
    api_key=os.environ["RAILWAIL_API_KEY"],
    base_url="https://api.railwail.com/v1",
)

resp = client.chat.completions.create(
    model="llama-3.3-70b-instruct",
    messages=[{"role": "user", "content": "Hello"}],
)
print(resp.choices[0].message.content)

cURL

curl https://api.railwail.com/v1/chat/completions \
  -H "Authorization: Bearer $RAILWAIL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.3-70b-instruct",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

API Endpoint Mapping

Fireworks endpoint โ†’ Railwail equivalent

FireworksRailwailNotes
POST /inference/v1/chat/completionsPOST /v1/chat/completionsIdentical
POST /inference/v1/completionsPOST /v1/completionsLegacy supported
POST /inference/v1/embeddingsPOST /v1/embeddingsIdentical
POST /inference/v1/image_generation/{model}POST /v1/images/generationsOpenAI-style
POST /inference/v1/audio/transcriptionsPOST /v1/audio/transcriptionsWhisper
GET /inference/v1/modelsGET /v1/models275+ models

Model Mapping

Fireworks model โ†’ Railwail

FireworksRailwailNotes
accounts/fireworks/models/llama-v3p3-70b-instructllama-3.3-70b-instructLlama 3.3
accounts/fireworks/models/llama-v3p1-405b-instructllama-3.1-405b-instructLargest
accounts/fireworks/models/mixtral-8x7b-instructmixtral-8x7b-instructMoE
accounts/fireworks/models/mixtral-8x22b-instructmixtral-8x22b-instructBigger MoE
accounts/fireworks/models/deepseek-v3deepseek-v3Frontier MoE
accounts/fireworks/models/deepseek-r1deepseek-r1Reasoning
accounts/fireworks/models/qwen2p5-72b-instructqwen2.5-72b-instructAlibaba
accounts/fireworks/models/firefunction-v2firefunction-v2Function calling
accounts/fireworks/models/firellava-13bfirellava-13bVision

Sponsored

Test Any AI Model Instantly

Our built-in playground lets you compare models side by side. Find the perfect model for your use case in minutes, not days.

Pricing Comparison (per 1M tokens, May 2026)

Same model, Railwail in EUR

ModelFireworks (USD)Railwail (EUR)Notes
llama-3.3-70b-instruct$0.90EUR 0.83Identical
llama-3.1-405b-instruct$3.00EUR 2.76Identical
mixtral-8x7b-instruct$0.50EUR 0.46Identical
deepseek-v3$0.90EUR 0.83Identical
deepseek-r1$3.00 / $8.00EUR 2.76 / 7.36Input/output
firefunction-v2$0.90EUR 0.83Identical

Why Railwail Over Fireworks

  • EU billing in EUR with VAT receipts
  • Frankfurt-region hosting for low-latency EU customers
  • Same OpenAI-compatible API
  • Adds Claude, GPT-4o, Gemini to the catalog โ€” Fireworks has no closed-source
  • Built-in playground for cross-model comparison
  • Comparable pricing on Fireworks-style open models

FAQ

What about FireFunction and the function-calling models?

FireFunction v2 is mirrored as firefunction-v2 with identical tool/function calling behaviour. For higher-quality function calling, llama-3.3-70b-instruct now matches FireFunction v2 on most benchmarks.

Can I bring my own LoRA?

Custom LoRAs (Fireworks' multi-LoRA feature) are not currently hosted on Railwail. You can keep them on Fireworks and use Railwail for everything else.

What about Fireworks' speculative decoding speedups?

Railwail uses vLLM with speculative decoding on all Llama models โ€” comparable throughput to Fireworks.

How does streaming compare?

Both APIs stream OpenAI-format SSE chunks. First-token latency from EU origins is typically lower via Railwail's Frankfurt edge.

What about embeddings?

POST /v1/embeddings accepts identical request shapes. Railwail adds Voyage and Cohere embedding models alongside the open-source options.

Sponsored

Pay Only for What You Use

Transparent per-token pricing with no monthly minimums. Start with free credits and scale as you grow.

Next Steps

  • Sign up at railwail.com
  • Generate an API key
  • Update baseURL to https://api.railwail.com/v1
  • Replace long Fireworks model paths with short Railwail slugs
  • Read the reference at railwail.com/docs
  • Compare per-token pricing at railwail.com/pricing

Railwail Team

Developer Relations

The Railwail team writes integration guides for developers migrating from single-provider AI APIs to a unified multi-model platform.

Tags:
Fireworks AI
Migration
Llama
Open Source
Function Calling