TL;DR โ Switch in Under 10 Minutes
- Drop the cohere-ai SDK โ use the OpenAI SDK with Cohere models
- Command R+, Command R 08-2024 mapped to /v1/chat/completions
- Embed v4 mapped to /v1/embeddings
- Rerank v3.5 mapped to /v1/rerank (Cohere-compatible body)
- EU hosting, EUR billing, plus 274 other models
Why Move Off Cohere?
Cohere has strong enterprise-grade embeddings, reranking, and Command R+ for RAG workloads. The trade-offs are limited frontier-model variety, US-default hosting (EU is enterprise-tier only), and a proprietary SDK that does not play well with OpenAI-shaped codebases. Railwail mirrors Cohere's flagship models behind the OpenAI Chat Completions / Embeddings interface, adds Cohere-compatible /v1/rerank, and unifies billing in EUR.
Step 1 โ Get a Railwail API Key
Sign up at railwail.com and generate a key.
Sponsored
Access 100+ AI Models with One API Key
GPT-4o, Claude, Gemini, Llama, Flux, DALL-E and more โ all through a single, OpenAI-compatible endpoint. No more juggling multiple providers.
Step 2 โ Replace the Cohere SDK
Chat โ TypeScript
Before (Cohere):
import { CohereClient } from "cohere-ai";
const cohere = new CohereClient({ token: process.env.COHERE_API_KEY });
const res = await cohere.chat({
model: "command-r-plus-08-2024",
message: "Hello",
});After (Railwail):import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.RAILWAIL_API_KEY,
baseURL: "https://api.railwail.com/v1",
});
const res = await client.chat.completions.create({
model: "command-r-plus-08-2024",
messages: [{ role: "user", content: "Hello" }],
});Embed โ Python
from openai import OpenAI
client = OpenAI(
api_key=os.environ["RAILWAIL_API_KEY"],
base_url="https://api.railwail.com/v1",
)
resp = client.embeddings.create(
model="embed-english-v4",
input=["hello world", "another doc"],
)
print(resp.data[0].embedding[:5])Rerank โ cURL
curl https://api.railwail.com/v1/rerank \
-H "Authorization: Bearer $RAILWAIL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "rerank-v3.5",
"query": "What is the capital of France?",
"documents": ["Paris is the capital of France.", "Berlin is in Germany."],
"top_n": 1
}'API Endpoint Mapping
Cohere endpoint โ Railwail equivalent
| Cohere | Railwail | Notes |
|---|---|---|
| POST /v2/chat | POST /v1/chat/completions | OpenAI message format |
| POST /v2/embed | POST /v1/embeddings | Cohere Embed v4 |
| POST /v2/rerank | POST /v1/rerank | Same request shape |
| POST /v2/classify | POST /v1/classify | Cohere-compatible classify |
| POST /v1/tokenize | Use tiktoken / cohere-tokenizer client-side | Local tokenization |
| GET /v1/models | GET /v1/models | 275+ models |
Model Mapping
Cohere model โ Railwail
| Cohere | Railwail | Notes |
|---|---|---|
| command-r-plus-08-2024 | command-r-plus-08-2024 | Frontier RAG |
| command-r-08-2024 | command-r-08-2024 | Smaller RAG |
| command-r7b-12-2024 | command-r-7b | Edge size |
| c4ai-aya-23-35b | aya-23-35b | Multilingual |
| embed-english-v4 | embed-english-v4 | English embeddings |
| embed-multilingual-v4 | embed-multilingual-v4 | 100+ languages |
| rerank-v3.5 | rerank-v3.5 | Cross-encoder reranker |
| rerank-multilingual-v3 | rerank-multilingual-v3 | Multilingual rerank |
Sponsored
Test Any AI Model Instantly
Our built-in playground lets you compare models side by side. Find the perfect model for your use case in minutes, not days.
Pricing Comparison (per 1M tokens, May 2026)
Same Cohere model, Railwail in EUR
| Model | Cohere (USD) | Railwail (EUR) | Notes |
|---|---|---|---|
| command-r-plus input | $2.50 | EUR 2.30 | Identical |
| command-r-plus output | $10.00 | EUR 9.20 | Identical |
| command-r input | $0.15 | EUR 0.14 | Identical |
| command-r output | $0.60 | EUR 0.55 | Identical |
| embed-english-v4 | $0.10 | EUR 0.09 | Identical |
| embed-multilingual-v4 | $0.10 | EUR 0.09 | Identical |
| rerank-v3.5 per 1k searches | $2.00 | EUR 1.84 | Identical |
Why Railwail Over Cohere
- EU billing in EUR with VAT receipts
- OpenAI-compatible API โ drop the cohere-ai SDK
- Same Cohere models at same price
- Plus Claude, GPT-4o, Gemini, Voyage embeddings for cross-provider RAG experiments
- Built-in /v1/rerank endpoint that accepts Cohere's request body
- Built-in playground for embed-quality A/B testing
FAQ
Is the rerank API exactly Cohere-compatible?
Yes. Same query, documents, top_n, return_documents fields. Response includes results array with index and relevance_score.
What about Cohere's RAG with citations?
Cohere's stateful chat (conversation_id, connectors, documents in chat call) is proprietary. Railwail supports the stateless OpenAI Chat Completions schema โ pass retrieved documents in the system or user message yourself. Citations metadata is preserved in the response when using Cohere models.
Are embedding input types (search_query vs search_document) preserved?
Yes. Pass input_type: 'search_query' or 'search_document' as an extension parameter.
Can I do binary / int8 embeddings?
Yes. Pass embedding_types: ['binary'] or ['int8'] to get quantised embeddings for vector-store storage efficiency.
Will my Cohere tool-use prompts work?
Cohere's tools format is converted to OpenAI tools schema by Railwail. Tool calls in the response come back as tool_calls per OpenAI convention.
Sponsored
Pay Only for What You Use
Transparent per-token pricing with no monthly minimums. Start with free credits and scale as you grow.
Next Steps
- Sign up at railwail.com
- Generate an API key
- Drop cohere-ai and use the OpenAI SDK
- Use /v1/rerank for reranking workloads
- Read the reference at railwail.com/docs
- Browse Cohere models at railwail.com/models?provider=cohere
- Compare pricing at railwail.com/pricing