Custom Providers
Specialized and emerging providers routed through Railwail's custom integration layer — including xAI Grok, niche image and video models, and partner labs.
55 models from Custom Providers on Railwail
Access every Custom Providers model through Railwail's OpenAI-compatible API.
55 models available
Grok 4
xAI's flagship reasoning model with vision and tool use. 256k context, strong at complex reasoning and STEM tasks.
Grok 4.3
xAI's May 2026 flagship. 1M context, vision, always-on reasoning, real-time X/web retrieval via DeepSearch.
Ideogram 3.0
Ideogram's flagship text-to-image model with industry-leading text rendering and prompt adherence.
Kimi K2 (Moonshot)
Moonshot AI's 1T-parameter MoE model. Industry-leading agentic coding and tool-use benchmarks.
MiniMax-01
MiniMax's 456B hybrid lightning-attention model with native 4M-token context. Industry-leading long-context.
Perplexity Sonar Pro
Perplexity's premium web-grounded search model with multi-step reasoning over live sources.
Voyage AI voyage-3
Voyage's general-purpose embedding model. 1024 dims, 32k context, strong retrieval performance.
AI21 Jamba 1.5 Large
AI21's flagship hybrid Mamba-Transformer model with a 256k context window for long-document tasks.
AI21 Jamba 1.5 Mini
Cost-efficient hybrid Mamba-Transformer model with 256k context. Tuned for high-throughput RAG.
Cartesia Sonic
Cartesia's ultra-low-latency TTS (~90ms TTFB). State-space model with voice cloning support.
Cohere Aya 23 35B
Open-weights multilingual research model from Cohere covering 23 languages. 35B parameters.
Cohere Command Light (legacy)
Cohere's fast lightweight chat model (deprecated Sep 2025). Kept as comparison tombstone.
Cohere Command R (08-2024)
Cohere's mid-tier RAG/tool model. Cost-efficient sibling of Command R+ with 128k context.
Cohere Command R+ (08-2024)
Cohere's flagship RAG- and tool-optimized chat model. 128k context, refreshed August 2024.
Cohere embed-multilingual-v3
Cohere's multilingual embedding model. Supports 100+ languages with separate search and classification modes.
Deepgram Nova-3
Deepgram's flagship STT. First to offer realtime multilingual transcription with self-serve customization.
Edge TTS
Microsoft Edge neural voices accessed via the open-source edge-tts wrapper. 400+ voices across 100+ locales, suitable for batch generation.
Get3D (NVIDIA)
NVIDIA GET3D generative model for textured 3D shapes. Trained on category-specific datasets producing meshes with high-quality textures.
Grok 2 Vision
xAI's vision-capable Grok 2 snapshot. Image-in, text-out with strong multilingual instruction following.
Grok 3
xAI's flagship model. Strong at reasoning, coding, and real-time knowledge with web search capabilities.
Grok 4.1 Fast
xAI's cost-efficient high-throughput model. 2M context, optional reasoning, optimized for agentic loops and real-time apps.
Hailuo / MiniMax Video-01
MiniMax's Hailuo video-01. 6s 1280x720 clips with strong cinematic motion and physical realism.
Ideogram 2.0 Turbo
Ideogram's fast text-to-image variant. Strong typography and logo rendering at low latency.
Jina Embeddings v3 (Multilingual)
Jina's frontier multilingual embedding model. 570M params, 8192 ctx, 89 languages, Matryoshka dims 128-1024.
Kling 1.6 Pro
Kuaishou's Kling 1.6 Pro. Premium cinematic motion and physics realism, ~$0.07/sec.
LeRobot SmolVLA
HuggingFace's 450M VLA pretrained on 487 community LeRobot datasets. Runs on consumer GPUs.
Luma Dream Machine v1.6
Luma's Dream Machine 1.6. 720p text/image-to-video with strong motion and camera control.
mxbai-embed-large-v1
Mixedbread's open-source 335M embedding model. Top MTEB benchmark for English retrieval at release.
NVIDIA Cosmos-Predict-1
NVIDIA's world foundation model for physical AI. Diffusion-based video prediction for robotics simulation.
Octo Base
Berkeley/Stanford 93M transformer diffusion policy. Pretrained on 800k Open-X-Embodiment episodes.
Octo Small
Compact 27M variant of Octo. Faster inference on consumer GPUs, designed for low-latency control.
OpenVLA-7B
Stanford/Berkeley open VLA trained on 970k Open-X-Embodiment episodes. Supports LoRA fine-tuning.
Perplexity Sonar
Perplexity's fastest and cheapest web-grounded chat model. Live-source citations included.
Perplexity Sonar Reasoning
Perplexity's reasoning model with chain-of-thought and integrated web search.
Physical Intelligence Pi-0-FAST
Autoregressive π-0 variant using FAST action tokenizer. Faster inference at competitive task success.
Physical Intelligence π-0
Physical Intelligence's flagship VLA flow-matching policy. Generalist robot control, pretrained on 10k+ hrs robot data.
Physical Intelligence π-0.5
Upgraded π-0 with open-world generalization via knowledge insulation. Weights and fine-tuning open-sourced.
Pika 2.0 (Official)
Pika Labs' 2.0 release. Cinematic text/image-to-video with scene composition controls.
Playground v3 (Design)
Playground's text-to-image model focused on graphic design aesthetics and embedded typography.
PlayHT 2.0
PlayHT's 2.0 generative voice model. Multi-lingual expressive speech synthesis with sub-second latency and high-fidelity voice cloning.
Qwen 2.5-Max
Alibaba's flagship pretrained MoE model. Top-tier reasoning and code performance via DashScope API.
RDT-1B
Tsinghua's 1B diffusion-transformer bimanual manipulation policy. Predicts next 64 actions per inference.
Recraft V3 Realistic
Recraft's high-prompt-adherence raster image model. Strong layout control and brand-style consistency.
Recraft V3 SVG
Recraft's vector/SVG generation model. Editable illustrations and icons from text.
Reka Core
Reka's frontier multimodal model supporting text, image, video and audio inputs.
Reka Edge
Reka's small on-device-friendly multimodal model. ~7B parameters, 16k context.
Reka Flash
Reka's 21B dense multimodal model balancing speed and quality. Up to 128k context.
Runway Gen-3 Alpha Turbo
Runway's faster, cheaper Gen-3 variant. Image-to-video at 5 credits/sec (~$0.05/sec).
Snowflake Arctic Instruct
Snowflake's open MoE model: 480B total / 17B active params with dense+MoE hybrid architecture.
Stable Audio 2
Stability AI's Stable Audio 2.0. Text-to-music up to 3 minutes of full-length, structured tracks at 44.1 kHz.
Stable Diffusion 3.5 Large (Stability)
Stability AI's 8B-parameter flagship SD3.5 model. Strong prompt adherence and aesthetic quality.
Stable Diffusion 3.5 Large Turbo
Distilled 4-step variant of SD3.5 Large. 8B params, ~4x faster inference at competitive quality.
Stable Diffusion 3.5 Medium
Stability AI's 2.5B-parameter SD3.5 with strong quality/speed trade-off. Consumer-GPU friendly.
Voyage AI voyage-code-3
Voyage's code-specialized embedding model. Up to 32k context, Matryoshka 256-2048 dims, int8/binary support.
Yi Large
01.AI's larger general-purpose chat model with 32k context window and strong bilingual performance.
Frequently asked questions
How is Custom Providers pricing handled on Railwail?
Are there rate limits when using Custom Providers via Railwail?
Which regions does Custom Providers support through Railwail?
Is there a sandbox or free tier to test Custom Providers models?
Start building with Custom Providers today
Free credits on sign-up. No credit card required. Access Custom Providers and 27+ other providers through a single API.