Custom Providers

xAI's flagship reasoning model with vision and tool use. 256k context, strong at complex reasoning and STEM tasks.

xaiflagshipreasoning

Grok 4.20 Reasoning

xAI's Grok 4.20 reasoning snapshot. Runs an extended thinking pass before answering for multi-step analysis, math and STEM, with a 1M token context window and strong agentic tool calling.

xaigrokreasoning

Grok 4.3

MultimodalxAI

NewPopular

xAI's May 2026 flagship. 1M context, vision, always-on reasoning, real-time X/web retrieval via DeepSearch.

xaiflagshipreasoning

Ideogram 3.0

ImageIdeogram

ideogramtext-to-imagetypography

Ideogram's flagship text-to-image model with industry-leading text rendering and prompt adherence.

€0.0915.0s

Kimi K2 (Moonshot)

Moonshot AI's 1T-parameter MoE model. Industry-leading agentic coding and tool-use benchmarks.

moonshotkimimoe

MiniMax-01

Text & ChatMinimax

MiniMax's 456B hybrid lightning-attention model with native 4M-token context. Industry-leading long-context.

minimaxlong-contextlightning-attention

Perplexity Sonar Pro

Perplexity's premium web-grounded search model with multi-step reasoning over live sources.

perplexityweb-searchcitations

Voyage AI voyage-3

Voyage's general-purpose embedding model. 1024 dims, 32k context, strong retrieval performance.

voyageembeddingretrieval

AI21 Jamba 1.5 Large

AI21's flagship hybrid Mamba-Transformer model with a 256k context window for long-document tasks.

ai21long-contextmamba

AI21 Jamba 1.5 Mini

Cost-efficient hybrid Mamba-Transformer model with 256k context. Tuned for high-throughput RAG.

ai21long-contextmamba

Cartesia Sonic

TTSCustom

Cartesia's ultra-low-latency TTS (~90ms TTFB). State-space model with voice cloning support.

cartesiattslow-latency

Cohere Aya 23 35B

Open-weights multilingual research model from Cohere covering 23 languages. 35B parameters.

coheremultilingualopen-weights

Cohere Command Light (legacy)

Text & ChatCohere

Cohere's fast lightweight chat model (deprecated Sep 2025). Kept as comparison tombstone.

coherelegacydeprecated

Cohere Command R (08-2024)

Text & ChatCohere

Cohere's mid-tier RAG/tool model. Cost-efficient sibling of Command R+ with 128k context.

cohereragtools

Cohere Command R+ (08-2024)

Text & ChatCohere

Cohere's flagship RAG- and tool-optimized chat model. 128k context, refreshed August 2024.

cohereragtools

Cohere embed-multilingual-v3

Cohere's multilingual embedding model. Supports 100+ languages with separate search and classification modes.

cohereembeddingmultilingual

Deepgram Nova-3

STTCustom

Deepgram's flagship STT. First to offer realtime multilingual transcription with self-serve customization.

€0.004

deepgramstttranscription

Edge TTS

TTSCustom

Microsoft Edge neural voices accessed via the open-source edge-tts wrapper. 400+ voices across 100+ locales, suitable for batch generation.

microsoftttsmultilingual

Get3D (NVIDIA)

NVIDIA GET3D generative model for textured 3D shapes. Trained on category-specific datasets producing meshes with high-quality textures.

nvidia3d-generationopen-weights

Grok 2 Vision

MultimodalxAI

xAI's vision-capable Grok 2 snapshot. Image-in, text-out with strong multilingual instruction following.

xaivisionlegacy

Grok 3

New

xAI's flagship model. Strong at reasoning, coding, and real-time knowledge with web search capabilities.

Free3.0s

reasoningreal-time

Grok 4.1 Fast

MultimodalxAI

New

xAI's cost-efficient high-throughput model. 2M context, optional reasoning, optimized for agentic loops and real-time apps.

xaicost-efficientvision

Grok 4.20 (Non-Reasoning)

xAI's Grok 4.20 standard snapshot. Skips the extended thinking pass for lower-latency answers on tasks that do not need deep deliberation. 1M token context window.

xaigroklong-context

Grok 4.20 Multi-Agent

xAI's Grok 4.20 multi-agent snapshot. Coordinates several specialized agents under one call to handle multi-step workflows that mix research, tool use and synthesis. 1M token context window.

xaigrokmulti-agent

Grok Build 0.1

CodexAI

xAI's Grok coding-focused model. Tuned for code generation and software development tasks with a 256k token context window for working over large codebases.

minimaxhailuotext-to-video

xaigrokcode

Hailuo / MiniMax Video-01

VideoCustom

MiniMax's Hailuo video-01. 6s 1280x720 clips with strong cinematic motion and physical realism.

€0.43

Ideogram 2.0 Turbo

ImageIdeogram

Ideogram's fast text-to-image variant. Strong typography and logo rendering at low latency.

€0.05

ideogramtext-to-imagetypography

Jina Embeddings v3 (Multilingual)

Jina's frontier multilingual embedding model. 570M params, 8192 ctx, 89 languages, Matryoshka dims 128-1024.

jinaembeddingmultilingual

Kling 1.6 Pro

VideoKuaishou

Kuaishou's Kling 1.6 Pro. Premium cinematic motion and physics realism, ~$0.07/sec.

€0.35

kuaishouklingtext-to-video

LeRobot SmolVLA

RoboticsCustom

HuggingFace's 450M VLA pretrained on 487 community LeRobot datasets. Runs on consumer GPUs.

lumatext-to-videoimage-to-video

huggingfacelerobotvla

Luma Dream Machine v1.6

VideoCustom

Luma's Dream Machine 1.6. 720p text/image-to-video with strong motion and camera control.

€0.40

mxbai-embed-large-v1

Mixedbread's open-source 335M embedding model. Top MTEB benchmark for English retrieval at release.

mixedbreadembeddingopen-weights

NVIDIA Cosmos-Predict-1

RoboticsCustom

NVIDIA's world foundation model for physical AI. Diffusion-based video prediction for robotics simulation.

nvidiacosmosvla

Octo Base

RoboticsUC Berkeley

Berkeley/Stanford 93M transformer diffusion policy. Pretrained on 800k Open-X-Embodiment episodes.

berkeleystanfordvla

Octo Small

RoboticsUC Berkeley

Compact 27M variant of Octo. Faster inference on consumer GPUs, designed for low-latency control.

berkeleyvlarobotics

OpenVLA-7B

RoboticsOpenVLA

Stanford/Berkeley open VLA trained on 970k Open-X-Embodiment episodes. Supports LoRA fine-tuning.

stanfordberkeleyvla

Perplexity Sonar

Perplexity's fastest and cheapest web-grounded chat model. Live-source citations included.

perplexityweb-searchcitations

Perplexity Sonar Reasoning

Perplexity's reasoning model with chain-of-thought and integrated web search.

perplexityweb-searchreasoning

Physical Intelligence Pi-0-FAST

RoboticsPhysical Intelligence

Autoregressive π-0 variant using FAST action tokenizer. Faster inference at competitive task success.

physical-intelligencevlarobotics

Physical Intelligence π-0

RoboticsPhysical Intelligence

Physical Intelligence's flagship VLA flow-matching policy. Generalist robot control, pretrained on 10k+ hrs robot data.

physical-intelligencevlarobotics

Physical Intelligence π-0.5

RoboticsPhysical Intelligence

Upgraded π-0 with open-world generalization via knowledge insulation. Weights and fine-tuning open-sourced.

physical-intelligencevlarobotics

Pika 2.0 (Official)

VideoPika

Pika Labs' 2.0 release. Cinematic text/image-to-video with scene composition controls.

€0.20

pikatext-to-videoimage-to-video

Playground v3 (Design)

ImagePlayground AI

Playground's text-to-image model focused on graphic design aesthetics and embedded typography.

playgroundtext-to-imagedesign

PlayHT 2.0

TTSCustom

PlayHT's 2.0 generative voice model. Multi-lingual expressive speech synthesis with sub-second latency and high-fidelity voice cloning.

playhtttsvoice-cloning

Qwen 2.5-Max

Alibaba's flagship pretrained MoE model. Top-tier reasoning and code performance via DashScope API.

qwenalibabamoe

RDT-1B

RoboticsCustom

Tsinghua's 1B diffusion-transformer bimanual manipulation policy. Predicts next 64 actions per inference.

recrafttext-to-imagedesign

tsinghuavlarobotics

Recraft V3 Realistic

ImageRecraft

Recraft's high-prompt-adherence raster image model. Strong layout control and brand-style consistency.

€0.04

Recraft V3 SVG

ImageRecraft

Recraft's vector/SVG generation model. Editable illustrations and icons from text.

€0.08

recrafttext-to-svgvector

Reka Core

MultimodalCustom

Reka's frontier multimodal model supporting text, image, video and audio inputs.

rekamultimodalvideo-understanding

Reka Edge

MultimodalCustom

Reka's small on-device-friendly multimodal model. ~7B parameters, 16k context.

rekamultimodaledge

Reka Flash

MultimodalCustom

Reka's 21B dense multimodal model balancing speed and quality. Up to 128k context.

rekamultimodalcost-efficient

Runway Gen-3 Alpha Turbo

VideoCustom

Runway's faster, cheaper Gen-3 variant. Image-to-video at 5 credits/sec (~$0.05/sec).

€0.05

runwayimage-to-videofast

Snowflake Arctic Instruct

Snowflake's open MoE model: 480B total / 17B active params with dense+MoE hybrid architecture.

snowflakemoeopen-weights

Stable Audio 2

TTSUdio

Stability AI's Stable Audio 2.0. Text-to-music up to 3 minutes of full-length, structured tracks at 44.1 kHz.

stabilitymusic-generationpricing-tbd

Stable Diffusion 3.5 Large (Stability)

stabilitytext-to-imageopen-weights

Stability AI's 8B-parameter flagship SD3.5 model. Strong prompt adherence and aesthetic quality.

€0.07

Stable Diffusion 3.5 Large Turbo

stabilitytext-to-imageopen-weights

Distilled 4-step variant of SD3.5 Large. 8B params, ~4x faster inference at competitive quality.

€0.04

Stable Diffusion 3.5 Medium

stabilitytext-to-imageopen-weights

Stability AI's 2.5B-parameter SD3.5 with strong quality/speed trade-off. Consumer-GPU friendly.

€0.04

Voyage AI voyage-code-3

Voyage's code-specialized embedding model. Up to 32k context, Matryoshka 256-2048 dims, int8/binary support.

voyageembeddingcode

Yi Large

01.AI's larger general-purpose chat model with 32k context window and strong bilingual performance.