AI Models
Browse our full catalog of AI models. From text generation to image creation, video synthesis to audio processing — find the perfect model for your project.
113 models available
Claude Code
Anthropic's specialized coding agent. Autonomous code writing, debugging, and refactoring with deep codebase understanding.
Claude Opus 4
Anthropic's most powerful model. Exceptional at complex analysis, agentic tasks, and extended reasoning.
Claude Sonnet 4
Anthropic's most capable model. Excellent for complex analysis, coding, math, and creative writing.
Codestral
Mistral's code-specialized model. Optimized for code generation, completion, and understanding across 80+ languages.
Cursor (GPT-4o)
AI-powered code editor backed by GPT-4o. Inline code completion, chat-based editing, and codebase-aware suggestions.
ElevenLabs Multilingual V2
ElevenLabs' most natural-sounding TTS model. Supports 29 languages with emotional range.
Flux 1.1 Pro
Black Forest Labs' most capable image model. Photorealistic outputs with exceptional text rendering and prompt following.
Flux 1.1 Pro Ultra
FLUX 1.1 Pro in ultra mode. Up to 4 megapixel images with raw mode for photorealism.
Flux Dev
Black Forest Labs' development model. Fast, high-quality image generation with LoRA support.
Gemini 2.0 Flash
Google's fastest multimodal model. Supports text, images, audio, and video input.
Gemini 2.0 Flash (Multimodal)
Google's multimodal model accepting text, images, audio, and video. Native multimodal understanding across input types.
Gemini 2.5 Pro
Google's latest thinking model. Excels at reasoning, coding, math, and science with massive context window.
GitHub Copilot
GitHub's AI pair programmer. Real-time code suggestions, chat assistance, and PR reviews powered by OpenAI models.
Google Veo 2
Google's state-of-the-art video generation model. Simulates real-world physics with various visual styles.
GPT-4.1
OpenAI's newest flagship model. Improved reasoning, instruction following, and coding over GPT-4o.
GPT-4.5 Preview
OpenAI's latest frontier model with improved reasoning, creativity, and instruction following. Significant improvements over GPT-4o.
GPT-4o
OpenAI's most capable multimodal model. Excellent for complex reasoning, coding, and creative tasks.
GPT-4o (Vision)
GPT-4o's vision capabilities. Analyze images, charts, documents, and screenshots with detailed understanding and reasoning.
GR00T N1
NVIDIA's foundation model for humanoid robots. World-model-based VLA enabling whole-body control and human-like manipulation.
Llama 4 Maverick
Meta's powerful Llama 4 Maverick model. A larger, more capable variant with strong reasoning, creative writing, and multilingual abilities.
Midjourney v6.1
Midjourney's latest model known for stunning artistic quality. Excels at creative, aesthetic images with a distinctive artistic style.
Midjourney V7
The latest Midjourney model. Industry-leading aesthetic quality and prompt adherence for image generation.
MusicGen
Meta's music generation model. Generate up to 1 minute of music from text descriptions.
o1
OpenAI's reasoning model that thinks before answering. Uses chain-of-thought to solve complex math, science, and coding problems.
o3-mini
OpenAI's reasoning model optimized for STEM tasks, coding, and math. Uses chain-of-thought reasoning.
OpenVLA
Open-source 7B Vision-Language-Action model built on Prismatic VLM and Llama 2. Converts visual observations and language goals into robot actions.
Pi0
Physical Intelligence's foundation model for robot control. Combines vision-language understanding with dexterous manipulation across diverse tasks.
RT-2
Google DeepMind's Robotic Transformer 2. Vision-Language-Action model that translates visual observations and language instructions directly into robot actions.
Runway Gen-3 Alpha
Runway's latest video generation model. Professional-quality video creation with fine-grained control over motion and style.
Runway Gen-4
Runway's latest video generation model. Cinematic quality with precise camera and motion control.
Sora
OpenAI's video generation model. Creates realistic and imaginative videos from text prompts with impressive temporal coherence.
Suno v4
Suno's latest music AI. Generates complete songs with lyrics, vocals, and instrumentation. Supports many genres and custom lyrics.
Text Embedding 3 Large
OpenAI's most powerful embedding model. 3072 dimensions for maximum accuracy.
Udio
AI music generation platform creating full songs with vocals and lyrics from text descriptions. Wide range of genres and styles.
Veo 2
Google DeepMind's video generation model. Creates high-fidelity, 1080p videos with strong understanding of physics and motion.
Whisper Large V3
OpenAI's Whisper model. State-of-the-art speech recognition supporting 99+ languages.
AssemblyAI Universal-2
AssemblyAI's latest speech model. Excellent accuracy across accents and noisy environments with built-in speaker diarization.
Bark
Suno's text-to-audio model. Generates realistic speech, music, and sound effects.
BGE-M3
BAAI's versatile embedding model supporting dense, sparse, and multi-vector retrieval. Open-source and highly effective.
Claude 3.5 Haiku
Anthropic's fastest and most affordable model. Ideal for high-volume tasks, customer support, and quick responses.
Claude 3.5 Sonnet
Previous generation balanced model from Anthropic. Still excellent for many tasks including coding, analysis, and creative writing.
Claude 3.5 Sonnet (Vision)
Claude's vision capabilities. Excellent at analyzing images, documents, and code screenshots with detailed, accurate descriptions.
Claude Haiku 3.5
Anthropic's fast and affordable model. Great for quick tasks, summarization, and simple coding.
CodeLlama 70B
Meta's largest code-specialized Llama model. Trained on code-heavy data with strong performance on code generation and infilling.
CogVideoX
Open-source video generation model from Tsinghua University. Generates coherent videos from text with strong temporal consistency.
Cohere Embed v3
Cohere's multilingual embedding model. Supports 100+ languages with separate search and classification modes.
Command R
Cohere's efficient model optimized for RAG and tool use. Great balance of quality and cost for production deployments.
Command R+
Cohere's flagship model for enterprise RAG applications. Excellent at retrieval-augmented generation, summarization, and multi-step tasks.
DALL-E 3
OpenAI's latest image generation model. Excellent at following complex prompts with high fidelity.
DBRX Instruct
Databricks' open-source MoE model with 132B total parameters. Strong at enterprise tasks, SQL, and data-related queries.
Deepgram Nova 2
Deepgram's most accurate ASR model. Optimized for real-time transcription with industry-leading word error rates.
DeepSeek Coder V2
DeepSeek's specialized coding model. Excellent at code generation, debugging, and explanation.
DeepSeek R1
DeepSeek's reasoning model with chain-of-thought capabilities. Excellent for complex problem-solving.
DeepSeek V3
Powerful open-weight model from DeepSeek. Strong at coding, math, and Chinese/English tasks.
ElevenLabs Turbo v2.5
Low-latency TTS model from ElevenLabs. Optimized for real-time applications with natural-sounding output.
Flux Schnell
The fastest Flux model. Generate images in under 2 seconds. Great for prototyping.
Gemini 2.0 Flash
Google's fast, versatile multimodal model. Supports text, images, audio, and video inputs. Great balance of speed and capability.
Gemini 2.0 Flash Lite
Google's most cost-efficient model. Optimized for high-volume, lower-complexity tasks with excellent throughput.
Gemini Robotics
Google DeepMind's Gemini model adapted for robotics. Leverages Gemini's multimodal understanding for zero-shot robot task planning and execution.
Gemma 2 27B
Google's open-source 27B model. Strong performance in reasoning and text generation, built with Google's research expertise.
Gemma 2 9B
Compact open-source model from Google. Excellent for on-device deployment and resource-constrained environments.
GPT-4o Mini
Small, fast, and affordable model for lightweight tasks. Great balance of speed and capability.
Grok 2
xAI's previous flagship model. Known for its witty personality, strong reasoning, and ability to handle nuanced questions.
Grok 3
xAI's flagship model. Strong at reasoning, coding, and real-time knowledge with web search capabilities.
Grok 3 Mini
Smaller, faster version of Grok 3. Excellent for quick responses and lower-cost applications while maintaining strong capabilities.
Helix
Figure AI's VLA model powering their humanoid robots. Combines language understanding with full-body motion planning for household and industrial tasks.
HunyuanVideo
Tencent's open-source video generation model. Strong visual quality with diverse style support.
Ideogram 2.0
Ideogram's latest model excelling at typography and text in images. Best-in-class text rendering in generated images.
Ideogram 3.0
Exceptional at rendering text in images. Great for logos, posters, and designs with typography.
Jina Embeddings v3
Jina AI's latest embedding model with task-specific adapters. Supports flexible dimensions and multiple retrieval tasks.
Kling 1.5
Kuaishou's video generation model. Creates high-quality videos with good motion consistency and diverse styles.
Kling 1.6
Kuaishou's video generation model. Professional quality with strong motion coherence.
Llama 3.1 405B
Meta's largest open-source model. 405 billion parameters delivering frontier-class performance on reasoning, coding, and multilingual tasks.
Llama 3.1 70B
Meta's highly capable 70B open-source model. Great balance of performance and efficiency for a wide range of tasks.
Llama 3.1 8B
Meta's compact 8B model. Surprisingly capable for its size, perfect for fast inference, edge deployment, and cost-sensitive applications.
Llama 3.3 70B
Meta's open-source 70B parameter model. Strong all-around performance with multilingual support.
Llama 4 Scout
Meta's next-generation Llama 4 model optimized for efficiency. Built on a new architecture with improved reasoning and instruction following.
LLARVA
Vision-Language-Action model using LLM backbones for structured robot action prediction. Bridges language models and low-level robot control.
LLaVA 1.6 34B
Open-source multimodal model combining language and vision. Strong visual understanding with conversational capabilities.
Luma Dream Machine
Luma AI's video generation model. Fast, high-quality video creation with strong physics simulation.
Minimax Video
MiniMax's video generation model. Fast, high-quality video output with text-to-video capabilities.
Minimax Video-01
Minimax's video generation model supporting up to 720p resolution. Good for short-form video content creation.
Mistral Large
Mistral's flagship model. Strong reasoning, multilingual, and coding capabilities.
Mistral Large 2
Mistral's most capable model. 123B parameters with strong reasoning, multilingual support, and function calling. Great for complex enterprise tasks.
Mistral Medium
Mid-range model from Mistral AI offering a good balance of performance and cost for most business applications.
Mistral Nemo
12B open-weight model by Mistral and NVIDIA. Compact but capable, ideal for on-device or self-hosted deployments.
Mistral Small
Mistral's efficient small model. Fast and cost-effective for straightforward tasks like classification, text generation, and RAG.
MusicGen Large
Meta's open-source music generation model. Creates high-quality music from text descriptions with control over style, tempo, and instruments.
o1 Mini
Smaller, faster version of OpenAI's o1 reasoning model. Optimized for STEM tasks with lower latency and cost.
Octo
Open-source generalist robot policy from UC Berkeley. Supports multiple robot embodiments and can be fine-tuned for new tasks with minimal data.
OpenAI TTS-1
OpenAI's text-to-speech model. Six built-in voices with natural intonation.
OpenAI TTS-1 HD
OpenAI's high-definition TTS model. Better quality for production use cases.
Phi-4
Microsoft's small but mighty 14B model. Punches well above its weight class on reasoning, math, and coding benchmarks.
Pi0.5
Physical Intelligence's latest VLA model with improved generalization. Handles complex multi-step manipulation tasks with fewer demonstrations.
Pika 2.0
Pika's latest video model with improved motion quality and generation speed. User-friendly interface for video creation.
Pixtral Large
Mistral's vision-language model. 124B parameters with native image understanding, document analysis, and visual reasoning.
Playground v3
Playground AI's latest model focused on photorealistic image generation with strong aesthetic quality and prompt adherence.
Qwen 2.5 72B
Alibaba's powerful open-source model. Excellent at coding, math, and multilingual tasks.
Qwen 2.5 7B
Compact 7B model from Alibaba's Qwen series. Fast and efficient while maintaining strong multilingual and coding capabilities.
Qwen2.5-Coder 32B
Alibaba's specialized coding model. Strong performance on code benchmarks with support for many programming languages.
QwQ 32B
Alibaba's reasoning model. Uses chain-of-thought to solve complex math, logic, and coding problems. Open-weight alternative to o1.
Recraft V3
State-of-the-art image generation optimized for design and branding. SVG vector output support.
RoboFlamingo
Robotics adaptation of the Flamingo vision-language model. Few-shot learning for robot tasks using language-conditioned visuomotor policies.
RT-X
Cross-embodiment robotic foundation model from the Open X-Embodiment collaboration. Trained on data from 22 robot types for generalized manipulation.
SpatialVLA
VLA model with explicit 3D spatial reasoning. Uses depth perception and spatial understanding for more precise robotic manipulation.
Stable Audio
Stability AI's audio generation model. Creates music and sound effects from text prompts with customizable duration.
Stable Diffusion 3.5 Large
Stability AI's latest open-source image model. 8B parameter model with improved prompt adherence, typography, and photorealism.
Stable Diffusion XL
Stability AI's popular SDXL model. Widely adopted, extensive community support, and thousands of fine-tuned variants available.
Stable Diffusion XL
Stability AI's SDXL model via Replicate. High-quality image generation with extensive customization.
StarCoder2 15B
BigCode's open-source code model trained on The Stack v2. Supports 600+ programming languages with strong completion quality.
Text Embedding 3 Small
OpenAI's compact embedding model. 1536 dimensions, great for semantic search and RAG.
Udio V1.5
AI music generation with studio-quality output. Generate full songs with vocals, instruments, and production.
Yi Lightning
01.AI's fast inference model. Optimized for speed with competitive quality, ideal for real-time applications.
Start Building with AI
Access all models through a single API. Get free credits when you sign up — no credit card required.