Code Models
AI-powered coding assistants for development
Code generation models for autocomplete, review, and refactors
Code-generation models are large language models trained or fine-tuned specifically on source code. They power IDE autocomplete, PR review, automated refactoring, test generation, and cross-language translation. Reach for a code model — over a general text model — when you want stronger correctness on programming tasks and structured outputs (diffs, JSON) that play well with developer tooling.
13 models available
Codestral
Mistral's code-specialized model. Optimized for code generation, completion, and understanding across 80+ languages.
DeepSeek Coder V2
DeepSeek's specialized coding model. Excellent at code generation, debugging, and explanation.
Granite Code 20B
IBM Granite 20B Code Instruct. Larger Granite code model balancing quality and inference cost for enterprise CI/CD code-review automation.
Granite Code 34B
IBM Granite 34B Code Instruct. Largest Granite code-instruction model. Top-tier among Apache-2.0 code LLMs on HumanEval, MBPP and MultiPL-E.
Granite Code 3B
IBM Granite 3B Code Instruct. Apache-2.0 small code-instruction model. Strong on Python, Java, JavaScript and Go for enterprise IDE integrations.
Granite Code 8B
IBM Granite 8B Code Instruct. Trained on permissively-licensed code, strong on multi-language code completion and instruction-following.
Magicoder S CL 7B
UIUC Magicoder S CL 7B. CodeLlama-7B fine-tuned with OSS-Instruct synthetic data. Strong HumanEval Plus and MBPP Plus performance per parameter.
Phind CodeLlama 34B v2
Phind CodeLlama 34B v2. Highly tuned CodeLlama variant focused on retrieval-augmented developer assistant workflows.
StarCoder2 15B
BigCode StarCoder2 15B code-generation flagship. Trained on 4T tokens of Stack v2 data with grouped-query attention and 16k context.
StarCoder2 3B
BigCode StarCoder2 3B code-generation model. Trained on The Stack v2, supports 600+ programming languages. Apache-2.0 licensed for commercial use.
StarCoder2 7B
BigCode StarCoder2 7B code-generation model. 16k context, 600+ programming languages, strong fill-in-the-middle (FIM) performance.
WizardCoder 33B
WizardLM WizardCoder 33B v1.1. Evol-Instruct fine-tune of DeepSeek-Coder-33B with strong code-generation benchmark performance.
Yi-Coder 9B
01.AI Yi-Coder 9B chat model. Strong multilingual code completion and chat, 128k context, competitive with code-specialized models 2x its size.
Top code models picks
Hand-picked across four common criteria — resolved against the live catalog so the picks track price and performance changes.
Mistral's code-specialized model. Optimized for code generation, completion, and understanding across 80+ languages.
Learn moreDeepSeek's specialized coding model. Excellent at code generation, debugging, and explanation.
Learn moreMistral's code-specialized model. Optimized for code generation, completion, and understanding across 80+ languages.
Learn moreMistral's code-specialized model. Optimized for code generation, completion, and understanding across 80+ languages.
Learn morePricing in code generation follows the same per-token model as general text. Flagship code models (GPT-5 Codex, Claude 4.6 Sonnet, Codestral) cost €1-€10 per million input tokens; budget tiers (Codestral Mamba, DeepSeek Coder, Qwen Coder) cost €0.05-€0.50 per million. A single IDE autocomplete request rarely runs more than a few thousand input tokens, so per-call cost is fractions of a cent. The bills grow when you ship agents that re-prompt themselves dozens of times per task.
The trade-off triangle is correctness, speed, and context. Flagships solve harder problems and follow project conventions more reliably but respond at 30-80 tokens/second, which feels slow inside a tight autocomplete loop. Fast budget models (Codestral Mamba, GPT-5 Mini) stream at 200+ tokens/second and feel native in the editor. For batch tasks (refactor a whole repo, generate tests for fifty files), flagship correctness wins. For tight autocomplete loops, fast tier wins.
Watch out for cross-file context: most autocomplete loops only send the current file. For real codebase-aware refactoring, you need a retrieval layer that pulls related files into the prompt. Tools like Cursor and Continue do this automatically; if you're rolling your own, embed the codebase first and retrieve the top 5-10 most relevant files per request.
Watch out for license contamination: a few open-weights code models were trained on permissively licensed code only; others swept GPL code with unclear redistribution terms. If you're shipping generated code in a closed-source product, prefer commercial models with explicit code-license guarantees.
Top picks above cover the most correct flagship, the cheapest workhorse, the longest-context model, and the fastest autocomplete option.
Popular use cases
Common patterns built with code models on Railwail.
Frequently asked questions
Start Building with AI
Access all models through a single API. Get free credits when you sign up — no credit card required.