Codestral
Mistral's code-specialized model. Optimized for code generation, completion, and understanding across 80+ languages.
Codestral is code AI model from Mistral AI, priced at €3.00 per 1M input tokens with a 256K tokens context window.
Examples
See what Codestral can generate
Write a type-safe Express middleware for JWT authentication in TypeScript
import { Request, Response, NextFunction } from 'express';
import jwt from 'jsonwebtoken';
interface JwtPayload {
userId: string;
email: string;
role: 'admin' | 'user';
}
declare global {
namespace Express {
interface Request {
user?: JwtPayload;
}
}
}
export const authenticate = (
req: Request,
res: Response,
next: NextFunction
): void => {
const token = req.headers.authorization?.split(' ')[1];
if (!token) {
res.status(401).json({ error: 'No token provided' });
return;
}
try {
const decoded = jwt.verify(
token,
process.env.JWT_SECRET!
) as JwtPayload;
req.user = decoded;
next();
} catch (err) {
res.status(403).json({ error: 'Invalid or expired token' });
}
};Write a Rust function that reads a JSON config file and returns a parsed struct with proper error handling
use serde::Deserialize;
use std::fs;
use std::path::Path;
use thiserror::Error;
#[derive(Debug, Deserialize)]
pub struct AppConfig {
pub host: String,
pub port: u16,
pub database_url: String,
pub max_connections: Option<u32>,
}
#[derive(Error, Debug)]
pub enum ConfigError {
#[error("Failed to read config file: {0}")]
IoError(#[from] std::io::Error),
#[error("Failed to parse config JSON: {0}")]
ParseError(#[from] serde_json::Error),
#[error("Config file not found: {0}")]
NotFound(String),
}
pub fn load_config(path: &str) -> Result<AppConfig, ConfigError> {
if !Path::new(path).exists() {
return Err(ConfigError::NotFound(path.to_string()));
}
let contents = fs::read_to_string(path)?;
let config: AppConfig = serde_json::from_str(&contents)?;
Ok(config)
}
// Usage:
// let config = load_config("config.json")?;Pricing
API Integration
Use our OpenAI-compatible API to integrate Codestral into your application.
npm install railwailimport railwail from "railwail";
const rw = railwail("YOUR_API_KEY");
// Simple — just pass a string
const reply = await rw.run("codestral", "Hello! What can you do?");
console.log(reply);
// With message history
const reply2 = await rw.run("codestral", [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);
// Full response with usage info
const res = await rw.chat("codestral", [
{ role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);Deep dive — Mistral AI's Codestral
Mistral AI was founded in April 2023 in Paris by Arthur Mensch (CEO, former DeepMind), Guillaume Lample and Timothée Lacroix (both former Meta FAIR co-authors of the LLaMA papers). Mistral has built a frontier-scale European LLM lab with a portfolio mixing fully open-weight releases (Mistral 7B, Mixtral 8x7B, Mixtral 8x22B, Mistral Small) and commercial closed-API models (Mistral Large, Mistral Embed, Mistral Saba). The company has raised over €1B from investors including Andreessen Horowitz, General Catalyst, Lightspeed, Salesforce, Nvidia and Microsoft, with a 2024 valuation around €6B. Codestral was released May 2024 as Mistral's first dedicated code model — a 22B dense transformer trained on 80+ programming languages with native fill-in-the-middle support for IDE integrations. A successor, Codestral 25.01, followed in January 2025.
Visit Mistral AI →Codestral-22B is a 22B dense decoder-only transformer using Mistral's standard architecture: 56 layers, 6,144 hidden size, 48-head grouped-query attention with 8 KV heads, sliding-window attention (4,096-token window), RoPE positional embeddings with theta=1M (for long-context support), SwiGLU activations and RMSNorm. The tokeniser is the Mistral BPE with 32,768 entries plus added fill-in-the-middle special tokens (`[PREFIX]`, `[SUFFIX]`, `[MIDDLE]`). Training combined a standard left-to-right autoregressive objective with FIM training for IDE-grade code completion. The training corpus covers 80+ programming languages drawn from public code repositories, plus natural-language code documentation pairs and instruction data for the chat / explain capability. Released May 2024 under the Mistral AI Non-Production License (MNPL), which permits open-weights research and evaluation but not commercial production use without a separate license or hosted API access.
- Parameters
- 22B (dense)
- Context
- 32K tokens
- 22B dense transformer purpose-built for code
- Native fill-in-the-middle (FIM) for IDE autocomplete
- Trained on 80+ programming languages
- 32K context window
- Competitive with CodeLlama-34B and DeepSeek-Coder-33B at smaller size
- Strong on mainstream: Python, JS/TS, Java, C/C++, Go, Rust, Bash, SQL, PHP, Swift
- Open weights available under MNPL (research only)
- Best for: IDE code completion, code generation, explanation and refactoring across mainstream languages.
Trained on trillions of tokens (exact figure not disclosed) of public code repositories across 80+ programming languages, natural-language code-documentation pairs, and instruction-style code data for chat and explain capabilities. Knowledge cutoff approximately early 2024. Training combined autoregressive and fill-in-the-middle objectives.
License: Mistral AI Non-Production License (MNPL). Open weights for research, evaluation and personal use; commercial production requires Mistral commercial license or hosted API access (la Plateforme, AWS Bedrock, Azure AI Studio).
Known limitations
- MNPL blocks commercial production use of open weights
- Smaller than successor Codestral 25.01 and below DeepSeek-Coder V2 on top benchmarks
- 32K context shorter than newer code models (DeepSeek-Coder V2: 128K)
- Weaker on rare languages and DSLs
- Not as strong on agentic / repo-level reasoning as longer-context code models
- No multimodal input (no image-of-code understanding)
Frequently asked questions
Related Models
View all CodeDeepSeek Coder V2
DeepSeek's specialized coding model. Excellent at code generation, debugging, and explanation.
Granite Code 20B
IBM Granite 20B Code Instruct. Larger Granite code model balancing quality and inference cost for enterprise CI/CD code-review automation.
Granite Code 34B
IBM Granite 34B Code Instruct. Largest Granite code-instruction model. Top-tier among Apache-2.0 code LLMs on HumanEval, MBPP and MultiPL-E.
Granite Code 3B
IBM Granite 3B Code Instruct. Apache-2.0 small code-instruction model. Strong on Python, Java, JavaScript and Go for enterprise IDE integrations.
Start using Codestral today
Get started with free credits. No credit card required. Access Codestral and 100+ other models through a single API.