Voyage AI voyage-code-3
Voyage's code-specialized embedding model. Up to 32k context, Matryoshka 256-2048 dims, int8/binary support.
Voyage AI voyage-code-3 is embeddings AI model from Custom, priced at β¬0.180 per 1M input tokens with a 32K tokens context window.
Pricing
API Integration
Use our OpenAI-compatible API to integrate Voyage AI voyage-code-3 into your application.
npm install railwailimport railwail from "railwail";
const rw = railwail("YOUR_API_KEY");
const vectors = await rw.run("voyage-code-3", "Hello world", { type: "embed" });
console.log(vectors[0].length); // embedding dimensions
// Or use the embed() method for full control
const res = await rw.embed("voyage-code-3", ["Hello", "World"]);
for (const item of res.data) {
console.log(item.embedding.length);
}Deep dive β Voyage AI's Voyage AI voyage-code-3
Voyage AI was founded in 2023 by Stanford CS professor Tengyu Ma and team, focused on best-in-class retrieval and reranking models for RAG, with a particular emphasis on domain-specific variants. The company has shipped specialised embeddings for finance (voyage-finance-2), law (voyage-law-2), multilingual (voyage-multilingual-2) and code (voyage-code-2 then voyage-code-3). voyage-code-3 launched in December 2024 as the successor to voyage-code-2 and quickly became the top-scoring code embedding model on the CoIR and CodeSearchNet benchmarks. In February 2025 Voyage AI was acquired by MongoDB for $220M, with voyage-code-3 now integrated into MongoDB Atlas Vector Search and recommended for code-aware AI agents and IDE-style retrieval workloads.
Visit Voyage AI βVoyage AI voyage-code-3 is a hosted embedding model specialised for source code, technical documentation, commit messages, issues, pull-request reviews and code-mixed natural language. It has the same 32,000-token context window as voyage-3 and supports Matryoshka-style heads at 256 / 512 / 1,024 / 2,048 dimensions, plus int8 / binary quantisation. Training used a contrastive retrieval objective on curated pairs covering more than 30 programming languages (Python, JavaScript, TypeScript, Go, Rust, Java, C/C++, C#, SQL, Bash, etc.) together with technical natural-language text, with deliberate emphasis on code-to-text and text-to-code retrieval as well as code-clone detection. Voyage reports voyage-code-3 outperforming OpenAI text-embedding-3-large by 13.8 points on average across CoIR sub-tasks while costing the same. The model is offered through the Voyage API and natively in MongoDB Atlas Vector Search after the February 2025 acquisition.
- Parameters
- Undisclosed
- Context
- 32K tokens
- Top-tier code embedding model on CoIR and CodeSearchNet
- 30+ programming languages including Python, JavaScript, TypeScript, Go, Rust, Java
- Text-to-code and code-to-text retrieval (e.g. find function from docstring)
- 32,000-token context window for full-file embedding
- Matryoshka heads at 256 / 512 / 1024 / 2048 dimensions
- int8 / binary quantisation for cheap storage
- MongoDB Atlas Vector Search integration
- Best for: code-aware AI agents, IDE retrieval, repository search, ticket-to-code linking
Not disclosed. Voyage describes 'curated pairs covering 30+ programming languages and technical natural-language text' with contrastive negatives.
License: Proprietary commercial API. Available standalone and bundled with MongoDB Atlas Vector Search.
Known limitations
- Closed weights, hosted only
- Cannot fine-tune externally
- Optimised for code; general-domain retrieval slightly below voyage-3
- Hosted-only latency profile higher than local code embeddings
- Coverage of niche / DSL languages weaker than mainstream ones
Frequently asked questions
Related Models
View all EmbeddingsText Embedding 3 Large
OpenAI's most powerful embedding model. 3072 dimensions for maximum accuracy.
Voyage AI voyage-3
Voyage's general-purpose embedding model. 1024 dims, 32k context, strong retrieval performance.
Cohere embed-multilingual-v3
Cohere's multilingual embedding model. Supports 100+ languages with separate search and classification modes.
Jina Embeddings v3 (Multilingual)
Jina's frontier multilingual embedding model. 570M params, 8192 ctx, 89 languages, Matryoshka dims 128-1024.
Start using Voyage AI voyage-code-3 today
Get started with free credits. No credit card required. Access Voyage AI voyage-code-3 and 100+ other models through a single API.