Cohere embed-multilingual-v3

Custom
Embeddings

Cohere's multilingual embedding model. Supports 100+ languages with separate search and classification modes.

Embed with Cohere embed-multilingual-v3
Vectorize text and preview the first 8 dimensions as a bar chart.
Sign in to try this model with €5 free credits.
Sign in
Outputs a high-dimensional vector you can plug into RAG or search.
Vector preview appears here.
TL;DRΒ·Last updated May 16, 2026

Cohere embed-multilingual-v3 is embeddings AI model from Custom, priced at €0.100 per 1M input tokens with a 512 tokens context window.

Try Cohere embed-multilingual-v3
Direct API access coming soon

Pricing

Price per Generation
Per generationFree

API Integration

Use our OpenAI-compatible API to integrate Cohere embed-multilingual-v3 into your application.

Install
npm install railwail
JavaScript / TypeScript
import railwail from "railwail";

const rw = railwail("YOUR_API_KEY");

const vectors = await rw.run("cohere-embed-multilingual-v3", "Hello world", { type: "embed" });
console.log(vectors[0].length); // embedding dimensions

// Or use the embed() method for full control
const res = await rw.embed("cohere-embed-multilingual-v3", ["Hello", "World"]);
for (const item of res.data) {
  console.log(item.embedding.length);
}
Specifications
Context window
512 tokens
Developer
Custom
Category
Embeddings
Supported Formats
text
Tags
cohere
embedding
multilingual
search

Deep dive β€” Cohere's Cohere embed-multilingual-v3

About Cohere
Founded 2019 Β· Toronto, Canada

Cohere was founded in 2019 in Toronto by Aidan Gomez (CEO), Nick Frosst and Ivan Zhang. Aidan Gomez is a co-author of the original Transformer paper 'Attention is All You Need' (2017) while at Google Brain; Nick Frosst is a former Geoffrey Hinton mentee. The company focuses on enterprise-grade large language models with a particular emphasis on retrieval, RAG, multilingual coverage and data sovereignty. Cohere has raised over $970M from investors including Inovia Capital, NVIDIA, Oracle, Salesforce Ventures, PSP Investments and the Canadian government's Strategic Innovation Fund, with a 2024 valuation of $5.5B. The Embed v3 family launched in November 2023 and remains one of the top-ranked commercial embedding models on the MTEB and BEIR retrieval leaderboards, especially for multilingual workloads.

Visit Cohere β†’
Architecture
Bi-encoder Transformer trained with contrastive retrieval objective

Cohere embed-multilingual-v3 is a bi-encoder Transformer that encodes text into a 1024-dimensional dense vector for retrieval. The model is the multilingual sibling of embed-english-v3 and supports more than 100 languages with cross-lingual semantic alignment, so that a German query can retrieve a relevant English document. Maximum input length is 512 tokens (~2,000 characters); longer documents must be chunked. The model was trained with a contrastive InfoNCE objective on a curated mix of multilingual question-answer pairs, web-search query-document pairs and licensed corpora, with deliberate down-weighting of low-quality web data. A signature feature is the input_type parameter, which lets the caller mark the input as 'search_document', 'search_query', 'classification' or 'clustering' to route through different projection heads tuned for each use case. The 1024-dim vectors are L2-normalised and accept cosine similarity directly. Cohere also offers a quantised int8 / binary endpoint for cheaper vector storage.

Parameters
Undisclosed
Context
512 tokens
What it can do
  • 100+ languages with strong cross-lingual retrieval (DE query, EN doc)
  • input_type parameter to specialise the embedding for query, document, classification or clustering
  • 1024-dim L2-normalised vectors, cosine similarity
  • int8 and binary quantisation endpoints for cheap vector storage
  • Top-tier MTEB and BEIR retrieval scores for multilingual workloads
  • Available on Cohere API, Amazon Bedrock, Oracle Cloud, Azure AI Studio
  • Best for: multilingual RAG, cross-lingual search, enterprise knowledge bases
Training & License

Contrastive training on a curated mix of multilingual QA pairs, search query-document pairs and licensed corpora. Exact token count not disclosed.

License: Proprietary commercial API. Available also on Amazon Bedrock and Oracle Cloud with separate licensing.

Known limitations
  • Hard cap of 512 tokens per input
  • 1024-dim vectors more expensive to store than 384-dim alternatives
  • Closed weights; no on-premise deployment outside of Bedrock / Oracle
  • Cross-lingual retrieval still weaker for very low-resource languages
  • input_type parameter required for best quality

Frequently asked questions

Start using Cohere embed-multilingual-v3 today

Get started with free credits. No credit card required. Access Cohere embed-multilingual-v3 and 100+ other models through a single API.