Text Embedding 3 Large
OpenAI's most powerful embedding model. 3072 dimensions for maximum accuracy.
Text Embedding 3 Large is embeddings AI model from OpenAI, priced at €1.30 per 1M input tokens with a unknown context window.
Examples
See what Text Embedding 3 Large can generate
FAQ Matching
Input:
"How do I reset my password?"
Similar matches:
Steps to change your account password
96%I forgot my login credentials
91%Account recovery and password reset guide
89%Code Search
Input:
"React useEffect cleanup function memory leak"
Similar matches:
Preventing memory leaks in React component lifecycle
93%useEffect return function for subscription cleanup
90%React hooks best practices for side effects
84%Pricing
API Integration
Use our OpenAI-compatible API to integrate Text Embedding 3 Large into your application.
npm install railwailimport railwail from "railwail";
const rw = railwail("YOUR_API_KEY");
const vectors = await rw.run("text-embedding-3-large", "Hello world", { type: "embed" });
console.log(vectors[0].length); // embedding dimensions
// Or use the embed() method for full control
const res = await rw.embed("text-embedding-3-large", ["Hello", "World"]);
for (const item of res.data) {
console.log(item.embedding.length);
}Deep dive — OpenAI's Text Embedding 3 Large
OpenAI was founded in December 2015 by Sam Altman, Elon Musk, Greg Brockman, Ilya Sutskever, Wojciech Zaremba and John Schulman, and restructured to capped-profit OpenAI LP in 2019. The embedding model family started with text-embedding-ada-001 in 2021, was unified into text-embedding-ada-002 in December 2022 (still the most-used embedding model on Earth at one point) and replaced in January 2024 by text-embedding-3-small and text-embedding-3-large. The v3 release was OpenAI's first to support Matryoshka-style dimension reduction (sale of arbitrary 256-3072 dim vectors from the same model) and beat ada-002 by 20+ percentage points on the MIRACL multilingual retrieval benchmark while costing roughly the same.
Visit OpenAI →OpenAI text-embedding-3-large is the flagship embedding model in the v3 generation. It produces 3,072-dimensional vectors by default with full Matryoshka representation support, so callers can request any dimension between 256 and 3,072 in the API and OpenAI will truncate-and-renormalise without re-running the model. The model accepts up to 8,191 tokens per input and was trained with a contrastive retrieval objective on a curated multilingual web corpus, including search-query/document pairs and large-scale instruction-tuned pairs. It scores 64.6% on MTEB and 54.9% on MIRACL multilingual retrieval, a step up from 61.0% / 31.4% for ada-002. Pricing is $0.00013 per 1k tokens. Output vectors are L2-normalised. Like all OpenAI models the system is closed-source and hosted only. OpenAI has not published a technical paper for the v3 family beyond a launch blog.
- Parameters
- Undisclosed
- Context
- 8.2K tokens
- 3,072-dim vectors with Matryoshka truncation to any 256-3072 size
- 8,191-token context window for long-document embedding
- Multilingual coverage across 100+ languages
- State-of-the-art MIRACL multilingual retrieval (~55%)
- L2-normalised vectors with cosine similarity
- Drop-in upgrade from ada-002 via the same /embeddings endpoint
- Best for: production RAG, multilingual search, retrieval-heavy SaaS
Not disclosed. OpenAI describes a 'curated multilingual web corpus' with search-query/document pairs and instruction-tuned pairs.
License: Proprietary commercial API. Generated embeddings may be stored and used commercially under the OpenAI Usage Policy.
Known limitations
- Closed weights, hosted only
- Cannot fine-tune the model
- 3,072-dim full vectors are storage-heavy without truncation
- Worse-than-Cohere v3 on some low-resource languages
- 8,191-token cap may force chunking for very long documents
Frequently asked questions
Related Models
View all EmbeddingsVoyage AI voyage-3
Voyage's general-purpose embedding model. 1024 dims, 32k context, strong retrieval performance.
Cohere embed-multilingual-v3
Cohere's multilingual embedding model. Supports 100+ languages with separate search and classification modes.
Jina Embeddings v3 (Multilingual)
Jina's frontier multilingual embedding model. 570M params, 8192 ctx, 89 languages, Matryoshka dims 128-1024.
Text Embedding 3 Small
OpenAI's compact embedding model. 1536 dimensions, great for semantic search and RAG.
Start using Text Embedding 3 Large today
Get started with free credits. No credit card required. Access Text Embedding 3 Large and 100+ other models through a single API.