Text Embedding 3 Small
OpenAI's compact embedding model. 1536 dimensions, great for semantic search and RAG.
Text Embedding 3 Small is embeddings AI model from OpenAI, priced at €0.200 per 1M input tokens with a unknown context window.
Examples
See what Text Embedding 3 Small can generate
Semantic Search
Input:
"How to deploy a Node.js app to production"
Similar matches:
Deploying Node applications to cloud servers
94%Setting up a Node.js production environment
89%Node.js deployment best practices and CI/CD
85%Document Clustering
Input:
"Machine learning model training techniques"
Similar matches:
Deep learning optimization and hyperparameter tuning
91%Neural network training strategies for beginners
87%Supervised learning algorithms and model evaluation
83%Pricing
API Integration
Use our OpenAI-compatible API to integrate Text Embedding 3 Small into your application.
npm install railwailimport railwail from "railwail";
const rw = railwail("YOUR_API_KEY");
const vectors = await rw.run("text-embedding-3-small", "Hello world", { type: "embed" });
console.log(vectors[0].length); // embedding dimensions
// Or use the embed() method for full control
const res = await rw.embed("text-embedding-3-small", ["Hello", "World"]);
for (const item of res.data) {
console.log(item.embedding.length);
}Deep dive — OpenAI's Text Embedding 3 Small
OpenAI was founded in December 2015 by Sam Altman, Elon Musk, Greg Brockman, Ilya Sutskever, Wojciech Zaremba and John Schulman and restructured to a capped-profit company in 2019. The embedding line started with text-embedding-ada-001 (2021), was unified into ada-002 (December 2022) and replaced in January 2024 by the v3 generation. text-embedding-3-small is the smaller, cheaper sibling of text-embedding-3-large: it beats ada-002 by ~5 points on MTEB at one fifth the price ($0.00002 per 1k tokens). Together with the large variant it was the first OpenAI embedding model to support Matryoshka-style dimension shortening, allowing callers to choose any vector size between 256 and 1,536 without retraining or quality drop.
Visit OpenAI →OpenAI text-embedding-3-small is the entry-level embedding model in the v3 generation. It produces 1,536-dimensional vectors by default and supports the same Matryoshka shortening as the large variant, so callers can request any dimension between 256 and 1,536 in the API and OpenAI will truncate-and-renormalise on the fly. Maximum input length is 8,191 tokens. Training used a contrastive retrieval objective on a curated multilingual web corpus including search-query/document pairs. The model scores around 62.3% MTEB and 44.0% MIRACL multilingual retrieval, a clear step up from 61.0% / 31.4% for ada-002, while being roughly five times cheaper and faster. It is positioned as the default embedding model for cost-sensitive production RAG and large-scale search. Like all OpenAI models the system is closed-source and hosted only.
- Parameters
- Undisclosed (smaller than text-embedding-3-large)
- Context
- 8.2K tokens
- 1,536-dim vectors with Matryoshka truncation to any 256-1,536 size
- 8,191-token context window for long-document embedding
- Multilingual coverage across 100+ languages
- $0.00002 per 1k tokens (~5x cheaper than text-embedding-3-large)
- Drop-in upgrade from ada-002 via the same /embeddings endpoint
- L2-normalised vectors with cosine similarity
- Best for: cost-sensitive RAG, large-scale search, embeddings at high QPS
Not disclosed. OpenAI describes a 'curated multilingual web corpus' with search-query/document pairs.
License: Proprietary commercial API. Generated embeddings may be stored and used commercially under the OpenAI Usage Policy.
Known limitations
- Closed weights, hosted only
- Cannot fine-tune the model
- Lower MTEB / MIRACL scores than text-embedding-3-large
- Multilingual quality below Cohere v3 for some low-resource languages
- 8,191-token cap may force chunking for long documents
Frequently asked questions
Related Models
View all EmbeddingsText Embedding 3 Large
OpenAI's most powerful embedding model. 3072 dimensions for maximum accuracy.
Voyage AI voyage-3
Voyage's general-purpose embedding model. 1024 dims, 32k context, strong retrieval performance.
Cohere embed-multilingual-v3
Cohere's multilingual embedding model. Supports 100+ languages with separate search and classification modes.
Jina Embeddings v3 (Multilingual)
Jina's frontier multilingual embedding model. 570M params, 8192 ctx, 89 languages, Matryoshka dims 128-1024.
Start using Text Embedding 3 Small today
Get started with free credits. No credit card required. Access Text Embedding 3 Small and 100+ other models through a single API.