Curiosity · AI Model

OpenAI text-embedding-3-small

text-embedding-3-small is OpenAI's throughput-tuned embedding model — 1536 default dimensions and roughly five times cheaper than text-embedding-3-large while still beating the legacy ada-002. It is the right default when you are embedding tens of millions of chunks or running cost-sensitive semantic search at scale.

Model specs

Vendor: OpenAI
Family: text-embedding-3
Released: 2024-01
Context window: 8,191 tokens
Modalities: text
Input price: $0.02/M tok
Output price: n/a
Pricing as of: 2026-04-20

Strengths

≈5× cheaper per million tokens than text-embedding-3-large
Strong quality — still beats the legacy text-embedding-ada-002
Matryoshka truncation lets you trade a few MTEB points for far smaller vectors
Fast enough to embed millions of documents per hour

Limitations

Lower MTEB score than text-embedding-3-large — use the large model for high-stakes retrieval
Closed weights — cannot self-host
Multilingual quality is decent but lags Cohere embed-v3 and Voyage-3 on some locales

Use cases

Large-scale corpus ingestion for RAG
Semantic search with tight cost ceilings
Deduplication and near-duplicate detection
Recommendation features in consumer apps

Benchmarks

Benchmark	Score	As of
MTEB (English, avg)	≈62.3	2024-01
MIRACL (multilingual)	≈44.0	2024-01

Frequently asked questions

What is text-embedding-3-small?

text-embedding-3-small is OpenAI's cost-optimised embedding model, producing 1536-dimensional vectors. It replaces text-embedding-ada-002 for most workloads and is the recommended default for bulk ingestion and cost-sensitive RAG.

How does text-embedding-3-small compare with ada-002?

It is both higher-quality on MTEB and cheaper per million tokens than ada-002. OpenAI recommends new projects use text-embedding-3-small or text-embedding-3-large rather than the legacy ada-002 model.

How much does text-embedding-3-small cost?

As of April 2026, text-embedding-3-small costs roughly USD 0.02 per million input tokens on the OpenAI API, making it viable for embedding very large corpora.

Can I shrink text-embedding-3-small vectors?

Yes — the model supports a dimensions parameter via Matryoshka representation learning, letting you request shorter vectors (for example 512 or 256 dims) to save storage with modest quality impact.

Sources

OpenAI — New embedding models — accessed 2026-04-20
OpenAI — Embeddings guide — accessed 2026-04-20