Curiosity · AI Model

OpenAI text-embedding-3-small

text-embedding-3-small is OpenAI's throughput-tuned embedding model — 1536 default dimensions and roughly five times cheaper than text-embedding-3-large while still beating the legacy ada-002. It is the right default when you are embedding tens of millions of chunks or running cost-sensitive semantic search at scale.

Model specs

Vendor
OpenAI
Family
text-embedding-3
Released
2024-01
Context window
8,191 tokens
Modalities
text
Input price
$0.02/M tok
Output price
n/a
Pricing as of
2026-04-20

Strengths

  • ≈5× cheaper per million tokens than text-embedding-3-large
  • Strong quality — still beats the legacy text-embedding-ada-002
  • Matryoshka truncation lets you trade a few MTEB points for far smaller vectors
  • Fast enough to embed millions of documents per hour

Limitations

  • Lower MTEB score than text-embedding-3-large — use the large model for high-stakes retrieval
  • Closed weights — cannot self-host
  • Multilingual quality is decent but lags Cohere embed-v3 and Voyage-3 on some locales

Use cases

  • Large-scale corpus ingestion for RAG
  • Semantic search with tight cost ceilings
  • Deduplication and near-duplicate detection
  • Recommendation features in consumer apps

Benchmarks

BenchmarkScoreAs of
MTEB (English, avg)≈62.32024-01
MIRACL (multilingual)≈44.02024-01

Frequently asked questions

What is text-embedding-3-small?

text-embedding-3-small is OpenAI's cost-optimised embedding model, producing 1536-dimensional vectors. It replaces text-embedding-ada-002 for most workloads and is the recommended default for bulk ingestion and cost-sensitive RAG.

How does text-embedding-3-small compare with ada-002?

It is both higher-quality on MTEB and cheaper per million tokens than ada-002. OpenAI recommends new projects use text-embedding-3-small or text-embedding-3-large rather than the legacy ada-002 model.

How much does text-embedding-3-small cost?

As of April 2026, text-embedding-3-small costs roughly USD 0.02 per million input tokens on the OpenAI API, making it viable for embedding very large corpora.

Can I shrink text-embedding-3-small vectors?

Yes — the model supports a dimensions parameter via Matryoshka representation learning, letting you request shorter vectors (for example 512 or 256 dims) to save storage with modest quality impact.

Sources

  1. OpenAI — New embedding models — accessed 2026-04-20
  2. OpenAI — Embeddings guide — accessed 2026-04-20