Curiosity · AI Model
Cohere Rerank 3
Cohere Rerank 3 is a dedicated reranking model that scores (query, document) pairs to produce high-precision relevance rankings. It sits after a fast first-stage retriever (BM25 or an embedding model) and reorders the top candidates, typically lifting NDCG@10 by several points with almost no latency budget.
Model specs
- Vendor
- Cohere
- Family
- Rerank 3
- Released
- 2024-04
- Context window
- 4,096 tokens
- Modalities
- text
- Input price
- $2/M tok
- Output price
- n/a
- Pricing as of
- 2026-04-20
Strengths
- Meaningful precision gains over embedding-only retrieval
- Multilingual — 100+ languages supported
- Simple API — pass query, docs, top_n and get scores
- Cheap relative to its impact on end-user relevance
Limitations
- Per-pair scoring is O(N) per query — don't rerank 10 000 candidates, narrow first
- Priced per 1M tokens which can add up on long documents — chunk strategically
- Closed weights — use bge-reranker-v2-m3 for self-hosted equivalents
Use cases
- Second-stage reranking in RAG pipelines
- Search result reordering for enterprise portals
- Multilingual reranking over 100+ languages
- Agent tool retrieval — picking the right tool from a large toolbox
Benchmarks
| Benchmark | Score | As of |
|---|---|---|
| BEIR NDCG@10 uplift vs BM25 | +8 to +15 | 2024 |
| Latency per 100 docs | <200 ms | 2024 |
Frequently asked questions
What is Cohere Rerank 3?
Cohere Rerank 3 is a cross-encoder model that takes a query and a list of candidate documents and returns relevance scores, letting you reorder retrieval results for higher top-k precision in a RAG pipeline.
How much does Rerank 3 improve retrieval?
On BEIR-style benchmarks, a cross-encoder reranker typically adds 8–15 NDCG@10 points over BM25-only retrieval and 3–8 points over a strong dense retriever. Exact gains depend on your corpus and baseline.
How many candidates should I rerank?
Typical patterns retrieve the top 50–200 candidates from a first-stage retriever and rerank them with Rerank 3 to pick the final top 5–10. Reranking 10 000 candidates is expensive and rarely improves quality versus narrowing first.
How much does Cohere Rerank 3 cost?
As of April 2026, Rerank 3 is priced per 1000 searches or per million tokens depending on plan — roughly USD 2 per million tokens of reranked text on the Cohere API.
Sources
- Cohere — Rerank 3 announcement — accessed 2026-04-20
- Cohere — Rerank docs — accessed 2026-04-20