Curiosity · AI Model
NV-Embed v2
NV-Embed v2, released by NVIDIA Research in late 2024, is a 7B open-weights text-embedding model built on Mistral 7B with a latent-attention pooling layer and a two-stage contrastive + instruction-tuning recipe. It reached the top of the MTEB English leaderboard at launch and remains a common choice for high-accuracy retrieval where an embedding model larger than a few hundred million parameters is acceptable.
Model specs
- Vendor
- NVIDIA
- Family
- NV-Embed
- Released
- 2024-10
- Context window
- 32,768 tokens
- Modalities
- text
Strengths
- Leading MTEB English retrieval scores at release
- Long 32K-token context support for document-level embeddings
- Open weights under research-friendly license
Limitations
- 7B size is heavier than MiniLM/E5 embeddings for CPU deployments
- English-centric — weaker multilingual coverage than multilingual E5
- Latency higher than small-transformer encoders
Use cases
- High-accuracy RAG over large document corpora
- Semantic search on GPU hosts where an encoder LLM is acceptable
- Classification and clustering pipelines needing strong embeddings
- Research baselines for instruction-tuned embeddings
Benchmarks
| Benchmark | Score | As of |
|---|---|---|
| MTEB English average | ≈72 (leader at launch) | 2024-10 |
| BEIR | state-of-the-art open-weights | 2024-10 |
Frequently asked questions
What is NV-Embed v2?
NV-Embed v2 is NVIDIA Research's open-weights English embedding model, built on Mistral 7B and trained with a two-stage contrastive plus instruction-tuning recipe that topped the MTEB leaderboard in late 2024.
When should I pick NV-Embed v2 over smaller embeddings?
Pick NV-Embed v2 when you can afford GPU inference and need the highest retrieval quality. For CPU pipelines or extreme scale, smaller encoders like all-mpnet-base-v2 or E5 are cheaper.
What context window does NV-Embed v2 support?
NV-Embed v2 inherits Mistral 7B's long-context support and can embed passages up to roughly 32K tokens, useful for long-document retrieval.
Sources
- arXiv — NV-Embed paper — accessed 2026-04-20
- Hugging Face — nvidia/NV-Embed-v2 — accessed 2026-04-20