Curiosity · AI Model

NV-Embed v2

NV-Embed v2, released by NVIDIA Research in late 2024, is a 7B open-weights text-embedding model built on Mistral 7B with a latent-attention pooling layer and a two-stage contrastive + instruction-tuning recipe. It reached the top of the MTEB English leaderboard at launch and remains a common choice for high-accuracy retrieval where an embedding model larger than a few hundred million parameters is acceptable.

Model specs

Vendor: NVIDIA
Family: NV-Embed
Released: 2024-10
Context window: 32,768 tokens
Modalities: text

Strengths

Leading MTEB English retrieval scores at release
Long 32K-token context support for document-level embeddings
Open weights under research-friendly license

Limitations

7B size is heavier than MiniLM/E5 embeddings for CPU deployments
English-centric — weaker multilingual coverage than multilingual E5
Latency higher than small-transformer encoders

Use cases

High-accuracy RAG over large document corpora
Semantic search on GPU hosts where an encoder LLM is acceptable
Classification and clustering pipelines needing strong embeddings
Research baselines for instruction-tuned embeddings

Benchmarks

Benchmark	Score	As of
MTEB English average	≈72 (leader at launch)	2024-10
BEIR	state-of-the-art open-weights	2024-10

Frequently asked questions

What is NV-Embed v2?

NV-Embed v2 is NVIDIA Research's open-weights English embedding model, built on Mistral 7B and trained with a two-stage contrastive plus instruction-tuning recipe that topped the MTEB leaderboard in late 2024.

When should I pick NV-Embed v2 over smaller embeddings?

Pick NV-Embed v2 when you can afford GPU inference and need the highest retrieval quality. For CPU pipelines or extreme scale, smaller encoders like all-mpnet-base-v2 or E5 are cheaper.

What context window does NV-Embed v2 support?

NV-Embed v2 inherits Mistral 7B's long-context support and can embed passages up to roughly 32K tokens, useful for long-document retrieval.

Sources

arXiv — NV-Embed paper — accessed 2026-04-20
Hugging Face — nvidia/NV-Embed-v2 — accessed 2026-04-20