Curiosity · AI Model

NV-Embed v2

NV-Embed v2, released by NVIDIA Research in late 2024, is a 7B open-weights text-embedding model built on Mistral 7B with a latent-attention pooling layer and a two-stage contrastive + instruction-tuning recipe. It reached the top of the MTEB English leaderboard at launch and remains a common choice for high-accuracy retrieval where an embedding model larger than a few hundred million parameters is acceptable.

Model specs

Vendor
NVIDIA
Family
NV-Embed
Released
2024-10
Context window
32,768 tokens
Modalities
text

Strengths

  • Leading MTEB English retrieval scores at release
  • Long 32K-token context support for document-level embeddings
  • Open weights under research-friendly license

Limitations

  • 7B size is heavier than MiniLM/E5 embeddings for CPU deployments
  • English-centric — weaker multilingual coverage than multilingual E5
  • Latency higher than small-transformer encoders

Use cases

  • High-accuracy RAG over large document corpora
  • Semantic search on GPU hosts where an encoder LLM is acceptable
  • Classification and clustering pipelines needing strong embeddings
  • Research baselines for instruction-tuned embeddings

Benchmarks

BenchmarkScoreAs of
MTEB English average≈72 (leader at launch)2024-10
BEIRstate-of-the-art open-weights2024-10

Frequently asked questions

What is NV-Embed v2?

NV-Embed v2 is NVIDIA Research's open-weights English embedding model, built on Mistral 7B and trained with a two-stage contrastive plus instruction-tuning recipe that topped the MTEB leaderboard in late 2024.

When should I pick NV-Embed v2 over smaller embeddings?

Pick NV-Embed v2 when you can afford GPU inference and need the highest retrieval quality. For CPU pipelines or extreme scale, smaller encoders like all-mpnet-base-v2 or E5 are cheaper.

What context window does NV-Embed v2 support?

NV-Embed v2 inherits Mistral 7B's long-context support and can embed passages up to roughly 32K tokens, useful for long-document retrieval.

Sources

  1. arXiv — NV-Embed paper — accessed 2026-04-20
  2. Hugging Face — nvidia/NV-Embed-v2 — accessed 2026-04-20