Curiosity · AI Model
Nemotron Ultra 253B
Nemotron Ultra 253B, released by NVIDIA in 2025, is the heaviest open-weights model in the Llama Nemotron series. Built from Llama 3.1 405B via efficient fine-tuning and architecture slicing, it targets enterprise-grade reasoning on code, math, and RAG workloads, and is optimised for high throughput on NVIDIA GPUs with TensorRT-LLM.
Model specs
- Vendor
- NVIDIA
- Family
- Llama Nemotron
- Released
- 2025-03
- Context window
- 128,000 tokens
- Modalities
- text
Strengths
- Top-tier open-weights reasoning at launch
- Highly optimised for NVIDIA hardware and TensorRT-LLM
- Includes curated datasets and open training recipes
Limitations
- Enormous footprint — multi-node serving required
- Focused on NVIDIA infrastructure; portability outside the stack is uneven
- Llama community license imposes some commercial restrictions
Use cases
- Enterprise reasoning workloads served via NVIDIA NIM
- Research on frontier-scale open-weights post-training
- High-throughput inference on H100 and B200 fleets
- Customisation platform for domain-specific agents
Benchmarks
| Benchmark | Score | As of |
|---|---|---|
| MMLU-Pro | ≈78% | 2025-03 |
| MATH | ≈82% | 2025-03 |
| HumanEval | ≈89% | 2025-03 |
Frequently asked questions
What is Nemotron Ultra 253B?
Nemotron Ultra 253B is NVIDIA's top-tier open-weights reasoning model, derived from Llama 3.1 405B via extensive post-training and architectural adjustments for efficient inference on NVIDIA GPUs.
How is Nemotron Ultra different from Llama 3.1 405B?
NVIDIA applies pruning, architecture slicing, and reasoning-focused fine-tuning, producing a 253B model that matches or beats the base 405B on many evals while being substantially cheaper to serve.
Where can I run Nemotron Ultra?
Weights are available on Hugging Face under the 'nvidia' organisation, and NVIDIA offers hosted access through NIM microservices in their AI Foundry.
Sources
- NVIDIA — Llama Nemotron Ultra — accessed 2026-04-20
- Hugging Face — nvidia/Llama-3_1-Nemotron-Ultra-253B — accessed 2026-04-20