Curiosity · AI Model

Llama 4 Scout

Llama 4 Scout is Meta's 2025 efficiency-tier open-weights model in the Llama 4 family — a Mixture-of-Experts design that runs on a single H100 while still reaching long context. Pick Scout when you want Llama 4's native multimodality and MoE economics without the Maverick compute footprint.

Model specs

Vendor
Meta
Family
Llama 4
Released
2025-04
Context window
10,000,000 tokens
Modalities
text, vision
Input price
$0.1/M tok
Output price
$0.3/M tok
Pricing as of
2026-04-20

Strengths

  • Open weights under the Llama 4 community license
  • Industry-leading 10M token context for document-heavy workloads
  • Single-GPU inference with 4-bit quantization
  • Native multimodal (text + vision) from pretraining

Limitations

  • Smaller than Maverick on reasoning and code benchmarks
  • Long-context claims degrade beyond ~1M tokens in practice
  • Behind Claude / GPT-5 on agentic multi-step tasks
  • Self-host ops burden — you own the GPUs and observability

Use cases

  • Single-GPU on-prem deployments and lab work
  • Long-document RAG pipelines with 10M token window
  • Fine-tunes on domain data where Maverick cost is prohibitive
  • Edge servers and sovereign-cloud inference

Benchmarks

BenchmarkScoreAs of
MMLU-Pro≈74%2026-04
LiveCodeBench≈32%2026-04

Frequently asked questions

What is Llama 4 Scout?

Llama 4 Scout is the smaller of Meta's two Llama 4 open-weights models released April 2025 — a 17B active / 109B total Mixture-of-Experts LLM with a 10M token context window, released under the Llama 4 community license.

Can Llama 4 Scout run on one GPU?

Yes — Scout is designed to run on a single Nvidia H100 with 4-bit quantization, which makes it the practical on-prem choice versus Maverick's multi-GPU requirement.

When should I pick Scout over Maverick?

Pick Scout when single-GPU economics, long-context RAG, or sovereign deployment matter more than absolute reasoning quality. Pick Maverick when you need frontier-tier performance and have multi-GPU budget.

Sources

  1. Meta — The Llama 4 Herd — accessed 2026-04-20
  2. Hugging Face — meta-llama/Llama-4-Scout — accessed 2026-04-20