Curiosity · AI Model
Mistral Small 3
Mistral Small 3 is Mistral AI's January 2025 efficiency-tier open-weights model — 24B parameters under a pure Apache 2.0 license with no community-license caveats. It targets low-latency applications where you want strong quality per dollar and full commercial freedom without Meta-style revenue thresholds.
Model specs
- Vendor
- Mistral AI
- Family
- Mistral Small
- Released
- 2025-01
- Context window
- 32,768 tokens
- Modalities
- text
- Input price
- $0.1/M tok
- Output price
- $0.3/M tok
- Pricing as of
- 2026-04-20
Strengths
- Apache 2.0 license — no revenue cap or community-license caveats
- 24B dense — fits on a single H100 at BF16, laptop-class at 4-bit
- Competitive with Llama 3.3 70B on several benchmarks at a third the size
- Fast inference — 150+ tokens/sec on a single H100
Limitations
- 32K context window — short compared to Llama 3.1/3.3 at 128K
- Text-only — see Pixtral for vision and Mistral Large for flagship
- Trails DeepSeek V3 and Llama 4 Maverick on top-end reasoning
- Smaller community than Llama — fewer off-the-shelf fine-tunes
Use cases
- Low-latency chat and voice-assistant backends
- High-throughput batch pipelines where cost dominates
- On-prem deployments needing a permissive license
- Fine-tuning on proprietary data without license friction
Benchmarks
| Benchmark | Score | As of |
|---|---|---|
| MMLU | ≈81% | 2025-01 |
| HumanEval | ≈84% | 2025-01 |
Frequently asked questions
What makes Mistral Small 3 different from Llama 3.3 70B?
Size and license. Mistral Small 3 is 24B under Apache 2.0 with no community-license caveats, so it's fully permissive. Llama 3.3 70B is larger and stronger but ships under a conditional Meta license.
Is Mistral Small 3 good for production?
Yes — it was explicitly designed as a production model. Latency is its selling point: under 100ms first-token on a single GPU, with competitive instruction-following quality.
Where do I get Mistral Small 3?
Weights are on Hugging Face under mistralai/Mistral-Small-24B-Instruct-2501. Hosted inference is available via Mistral's own API, Together, Fireworks, and OpenRouter.
Sources
- Mistral AI — Mistral Small 3 announcement — accessed 2026-04-20
- Hugging Face — mistralai/Mistral-Small-24B-Instruct — accessed 2026-04-20