Curiosity · AI Model

Mixtral 8x22B

Mixtral 8x22B is Mistral AI's April 2024 flagship open-weights MoE — 141B total parameters with 39B active per token, released under Apache 2.0 as an unambiguously permissive production-grade open model. It defined the open MoE category before DeepSeek V3 and Llama 4 took it further.

Model specs

Vendor: Mistral AI
Family: Mixtral
Released: 2024-04
Context window: 65,536 tokens
Modalities: text
Input price: $1.2/M tok
Output price: $1.2/M tok
Pricing as of: 2026-04-20

Strengths

Apache 2.0 — fully permissive open license for commercial use
MoE architecture — 141B total but only 39B active per token
Strong multilingual performance — native European language coverage
Established in many inference stacks — vLLM, TensorRT-LLM, Together, Fireworks

Limitations

Now trails DeepSeek V3, Llama 4, and Qwen 2.5 on most benchmarks
65K context — modest compared to 128K–1M windows in 2026
MoE memory footprint is large even with efficient runtime
Newer Mistral Large 2 and Mistral Large 3 are proprietary, not open

Use cases

High-throughput production chat on self-hosted infra
Multilingual workloads across French, German, Spanish, Italian
Fine-tuning base for domain models needing MoE efficiency
Coding pipelines where Apache 2.0 license matters

Benchmarks

Benchmark	Score	As of
MMLU	≈77%	2024-04
HumanEval	≈76%	2024-04
MATH	≈41%	2024-04

Frequently asked questions

What is Mixtral 8x22B?

A Mixture-of-Experts open-weights LLM from Mistral AI with 8 experts of roughly 22B each, totaling 141B parameters but only activating ~39B per token. Released April 2024 under Apache 2.0.

Is Mixtral 8x22B still worth deploying?

For Apache 2.0 requirements and multilingual European workloads, yes. But for general quality-per-dollar, DeepSeek V3 or Llama 4 Maverick are now stronger open choices.

What's the difference between Mixtral 8x7B and 8x22B?

Both are Mistral MoE models, but 8x7B (46B total) targets smaller deployments while 8x22B (141B total) is the flagship. 8x22B is meaningfully stronger on reasoning, coding, and math.

Sources

Mistral AI — Cheaper, Better, Faster, Stronger (Mixtral 8x22B) — accessed 2026-04-20
Hugging Face — mistralai/Mixtral-8x22B-Instruct-v0.1 — accessed 2026-04-20