Curiosity · AI Model

DeepSeek V3

DeepSeek V3 is DeepSeek's December 2024 flagship open-weights MoE — 671B total parameters with 37B active per token, trained for a reported ~$5.6M in compute. It credibly matched GPT-4o on most benchmarks at release, shocking the market and triggering the wider 'DeepSeek moment' that revalued frontier training economics.

Model specs

Vendor: DeepSeek
Family: DeepSeek V3
Released: 2024-12
Context window: 128,000 tokens
Modalities: text
Input price: $0.27/M tok
Output price: $1.1/M tok
Pricing as of: 2026-04-20

Strengths

Open weights — MIT-licensed for the model; full commercial use
GPT-4o class quality on reasoning, coding, and math benchmarks
Multi-head Latent Attention (MLA) — drastically cheaper inference
Massive community adoption — Together, Fireworks, SambaNova serve it

Limitations

671B footprint — multi-node inference required for self-host
Some evaluation reports show weaker multilingual and safety behavior than Western frontier models
Export-control and geopolitical concerns for some enterprise buyers
Training data provenance less transparent than Llama or Gemma

Use cases

Frontier-quality self-hosted deployments
Synthetic data generation for distillation
Research baselines for alignment and efficiency work
Fine-tuning platforms where weights access is required

Benchmarks

Benchmark	Score	As of
MMLU	≈88%	2024-12
HumanEval	≈91%	2024-12
MATH-500	≈90%	2024-12

Frequently asked questions

What is DeepSeek V3?

A 671B parameter Mixture-of-Experts open-weights LLM from Chinese AI lab DeepSeek, released December 2024. It activates 37B parameters per token and reached GPT-4o class benchmark performance at a reported training cost of ~$5.6M.

Why was DeepSeek V3 such a big deal?

Its combination of frontier quality, open weights, and dramatically lower training cost forced a reassessment of how much compute is actually required for SOTA models. It became the reference point for efficient frontier training.

Is DeepSeek V3 free to use commercially?

Yes — model weights are released under MIT license, which permits commercial self-hosting. You still pay compute costs, and hosted endpoints via the DeepSeek API or Together/Fireworks are usage-priced.

Sources

DeepSeek — DeepSeek V3 Technical Report — accessed 2026-04-20
Hugging Face — deepseek-ai/DeepSeek-V3 — accessed 2026-04-20