Curiosity · AI Model

DeepSeek R1

DeepSeek R1 is DeepSeek's January 2025 open-weights reasoning model — the first non-frontier-lab release to credibly match OpenAI o1 on math and reasoning benchmarks. Trained with a largely RL-based recipe from the V3 base, it ships MIT-licensed along with distilled variants from 1.5B to 70B built on Qwen and Llama bases.

Model specs

Vendor
DeepSeek
Family
DeepSeek R1
Released
2025-01
Context window
128,000 tokens
Modalities
text
Input price
$0.55/M tok
Output price
$2.19/M tok
Pricing as of
2026-04-20

Strengths

  • Open weights under MIT license — freely commercial-usable
  • Frontier-class reasoning at open-source cost
  • Distilled variants from 1.5B to 70B for different hardware budgets
  • Transparent RL post-training recipe detailed in the paper

Limitations

  • Reasoning traces can be verbose and increase output token cost
  • Weaker than V3 on non-reasoning general chat tasks
  • Safety and alignment behavior less thoroughly red-teamed than Western frontier
  • Distilled variants trade quality for size — large-model outputs remain strongest

Use cases

  • Self-hosted reasoning assistants for math, logic, science
  • Research on RL-based post-training and reasoning distillation
  • Local inference with distilled 7B / 14B / 32B Qwen-base variants
  • Agentic workflows where tool-use and planning matter more than latency

Benchmarks

BenchmarkScoreAs of
MATH-500≈97%2025-01
AIME 2024≈79%2025-01
Codeforces≈96%ile2025-01

Frequently asked questions

What is DeepSeek R1?

DeepSeek R1 is an open-weights reasoning LLM that uses inference-time thinking traces to match OpenAI o1 on math and logic benchmarks. Released January 2025 under MIT license by Chinese AI lab DeepSeek.

What are the DeepSeek R1 distilled models?

R1 ships with six distilled variants — R1-Distill-Qwen-1.5B, 7B, 14B, 32B, and R1-Distill-Llama-8B, 70B — which transfer reasoning traces into smaller Qwen and Llama bases for local inference.

Is DeepSeek R1 better than OpenAI o1?

They're comparable on public reasoning benchmarks. R1 wins on openness and price; o1 and o3 often win on structured tool use and broader agentic behavior. Choose based on self-host needs.

Sources

  1. DeepSeek — R1 paper and repository — accessed 2026-04-20
  2. Hugging Face — deepseek-ai/DeepSeek-R1 — accessed 2026-04-20