Curiosity · AI Model

DeepSeek R1

DeepSeek R1 is DeepSeek's January 2025 open-weights reasoning model — the first non-frontier-lab release to credibly match OpenAI o1 on math and reasoning benchmarks. Trained with a largely RL-based recipe from the V3 base, it ships MIT-licensed along with distilled variants from 1.5B to 70B built on Qwen and Llama bases.

Model specs

Vendor: DeepSeek
Family: DeepSeek R1
Released: 2025-01
Context window: 128,000 tokens
Modalities: text
Input price: $0.55/M tok
Output price: $2.19/M tok
Pricing as of: 2026-04-20

Strengths

Open weights under MIT license — freely commercial-usable
Frontier-class reasoning at open-source cost
Distilled variants from 1.5B to 70B for different hardware budgets
Transparent RL post-training recipe detailed in the paper

Limitations

Reasoning traces can be verbose and increase output token cost
Weaker than V3 on non-reasoning general chat tasks
Safety and alignment behavior less thoroughly red-teamed than Western frontier
Distilled variants trade quality for size — large-model outputs remain strongest

Use cases

Self-hosted reasoning assistants for math, logic, science
Research on RL-based post-training and reasoning distillation
Local inference with distilled 7B / 14B / 32B Qwen-base variants
Agentic workflows where tool-use and planning matter more than latency

Benchmarks

Benchmark	Score	As of
MATH-500	≈97%	2025-01
AIME 2024	≈79%	2025-01
Codeforces	≈96%ile	2025-01

Frequently asked questions

What is DeepSeek R1?

DeepSeek R1 is an open-weights reasoning LLM that uses inference-time thinking traces to match OpenAI o1 on math and logic benchmarks. Released January 2025 under MIT license by Chinese AI lab DeepSeek.

What are the DeepSeek R1 distilled models?

R1 ships with six distilled variants — R1-Distill-Qwen-1.5B, 7B, 14B, 32B, and R1-Distill-Llama-8B, 70B — which transfer reasoning traces into smaller Qwen and Llama bases for local inference.

Is DeepSeek R1 better than OpenAI o1?

They're comparable on public reasoning benchmarks. R1 wins on openness and price; o1 and o3 often win on structured tool use and broader agentic behavior. Choose based on self-host needs.

Sources

DeepSeek — R1 paper and repository — accessed 2026-04-20
Hugging Face — deepseek-ai/DeepSeek-R1 — accessed 2026-04-20