Curiosity · AI Model
DeepSeek LLM 67B
DeepSeek LLM 67B, announced in late 2023, was DeepSeek AI's first general-purpose frontier-scale open-weights model. Trained on two trillion bilingual tokens, it was competitive with Llama 2 70B while being more data-efficient, and is the direct ancestor of the DeepSeek V2 and V3 families that later dominated 2024-25 open-weights leaderboards.
Model specs
- Vendor
- DeepSeek
- Family
- DeepSeek LLM
- Released
- 2023-11
- Context window
- 4,096 tokens
- Modalities
- text
Strengths
- Strong bilingual coverage at launch
- Permissive DeepSeek license for commercial use
- Published detailed technical report for reproducibility
Limitations
- Dense 67B is expensive to serve vs. later MoE designs
- Superseded by DeepSeek V2 / V3 MoE models on all major benchmarks
- Only 4K context window in the base release
Use cases
- Bilingual Chinese/English chat and RAG systems
- Legacy open-weights frontier baselines
- Fine-tuning targets for domain models
- Historical benchmark comparisons in research papers
Benchmarks
| Benchmark | Score | As of |
|---|---|---|
| MMLU | ≈78% | 2023-11 |
| HumanEval | ≈73% | 2023-11 |
| C-Eval | ≈73% | 2023-11 |
Frequently asked questions
What is DeepSeek LLM 67B?
DeepSeek LLM 67B is a 67-billion-parameter dense open-weights large language model released by DeepSeek AI in late 2023, trained on two trillion bilingual Chinese and English tokens.
Is DeepSeek LLM 67B still the best DeepSeek model?
No — DeepSeek V2 and V3 MoE models, along with DeepSeek-R1 reasoning variants, significantly outperform the original 67B. It remains a historical reference and fine-tuning base.
What license covers DeepSeek LLM 67B?
DeepSeek publishes its own permissive license that allows most commercial use, with some conditions on large-scale public deployment. Check the license file before shipping.
Sources
- arXiv — DeepSeek LLM paper — accessed 2026-04-20
- Hugging Face — deepseek-ai/deepseek-llm-67b-chat — accessed 2026-04-20