Curiosity · AI Model

DeepSeek LLM 67B

DeepSeek LLM 67B, announced in late 2023, was DeepSeek AI's first general-purpose frontier-scale open-weights model. Trained on two trillion bilingual tokens, it was competitive with Llama 2 70B while being more data-efficient, and is the direct ancestor of the DeepSeek V2 and V3 families that later dominated 2024-25 open-weights leaderboards.

Model specs

Vendor: DeepSeek
Family: DeepSeek LLM
Released: 2023-11
Context window: 4,096 tokens
Modalities: text

Strengths

Strong bilingual coverage at launch
Permissive DeepSeek license for commercial use
Published detailed technical report for reproducibility

Limitations

Dense 67B is expensive to serve vs. later MoE designs
Superseded by DeepSeek V2 / V3 MoE models on all major benchmarks
Only 4K context window in the base release

Use cases

Bilingual Chinese/English chat and RAG systems
Legacy open-weights frontier baselines
Fine-tuning targets for domain models
Historical benchmark comparisons in research papers

Benchmarks

Benchmark	Score	As of
MMLU	≈78%	2023-11
HumanEval	≈73%	2023-11
C-Eval	≈73%	2023-11

Frequently asked questions

What is DeepSeek LLM 67B?

DeepSeek LLM 67B is a 67-billion-parameter dense open-weights large language model released by DeepSeek AI in late 2023, trained on two trillion bilingual Chinese and English tokens.

Is DeepSeek LLM 67B still the best DeepSeek model?

No — DeepSeek V2 and V3 MoE models, along with DeepSeek-R1 reasoning variants, significantly outperform the original 67B. It remains a historical reference and fine-tuning base.

What license covers DeepSeek LLM 67B?

DeepSeek publishes its own permissive license that allows most commercial use, with some conditions on large-scale public deployment. Check the license file before shipping.

Sources

arXiv — DeepSeek LLM paper — accessed 2026-04-20
Hugging Face — deepseek-ai/deepseek-llm-67b-chat — accessed 2026-04-20