Curiosity · AI Model

DeepSeek LLM 67B

DeepSeek LLM 67B, announced in late 2023, was DeepSeek AI's first general-purpose frontier-scale open-weights model. Trained on two trillion bilingual tokens, it was competitive with Llama 2 70B while being more data-efficient, and is the direct ancestor of the DeepSeek V2 and V3 families that later dominated 2024-25 open-weights leaderboards.

Model specs

Vendor
DeepSeek
Family
DeepSeek LLM
Released
2023-11
Context window
4,096 tokens
Modalities
text

Strengths

  • Strong bilingual coverage at launch
  • Permissive DeepSeek license for commercial use
  • Published detailed technical report for reproducibility

Limitations

  • Dense 67B is expensive to serve vs. later MoE designs
  • Superseded by DeepSeek V2 / V3 MoE models on all major benchmarks
  • Only 4K context window in the base release

Use cases

  • Bilingual Chinese/English chat and RAG systems
  • Legacy open-weights frontier baselines
  • Fine-tuning targets for domain models
  • Historical benchmark comparisons in research papers

Benchmarks

BenchmarkScoreAs of
MMLU≈78%2023-11
HumanEval≈73%2023-11
C-Eval≈73%2023-11

Frequently asked questions

What is DeepSeek LLM 67B?

DeepSeek LLM 67B is a 67-billion-parameter dense open-weights large language model released by DeepSeek AI in late 2023, trained on two trillion bilingual Chinese and English tokens.

Is DeepSeek LLM 67B still the best DeepSeek model?

No — DeepSeek V2 and V3 MoE models, along with DeepSeek-R1 reasoning variants, significantly outperform the original 67B. It remains a historical reference and fine-tuning base.

What license covers DeepSeek LLM 67B?

DeepSeek publishes its own permissive license that allows most commercial use, with some conditions on large-scale public deployment. Check the license file before shipping.

Sources

  1. arXiv — DeepSeek LLM paper — accessed 2026-04-20
  2. Hugging Face — deepseek-ai/deepseek-llm-67b-chat — accessed 2026-04-20