Curiosity · AI Model
Phi-3-mini 128k
Phi-3-mini 128k-instruct is a 3.8-billion-parameter small language model from Microsoft Research, released April 2024. It extends the standard Phi-3-mini 4k model with LongRope to a 128k context window while keeping the footprint that runs on a modern laptop or phone. Trained on a curated 'textbook-quality' mixture, Phi-3-mini reaches roughly GPT-3.5 quality on MMLU and HumanEval at a fraction of the size, and is released under MIT — making it a popular choice for edge and offline deployments.
Model specs
- Vendor
- Microsoft Research
- Family
- Phi
- Released
- 2024-04
- Context window
- 128,000 tokens
- Modalities
- text
Strengths
- Tiny footprint — runs in 4-bit on 8 GB GPUs or modern phones
- 128k context extension via LongRope
- MIT licence — permissive commercial use
- Strong quality per parameter
Limitations
- Weak on multilingual tasks (English-heavy training)
- Cannot match 70B models on complex reasoning
- Occasional hallucinations under long context
- No native multimodality (vision is a separate Phi-3.5-vision model)
Use cases
- On-device chat assistants (laptop, mobile)
- Offline RAG over long documents
- Cheap agents for background automations
- Teaching and research on small LLMs
Benchmarks
| Benchmark | Score | As of |
|---|---|---|
| MMLU | ≈68% | 2024-04 |
| HumanEval | ≈59% | 2024-04 |
| MT-Bench | ≈8.4 | 2024-04 |
Frequently asked questions
What is Phi-3-mini 128k?
A 3.8B-parameter small LLM from Microsoft with a 128k context window, released under MIT in April 2024.
How does Phi-3-mini achieve such strong quality at 3.8B?
Via curated 'textbook-quality' training data (a mix of filtered web and synthetic tutorials), heavy distillation, and instruction tuning.
Can I run Phi-3-mini on my laptop?
Yes — 4-bit GGUF variants run comfortably on CPUs or modest GPUs, and there are mobile-optimised builds.
Sources
- Phi-3 technical report (arXiv) — accessed 2026-04-20
- Phi-3-mini-128k-instruct on Hugging Face — accessed 2026-04-20