Curiosity · AI Model

Phi-3-mini 128k

Phi-3-mini 128k-instruct is a 3.8-billion-parameter small language model from Microsoft Research, released April 2024. It extends the standard Phi-3-mini 4k model with LongRope to a 128k context window while keeping the footprint that runs on a modern laptop or phone. Trained on a curated 'textbook-quality' mixture, Phi-3-mini reaches roughly GPT-3.5 quality on MMLU and HumanEval at a fraction of the size, and is released under MIT — making it a popular choice for edge and offline deployments.

Model specs

Vendor: Microsoft Research
Family: Phi
Released: 2024-04
Context window: 128,000 tokens
Modalities: text

Strengths

Tiny footprint — runs in 4-bit on 8 GB GPUs or modern phones
128k context extension via LongRope
MIT licence — permissive commercial use
Strong quality per parameter

Limitations

Weak on multilingual tasks (English-heavy training)
Cannot match 70B models on complex reasoning
Occasional hallucinations under long context
No native multimodality (vision is a separate Phi-3.5-vision model)

Use cases

On-device chat assistants (laptop, mobile)
Offline RAG over long documents
Cheap agents for background automations
Teaching and research on small LLMs

Benchmarks

Benchmark	Score	As of
MMLU	≈68%	2024-04
HumanEval	≈59%	2024-04
MT-Bench	≈8.4	2024-04

Frequently asked questions

What is Phi-3-mini 128k?

A 3.8B-parameter small LLM from Microsoft with a 128k context window, released under MIT in April 2024.

How does Phi-3-mini achieve such strong quality at 3.8B?

Via curated 'textbook-quality' training data (a mix of filtered web and synthetic tutorials), heavy distillation, and instruction tuning.

Can I run Phi-3-mini on my laptop?

Yes — 4-bit GGUF variants run comfortably on CPUs or modest GPUs, and there are mobile-optimised builds.

Sources

Phi-3 technical report (arXiv) — accessed 2026-04-20
Phi-3-mini-128k-instruct on Hugging Face — accessed 2026-04-20