Curiosity · AI Model

Meta MobileLLM 1.5B

Q: What is MobileLLM 1.5B?

A ~1.5B-parameter transformer from Meta AI Research designed specifically for on-device inference, using embedding sharing, GQA, and a deep-and-thin layout.

Q: Why a new architecture for phones?

Meta found that parameter efficiency under ~1B behaves differently from the scaling-law regime of 7B+ models, and that deeper, thinner networks with shared embeddings dominate at that scale.

Q: Is MobileLLM open?

The paper and model cards have been released by Meta AI Research; weights are available for research use under a permissive Meta licence.

MobileLLM is a Meta AI Research project dedicated to sub-billion-parameter LLMs that run entirely on phones and edge devices. The 1.5B variant (released alongside the 125M/350M/600M series in July 2024, updated 2025) uses a deep-and-thin transformer, embedding sharing across input and output, grouped-query attention, and SwiGLU — design choices that consistently beat other sub-2B models at matched compute. MobileLLM aims to enable on-device chat, assistants, and code completion without cloud round-trips.

Model specs

Vendor: Meta AI Research
Family: MobileLLM
Released: 2024-07
Context window: 2,048 tokens
Modalities: text

Strengths

Architecture co-designed for phones
Strong quality-per-parameter in the sub-2B regime
Permissive release for research
Family covers 125M–1.5B for right-sizing

Limitations

Far smaller than 3B–8B class models on hard reasoning
Context window limited (2k by default)
No native multimodality
Research-grade tooling — fewer turnkey deployments than Phi / Gemma

Use cases

On-device mobile assistants
Offline code and text autocomplete
Low-power IoT agents
Research on sub-2B LLM design

Benchmarks

Benchmark	Score	As of
Zero-shot common-sense (avg, 1.5B)	≈2-4 pp better than comparable <2B baselines	2024-07
On-device decoding (phone CPU, 4-bit)	tens of tokens/sec	2024-07

Frequently asked questions

What is MobileLLM 1.5B?