Curiosity · AI Model

OpenELM 3B

OpenELM 3B is Apple's open-weight 3-billion-parameter language model, released in 2024 alongside the smaller 270M, 450M, and 1.1B siblings. Apple used layer-wise scaling (varying parameter counts per transformer layer) to improve efficiency, and published the full training recipe. OpenELM is especially notable for its CoreML export paths, making it the reference 'Apple on-device LLM'.

Model specs

Vendor
Apple
Family
OpenELM
Released
2024-04
Context window
2,048 tokens
Modalities
text

Strengths

  • Fully open training recipe, data mixture, and logs
  • Layer-wise scaling illustrates efficient-transformer ideas
  • CoreML conversion path documented

Limitations

  • Benchmark scores trail Gemma 2 2B and Phi-2
  • Short 2048-token context
  • Apple Sample Code Licence has restrictions around Apple trademarks

Use cases

  • On-device research on Apple Silicon (M-series)
  • Comparisons with Apple Foundation Models (2024 WWDC)
  • Fine-tuning for privacy-preserving iOS apps
  • Teaching efficient-transformer architectures

Benchmarks

BenchmarkScoreAs of
MMLU (5-shot)~27%2026-04
ARC-Challenge~42%2026-04
HellaSwag~73%2026-04

Frequently asked questions

What is OpenELM 3B?

OpenELM 3B is Apple's open-weight 3-billion-parameter language model, released in April 2024 with a full public training recipe. It uses layer-wise parameter scaling to improve efficiency.

Is OpenELM the model inside Apple Intelligence?

No. Apple's shipped on-device model for iOS and macOS is part of the closed Apple Foundation Models family. OpenELM is a research release that predates and informs that work.

Sources

  1. OpenELM on HuggingFace — accessed 2026-04-20
  2. OpenELM paper (arXiv) — accessed 2026-04-20