Curiosity · AI Model

Jamba 1.5 Large

Jamba 1.5 Large is AI21 Labs' August 2024 open-weights flagship — a 398B total / 94B active hybrid MoE that interleaves Mamba state-space model layers with Transformer attention. The hybrid design delivers linear-time long-context inference with strong quality, and the 256K window was industry-leading at release.

Model specs

Vendor
AI21 Labs
Family
Jamba
Released
2024-08
Context window
256,000 tokens
Modalities
text
Input price
$2/M tok
Output price
$8/M tok
Pricing as of
2026-04-20

Strengths

  • Open weights under the Jamba Open Model License
  • Hybrid architecture — linear-time long-context inference via Mamba
  • Industry-leading 256K context at 2024 release
  • Strong retrieval performance on RULER long-context benchmark

Limitations

  • Hybrid inference stack less mature than pure Transformer — fewer runtimes support it
  • Custom Jamba license is permissive but not OSI-standard
  • Trails Llama 3.1 405B and DeepSeek V3 on general reasoning benchmarks
  • Community adoption smaller than Llama / Qwen / Mistral families

Use cases

  • Long-document Q&A and summarization at 256K context
  • RAG pipelines benefiting from low-latency long-prompt processing
  • Research on hybrid SSM+Transformer architectures
  • Enterprise workloads with mixed structured and unstructured inputs

Benchmarks

BenchmarkScoreAs of
MMLU≈81%2024-08
RULER 256K≈92%2024-08
HumanEval≈71%2024-08

Frequently asked questions

What is Jamba 1.5 Large?

AI21 Labs' 398B total / 94B active open-weights hybrid MoE released August 2024. It interleaves Mamba state-space model layers with Transformer attention to deliver linear-time long-context inference and a 256K window.

Why hybrid SSM + Transformer?

Attention has quadratic cost in context length, while Mamba-style state-space models run in linear time. Jamba's hybrid approach keeps attention for quality while using Mamba layers to make 256K context economically viable.

Should I deploy Jamba instead of a pure Transformer?

Consider Jamba when very long context with fast inference is a primary requirement. For general chat and coding, mainstream Transformer MoEs like DeepSeek V3 or Llama 4 Maverick have richer tooling and slightly better benchmark quality.

Sources

  1. AI21 Labs — Jamba 1.5 announcement — accessed 2026-04-20
  2. Hugging Face — ai21labs/AI21-Jamba-1.5-Large — accessed 2026-04-20