Curiosity · AI Model

Jamba 1.5 Large

Jamba 1.5 Large is AI21 Labs' August 2024 open-weights flagship — a 398B total / 94B active hybrid MoE that interleaves Mamba state-space model layers with Transformer attention. The hybrid design delivers linear-time long-context inference with strong quality, and the 256K window was industry-leading at release.

Model specs

Vendor: AI21 Labs
Family: Jamba
Released: 2024-08
Context window: 256,000 tokens
Modalities: text
Input price: $2/M tok
Output price: $8/M tok
Pricing as of: 2026-04-20

Strengths

Open weights under the Jamba Open Model License
Hybrid architecture — linear-time long-context inference via Mamba
Industry-leading 256K context at 2024 release
Strong retrieval performance on RULER long-context benchmark

Limitations

Hybrid inference stack less mature than pure Transformer — fewer runtimes support it
Custom Jamba license is permissive but not OSI-standard
Trails Llama 3.1 405B and DeepSeek V3 on general reasoning benchmarks
Community adoption smaller than Llama / Qwen / Mistral families

Use cases

Long-document Q&A and summarization at 256K context
RAG pipelines benefiting from low-latency long-prompt processing
Research on hybrid SSM+Transformer architectures
Enterprise workloads with mixed structured and unstructured inputs

Benchmarks

Benchmark	Score	As of
MMLU	≈81%	2024-08
RULER 256K	≈92%	2024-08
HumanEval	≈71%	2024-08

Frequently asked questions

What is Jamba 1.5 Large?

AI21 Labs' 398B total / 94B active open-weights hybrid MoE released August 2024. It interleaves Mamba state-space model layers with Transformer attention to deliver linear-time long-context inference and a 256K window.

Why hybrid SSM + Transformer?

Attention has quadratic cost in context length, while Mamba-style state-space models run in linear time. Jamba's hybrid approach keeps attention for quality while using Mamba layers to make 256K context economically viable.

Should I deploy Jamba instead of a pure Transformer?

Consider Jamba when very long context with fast inference is a primary requirement. For general chat and coding, mainstream Transformer MoEs like DeepSeek V3 or Llama 4 Maverick have richer tooling and slightly better benchmark quality.

Sources

AI21 Labs — Jamba 1.5 announcement — accessed 2026-04-20
Hugging Face — ai21labs/AI21-Jamba-1.5-Large — accessed 2026-04-20