Creativity · Agent Protocol
Agent Map-Reduce Pattern
When an input is too large for a single context window, or when per-chunk work is embarrassingly parallel, the map-reduce pattern splits the input, runs N map agents in parallel on the chunks, then hands their outputs to a reduce agent that synthesizes the final answer. Standard for summarizing 1000-page PDFs, analyzing large codebases, or fanning out a question across many documents.
Protocol facts
- Sponsor
- Community pattern
- Status
- stable
- Interop with
- LangChain MapReduce chains, LlamaIndex tree summary, Ray
Frequently asked questions
When should I use map-reduce vs. long-context?
Long-context is simpler when the input fits (up to ~200K–1M tokens). Map-reduce wins when input exceeds context, when chunks are independent, or when you want to parallelize for latency.
Can the reduce step overflow too?
Yes, with very large inputs. Solution: recursive reduce — reduce in pairs/groups of K until a single output remains. This is called 'tree summarization' in LlamaIndex.
How does this interact with consistency?
Each map call sees only its chunk, so cross-chunk facts can conflict. The reduce step must reconcile them explicitly, and for strict consistency you may need a second pass that re-checks the final against all chunks.
Sources
- LangChain — map-reduce chains — accessed 2026-04-20
- LlamaIndex — tree summary — accessed 2026-04-20