Creativity · Agent Protocol

Agent Map-Reduce Pattern

When an input is too large for a single context window, or when per-chunk work is embarrassingly parallel, the map-reduce pattern splits the input, runs N map agents in parallel on the chunks, then hands their outputs to a reduce agent that synthesizes the final answer. Standard for summarizing 1000-page PDFs, analyzing large codebases, or fanning out a question across many documents.

Protocol facts

Sponsor
Community pattern
Status
stable
Interop with
LangChain MapReduce chains, LlamaIndex tree summary, Ray

Frequently asked questions

When should I use map-reduce vs. long-context?

Long-context is simpler when the input fits (up to ~200K–1M tokens). Map-reduce wins when input exceeds context, when chunks are independent, or when you want to parallelize for latency.

Can the reduce step overflow too?

Yes, with very large inputs. Solution: recursive reduce — reduce in pairs/groups of K until a single output remains. This is called 'tree summarization' in LlamaIndex.

How does this interact with consistency?

Each map call sees only its chunk, so cross-chunk facts can conflict. The reduce step must reconcile them explicitly, and for strict consistency you may need a second pass that re-checks the final against all chunks.

Sources

  1. LangChain — map-reduce chains — accessed 2026-04-20
  2. LlamaIndex — tree summary — accessed 2026-04-20