Creativity · Agent Protocol

Agent Voting / Ensemble Pattern

For high-stakes or ambiguous tasks, running one agent is gambling. The voting-ensemble pattern spawns N agents (same model with different seeds, or a mix of models) on the same prompt and picks the answer by majority vote or uses a judge LLM to pick the best. You pay Nx the cost in exchange for dramatically lower tail-risk errors.

Protocol facts

Sponsor
Research community
Status
stable
Interop with
LangGraph, AutoGen, any parallel agent framework

Frequently asked questions

What's the difference between voting and judge-based aggregation?

Voting counts identical answers — works for multiple-choice or extractable facts. Judge-based uses an LLM to read all N responses and pick/synthesize the best — works for open-ended work like essays or code.

Should the N agents be identical?

Not necessarily. Diverse ensembles (different models, different prompts, different temperatures) often outperform N copies of the same agent, because diverse mistakes cancel out.

When is it not worth it?

For simple, well-bounded tasks where a single strong agent is already >99% accurate. The extra cost buys nothing, and latency increases linearly with N unless you parallelize.

Sources

  1. Self-Consistency paper (Wang et al. 2022) — accessed 2026-04-20
  2. LLM-as-a-Judge paper — accessed 2026-04-20