Creativity · Comparison

ReAct vs Reflexion

ReAct and Reflexion are two classic agent patterns. ReAct (Reason + Act) interleaves chain-of-thought with tool calls — the model thinks, acts via a tool, observes the result, and continues. Reflexion wraps an agent in a self-critique loop: after an attempt, the agent reflects on what went wrong, stores a lesson in memory, and tries again. In 2026, ReAct is implicit in most agent frameworks; Reflexion is useful when you have multiple allowable attempts and a verifier.

Side-by-side

Criterion ReAct Reflexion
Loop structure Think -> Act -> Observe Try -> Self-critique -> Remember -> Retry
Typical attempts per task One (with many internal steps) Multiple (2-10)
Requires a verifier No Yes — needs signal to know attempt was good/bad
Compute cost per task 1x (scaled by number of tool calls) 2-10x (repeat attempts)
Best on Tool-use tasks with clear end state (agents, coding, search) Tasks with testable outcomes (unit tests, benchmark scores)
Memory across attempts Short-term in-context only Persistent 'lessons' across attempts
Popular frameworks LangGraph, Pydantic AI, CrewAI, OpenAI Agents SDK Custom; LangGraph supports via reflection node
When to stop Agent decides or max-steps guard Attempt passes verifier, or max-attempts exhausted

Verdict

ReAct is the default modern agent pattern — nearly every agent framework in 2026 implements it out of the box. It's simple, effective, and compose-able with tools. Reflexion is a layer on top: wrap a ReAct agent in a self-critique + retry loop when you have a reliable verifier (test suite, eval harness, human approval) and attempts are affordable. Reflexion is a big lift for coding agents with test runners (the verifier is 'do tests pass?'). For most conversational agents, vanilla ReAct is enough. Combining the two is common on hard coding benchmarks.

When to choose each

Choose ReAct if…

  • You're building a general-purpose agent with tool use.
  • You have a single-attempt budget per task.
  • You want the simplest, most-supported pattern.
  • Your verifier signal is weak or missing.

Choose Reflexion if…

  • You have a reliable verifier (tests, evals, scoring).
  • Multiple attempts per task are affordable.
  • The task benefits from cumulative learning across attempts.
  • You're building a coding agent running against a test suite.

Frequently asked questions

Can I combine ReAct and Reflexion?

Yes — that's the typical setup for high-performing coding agents. Each attempt is a full ReAct loop (think, act with tools, observe). Reflexion wraps the attempts: after a failed attempt, the agent reflects, stores lessons in memory, and starts a new ReAct loop.

How is Reflexion different from RLHF?

RLHF trains the model weights from human preference data. Reflexion is a purely inference-time pattern — no weight updates, just in-context learning from the agent's own reflections. Much cheaper and can be applied immediately without retraining.

What's a good max-attempts cap for Reflexion?

Most papers and production setups use 3-5. Returns diminish quickly past that, especially if the verifier signal is noisy. Always combine with a clear budget to avoid runaway loops.

Sources

  1. Yao et al. — ReAct — accessed 2026-04-20
  2. Shinn et al. — Reflexion — accessed 2026-04-20