Creativity · Comparison

Agent Memory Patterns vs RAG

People often conflate these two. RAG is about retrieving relevant chunks from a static or slowly-changing corpus at every call. Agent memory is about maintaining evolving state — what the agent has learned, what the user prefers, what happened last session. A real assistant usually needs both, and mixing them up is a common architectural bug.

Side-by-side

Criterion Agent Memory Patterns Retrieval-Augmented Generation (RAG)
Goal Maintain evolving per-agent or per-user state Ground answers in external documents
State persistence Persistent across sessions Typically stateless per call
Primary storage Summarised episodes, KV store, vector store Document corpus + vector index
Update pattern Write after each task / session Rebuild or incrementally index corpus
Typical libraries Letta, MemGPT, Mem0, LangGraph checkpoints LangChain, LlamaIndex, Haystack, R2R
Evaluation challenge Does the agent remember useful things correctly? Are retrieved chunks relevant and grounded?
Risk of staleness High — memories drift if never updated High — corpus must be re-indexed on change
Best fit Personal assistants, long-running agents Knowledge bots over a document corpus

Verdict

Treat these as complementary. RAG gives an agent access to authoritative external knowledge — docs, tickets, manuals — without bloating the prompt. Agent memory gives an agent a sense of continuity — 'we tried that last week and it failed', 'the user prefers concise answers'. A mature assistant typically uses RAG for facts and a memory subsystem for state, with clear boundaries between them so neither gets abused as a substitute for the other.

When to choose each

Choose Agent Memory Patterns if…

  • You're building a long-running personal or coding assistant.
  • Users expect continuity across sessions.
  • You need per-user preferences and behavioural history.
  • Your agent learns from feedback over time.

Choose Retrieval-Augmented Generation (RAG) if…

  • You need to ground answers in a document corpus (manuals, tickets, papers).
  • Your knowledge changes often and must be authoritative.
  • You want to cite sources in responses.
  • Your task is primarily Q&A, not long-term collaboration.

Frequently asked questions

Isn't RAG a form of memory?

RAG is retrieval over a corpus — it's external knowledge, not agent state. Agent memory is usually about what the agent or user did, preferred, and decided. Implementations overlap (both use vector stores), but the design intent differs.

Can I use the same vector store for both?

Technically yes, but keep namespaces or collections separate. Mixing user memories with corpus chunks is a fast path to privacy leaks and confused retrieval.

Which should VSET students learn first?

Start with RAG — it's the foundation everyone needs. Add memory patterns once your agent genuinely needs continuity; a lot of projects fake memory by just appending the last turn to the prompt.

Sources

  1. Letta / MemGPT — documentation — accessed 2026-04-20
  2. LlamaIndex — RAG overview — accessed 2026-04-20