Creativity · Comparison

Agent Memory Patterns vs RAG

People often conflate these two. RAG is about retrieving relevant chunks from a static or slowly-changing corpus at every call. Agent memory is about maintaining evolving state — what the agent has learned, what the user prefers, what happened last session. A real assistant usually needs both, and mixing them up is a common architectural bug.

Side-by-side

Criterion	Agent Memory Patterns	Retrieval-Augmented Generation (RAG)
Goal	Maintain evolving per-agent or per-user state	Ground answers in external documents
State persistence	Persistent across sessions	Typically stateless per call
Primary storage	Summarised episodes, KV store, vector store	Document corpus + vector index
Update pattern	Write after each task / session	Rebuild or incrementally index corpus
Typical libraries	Letta, MemGPT, Mem0, LangGraph checkpoints	LangChain, LlamaIndex, Haystack, R2R
Evaluation challenge	Does the agent remember useful things correctly?	Are retrieved chunks relevant and grounded?
Risk of staleness	High — memories drift if never updated	High — corpus must be re-indexed on change
Best fit	Personal assistants, long-running agents	Knowledge bots over a document corpus

Verdict

Treat these as complementary. RAG gives an agent access to authoritative external knowledge — docs, tickets, manuals — without bloating the prompt. Agent memory gives an agent a sense of continuity — 'we tried that last week and it failed', 'the user prefers concise answers'. A mature assistant typically uses RAG for facts and a memory subsystem for state, with clear boundaries between them so neither gets abused as a substitute for the other.

When to choose each

Choose Agent Memory Patterns if…

You're building a long-running personal or coding assistant.
Users expect continuity across sessions.
You need per-user preferences and behavioural history.
Your agent learns from feedback over time.

Choose Retrieval-Augmented Generation (RAG) if…

You need to ground answers in a document corpus (manuals, tickets, papers).
Your knowledge changes often and must be authoritative.
You want to cite sources in responses.
Your task is primarily Q&A, not long-term collaboration.

Frequently asked questions

Isn't RAG a form of memory?

RAG is retrieval over a corpus — it's external knowledge, not agent state. Agent memory is usually about what the agent or user did, preferred, and decided. Implementations overlap (both use vector stores), but the design intent differs.

Can I use the same vector store for both?

Technically yes, but keep namespaces or collections separate. Mixing user memories with corpus chunks is a fast path to privacy leaks and confused retrieval.

Which should VSET students learn first?

Start with RAG — it's the foundation everyone needs. Add memory patterns once your agent genuinely needs continuity; a lot of projects fake memory by just appending the last turn to the prompt.

Sources

Letta / MemGPT — documentation — accessed 2026-04-20
LlamaIndex — RAG overview — accessed 2026-04-20