Creativity · Comparison
Agent Memory Patterns vs RAG
People often conflate these two. RAG is about retrieving relevant chunks from a static or slowly-changing corpus at every call. Agent memory is about maintaining evolving state — what the agent has learned, what the user prefers, what happened last session. A real assistant usually needs both, and mixing them up is a common architectural bug.
Side-by-side
| Criterion | Agent Memory Patterns | Retrieval-Augmented Generation (RAG) |
|---|---|---|
| Goal | Maintain evolving per-agent or per-user state | Ground answers in external documents |
| State persistence | Persistent across sessions | Typically stateless per call |
| Primary storage | Summarised episodes, KV store, vector store | Document corpus + vector index |
| Update pattern | Write after each task / session | Rebuild or incrementally index corpus |
| Typical libraries | Letta, MemGPT, Mem0, LangGraph checkpoints | LangChain, LlamaIndex, Haystack, R2R |
| Evaluation challenge | Does the agent remember useful things correctly? | Are retrieved chunks relevant and grounded? |
| Risk of staleness | High — memories drift if never updated | High — corpus must be re-indexed on change |
| Best fit | Personal assistants, long-running agents | Knowledge bots over a document corpus |
Verdict
Treat these as complementary. RAG gives an agent access to authoritative external knowledge — docs, tickets, manuals — without bloating the prompt. Agent memory gives an agent a sense of continuity — 'we tried that last week and it failed', 'the user prefers concise answers'. A mature assistant typically uses RAG for facts and a memory subsystem for state, with clear boundaries between them so neither gets abused as a substitute for the other.
When to choose each
Choose Agent Memory Patterns if…
- You're building a long-running personal or coding assistant.
- Users expect continuity across sessions.
- You need per-user preferences and behavioural history.
- Your agent learns from feedback over time.
Choose Retrieval-Augmented Generation (RAG) if…
- You need to ground answers in a document corpus (manuals, tickets, papers).
- Your knowledge changes often and must be authoritative.
- You want to cite sources in responses.
- Your task is primarily Q&A, not long-term collaboration.
Frequently asked questions
Isn't RAG a form of memory?
RAG is retrieval over a corpus — it's external knowledge, not agent state. Agent memory is usually about what the agent or user did, preferred, and decided. Implementations overlap (both use vector stores), but the design intent differs.
Can I use the same vector store for both?
Technically yes, but keep namespaces or collections separate. Mixing user memories with corpus chunks is a fast path to privacy leaks and confused retrieval.
Which should VSET students learn first?
Start with RAG — it's the foundation everyone needs. Add memory patterns once your agent genuinely needs continuity; a lot of projects fake memory by just appending the last turn to the prompt.
Sources
- Letta / MemGPT — documentation — accessed 2026-04-20
- LlamaIndex — RAG overview — accessed 2026-04-20