Agent Memory
What is agent memory?
Agent memory in Letta is about managing what information is in the agent’s context window.
The context window is a scarce resource - you can’t fit everything into it. Effective memory management is about deciding what stays in context (immediately visible) and what moves to external storage (retrieved when needed).
Agent memory enables AI agents to maintain persistent state, learn from interactions, and develop long-term relationships with users. Unlike traditional chatbots that treat each conversation as isolated, agents with sophisticated memory systems can build understanding over time.
Types of Memory in Letta
Letta agents have access to multiple memory systems:
Core Memory (In-Context)
Memory blocks are structured sections of the agent’s context window that persist across all interactions. They are always visible - no retrieval needed.
Memory blocks are Letta’s core abstraction. You can create blocks with any descriptive label - the agent learns how to use them autonomously. This enables everything from simple user preferences to sophisticated multi-agent coordination.
Learn more about memory blocks →
External Memory (Out-of-Context)
External memory provides unlimited storage for information that doesn’t need to be always visible. Agents retrieve from external memory on-demand using search tools.
Letta provides several built-in external memory systems:
- Conversation search - Search past messages using full-text and semantic search
- Archival memory - Agent-managed semantically searchable database for facts and knowledge
- Letta Filesystem - File management system for documents and data (learn more)
Agents can also access any external data source through MCP servers or custom tools - databases, APIs, vector stores, or third-party services.
How Agents Manage Their Memory
What makes Letta unique is that agents don’t just read from memory - they actively manage it. Unlike traditional RAG systems that passively retrieve information, Letta agents use built-in tools to decide what to remember, update, and search for.
When a user mentions they’ve switched from Python to TypeScript, the agent may choose to update its memory:
Agents have three primary tools for editing memory blocks:
memory_replace
- Search and replace for precise editsmemory_insert
- Insert a line into a blockmemory_rethink
- Rewrite an entire block
These tools can be attached or detached based on your use case. Not all agents need all tools (for example, some agents may not need memory_rethink
), and memory tools can be removed entirely from an agent if needed.
The agent decides what information is important enough to persist in its memory blocks, actively maintaining this information over time. This enables agents to build understanding through conversation rather than just retrieving relevant documents.
Memory Blocks vs RAG
Traditional RAG retrieves semantically similar chunks on-demand. Letta’s memory blocks are persistent, structured context that agents actively maintain.
Use memory blocks for:
- Information that should always be visible (user preferences, agent persona)
- Knowledge that evolves over time (project status, learned preferences)
Use external memory (RAG-style) for:
- Large document collections
- Historical conversation logs
- Static reference material
Best practice: Use both together. Memory blocks hold the “executive summary” while external storage holds the full details.
Research Background
Letta is built by the creators of MemGPT, a research paper that introduced the concept of an “LLM Operating System” for memory management. The base agent design in Letta is a MemGPT-style agent, which inherits core principles of self-editing memory, memory hierarchy, and intelligent context window management.