Customize your agent memory

Letta agents have programmable in-context memory. This means a section of the context window is reserved for editable memory: context that can be edited by memory editing tools. Like standard system prompts, the memory also can be used to define the behavior of the agent and store personalization data. The key distinction is that this data can be modified over time.

Memory

The in-context (i.e. core) memory of agents is represented by a Memory object. This memory object contains:

  • A set of Block objects representing a segment of memory, with an associated character limit, label, and value
  • A set of memory editing tools, which allow the agent to self-edit its own memory

Agent Core Memory

The core memory of an agent is a set of blocks, with an associated character limit, label, and value. The contents of core memory is always stored in the agent’s context window, so the information is always provided to the LLM during inference.

Archival memory and recall memory are not stored in-context unless explicitly retrieved.

Blocks

Blocks are the basic unit of core memory. A set of blocks makes up the core memory.

Each block has:

  • A limit, corresponding to the character limit of the block (i.e. how many characters in the context window can be used up by this block)
  • A value, corresponding to the data represented in the context window for this block
  • A label, corresponding to the type of data represented in the block (e.g. human, persona)

Creating agents with memory

When you create an agent in Letta, you initialize the core memory of the agent using the “memory blocks” field. Each block has a value (the memory contents), as well as a label which tells the agent what the memory is used about (e.g. “human” refers to memories about the human user).

1# install letta_client with `pip install letta-client`
2from letta_client import Letta
3
4# create a client to connect to your local Letta Server
5client = Letta(
6 base_url="http://localhost:8283"
7)
8
9# create an agent with two basic self-editing memory blocks
10agent_state = client.agents.create(
11 memory_blocks=[
12 {
13 "label": "human",
14 "value": "The human's name is Bob the Builder.",
15 "limit": 5000
16 },
17 {
18 "label": "persona",
19 "value": "My name is Sam, the all-knowing sentient AI.",
20 "limit": 5000
21 }
22 ],
23 model="openai/gpt-4o-mini",
24 embedding="openai/text-embedding-3-small"
25)

Shared Memory

You can create blocks independently of agents. This allows for multiple agents to be attached to a block. This allows of synchronized context windows accross agents, enabling shared memory.

1# create a persisted block, which can be attached to agents
2block = client.blocks.create(
3 label="organization",
4 value="Organization: Letta",
5 limit=4000,
6)
7
8# create an agent with both a shared block and its own blocks
9shared_block_agent1 = client.agents.create(
10 name="shared_block_agent1",
11 memory_blocks=[
12 {
13 "label": "persona",
14 "value": "I am agent 1"
15 },
16 ],
17 block_ids=[block.id],
18 model="openai/gpt-4o-mini",
19 embedding="openai/text-embedding-3-small"
20)
21
22# create another agent sharing the block
23shared_block_agent2 = client.agents.create(
24 name="shared_block_agent2",
25 memory_blocks=[
26 {
27 "label": "persona",
28 "value": "I am agent 2"
29 },
30 ],
31 block_ids=[block.id],
32 model="openai/gpt-4o-mini",
33 embedding="openai/text-embedding-3-small"
34)

Stateful Workflows (advanced)

In some advanced usecases, you may want your agent to have persistent memory while not retaining conversation history. For example, if you are using a Letta agent as a “workflow” that’s run many times across many different users, you may not want to keep the conversation or event history inside of the message buffer.

You can create a stateful agent that does not retain conversation (event) history (i.e. a “stateful workflow”) by setting the message_buffer_autoclear flag to true during agent creation. If set to true (default false), the message history will not be persisted in-context between requests (though the agent will still have access to core, archival, and recall memory).

Built with