Skip to content
Sign up
Experimental

Conversations

Use conversations to run parallel sessions with an agent while sharing memory and searchable message history.

A conversation is a message thread within an agent. A single agent can have multiple conversations running in parallel—each with its own context window, but all sharing the same memory blocks and searchable message history.

This is useful when you want to run several sessions with the same agent simultaneously without them interfering with each other. For example, you might have one conversation where you’re refactoring an API while another is writing tests—both sessions share the agent’s learned context about your codebase.

Create a conversation by specifying the agent ID:

import Letta from "@letta-ai/letta-client";
const client = new Letta({ apiKey: process.env.LETTA_API_KEY });
const conversation = await client.conversations.create({
agent_id: "agent-xxx",
});
console.log(`Created conversation: ${conversation.id}`);

Send messages to a conversation. The response is always a stream:

const stream = await client.conversations.messages.create(conversation.id, {
messages: [{ role: "user", content: "Explain this codebase" }],
stream_tokens: true,
});
for await (const chunk of stream) {
if (chunk.message_type === "assistant_message") {
process.stdout.write(chunk.content);
}
}

List all conversations for an agent:

const conversations = await client.conversations.list({
agent_id: "agent-xxx",
});
for (const conv of conversations) {
console.log(conv.id);
}

Retrieve the message history for a specific conversation:

const messages = await client.conversations.messages.list(conversation.id);
for (const msg of messages) {
console.log(`[${msg.message_type}] ${msg.content || msg.reasoning || ""}`);
}

All conversations within an agent share:

  • Memory blocks: The agent’s core memory (persona, human, project blocks, etc.) is shared across all conversations. When the agent updates a memory block in one conversation, that change is visible in all other conversations.

  • Searchable message history: Messages from all conversations are pooled together in a searchable database. The agent can use conversation_search to recall context from any past conversation, not just the current one.

Each conversation has its own:

  • Context window: The active messages being processed. Long conversations get compacted independently.

  • Message history: The sequence of messages in that specific thread.

This design lets you run parallel sessions that build on shared knowledge while keeping their immediate context separate.

Concurrency: The agents.messages.create endpoint is not thread-safe—concurrent requests to the same agent can cause race conditions. If you need to send messages to an agent from multiple threads or processes simultaneously, use separate conversations. Each conversation has its own message stream that can be written to independently.

Separating context: When your application has clearly distinct interaction sessions (e.g., different user sessions, different tasks), conversations let you keep their context windows separate while still sharing the agent’s learned memory. This prevents unrelated messages from one session polluting another’s context.

Letta Code uses conversations under the hood. Each time you start letta, it creates a new conversation with your last-used agent. Use letta --resume to continue your exact last session, or /resume to browse and switch between past conversations.