RAG

RAG with Letta

Overview of Retrieval-Augmented Generation patterns with Letta agents.

If you have an existing retrieval-augmented generation (RAG) pipeline, you can connect it to your Letta agents. While Letta provides built-in features like archival memory, you can integrate your own RAG pipeline just as you would with any LLM API. This gives you full control over your data and retrieval methods.

What is RAG?

RAG enhances LLM responses by retrieving relevant information from external data sources before generating an answer. Instead of relying on the model’s training data, a RAG system:

Takes a user query
Searches a vector database for relevant documents
Includes those documents in the LLM’s context
Generates an informed response based on the retrieved information

Choosing your RAG approach

Letta supports two approaches for integrating RAG, depending on how much control you want over the retrieval process.

Aspect	Simple RAG	Agentic RAG
Who controls retrieval	Your application controls when retrieval happens and what the retrieval query is	The agent decides when to retrieve and what query to use
Context inclusion	You can always include retrieval results in the context	Retrieval happens only when the agent determines it’s needed
Latency	Lower – typically a single LLM call, as the agent doesn’t need tool calling	Higher – requires tool calls for retrieval
Client code	More complex code, as the client handles retrieval logic	Simpler code, as the client just sends the user query
Customization	You have full control via your retrieval function	You have full control via your custom tool definition

Both approaches work with any vector database. Our tutorials include examples for ChromaDB, MongoDB Atlas, and Qdrant.

Next steps

Ready to integrate RAG with your Letta agents?

Simple RAG tutorial

Learn how to manage retrieval on the client side and inject context directly into your agent’s messages.

Agentic RAG tutorial

Learn how to empower your agent with custom search tools for autonomous retrieval.

Additional resources

Custom tools: Learn more about creating custom tools for your agents.
Memory management: Discover how Letta’s built-in memory works.
Agent Development Environment: Configure and test your agents in the web interface.