Simple RAG Tutorial
Learn how to manage retrieval on the client-side and inject context directly into your agent’s messages.
If you have an existing Retrieval-Augmented Generation (RAG) pipeline, you can connect it to your Letta agents. While Letta provides built-in features like archival memory, you can integrate your own RAG pipeline just as you would with any LLM API. This gives you full control over your data and retrieval methods.
Retrieval-Augmented Generation (RAG) enhances LLM responses by retrieving relevant information from external data sources before generating an answer. Instead of relying on the model’s training data, a RAG system:
Letta supports two approaches for integrating RAG, depending on how much control you want over the retrieval process.
| Aspect | Simple RAG | Agentic RAG |
|---|---|---|
| Who Controls Retrieval | Your application controls when retrieval happens and what the retrieval query is. | The agent decides when to retrieve and what query to use. |
| Context Inclusion | You can always include retrieval results in the context. | Retrieval happens only when the agent determines it’s needed. |
| Latency | Lower – typically single-hop, as the agent doesn’t need to do a tool call. | Higher – requires tool calls for retrieval. |
| Client Code | More complex, as it handles retrieval logic. | Simpler, as it just sends the user query. |
| Customization | You have full control via your retrieval function. | You have full control via your custom tool definition. |
Both approaches work with any vector database. Our tutorials include examples for ChromaDB, MongoDB Atlas, and Qdrant.
Ready to integrate RAG with your Letta agents?
Simple RAG Tutorial
Learn how to manage retrieval on the client-side and inject context directly into your agent’s messages.
Agentic RAG Tutorial
Learn how to empower your agent with custom search tools for autonomous retrieval.