Agentic RAG with Letta
In the Agentic RAG approach, we delegate the retrieval process to the agent itself. Instead of your application deciding what to search for, we provide the agent with a custom tool that allows it to query your vector database directly. This makes the agent more autonomous and your client-side code much simpler.
By the end of this tutorial, you’ll have a research assistant that autonomously decides when to search your vector database and what queries to use.
Prerequisites
To follow along, you need free accounts for:
- Letta - To access the agent development platform
- Hugging Face - For generating embeddings (MongoDB and Qdrant users only)
- One of the following vector databases:
- ChromaDB Cloud for a hosted vector database
- MongoDB Atlas for vector search with MongoDB
- Qdrant Cloud for a high-performance vector database
 
You will also need Python 3.8+ or Node.js v18+ and a code editor.
MongoDB and Qdrant users: This guide uses Hugging Face’s Inference API for generating embeddings. This approach keeps the tool code lightweight enough to run in Letta’s sandbox environment.
Getting Your API Keys
We’ll need API keys for Letta and your chosen vector database.
Get your Letta API Key
Get your Vector Database credentials
ChromaDB
MongoDB Atlas
Qdrant
Get your Hugging Face API Token (MongoDB & Qdrant users)
Create Access Token
Click the profile icon in the top right. Navigate to Settings > Access Tokens (or go directly to huggingface.co/settings/tokens).
The free tier includes 30,000 API requests per month, which is more than enough for development and testing.
Once you have these credentials, create a .env file in your project directory. Add the credentials for your chosen database:
Chromadb
MongoDB Atlas
Qdrant
Step 1: Set Up the Vector Database
First, we need to populate your chosen vector database with the content of the research papers. We’ll use two papers for this demo: “Attention Is All You Need” and “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”.
Before we begin, let’s set up our development environment:
Typescript users must update package.json to use ES modules:
Download the research papers using curl with the -L flag to follow redirects:
Verify the PDFs downloaded correctly:
You should see output indicating these are PDF documents, not HTML files.
Install the necessary packages for your chosen database:
Chromadb
MongoDB Atlas
Qdrant
For Python, install with:
Now create a setup.py or setup.ts file to load the PDFs, split them into chunks, and ingest them into your database:
Chromadb
MongoDB Atlas
Qdrant
Run the script from your terminal:
If you are using MongoDB Atlas, you must manually create a vector search index by following the steps below.
Create the Vector Search Index (MongoDB Atlas Only)
MongoDB Atlas users: The setup script ingests your data, but MongoDB Atlas requires you to manually create a vector search index before queries will work. Follow these steps carefully.
Select Database and Collection
- Database: Select rag_demo(or whatever you set asMONGODB_DB_NAME)
- Collection: Select rag_collection
Your vector database is now populated with research paper content and ready to query.
Step 2: Create a Custom Search Tool
A Letta tool is a Python function that your agent can call. We’ll create a function that searches your vector database and returns the results. Letta handles the complexities of exposing this function to the agent securely.
TypeScript users: Letta tools execute in Python, even when called from TypeScript. Create a tools.ts file that exports the Python code as a string constant, which you’ll use in Step 3 to create the tool.
Create a new file named tools.py (Python) or tools.ts (TypeScript) with the appropriate implementation for your database:
ChromaDB
MongoDB Atlas
Qdrant
This function takes a query, connects to your database, retrieves the most relevant documents, and returns them as a single string.
Step 3: Configure an Agentic Research Assistant
Next, we’ll create a new agent. This agent will have a specific persona that instructs it on how to behave and, most importantly, it will be equipped with our new search tool.
Create a file named create_agentic_agent.py (Python) or create_agentic_agent.ts (TypeScript):
TypeScript users: Notice how the TypeScript version imports searchResearchPapersToolCode from tools.ts (the file you created in Step 2). This keeps the code organized, just like the Python version imports from tools.py.
Run this script once to create the agent in your Letta project:
Configure Tool Dependencies and Environment Variables
For the tool to work within Letta’s environment, we need to configure its dependencies and environment variables through the Letta dashboard.
Find your agent
Navigate to your Letta dashboard and find the “Agentic RAG Assistant” agent you just created.
Configure Dependencies
In the ADE, select Tools from the sidebar, find and click on the search_research_papers tool, then click on the Dependencies tab.
Add the following dependencies based on your database:
ChromaDB
MongoDB Atlas
Qdrant

Configure Environment Variables
In the same tool configuration, navigate to Simulator > Environment.
Add the following environment variables with their corresponding values from your .env file:
ChromaDB
MongoDB Atlas
Qdrant
Make sure to click upload button next to the environment variable to update the agent with the variable.

Now, when the agent calls this tool, Letta’s execution environment will know to install the necessary dependencies and will have access to the necessary credentials to connect to your database.
Step 4: Let the Agent Lead the Conversation
With the agentic setup, our client-side code becomes incredibly simple. We no longer need to worry about retrieving context, we just send the user’s raw question to the agent and let it handle the rest.
Create the agentic_rag.py or agentic_rag.ts script:
Replace your-agentic-agent-id with the ID of the new agent you just created.
When you run this script, the agent receives the question, understands from its persona that it needs to search for information, calls the search_research_papers tool, gets the context, and then formulates an answer. All the RAG logic is handled by the agent, not your application.
Next Steps
Now that you’ve integrated Agentic RAG with Letta, you can expand on this foundation:



