Skip to content
  • Auto
  • Light
  • Dark
DiscordForumGitHubSign up
View as Markdown
Copy Markdown

Open in Claude
Open in ChatGPT

Agentic RAG with Letta

In the Agentic RAG approach, we delegate the retrieval process to the agent itself. Instead of your application deciding what to search for, we provide the agent with a custom tool that allows it to query your vector database directly. This makes the agent more autonomous and your client-side code much simpler.

By the end of this tutorial, you’ll have a research assistant that autonomously decides when to search your vector database and what queries to use.

To follow along, you need free accounts for:

  • Letta - To access the agent development platform
  • Hugging Face - For generating embeddings (MongoDB and Qdrant users only)
  • One of the following vector databases:

You will also need Python 3.8+ or Node.js v18+ and a code editor.

We’ll need API keys for Letta and your chosen vector database.

Get your Letta API Key
  1. Create a Letta Account

    If you don’t have one, sign up for a free account at letta.com.

  2. Navigate to API Keys

    Once logged in, click on API keys in the sidebar.

    Letta API Key Navigation

  3. Create and Copy Your Key

    Click + Create API key, give it a descriptive name, and click Confirm. Copy the key and save it somewhere safe.

Get your Vector Database credentials
  1. Create a ChromaDB Cloud Account

    Sign up for a free account on the ChromaDB Cloud website.

  2. Create a New Database

    From your dashboard, create a new database.

    ChromaDB New Project

  3. Get Your API Key and Host

    In your project settings, you’ll find your API Key, Tenant, Database, and Host URL. We’ll need all of these for our scripts.

    ChromaDB Keys

Get your Hugging Face API Token (MongoDB & Qdrant users)
  1. Create a Hugging Face Account

    Sign up for a free account at huggingface.co.

  2. Create Access Token

    Click the profile icon in the top right. Navigate to Settings > Access Tokens (or go directly to huggingface.co/settings/tokens).

  3. Generate New Token

    Click New token, give it a name (e.g., “Letta RAG Demo”), select Read role, and click Create token. Copy the token and save it securely.

    Hugging Face Token

Once you have these credentials, create a .env file in your project directory. Add the credentials for your chosen database:

Terminal window
LETTA_API_KEY="..."
CHROMA_API_KEY="..."
CHROMA_TENANT="..."
CHROMA_DATABASE="..."

First, we need to populate your chosen vector database with the content of the research papers. We’ll use two papers for this demo: “Attention Is All You Need” and “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”.

Before we begin, let’s set up our development environment:

Python
# Create a Python virtual environment to keep dependencies isolated
python -m venv venv
source venv/bin/activate # On Windows, use: venv\Scripts\activate
TypeScript
# Create a new Node.js project
npm init -y
# Create tsconfig.json for TypeScript configuration
cat > tsconfig.json << 'EOF'
{
"compilerOptions": {
"target": "ES2020",
"module": "ESNext",
"moduleResolution": "node",
"esModuleInterop": true,
"skipLibCheck": true,
"strict": true
}
}
EOF

Typescript users must update package.json to use ES modules:

"type": "module"

Download the research papers using curl with the -L flag to follow redirects:

curl -L -o 1706.03762.pdf https://arxiv.org/pdf/1706.03762.pdf
curl -L -o 1810.04805.pdf https://arxiv.org/pdf/1810.04805.pdf

Verify the PDFs downloaded correctly:

file 1706.03762.pdf 1810.04805.pdf

You should see output indicating these are PDF documents, not HTML files.

Install the necessary packages for your chosen database:

Python
# requirements.txt
letta-client
chromadb
pypdf
python-dotenv
TypeScript
npm install @letta-ai/letta-client dotenv
npm install --save-dev typescript @types/node tsx

For Python, install with:

Terminal window
pip install -r requirements.txt

Now create a setup.py or setup.ts file to load the PDFs, split them into chunks, and ingest them into your database:

Python
import os
import chromadb
import pypdf
from dotenv import load_dotenv
load_dotenv()
def main(): # Connect to ChromaDB Cloud
client = chromadb.CloudClient(
tenant=os.getenv("CHROMA_TENANT"),
database=os.getenv("CHROMA_DATABASE"),
api_key=os.getenv("CHROMA_API_KEY")
)
# Create or get the collection
collection = client.get_or_create_collection("rag_collection")
# Ingest PDFs
pdf_files = ["1706.03762.pdf", "1810.04805.pdf"]
for pdf_file in pdf_files:
print(f"Ingesting {pdf_file}...")
reader = pypdf.PdfReader(pdf_file)
for i, page in enumerate(reader.pages):
collection.add(
ids=[f"{pdf_file}-{i}"],
documents=[page.extract_text()]
)
print("\nIngestion complete!")
print(f"Total documents in collection: {collection.count()}")
if **name** == "**main**":
main()
TypeScript
import { CloudClient } from 'chromadb';
import { DefaultEmbeddingFunction } from '@chroma-core/default-embed';
import * as dotenv from 'dotenv';
import * as path from 'path';
import * as fs from 'fs';
import { pdfToPages } from 'pdf-ts';
dotenv.config();
async function main() {
// Connect to ChromaDB Cloud
const client = new CloudClient({
apiKey: process.env.CHROMA_API_KEY || '',
tenant: process.env.CHROMA_TENANT || '',
database: process.env.CHROMA_DATABASE || ''
});
// Create embedding function
const embedder = new DefaultEmbeddingFunction();
// Create or get the collection
const collection = await client.getOrCreateCollection({
name: 'rag_collection',
embeddingFunction: embedder
});
// Ingest PDFs
const pdfFiles = ['1706.03762.pdf', '1810.04805.pdf'];
for (const pdfFile of pdfFiles) {
console.log(`Ingesting ${pdfFile}...`);
const pdfPath = path.join(__dirname, pdfFile);
const dataBuffer = fs.readFileSync(pdfPath);
const pages = await pdfToPages(dataBuffer);
for (let i = 0; i < pages.length; i++) {
const text = pages[i].text.trim();
if (text) {
await collection.add({
ids: [`${pdfFile}-${i}`],
documents: [text]
});
}
}
}
console.log('\nIngestion complete!');
const count = await collection.count();
console.log(`Total documents in collection: ${count}`);
}
main().catch(console.error);

Run the script from your terminal:

Python
python setup.py
TypeScript
npx tsx setup.ts

If you are using MongoDB Atlas, you must manually create a vector search index by following the steps below.

Create the Vector Search Index (MongoDB Atlas Only)
  1. Navigate to Atlas Search

    Log in to your MongoDB Atlas dashboard, navigate to your cluster, and click on the “Atlas Search” tab.

  2. Create Search Index

    Click “Create Search Index”, then choose “JSON Editor” (not “Visual Editor”).

  3. Select Database and Collection

    • Database: Select rag_demo (or whatever you set as MONGODB_DB_NAME)
    • Collection: Select rag_collection
  4. Name and Configure Index

    • Index Name: Enter vector_index (this exact name is required by the code)
    • Paste this JSON definition:
    {
    "fields": [
    {
    "type": "vector",
    "path": "embedding",
    "numDimensions": 384,
    "similarity": "cosine"
    }
    ]
    }

    Note: 384 dimensions is for Hugging Face’s BAAI/bge-small-en-v1.5 model.

  5. Create and Wait

    Click “Create Search Index”. The index will take a few minutes to build. Wait until the status shows as “Active” before proceeding.

Your vector database is now populated with research paper content and ready to query.

A Letta tool is a Python function that your agent can call. We’ll create a function that searches your vector database and returns the results. Letta handles the complexities of exposing this function to the agent securely.

Create a new file named tools.py (Python) or tools.ts (TypeScript) with the appropriate implementation for your database:

Python
def search_research_papers(query_text: str, n_results: int = 1) -> str:
"""
Searches the research paper collection for a given query.
Args:
query_text (str): The text to search for.
n_results (int): The number of results to return.
Returns:
str: The most relevant document found.
"""
import chromadb
import os
# ChromaDB Cloud Client
# This tool code is executed on the Letta server. It expects the ChromaDB
# credentials to be passed as environment variables.
api_key = os.getenv("CHROMA_API_KEY")
tenant = os.getenv("CHROMA_TENANT")
database = os.getenv("CHROMA_DATABASE")
if not all([api_key, tenant, database]):
raise ValueError("CHROMA_API_KEY, CHROMA_TENANT, and CHROMA_DATABASE must be set as environment variables.")
client = chromadb.CloudClient(
tenant=tenant,
database=database,
api_key=api_key
)
collection = client.get_or_create_collection("rag_collection")
try:
results = collection.query(
query_texts=[query_text],
n_results=n_results
)
document = results['documents'][0][0]
return document
except Exception as e:
return f"Tool failed with error: {e}"
TypeScript
/**
* This file contains the Python tool code as a string.
* Letta tools execute in Python, so we define the Python source code here.
*/
export const searchResearchPapersToolCode = `def search_research_papers(query_text: str, n_results: int = 1) -> str:
"""
Searches the research paper collection for a given query.
Args:
query_text (str): The text to search for.
n_results (int): The number of results to return.
Returns:
str: The most relevant document found.
"""
import chromadb
import os
# ChromaDB Cloud Client
# This tool code is executed on the Letta server. It expects the ChromaDB
# credentials to be passed as environment variables.
api_key = os.getenv("CHROMA_API_KEY")
tenant = os.getenv("CHROMA_TENANT")
database = os.getenv("CHROMA_DATABASE")
if not all([api_key, tenant, database]):
raise ValueError("CHROMA_API_KEY, CHROMA_TENANT, and CHROMA_DATABASE must be set as environment variables.")
client = chromadb.CloudClient(
tenant=tenant,
database=database,
api_key=api_key
)
collection = client.get_or_create_collection("rag_collection")
try:
results = collection.query(
query_texts=[query_text],
n_results=n_results
)
document = results['documents'][0][0]
return document
except Exception as e:
return f"Tool failed with error: {e}"
`;

This function takes a query, connects to your database, retrieves the most relevant documents, and returns them as a single string.

Step 3: Configure an Agentic Research Assistant

Section titled “Step 3: Configure an Agentic Research Assistant”

Next, we’ll create a new agent. This agent will have a specific persona that instructs it on how to behave and, most importantly, it will be equipped with our new search tool.

Create a file named create_agentic_agent.py (Python) or create_agentic_agent.ts (TypeScript):

Python
import os
from letta_client import Letta
from dotenv import load_dotenv
from tools import search_research_papers
load_dotenv()
# Initialize the Letta client
client = Letta(token=os.getenv("LETTA_API_KEY"))
# Create a tool from our Python function
search_tool = client.tools.create_from_function(func=search_research_papers)
# Define the agent's persona
persona = """You are a world-class research assistant. Your goal is to answer questions accurately by searching through a database of research papers. When a user asks a question, first use the `search_research_papers` tool to find relevant information. Then, answer the user's question based on the information returned by the tool."""
# Create the agent with the tool attached
agent = client.agents.create(
name="Agentic RAG Assistant",
description="A smart agent that can search a vector database to answer questions.",
memory_blocks=[
{
"label": "persona",
"value": persona
}
],
tools=[search_tool.name]
)
print(f"Agent '{agent.name}' created with ID: {agent.id}")
TypeScript
import { LettaClient } from '@letta-ai/letta-client';
import * as dotenv from 'dotenv';
import { searchResearchPapersToolCode } from './tools.js';
dotenv.config();
async function main() {
// Initialize the Letta client
const client = new LettaClient({
token: process.env.LETTA_API_KEY || ''
});
// Create the tool from the Python code imported from tools.ts
const searchTool = await client.tools.create({
sourceCode: searchResearchPapersToolCode,
sourceType: 'python'
});
console.log(`Tool '${searchTool.name}' created with ID: ${searchTool.id}`);
// Define the agent's persona
const persona = `You are a world-class research assistant. Your goal is to answer questions accurately by searching through a database of research papers. When a user asks a question, first use the \`search_research_papers\` tool to find relevant information. Then, answer the user's question based on the information returned by the tool.`;
// Create the agent with the tool attached
const agent = await client.agents.create({
name: 'Agentic RAG Assistant',
description: 'A smart agent that can search a vector database to answer questions.',
memoryBlocks: [
{
label: 'persona',
value: persona
}
],
toolIds: [searchTool.id]
});
console.log(`Agent '${agent.name}' created with ID: ${agent.id}`);
}
main().catch(console.error);

Run this script once to create the agent in your Letta project:

Python
python create_agentic_agent.py
TypeScript
npx tsx create_agentic_agent.ts

Configure Tool Dependencies and Environment Variables

Section titled “Configure Tool Dependencies and Environment Variables”

For the tool to work within Letta’s environment, we need to configure its dependencies and environment variables through the Letta dashboard.

  1. Find your agent

    Navigate to your Letta dashboard and find the “Agentic RAG Assistant” agent you just created.

  2. Access the ADE

    Click on your agent to open the Agent Development Environment (ADE).

  3. Configure Dependencies

    In the ADE, select Tools from the sidebar, find and click on the search_research_papers tool, then click on the Dependencies tab.

    Add the following dependencies based on your database:

    chromadb

    Letta Dependencies Configuration

  4. Configure Environment Variables

    In the same tool configuration, navigate to Simulator > Environment.

    Add the following environment variables with their corresponding values from your .env file:

    CHROMA_API_KEY
    CHROMA_TENANT
    CHROMA_DATABASE

    Make sure to click upload button next to the environment variable to update the agent with the variable.

    Letta Tool Configuration

Now, when the agent calls this tool, Letta’s execution environment will know to install the necessary dependencies and will have access to the necessary credentials to connect to your database.

Step 4: Let the Agent Lead the Conversation

Section titled “Step 4: Let the Agent Lead the Conversation”

With the agentic setup, our client-side code becomes incredibly simple. We no longer need to worry about retrieving context, we just send the user’s raw question to the agent and let it handle the rest.

Create the agentic_rag.py or agentic_rag.ts script:

Python
import os
from letta_client import Letta
from dotenv import load_dotenv
load_dotenv()
# Initialize client
letta_client = Letta(token=os.getenv("LETTA_API_KEY"))
AGENT_ID = "your-agentic-agent-id" # Replace with your new agent ID
def main():
while True:
user_query = input("\nAsk a question about the research papers: ")
if user_query.lower() in ['exit', 'quit']:
break
response = letta_client.agents.messages.create(
agent_id=AGENT_ID,
messages=[{"role": "user", "content": user_query}]
)
for message in response.messages:
if message.message_type == 'assistant_message':
print(f"\nAgent: {message.content}")
if **name** == "**main**":
main()
TypeScript
import { LettaClient } from '@letta-ai/letta-client';
import * as dotenv from 'dotenv';
import * as readline from 'readline';
dotenv.config();
const AGENT_ID = 'your-agentic-agent-id'; // Replace with your new agent ID
async function main() {
// Initialize client
const lettaClient = new LettaClient({
token: process.env.LETTA_API_KEY || ''
});
const rl = readline.createInterface({
input: process.stdin,
output: process.stdout
});
const askQuestion = (query: string): Promise<string> => {
return new Promise((resolve) => {
rl.question(query, resolve);
});
};
while (true) {
const userQuery = await askQuestion('\nAsk a question about the research papers (or type "exit" to quit): ');
if (userQuery.toLowerCase() === 'exit' || userQuery.toLowerCase() === 'quit') {
rl.close();
break;
}
const response = await lettaClient.agents.messages.create(AGENT_ID, {
messages: [{ role: 'user', content: userQuery }]
});
for (const message of response.messages) {
if (message.messageType === 'assistant_message') {
console.log(`\nAgent: ${(message as any).content}`);
}
}
}
}
main().catch(console.error);

When you run this script, the agent receives the question, understands from its persona that it needs to search for information, calls the search_research_papers tool, gets the context, and then formulates an answer. All the RAG logic is handled by the agent, not your application.

Now that you’ve integrated Agentic RAG with Letta, you can expand on this foundation:

Simple RAG

Learn how to manage retrieval on the client-side for complete control.

Custom Tools

Explore creating more advanced custom tools for your agents.