Data Sources & Files

Connecting agents to external data and files

Data sources allow you to easily connect your agents to external files, for example: research papers, reports, medical records, or any other data in common text formats (.pdf, .txt, .md, .json, etc.). A data source can consist of many files, which can be uploaded via the ADE or API.

Once a file has been uploaded to a data source, the agent can access it using a set of file tools. The file is automatically chunked and embedded to allow the agent to use semantic search to find relevant information in the file (in addition to standard text-based search).

If you’ve used Claude Projects before, you can think of a data source in Letta as a “project”, except in Letta you can connect a single agent to multiple projects (in Claude Projects, a chat session can only be associated with a single project).

File tools

When a data source is attached to an agent, Letta automatically attaches a set of file tools to the agent:

  • open_file: Open a file to a specific location
  • grep_file: Search a file using a regular expression
  • search_file: Search a file using semantic (embedding-based) search

To detach these tools from your agent, simply detach all your data sources, the file tools will be automatically removed.

Creating a data source

ADE

To create a data source click the “data sources” tab in the bottom-left of the ADE, then click the “create data source” button. When you create a data source inside the ADE, it will be automatically attached to your agent.

API / SDK

To create a data source, you will need to specify a unique name as well as an EmbeddingConfig:

1# get an available embedding_config
2embedding_configs = client.models.list_embedding_models()
3embedding_config = embedding_configs[0]
4
5# create the source
6source = client.sources.create(
7 name="my_source",
8 embedding_config=embedding_config
9)

Now that you’ve created the source, you can start loading data into the source.

Uploading a file into a data source

ADE

Click the “data sources” tab in the bottom-left of the ADE to view your attached data sources. To upload a file, simply drag and drop the file into the data sources tab, or click the upload (+) button.

API / SDK

Uploading a file to a source will create an async job for processing the file, which will split the file into chunks and embed them.

1# upload a file into the source
2job = client.sources.files.upload(
3 source_id=source.id,
4 file=open("my_file.txt", "rb")
5)
6
7# wait until the job is completed
8while True:
9 job = client.jobs.retrieve(job.id)
10 if job.status == "completed":
11 break
12 elif job.status == "failed":
13 raise ValueError(f"Job failed: {job.metadata}")
14 print(f"Job status: {job.status}")
15 time.sleep(1)

Once the job is completed, you can list the files and the generated passages in the source:

1# list files in the source
2files = client.sources.files.list(source_id=source.id)
3print(f"Files in source: {files}")
4
5# list passages in the source
6passages = client.sources.passages.list(source_id=source.id)
7print(f"Passages in source: {passages}")

Listing available data sources

You can view available data sources by listing them:

1# list sources
2sources = client.sources.list()

Connecting a data source to an agent

When you attach a data source to an agent, the files inside the data source will become visible inside the agent’s context window. By default, only a limited “window” of the file will be visible to prevent context window overflow - the agent can use the file tools to browse through the files and search for information.

Attaching the data source

ADE

When you create a data source inside the ADE, it will be automatically attached to your agent. You can also attach existing data sources by clicking the “attach existing” button in the data sources tab.

API / SDK

You can attach a source to an agent by specifying both the source and agent IDs:

1client.agents.sources.attach(agent_id=agent.id, source_id=source.id)

Note that your agent and source must be configured with the same embedding model, to ensure that the agent is able to search accross a common embedding space for archival memory.

Detaching the data source

ADE

To detach a data source from an agent, click the “detach” button in the data sources tab.

API / SDK

Detaching a data source will remove the files from the agent’s context window:

1client.agents.sources.detach(agent_id=agent.id, source_id=source.id)