Self-hosting Letta

Learn how to run your own Letta server

The recommended way to use Letta locally is with Docker. To install Docker, see Docker’s installation guide. For issues with installing Docker, see Docker’s troubleshooting guide.

Running the Letta Server

To run the server with Docker, run the command:

1# replace `~/.letta/.persist/pgdata` with wherever you want to store your agent data
2docker run \
3 -v ~/.letta/.persist/pgdata:/var/lib/postgresql/data \
4 -p 8283:8283 \
5 -e OPENAI_API_KEY="your_openai_api_key" \
6 letta/letta:latest

This will run the Letta server with the OpenAI provider enabled, and store all data in the folder ~/.letta/.persist/pgdata.

If you have many different LLM API keys, you can also set up a .env file instead and pass that to docker run:

1# using a .env file instead of passing environment variables
2docker run \
3 -v ~/.letta/.persist/pgdata:/var/lib/postgresql/data \
4 -p 8283:8283 \
5 --env-file .env \
6 letta/letta:latest

Once the Letta server is running, you can access it via port 8283 (e.g. sending REST API requests to http://localhost:8283/v1). You can also connect your server to the Letta ADE to access and manage your agents in a web interface.

Enabling model providers

Self-hosted servers require embedding model configuration. Unlike Letta Cloud which manages embeddings automatically, self-hosted deployments must configure both LLM and embedding models explicitly when creating agents.

The Letta server can be connected to various LLM API backends (OpenAI, Anthropic, vLLM, Ollama, etc.). To enable access to these LLM API providers, set the appropriate environment variables when you use docker run:

1# replace `~/.letta/.persist/pgdata` with wherever you want to store your agent data
2docker run \
3 -v ~/.letta/.persist/pgdata:/var/lib/postgresql/data \
4 -p 8283:8283 \
5 -e OPENAI_API_KEY="your_openai_api_key" \
6 -e ANTHROPIC_API_KEY="your_anthropic_api_key" \
7 -e OLLAMA_BASE_URL="http://host.docker.internal:11434" \
8 letta/letta:latest

Linux users: Use --network host and localhost instead of host.docker.internal:

1docker run \
2 -v ~/.letta/.persist/pgdata:/var/lib/postgresql/data \
3 --network host \
4 -e OPENAI_API_KEY="your_openai_api_key" \
5 -e ANTHROPIC_API_KEY="your_anthropic_api_key" \
6 -e OLLAMA_BASE_URL="http://localhost:11434" \
7 letta/letta:latest

The example above will make all compatible models running on OpenAI, Anthropic, and Ollama available to your Letta server.

Configuring embedding models

When self-hosting, you must specify an embedding model when creating agents. Letta uses embeddings for archival memory search and retrieval.

Supported embedding providers

When creating agents on your self-hosted server, specify the embedding parameter:

1from letta_client import Letta
2
3# Connect to your self-hosted server
4client = Letta(base_url="http://localhost:8283")
5
6# Create agent with explicit embedding configuration
7agent = client.agents.create(
8 model="openai/gpt-4o-mini",
9 embedding="openai/text-embedding-3-small", # Required for self-hosted
10 memory_blocks=[
11 {"label": "persona", "value": "I am a helpful assistant."}
12 ]
13)

Available embedding models

The embedding model you can use depends on which provider you’ve configured:

OpenAI (requires OPENAI_API_KEY):

  • openai/text-embedding-3-small (recommended)
  • openai/text-embedding-3-large
  • openai/text-embedding-ada-002

Azure OpenAI (requires Azure configuration):

  • azure/text-embedding-3-small
  • azure/text-embedding-ada-002

Ollama (requires OLLAMA_BASE_URL):

  • ollama/mxbai-embed-large
  • ollama/nomic-embed-text
  • Any embedding model available in your Ollama instance

Letta Cloud difference: When using Letta Cloud, the embedding parameter is optional and managed automatically. Self-hosted servers require explicit embedding configuration.

Optional: Telemetry with ClickHouse

Letta supports optional telemetry using ClickHouse. Telemetry provides observability features like traces, LLM request logging, and performance metrics. See the telemetry guide for setup instructions.

Password protection

When running a self-hosted Letta server in a production environment (i.e. with untrusted users), make sure to enable both password protection (to prevent unauthorized access to your server over the network) and tool sandboxing (to prevent malicious tools from executing in a privledged environment).

To password protect your server, include SECURE=true and LETTA_SERVER_PASSWORD=yourpassword in your docker run command:

1# If LETTA_SERVER_PASSWORD isn't set, the server will autogenerate a password
2docker run \
3 -v ~/.letta/.persist/pgdata:/var/lib/postgresql/data \
4 -p 8283:8283 \
5 --env-file .env \
6 -e SECURE=true \
7 -e LETTA_SERVER_PASSWORD=yourpassword \
8 letta/letta:latest

With password protection enabled, you will have to provide your password in the bearer token header in your API requests:

1// install letta-client with `npm install @letta-ai/letta-client`
2import { LettaClient } from '@letta-ai/letta-client'
3
4// create the client with the token set to your password
5const client = new LettaClient({
6 baseUrl: "http://localhost:8283",
7 token: "yourpassword"
8});

Tool sandboxing

To enable tool sandboxing, set the E2B_API_KEY and E2B_SANDBOX_TEMPLATE_ID environment variables (via E2B) when you use docker run. When sandboxing is enabled, all custom tools (created by users from source code) will be executed in a sandboxed environment.

This does not include MCP tools, which are executed outside of the Letta server (on the MCP server itself), or built-in tools (like memory_insert), whose code cannot be modified after server startup.