Self-hosting Letta
The recommended way to use Letta locally is with Docker. To install Docker, see Docker’s installation guide. For issues with installing Docker, see Docker’s troubleshooting guide.
Running the Letta Server
To run the server with Docker, run the command:
This will run the Letta server with the OpenAI provider enabled, and store all data in the folder ~/.letta/.persist/pgdata.
If you have many different LLM API keys, you can also set up a .env file instead and pass that to docker run:
Once the Letta server is running, you can access it via port 8283 (e.g. sending REST API requests to http://localhost:8283/v1). You can also connect your server to the Letta ADE to access and manage your agents in a web interface.
Enabling model providers
Self-hosted servers require embedding model configuration. Unlike Letta Cloud which manages embeddings automatically, self-hosted deployments must configure both LLM and embedding models explicitly when creating agents.
The Letta server can be connected to various LLM API backends (OpenAI, Anthropic, vLLM, Ollama, etc.). To enable access to these LLM API providers, set the appropriate environment variables when you use docker run:
Linux users: Use --network host and localhost instead of host.docker.internal:
The example above will make all compatible models running on OpenAI, Anthropic, and Ollama available to your Letta server.
Configuring embedding models
When self-hosting, you must specify an embedding model when creating agents. Letta uses embeddings for archival memory search and retrieval.
Supported embedding providers
When creating agents on your self-hosted server, specify the embedding parameter:
Available embedding models
The embedding model you can use depends on which provider you’ve configured:
OpenAI (requires OPENAI_API_KEY):
openai/text-embedding-3-small(recommended)openai/text-embedding-3-largeopenai/text-embedding-ada-002
Azure OpenAI (requires Azure configuration):
azure/text-embedding-3-smallazure/text-embedding-ada-002
Ollama (requires OLLAMA_BASE_URL):
ollama/mxbai-embed-largeollama/nomic-embed-text- Any embedding model available in your Ollama instance
Letta Cloud difference: When using Letta Cloud, the embedding parameter is optional and managed automatically. Self-hosted servers require explicit embedding configuration.
Optional: Telemetry with ClickHouse
Letta supports optional telemetry using ClickHouse. Telemetry provides observability features like traces, LLM request logging, and performance metrics. See the telemetry guide for setup instructions.
Password protection
When running a self-hosted Letta server in a production environment (i.e. with untrusted users), make sure to enable both password protection (to prevent unauthorized access to your server over the network) and tool sandboxing (to prevent malicious tools from executing in a privledged environment).
To password protect your server, include SECURE=true and LETTA_SERVER_PASSWORD=yourpassword in your docker run command:
With password protection enabled, you will have to provide your password in the bearer token header in your API requests:
Tool sandboxing
To enable tool sandboxing, set the E2B_API_KEY and E2B_SANDBOX_TEMPLATE_ID environment variables (via E2B) when you use docker run.
When sandboxing is enabled, all custom tools (created by users from source code) will be executed in a sandboxed environment.
This does not include MCP tools, which are executed outside of the Letta server (on the MCP server itself), or built-in tools (like memory_insert), whose code cannot be modified after server startup.