Long-Running Executions
How to handle long-running agent executions
When agents need to execute multiple tool calls or perform complex operations (like deep research, data analysis, or multi-step workflows), processing time can vary significantly.
Letta supports various ways to handle long-running agents, so you can choose the approach that best fits your use case:
Option 1: Background Mode with Resumable Streaming
Best for: Operations exceeding 10 minutes, unreliable network connections, or critical workflows that must complete regardless of client connectivity.
Trade-off: Slightly higher latency to first token due to background task initialization.
Background mode decouples agent execution from your client connection. The agent processes your request on the server while streaming results to a persistent store, allowing you to reconnect and resume from any point — even if your application crashes or network fails.
Discovering and Resuming Active Streams
When your application starts or recovers from a crash, you can check for any active background streams and resume them. This is particularly useful for:
- Application restarts: Resume processing after deployments or crashes
- Load balancing: Pick up streams started by other instances
- Monitoring: Check progress of long-running operations from different clients
Option 2: Async Operations with Polling
Best for: Usecases where you don’t need real-time token streaming.
Ideal for batch processing, scheduled jobs, or when you don’t need real-time updates. The async SDK method queues your request and returns immediately, letting you check results later:
Option 3: Configure Streaming with Keepalive Pings and Longer Timeouts
Best for: Usecases where you are already using the standard streaming code, but are experiencing issues with timeouts or disconnects (e.g. due to network interruptions or hanging tool executions).
Trade-off: Not as reliable as background mode, and does not support resuming a disconnected stream/request.
This approach assumes a persistent HTTP connection. We highly recommend using background mode (or async polling) for long-running jobs, especially when:
- Your infrastructure uses aggressive proxy timeouts
- You need to handle network interruptions gracefully
- Operations might exceed 10 minutes
For operations under 10 minutes that need real-time updates without the complexity of background processing. Configure keepalive pings and timeouts to maintain stable connections: