Long-Running Executions

How to handle long-running agent executions

When agents need to execute multiple tool calls or perform complex operations (like deep research, data analysis, or multi-step workflows), processing time can vary significantly.

Letta supports various ways to handle long-running agents, so you can choose the approach that best fits your use case:

Use CaseDurationRecommendedationKey Benefits
Few-step invocations< 1 minuteStandard streamingSimplest approach
Variable length runs1-10 minutesBackground mode (Keepalive + Timeout as a second choice)Easy way to reduce timeouts
Deep research10+ minutesBackground mode, or async pollingSurvives disconnects, resumable streams
Batch jobsAnyAsync pollingFire-and-forget, check results later

Option 1: Background Mode with Resumable Streaming

Best for: Operations exceeding 10 minutes, unreliable network connections, or critical workflows that must complete regardless of client connectivity.

Trade-off: Slightly higher latency to first token due to background task initialization.

Background mode decouples agent execution from your client connection. The agent processes your request on the server while streaming results to a persistent store, allowing you to reconnect and resume from any point — even if your application crashes or network fails.

1curl --request POST \
2 --url http://localhost:8283/v1/agents/$AGENT_ID/messages/stream \
3 --header 'Content-Type: application/json' \
4 --data '{
5 "messages": [
6 {
7 "role": "user",
8 "content": "Run comprehensive analysis on this dataset"
9 }
10 ],
11 "stream_tokens": true,
12 "background": true
13}'
14
15# Response stream includes run_id and seq_id for each chunk:
16data: {"run_id":"run-123","seq_id":0,"message_type":"reasoning_message","reasoning":"Analyzing"}
17data: {"run_id":"run-123","seq_id":1,"message_type":"reasoning_message","reasoning":" the dataset"}
18data: {"run_id":"run-123","seq_id":2,"message_type":"tool_call","tool_call":{...}}
19# ... stream continues
20
21# Step 2: If disconnected, resume from last received seq_id
22curl --request GET \
23 --url http://localhost:8283/v1/runs/$RUN_ID/stream \
24 --header 'Accept: text/event-stream' \
25 --data '{
26 "starting_after": 57
27}'

Discovering and Resuming Active Streams

When your application starts or recovers from a crash, you can check for any active background streams and resume them. This is particularly useful for:

  • Application restarts: Resume processing after deployments or crashes
  • Load balancing: Pick up streams started by other instances
  • Monitoring: Check progress of long-running operations from different clients
1# Step 1: Find active background streams for your agents
2curl --request GET \
3 --url http://localhost:8283/v1/runs/active \
4 --header 'Content-Type: application/json' \
5 --data '{
6 "agent_ids": [
7 "agent-123",
8 "agent-456"
9 ],
10 "background": true
11}'
12# Returns: [{"run_id": "run-abc", "agent_id": "agent-123", "status": "processing", ...}]
13
14# Step 2: Resume streaming from the beginning (or any specified seq_id)
15curl --request GET \
16 --url http://localhost:8283/v1/runs/$RUN_ID/stream \
17 --header 'Accept: text/event-stream' \
18 --data '{
19 "starting_after": 0, # Start from beginning
20 "batch_size": 1000 # Fetch historical chunks in larger batches
21}'

Option 2: Async Operations with Polling

Best for: Usecases where you don’t need real-time token streaming.

Ideal for batch processing, scheduled jobs, or when you don’t need real-time updates. The async SDK method queues your request and returns immediately, letting you check results later:

1# Start async operation (returns immediately with run ID)
2curl --request POST \
3 --url http://localhost:8283/v1/agents/$AGENT_ID/messages/async \
4 --header 'Content-Type: application/json' \
5 --data '{
6 "messages": [
7 {
8 "role": "user",
9 "content": "Run comprehensive analysis on this dataset"
10 }
11 ]
12}'
13
14# Poll for results using the returned run ID
15curl --request GET \
16 --url http://localhost:8283/v1/runs/$RUN_ID

Option 3: Configure Streaming with Keepalive Pings and Longer Timeouts

Best for: Usecases where you are already using the standard streaming code, but are experiencing issues with timeouts or disconnects (e.g. due to network interruptions or hanging tool executions).

Trade-off: Not as reliable as background mode, and does not support resuming a disconnected stream/request.

This approach assumes a persistent HTTP connection. We highly recommend using background mode (or async polling) for long-running jobs, especially when:

  • Your infrastructure uses aggressive proxy timeouts
  • You need to handle network interruptions gracefully
  • Operations might exceed 10 minutes

For operations under 10 minutes that need real-time updates without the complexity of background processing. Configure keepalive pings and timeouts to maintain stable connections:

1curl --request POST \
2 --url http://localhost:8283/v1/agents/$AGENT_ID/messages/stream \
3 --header 'Content-Type: application/json' \
4 --data '{
5 "messages": [
6 {
7 "role": "user",
8 "content": "Execute this long-running analysis"
9 }
10 ],
11 "include_pings": true
12}'

Configuration Guidelines

ParameterPurposeWhen to Use
Timeout in secondsExtends request timeout beyond 60s defaultSet to 1.5x your expected max duration
Include pingsSends keepalive messages every ~30sEnable for operations with long gaps between outputs