Skip to content
Letta Platform Letta Platform Letta Docs
Sign up
Core concepts
Messages

Compaction (summarization)

Configuring compaction settings in the Letta API

When an agent’s conversation history grows too long to fit in its context window, Letta automatically compacts (summarizes) older messages to make room for new ones. The compaction_settings field lets you customize how this compaction works.

If you don’t specify compaction_settings, Letta uses sensible defaults:

  • Mode: sliding_window (keeps recent messages, summarizes older ones)
  • Model: Provider-specific default (claude-haiku-4-5, gpt-5-mini, or gemini-2.5-flash), falling back to the agent’s model
  • Sliding window: sliding_window_percentage=0.3 (targets keeping ~70% of the most recent history; increases the summarized portion in ~10% steps if needed to fit)
  • Summary limit: 50,000 characters

For most use cases, the defaults work well and you don’t need to configure compaction.

Customize compaction_settings when you want to:

  • Use a cheaper/faster model for summarization
  • Preserve more or less recent context
  • Maximize prefix caching by using self-compaction
  • Customize the summarization prompt

All fields are optional. If you don’t specify a model, Letta uses a provider-specific default: claude-haiku-4-5 for Anthropic, gpt-5-mini for OpenAI, gemini-2.5-flash for Google AI. If the provider isn’t recognized, the agent’s own model is used as fallback.

FieldTypeRequiredDescription
modelstringNoSummarizer model handle (format: provider/model-name). If not set, uses a provider-specific default (see above).
model_settingsobjectNoOptional overrides for the summarizer model defaults
promptstringNoCustom system prompt for the summarizer
prompt_acknowledgementbooleanNoWhether to include an acknowledgement post-prompt
clip_charsint | nullNoMax summary length in characters (default: 50,000)
modestringNoCompaction strategy (default: "sliding_window"). See Compaction modes.
sliding_window_percentagefloatNoFraction of messages to summarize (default: 0.3, meaning summarize ~30% and keep ~70%)

There are four compaction modes:

sliding_window (default): Preserves recent messages and summarizes older ones using a separate summarizer call.

Before compaction (10 messages):
[msg1, msg2, msg3, msg4, msg5, msg6, msg7, msg8, msg9, msg10]
|---- oldest ~30% summarized ----|
After compaction:
[summary of msg1-3, msg4, msg5, msg6, msg7, msg8, msg9, msg10]

The sliding_window_percentage controls what fraction of messages get summarized:

  • 0.2 = summarize 20% of messages (keep 80%)
  • 0.5 = summarize 50% of messages (keep 50%)

all: The entire conversation history is summarized in a separate summarizer call. Use when you need maximum space reduction.

self_compact_sliding_window: Same sliding window strategy, but the summarization request includes the agent’s system prompt and tool definitions. The summarization instruction is appended as a user message within the agent’s existing context. This keeps the prompt prefix identical to normal agent requests, improving cache hit rates.

self_compact_all: Same as all, but with the agent’s system prompt and tools included in the request for cache compatibility.

Self-compaction uses the same provider-specific default models as other modes (see above). You can set an explicit model or prompt to override the defaults.

Example: Custom compaction with a separate summarizer

Section titled “Example: Custom compaction with a separate summarizer”
from letta_client import Letta
import os
client = Letta(api_key=os.getenv("LETTA_API_KEY"))
agent = client.agents.create(
name="my_agent",
model="anthropic/claude-sonnet-4-6",
compaction_settings={
"model": "anthropic/claude-haiku-4-5", # Cheaper model for summarization
"mode": "sliding_window",
"sliding_window_percentage": 0.2, # Preserve more context
}
)

Example: Self-compaction for prompt caching

Section titled “Example: Self-compaction for prompt caching”
# Enable self-compaction to maximize prefix cache hits
client.agents.update(
agent_id="agent-xxx",
compaction_settings={
"mode": "self_compact_sliding_window",
}
)