Skip to content
Letta Platform Letta Platform Letta Docs
Sign up
Core concepts
Messages

Compaction (summarization)

Configuring compaction settings in the Letta API

When an agent’s conversation history grows too long to fit in its context window, Letta automatically compacts (summarizes) older messages to make room for new ones. The compaction_settings field lets you customize how this compaction works.

If you don’t specify compaction_settings, Letta uses sensible defaults:

  • Mode: sliding_window (keeps recent messages, summarizes older ones)
  • Model: Same as the agent’s main model
  • Sliding window: sliding_window_percentage=0.3 (targets keeping ~70% of the most recent history; increases the summarized portion in ~10% steps if needed to fit)
  • Summary limit: 2000 characters

For most use cases, the defaults work well and you don’t need to configure compaction.

Customize compaction_settings when you want to:

  • Use a cheaper/faster model for summarization
  • Preserve more or less recent context
  • Change the summarization strategy (e.g. to maximize prefix caching)
  • Customize the summarization prompt

If you specify compaction_settings, the only required field is:

  • model (string): the summarizer model handle (e.g. "openai/gpt-4o-mini")

All other fields are optional.

FieldTypeRequiredDescription
modelstringYesSummarizer model handle (format: provider/model-name)
model_settingsobjectNoOptional overrides for the summarizer model defaults
promptstringNoCustom system prompt for the summarizer
prompt_acknowledgementbooleanNoWhether to include an acknowledgement post-prompt
clip_charsint | nullNoMax summary length in characters (default: 2000)
modestringNo"sliding_window" or "all" (default: "sliding_window")
sliding_window_percentagefloatNoHow aggressively older history is summarized (default: 0.3)

Sliding window (default): Preserves recent messages and only summarizes older ones.

Before compaction (10 messages):
[msg1, msg2, msg3, msg4, msg5, msg6, msg7, msg8, msg9, msg10]
|---- oldest ~30% summarized ----|
After compaction:
[summary of msg1-3, msg4, msg5, msg6, msg7, msg8, msg9, msg10]

The sliding_window_percentage controls how aggressively older history is summarized:

  • 0.2 = summarize less (keep more recent context)
  • 0.5 = summarize more (keep less recent context)

All mode: The entire conversation history is summarized. Use when you need maximum space reduction.

from letta_client import Letta
import os
client = Letta(api_key=os.getenv("LETTA_API_KEY"))
agent = client.agents.create(
name="my_agent",
model="openai/gpt-4o",
compaction_settings={
"model": "openai/gpt-4o-mini", # Cheaper model for summarization
"mode": "sliding_window",
"sliding_window_percentage": 0.2, # Preserve more context
}
)