Summarize Messages
Summarize an agent's conversation history.
ParametersExpand Collapse
agent_id: str
The ID of the agent in the format 'agent-
Configuration for conversation compaction / summarization.
model is the only required user-facing field – it specifies the summarizer
model handle (e.g. "openai/gpt-4o-mini"). Per-model settings (temperature,
max tokens, etc.) are derived from the default configuration for that handle.
model: str
Model handle to use for summarization (format: provider/model-name).
clip_chars: Optional[int]
The maximum length of the summary in characters. If none, no clipping is performed.
mode: Optional[Literal["all", "sliding_window"]]
The type of summarization technique use.
model_settings: Optional[CompactionSettingsModelSettings]
Optional model settings used to override defaults for the summarizer model.
class OpenAIModelSettings: …
max_output_tokens: Optional[int]
The maximum number of tokens the model can generate.
parallel_tool_calls: Optional[bool]
Whether to enable parallel tool calling.
provider_type: Optional[Literal["openai"]]
The type of the provider.
reasoning: Optional[Reasoning]
The reasoning configuration for the model.
reasoning_effort: Optional[Literal["none", "minimal", "low", 3 more]]
The reasoning effort to use when generating text reasoning models
response_format: Optional[ResponseFormat]
The response format for the model.
class TextResponseFormat: …
Response format for plain text responses.
type: Optional[Literal["text"]]
The type of the response format.
class JsonSchemaResponseFormat: …
Response format for JSON schema-based responses.
json_schema: Dict[str, object]
The JSON schema of the response.
type: Optional[Literal["json_schema"]]
The type of the response format.
class JsonObjectResponseFormat: …
Response format for JSON object responses.
type: Optional[Literal["json_object"]]
The type of the response format.
temperature: Optional[float]
The temperature of the model.
class AnthropicModelSettings: …
effort: Optional[Literal["low", "medium", "high"]]
Effort level for Opus 4.5 model (controls token conservation). Not setting this gives similar performance to 'high'.
max_output_tokens: Optional[int]
The maximum number of tokens the model can generate.
parallel_tool_calls: Optional[bool]
Whether to enable parallel tool calling.
provider_type: Optional[Literal["anthropic"]]
The type of the provider.
response_format: Optional[ResponseFormat]
The response format for the model.
class TextResponseFormat: …
Response format for plain text responses.
type: Optional[Literal["text"]]
The type of the response format.
class JsonSchemaResponseFormat: …
Response format for JSON schema-based responses.
json_schema: Dict[str, object]
The JSON schema of the response.
type: Optional[Literal["json_schema"]]
The type of the response format.
class JsonObjectResponseFormat: …
Response format for JSON object responses.
type: Optional[Literal["json_object"]]
The type of the response format.
temperature: Optional[float]
The temperature of the model.
thinking: Optional[Thinking]
The thinking configuration for the model.
budget_tokens: Optional[int]
The maximum number of tokens the model can use for extended thinking.
type: Optional[Literal["enabled", "disabled"]]
The type of thinking to use.
verbosity: Optional[Literal["low", "medium", "high"]]
Soft control for how verbose model output should be, used for GPT-5 models.
class GoogleAIModelSettings: …
max_output_tokens: Optional[int]
The maximum number of tokens the model can generate.
parallel_tool_calls: Optional[bool]
Whether to enable parallel tool calling.
provider_type: Optional[Literal["google_ai"]]
The type of the provider.
response_schema: Optional[ResponseSchema]
The response schema for the model.
class TextResponseFormat: …
Response format for plain text responses.
type: Optional[Literal["text"]]
The type of the response format.
class JsonSchemaResponseFormat: …
Response format for JSON schema-based responses.
json_schema: Dict[str, object]
The JSON schema of the response.
type: Optional[Literal["json_schema"]]
The type of the response format.
class JsonObjectResponseFormat: …
Response format for JSON object responses.
type: Optional[Literal["json_object"]]
The type of the response format.
temperature: Optional[float]
The temperature of the model.
thinking_config: Optional[ThinkingConfig]
The thinking configuration for the model.
include_thoughts: Optional[bool]
Whether to include thoughts in the model's response.
thinking_budget: Optional[int]
The thinking budget for the model.
class GoogleVertexModelSettings: …
max_output_tokens: Optional[int]
The maximum number of tokens the model can generate.
parallel_tool_calls: Optional[bool]
Whether to enable parallel tool calling.
provider_type: Optional[Literal["google_vertex"]]
The type of the provider.
response_schema: Optional[ResponseSchema]
The response schema for the model.
class TextResponseFormat: …
Response format for plain text responses.
type: Optional[Literal["text"]]
The type of the response format.
class JsonSchemaResponseFormat: …
Response format for JSON schema-based responses.
json_schema: Dict[str, object]
The JSON schema of the response.
type: Optional[Literal["json_schema"]]
The type of the response format.
class JsonObjectResponseFormat: …
Response format for JSON object responses.
type: Optional[Literal["json_object"]]
The type of the response format.
temperature: Optional[float]
The temperature of the model.
thinking_config: Optional[ThinkingConfig]
The thinking configuration for the model.
include_thoughts: Optional[bool]
Whether to include thoughts in the model's response.
thinking_budget: Optional[int]
The thinking budget for the model.
class AzureModelSettings: …
Azure OpenAI model configuration (OpenAI-compatible).
max_output_tokens: Optional[int]
The maximum number of tokens the model can generate.
parallel_tool_calls: Optional[bool]
Whether to enable parallel tool calling.
provider_type: Optional[Literal["azure"]]
The type of the provider.
response_format: Optional[ResponseFormat]
The response format for the model.
class TextResponseFormat: …
Response format for plain text responses.
type: Optional[Literal["text"]]
The type of the response format.
class JsonSchemaResponseFormat: …
Response format for JSON schema-based responses.
json_schema: Dict[str, object]
The JSON schema of the response.
type: Optional[Literal["json_schema"]]
The type of the response format.
class JsonObjectResponseFormat: …
Response format for JSON object responses.
type: Optional[Literal["json_object"]]
The type of the response format.
temperature: Optional[float]
The temperature of the model.
class XaiModelSettings: …
xAI model configuration (OpenAI-compatible).
max_output_tokens: Optional[int]
The maximum number of tokens the model can generate.
parallel_tool_calls: Optional[bool]
Whether to enable parallel tool calling.
provider_type: Optional[Literal["xai"]]
The type of the provider.
response_format: Optional[ResponseFormat]
The response format for the model.
class TextResponseFormat: …
Response format for plain text responses.
type: Optional[Literal["text"]]
The type of the response format.
class JsonSchemaResponseFormat: …
Response format for JSON schema-based responses.
json_schema: Dict[str, object]
The JSON schema of the response.
type: Optional[Literal["json_schema"]]
The type of the response format.
class JsonObjectResponseFormat: …
Response format for JSON object responses.
type: Optional[Literal["json_object"]]
The type of the response format.
temperature: Optional[float]
The temperature of the model.
class CompactionSettingsModelSettingsZaiModelSettings: …
Z.ai (ZhipuAI) model configuration (OpenAI-compatible).
max_output_tokens: Optional[int]
The maximum number of tokens the model can generate.
parallel_tool_calls: Optional[bool]
Whether to enable parallel tool calling.
provider_type: Optional[Literal["zai"]]
The type of the provider.
response_format: Optional[CompactionSettingsModelSettingsZaiModelSettingsResponseFormat]
The response format for the model.
class TextResponseFormat: …
Response format for plain text responses.
type: Optional[Literal["text"]]
The type of the response format.
class JsonSchemaResponseFormat: …
Response format for JSON schema-based responses.
json_schema: Dict[str, object]
The JSON schema of the response.
type: Optional[Literal["json_schema"]]
The type of the response format.
class JsonObjectResponseFormat: …
Response format for JSON object responses.
type: Optional[Literal["json_object"]]
The type of the response format.
temperature: Optional[float]
The temperature of the model.
class GroqModelSettings: …
Groq model configuration (OpenAI-compatible).
max_output_tokens: Optional[int]
The maximum number of tokens the model can generate.
parallel_tool_calls: Optional[bool]
Whether to enable parallel tool calling.
provider_type: Optional[Literal["groq"]]
The type of the provider.
response_format: Optional[ResponseFormat]
The response format for the model.
class TextResponseFormat: …
Response format for plain text responses.
type: Optional[Literal["text"]]
The type of the response format.
class JsonSchemaResponseFormat: …
Response format for JSON schema-based responses.
json_schema: Dict[str, object]
The JSON schema of the response.
type: Optional[Literal["json_schema"]]
The type of the response format.
class JsonObjectResponseFormat: …
Response format for JSON object responses.
type: Optional[Literal["json_object"]]
The type of the response format.
temperature: Optional[float]
The temperature of the model.
class DeepseekModelSettings: …
Deepseek model configuration (OpenAI-compatible).
max_output_tokens: Optional[int]
The maximum number of tokens the model can generate.
parallel_tool_calls: Optional[bool]
Whether to enable parallel tool calling.
provider_type: Optional[Literal["deepseek"]]
The type of the provider.
response_format: Optional[ResponseFormat]
The response format for the model.
class TextResponseFormat: …
Response format for plain text responses.
type: Optional[Literal["text"]]
The type of the response format.
class JsonSchemaResponseFormat: …
Response format for JSON schema-based responses.
json_schema: Dict[str, object]
The JSON schema of the response.
type: Optional[Literal["json_schema"]]
The type of the response format.
class JsonObjectResponseFormat: …
Response format for JSON object responses.
type: Optional[Literal["json_object"]]
The type of the response format.
temperature: Optional[float]
The temperature of the model.
class TogetherModelSettings: …
Together AI model configuration (OpenAI-compatible).
max_output_tokens: Optional[int]
The maximum number of tokens the model can generate.
parallel_tool_calls: Optional[bool]
Whether to enable parallel tool calling.
provider_type: Optional[Literal["together"]]
The type of the provider.
response_format: Optional[ResponseFormat]
The response format for the model.
class TextResponseFormat: …
Response format for plain text responses.
type: Optional[Literal["text"]]
The type of the response format.
class JsonSchemaResponseFormat: …
Response format for JSON schema-based responses.
json_schema: Dict[str, object]
The JSON schema of the response.
type: Optional[Literal["json_schema"]]
The type of the response format.
class JsonObjectResponseFormat: …
Response format for JSON object responses.
type: Optional[Literal["json_object"]]
The type of the response format.
temperature: Optional[float]
The temperature of the model.
class BedrockModelSettings: …
AWS Bedrock model configuration.
max_output_tokens: Optional[int]
The maximum number of tokens the model can generate.
parallel_tool_calls: Optional[bool]
Whether to enable parallel tool calling.
provider_type: Optional[Literal["bedrock"]]
The type of the provider.
response_format: Optional[ResponseFormat]
The response format for the model.
class TextResponseFormat: …
Response format for plain text responses.
type: Optional[Literal["text"]]
The type of the response format.
class JsonSchemaResponseFormat: …
Response format for JSON schema-based responses.
json_schema: Dict[str, object]
The JSON schema of the response.
type: Optional[Literal["json_schema"]]
The type of the response format.
class JsonObjectResponseFormat: …
Response format for JSON object responses.
type: Optional[Literal["json_object"]]
The type of the response format.
temperature: Optional[float]
The temperature of the model.
prompt: Optional[str]
The prompt to use for summarization.
prompt_acknowledgement: Optional[bool]
Whether to include an acknowledgement post-prompt (helps prevent non-summary outputs).
sliding_window_percentage: Optional[float]
The percentage of the context window to keep post-summarization (only used in sliding window mode).
ReturnsExpand Collapse
class MessageCompactResponse: …
Summarize Messages
import os
from letta_client import Letta
client = Letta(
api_key=os.environ.get("LETTA_API_KEY"), # This is the default and can be omitted
)
response = client.agents.messages.compact(
agent_id="agent-123e4567-e89b-42d3-8456-426614174000",
)
print(response.num_messages_after)
{
"num_messages_after": 0,
"num_messages_before": 0,
"summary": "summary"
}
Returns Examples
{
"num_messages_after": 0,
"num_messages_before": 0,
"summary": "summary"
}