Summarize Messages
Summarize an agent's conversation history.
ParametersExpand Collapse
agentID: string
The ID of the agent in the format 'agent-
body: MessageCompactParams { compaction_settings }
Configuration for conversation compaction / summarization.
model is the only required user-facing field – it specifies the summarizer
model handle (e.g. "openai/gpt-4o-mini"). Per-model settings (temperature,
max tokens, etc.) are derived from the default configuration for that handle.
model: string
Model handle to use for summarization (format: provider/model-name).
clip_chars?: number | null
The maximum length of the summary in characters. If none, no clipping is performed.
mode?: "all" | "sliding_window"
The type of summarization technique use.
model_settings?: OpenAIModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 3 more } | AnthropicModelSettings { effort, max_output_tokens, parallel_tool_calls, 5 more } | GoogleAIModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 3 more } | 8 more | null
Optional model settings used to override defaults for the summarizer model.
OpenAIModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 3 more }
max_output_tokens?: number
The maximum number of tokens the model can generate.
parallel_tool_calls?: boolean
Whether to enable parallel tool calling.
provider_type?: "openai"
The type of the provider.
reasoning?: Reasoning { reasoning_effort }
The reasoning configuration for the model.
reasoning_effort?: "none" | "minimal" | "low" | 3 more
The reasoning effort to use when generating text reasoning models
response_format?: TextResponseFormat { type } | JsonSchemaResponseFormat { json_schema, type } | JsonObjectResponseFormat { type } | null
The response format for the model.
TextResponseFormat { type }
Response format for plain text responses.
type?: "text"
The type of the response format.
JsonSchemaResponseFormat { json_schema, type }
Response format for JSON schema-based responses.
json_schema: Record<string, unknown>
The JSON schema of the response.
type?: "json_schema"
The type of the response format.
JsonObjectResponseFormat { type }
Response format for JSON object responses.
type?: "json_object"
The type of the response format.
temperature?: number
The temperature of the model.
AnthropicModelSettings { effort, max_output_tokens, parallel_tool_calls, 5 more }
effort?: "low" | "medium" | "high" | null
Effort level for Opus 4.5 model (controls token conservation). Not setting this gives similar performance to 'high'.
max_output_tokens?: number
The maximum number of tokens the model can generate.
parallel_tool_calls?: boolean
Whether to enable parallel tool calling.
provider_type?: "anthropic"
The type of the provider.
response_format?: TextResponseFormat { type } | JsonSchemaResponseFormat { json_schema, type } | JsonObjectResponseFormat { type } | null
The response format for the model.
TextResponseFormat { type }
Response format for plain text responses.
type?: "text"
The type of the response format.
JsonSchemaResponseFormat { json_schema, type }
Response format for JSON schema-based responses.
json_schema: Record<string, unknown>
The JSON schema of the response.
type?: "json_schema"
The type of the response format.
JsonObjectResponseFormat { type }
Response format for JSON object responses.
type?: "json_object"
The type of the response format.
temperature?: number
The temperature of the model.
thinking?: Thinking { budget_tokens, type }
The thinking configuration for the model.
budget_tokens?: number
The maximum number of tokens the model can use for extended thinking.
type?: "enabled" | "disabled"
The type of thinking to use.
verbosity?: "low" | "medium" | "high" | null
Soft control for how verbose model output should be, used for GPT-5 models.
GoogleAIModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 3 more }
max_output_tokens?: number
The maximum number of tokens the model can generate.
parallel_tool_calls?: boolean
Whether to enable parallel tool calling.
provider_type?: "google_ai"
The type of the provider.
response_schema?: TextResponseFormat { type } | JsonSchemaResponseFormat { json_schema, type } | JsonObjectResponseFormat { type } | null
The response schema for the model.
TextResponseFormat { type }
Response format for plain text responses.
type?: "text"
The type of the response format.
JsonSchemaResponseFormat { json_schema, type }
Response format for JSON schema-based responses.
json_schema: Record<string, unknown>
The JSON schema of the response.
type?: "json_schema"
The type of the response format.
JsonObjectResponseFormat { type }
Response format for JSON object responses.
type?: "json_object"
The type of the response format.
temperature?: number
The temperature of the model.
thinking_config?: ThinkingConfig { include_thoughts, thinking_budget }
The thinking configuration for the model.
include_thoughts?: boolean
Whether to include thoughts in the model's response.
thinking_budget?: number
The thinking budget for the model.
GoogleVertexModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 3 more }
max_output_tokens?: number
The maximum number of tokens the model can generate.
parallel_tool_calls?: boolean
Whether to enable parallel tool calling.
provider_type?: "google_vertex"
The type of the provider.
response_schema?: TextResponseFormat { type } | JsonSchemaResponseFormat { json_schema, type } | JsonObjectResponseFormat { type } | null
The response schema for the model.
TextResponseFormat { type }
Response format for plain text responses.
type?: "text"
The type of the response format.
JsonSchemaResponseFormat { json_schema, type }
Response format for JSON schema-based responses.
json_schema: Record<string, unknown>
The JSON schema of the response.
type?: "json_schema"
The type of the response format.
JsonObjectResponseFormat { type }
Response format for JSON object responses.
type?: "json_object"
The type of the response format.
temperature?: number
The temperature of the model.
thinking_config?: ThinkingConfig { include_thoughts, thinking_budget }
The thinking configuration for the model.
include_thoughts?: boolean
Whether to include thoughts in the model's response.
thinking_budget?: number
The thinking budget for the model.
AzureModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
Azure OpenAI model configuration (OpenAI-compatible).
max_output_tokens?: number
The maximum number of tokens the model can generate.
parallel_tool_calls?: boolean
Whether to enable parallel tool calling.
provider_type?: "azure"
The type of the provider.
response_format?: TextResponseFormat { type } | JsonSchemaResponseFormat { json_schema, type } | JsonObjectResponseFormat { type } | null
The response format for the model.
TextResponseFormat { type }
Response format for plain text responses.
type?: "text"
The type of the response format.
JsonSchemaResponseFormat { json_schema, type }
Response format for JSON schema-based responses.
json_schema: Record<string, unknown>
The JSON schema of the response.
type?: "json_schema"
The type of the response format.
JsonObjectResponseFormat { type }
Response format for JSON object responses.
type?: "json_object"
The type of the response format.
temperature?: number
The temperature of the model.
XaiModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
xAI model configuration (OpenAI-compatible).
max_output_tokens?: number
The maximum number of tokens the model can generate.
parallel_tool_calls?: boolean
Whether to enable parallel tool calling.
provider_type?: "xai"
The type of the provider.
response_format?: TextResponseFormat { type } | JsonSchemaResponseFormat { json_schema, type } | JsonObjectResponseFormat { type } | null
The response format for the model.
TextResponseFormat { type }
Response format for plain text responses.
type?: "text"
The type of the response format.
JsonSchemaResponseFormat { json_schema, type }
Response format for JSON schema-based responses.
json_schema: Record<string, unknown>
The JSON schema of the response.
type?: "json_schema"
The type of the response format.
JsonObjectResponseFormat { type }
Response format for JSON object responses.
type?: "json_object"
The type of the response format.
temperature?: number
The temperature of the model.
ZaiModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
Z.ai (ZhipuAI) model configuration (OpenAI-compatible).
max_output_tokens?: number
The maximum number of tokens the model can generate.
parallel_tool_calls?: boolean
Whether to enable parallel tool calling.
provider_type?: "zai"
The type of the provider.
response_format?: TextResponseFormat { type } | JsonSchemaResponseFormat { json_schema, type } | JsonObjectResponseFormat { type } | null
The response format for the model.
TextResponseFormat { type }
Response format for plain text responses.
type?: "text"
The type of the response format.
JsonSchemaResponseFormat { json_schema, type }
Response format for JSON schema-based responses.
json_schema: Record<string, unknown>
The JSON schema of the response.
type?: "json_schema"
The type of the response format.
JsonObjectResponseFormat { type }
Response format for JSON object responses.
type?: "json_object"
The type of the response format.
temperature?: number
The temperature of the model.
GroqModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
Groq model configuration (OpenAI-compatible).
max_output_tokens?: number
The maximum number of tokens the model can generate.
parallel_tool_calls?: boolean
Whether to enable parallel tool calling.
provider_type?: "groq"
The type of the provider.
response_format?: TextResponseFormat { type } | JsonSchemaResponseFormat { json_schema, type } | JsonObjectResponseFormat { type } | null
The response format for the model.
TextResponseFormat { type }
Response format for plain text responses.
type?: "text"
The type of the response format.
JsonSchemaResponseFormat { json_schema, type }
Response format for JSON schema-based responses.
json_schema: Record<string, unknown>
The JSON schema of the response.
type?: "json_schema"
The type of the response format.
JsonObjectResponseFormat { type }
Response format for JSON object responses.
type?: "json_object"
The type of the response format.
temperature?: number
The temperature of the model.
DeepseekModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
Deepseek model configuration (OpenAI-compatible).
max_output_tokens?: number
The maximum number of tokens the model can generate.
parallel_tool_calls?: boolean
Whether to enable parallel tool calling.
provider_type?: "deepseek"
The type of the provider.
response_format?: TextResponseFormat { type } | JsonSchemaResponseFormat { json_schema, type } | JsonObjectResponseFormat { type } | null
The response format for the model.
TextResponseFormat { type }
Response format for plain text responses.
type?: "text"
The type of the response format.
JsonSchemaResponseFormat { json_schema, type }
Response format for JSON schema-based responses.
json_schema: Record<string, unknown>
The JSON schema of the response.
type?: "json_schema"
The type of the response format.
JsonObjectResponseFormat { type }
Response format for JSON object responses.
type?: "json_object"
The type of the response format.
temperature?: number
The temperature of the model.
TogetherModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
Together AI model configuration (OpenAI-compatible).
max_output_tokens?: number
The maximum number of tokens the model can generate.
parallel_tool_calls?: boolean
Whether to enable parallel tool calling.
provider_type?: "together"
The type of the provider.
response_format?: TextResponseFormat { type } | JsonSchemaResponseFormat { json_schema, type } | JsonObjectResponseFormat { type } | null
The response format for the model.
TextResponseFormat { type }
Response format for plain text responses.
type?: "text"
The type of the response format.
JsonSchemaResponseFormat { json_schema, type }
Response format for JSON schema-based responses.
json_schema: Record<string, unknown>
The JSON schema of the response.
type?: "json_schema"
The type of the response format.
JsonObjectResponseFormat { type }
Response format for JSON object responses.
type?: "json_object"
The type of the response format.
temperature?: number
The temperature of the model.
BedrockModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
AWS Bedrock model configuration.
max_output_tokens?: number
The maximum number of tokens the model can generate.
parallel_tool_calls?: boolean
Whether to enable parallel tool calling.
provider_type?: "bedrock"
The type of the provider.
response_format?: TextResponseFormat { type } | JsonSchemaResponseFormat { json_schema, type } | JsonObjectResponseFormat { type } | null
The response format for the model.
TextResponseFormat { type }
Response format for plain text responses.
type?: "text"
The type of the response format.
JsonSchemaResponseFormat { json_schema, type }
Response format for JSON schema-based responses.
json_schema: Record<string, unknown>
The JSON schema of the response.
type?: "json_schema"
The type of the response format.
JsonObjectResponseFormat { type }
Response format for JSON object responses.
type?: "json_object"
The type of the response format.
temperature?: number
The temperature of the model.
prompt?: string
The prompt to use for summarization.
prompt_acknowledgement?: boolean
Whether to include an acknowledgement post-prompt (helps prevent non-summary outputs).
sliding_window_percentage?: number
The percentage of the context window to keep post-summarization (only used in sliding window mode).
ReturnsExpand Collapse
MessageCompactResponse { num_messages_after, num_messages_before, summary }
Summarize Messages
import Letta from '@letta-ai/letta-client';
const client = new Letta({
apiKey: process.env['LETTA_API_KEY'], // This is the default and can be omitted
});
const response = await client.agents.messages.compact('agent-123e4567-e89b-42d3-8456-426614174000');
console.log(response.num_messages_after);
{
"num_messages_after": 0,
"num_messages_before": 0,
"summary": "summary"
}
Returns Examples
{
"num_messages_after": 0,
"num_messages_before": 0,
"summary": "summary"
}