Skip to content
Letta Platform Letta Platform Letta Docs
Sign up

Conversations

Create Conversation
client.conversations.create(ConversationCreateParams { agent_id, isolated_block_labels, summary } params, RequestOptionsoptions?): Conversation { id, agent_id, created_at, 6 more }
post/v1/conversations/
List Conversations
client.conversations.list(ConversationListParams { agent_id, after, limit, summary_search } query, RequestOptionsoptions?): ConversationListResponse { id, agent_id, created_at, 6 more }
get/v1/conversations/
Retrieve Conversation
client.conversations.retrieve(stringconversationID, RequestOptionsoptions?): Conversation { id, agent_id, created_at, 6 more }
get/v1/conversations/{conversation_id}
Update Conversation
client.conversations.update(stringconversationID, ConversationUpdateParams { summary } body, RequestOptionsoptions?): Conversation { id, agent_id, created_at, 6 more }
patch/v1/conversations/{conversation_id}
Cancel Conversation
client.conversations.cancel(stringconversationID, RequestOptionsoptions?): ConversationCancelResponse
post/v1/conversations/{conversation_id}/cancel
ModelsExpand Collapse
Conversation { id, agent_id, created_at, 6 more }

Represents a conversation on an agent for concurrent messaging.

id: string

The unique identifier of the conversation.

agent_id: string

The ID of the agent this conversation belongs to.

created_at?: string | null

The timestamp when the object was created.

formatdate-time
created_by_id?: string | null

The id of the user that made this object.

in_context_message_ids?: Array<string>

The IDs of in-context messages for the conversation.

isolated_block_ids?: Array<string>

IDs of blocks that are isolated (specific to this conversation, overriding agent defaults).

last_updated_by_id?: string | null

The id of the user that made this object.

summary?: string | null

A summary of the conversation.

updated_at?: string | null

The timestamp when the object was last updated.

formatdate-time
CreateConversation { isolated_block_labels, summary }

Request model for creating a new conversation.

isolated_block_labels?: Array<string> | null

List of block labels that should be isolated (conversation-specific) rather than shared across conversations. New blocks will be created as copies of the agent's blocks with these labels.

summary?: string | null

A summary of the conversation.

UpdateConversation { summary }

Request model for updating a conversation.

summary?: string | null

A summary of the conversation.

ConversationsMessages

List Conversation Messages
client.conversations.messages.list(stringconversationID, MessageListParams { after, before, group_id, 4 more } query?, RequestOptionsoptions?): ArrayPage<Message>
get/v1/conversations/{conversation_id}/messages
Send Conversation Message
client.conversations.messages.create(stringconversationID, MessageCreateParams { assistant_message_tool_kwarg, assistant_message_tool_name, background, 12 more } body, RequestOptionsoptions?): LettaResponse { messages, stop_reason, usage } | Stream<LettaStreamingResponse>
post/v1/conversations/{conversation_id}/messages
Retrieve Conversation Stream
client.conversations.messages.stream(stringconversationID, MessageStreamParams { batch_size, include_pings, poll_interval, starting_after } body?, RequestOptionsoptions?): MessageStreamResponse | Stream<LettaStreamingResponse>
post/v1/conversations/{conversation_id}/stream
Compact Conversation
client.conversations.messages.compact(stringconversationID, MessageCompactParams { compaction_settings } body?, RequestOptionsoptions?): CompactionResponse { num_messages_after, num_messages_before, summary }
post/v1/conversations/{conversation_id}/compact
ModelsExpand Collapse
CompactionRequest { compaction_settings }
compaction_settings?: CompactionSettings | null

Configuration for conversation compaction / summarization.

model is the only required user-facing field – it specifies the summarizer model handle (e.g. "openai/gpt-4o-mini"). Per-model settings (temperature, max tokens, etc.) are derived from the default configuration for that handle.

model: string

Model handle to use for summarization (format: provider/model-name).

clip_chars?: number | null

The maximum length of the summary in characters. If none, no clipping is performed.

mode?: "all" | "sliding_window"

The type of summarization technique use.

Accepts one of the following:
"all"
"sliding_window"
model_settings?: OpenAIModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 4 more } | AnthropicModelSettings { effort, max_output_tokens, parallel_tool_calls, 6 more } | GoogleAIModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 3 more } | 10 more | null

Optional model settings used to override defaults for the summarizer model.

Accepts one of the following:
OpenAIModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 4 more }
max_output_tokens?: number

The maximum number of tokens the model can generate.

parallel_tool_calls?: boolean

Whether to enable parallel tool calling.

provider_type?: "openai"

The type of the provider.

reasoning?: Reasoning { reasoning_effort }

The reasoning configuration for the model.

reasoning_effort?: "none" | "minimal" | "low" | 3 more

The reasoning effort to use when generating text reasoning models

Accepts one of the following:
"none"
"minimal"
"low"
"medium"
"high"
"xhigh"
response_format?: TextResponseFormat { type } | JsonSchemaResponseFormat { json_schema, type } | JsonObjectResponseFormat { type } | null

The response format for the model.

Accepts one of the following:
TextResponseFormat { type }

Response format for plain text responses.

type?: "text"

The type of the response format.

JsonSchemaResponseFormat { json_schema, type }

Response format for JSON schema-based responses.

json_schema: Record<string, unknown>

The JSON schema of the response.

type?: "json_schema"

The type of the response format.

JsonObjectResponseFormat { type }

Response format for JSON object responses.

type?: "json_object"

The type of the response format.

strict?: boolean

Enable strict mode for tool calling. When true, tool outputs are guaranteed to match JSON schemas.

temperature?: number

The temperature of the model.

AnthropicModelSettings { effort, max_output_tokens, parallel_tool_calls, 6 more }
effort?: "low" | "medium" | "high" | null

Effort level for Opus 4.5 model (controls token conservation). Not setting this gives similar performance to 'high'.

Accepts one of the following:
"low"
"medium"
"high"
max_output_tokens?: number

The maximum number of tokens the model can generate.

parallel_tool_calls?: boolean

Whether to enable parallel tool calling.

provider_type?: "anthropic"

The type of the provider.

response_format?: TextResponseFormat { type } | JsonSchemaResponseFormat { json_schema, type } | JsonObjectResponseFormat { type } | null

The response format for the model.

Accepts one of the following:
TextResponseFormat { type }

Response format for plain text responses.

type?: "text"

The type of the response format.

JsonSchemaResponseFormat { json_schema, type }

Response format for JSON schema-based responses.

json_schema: Record<string, unknown>

The JSON schema of the response.

type?: "json_schema"

The type of the response format.

JsonObjectResponseFormat { type }

Response format for JSON object responses.

type?: "json_object"

The type of the response format.

strict?: boolean

Enable strict mode for tool calling. When true, tool outputs are guaranteed to match JSON schemas.

temperature?: number

The temperature of the model.

thinking?: Thinking { budget_tokens, type }

The thinking configuration for the model.

budget_tokens?: number

The maximum number of tokens the model can use for extended thinking.

type?: "enabled" | "disabled"

The type of thinking to use.

Accepts one of the following:
"enabled"
"disabled"
verbosity?: "low" | "medium" | "high" | null

Soft control for how verbose model output should be, used for GPT-5 models.

Accepts one of the following:
"low"
"medium"
"high"
GoogleAIModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 3 more }
max_output_tokens?: number

The maximum number of tokens the model can generate.

parallel_tool_calls?: boolean

Whether to enable parallel tool calling.

provider_type?: "google_ai"

The type of the provider.

response_schema?: TextResponseFormat { type } | JsonSchemaResponseFormat { json_schema, type } | JsonObjectResponseFormat { type } | null

The response schema for the model.

Accepts one of the following:
TextResponseFormat { type }

Response format for plain text responses.

type?: "text"

The type of the response format.

JsonSchemaResponseFormat { json_schema, type }

Response format for JSON schema-based responses.

json_schema: Record<string, unknown>

The JSON schema of the response.

type?: "json_schema"

The type of the response format.

JsonObjectResponseFormat { type }

Response format for JSON object responses.

type?: "json_object"

The type of the response format.

temperature?: number

The temperature of the model.

thinking_config?: ThinkingConfig { include_thoughts, thinking_budget }

The thinking configuration for the model.

include_thoughts?: boolean

Whether to include thoughts in the model's response.

thinking_budget?: number

The thinking budget for the model.

GoogleVertexModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 3 more }
max_output_tokens?: number

The maximum number of tokens the model can generate.

parallel_tool_calls?: boolean

Whether to enable parallel tool calling.

provider_type?: "google_vertex"

The type of the provider.

response_schema?: TextResponseFormat { type } | JsonSchemaResponseFormat { json_schema, type } | JsonObjectResponseFormat { type } | null

The response schema for the model.

Accepts one of the following:
TextResponseFormat { type }

Response format for plain text responses.

type?: "text"

The type of the response format.

JsonSchemaResponseFormat { json_schema, type }

Response format for JSON schema-based responses.

json_schema: Record<string, unknown>

The JSON schema of the response.

type?: "json_schema"

The type of the response format.

JsonObjectResponseFormat { type }

Response format for JSON object responses.

type?: "json_object"

The type of the response format.

temperature?: number

The temperature of the model.

thinking_config?: ThinkingConfig { include_thoughts, thinking_budget }

The thinking configuration for the model.

include_thoughts?: boolean

Whether to include thoughts in the model's response.

thinking_budget?: number

The thinking budget for the model.

AzureModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 2 more }

Azure OpenAI model configuration (OpenAI-compatible).

max_output_tokens?: number

The maximum number of tokens the model can generate.

parallel_tool_calls?: boolean

Whether to enable parallel tool calling.

provider_type?: "azure"

The type of the provider.

response_format?: TextResponseFormat { type } | JsonSchemaResponseFormat { json_schema, type } | JsonObjectResponseFormat { type } | null

The response format for the model.

Accepts one of the following:
TextResponseFormat { type }

Response format for plain text responses.

type?: "text"

The type of the response format.

JsonSchemaResponseFormat { json_schema, type }

Response format for JSON schema-based responses.

json_schema: Record<string, unknown>

The JSON schema of the response.

type?: "json_schema"

The type of the response format.

JsonObjectResponseFormat { type }

Response format for JSON object responses.

type?: "json_object"

The type of the response format.

temperature?: number

The temperature of the model.

XaiModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 2 more }

xAI model configuration (OpenAI-compatible).

max_output_tokens?: number

The maximum number of tokens the model can generate.

parallel_tool_calls?: boolean

Whether to enable parallel tool calling.

provider_type?: "xai"

The type of the provider.

response_format?: TextResponseFormat { type } | JsonSchemaResponseFormat { json_schema, type } | JsonObjectResponseFormat { type } | null

The response format for the model.

Accepts one of the following:
TextResponseFormat { type }

Response format for plain text responses.

type?: "text"

The type of the response format.

JsonSchemaResponseFormat { json_schema, type }

Response format for JSON schema-based responses.

json_schema: Record<string, unknown>

The JSON schema of the response.

type?: "json_schema"

The type of the response format.

JsonObjectResponseFormat { type }

Response format for JSON object responses.

type?: "json_object"

The type of the response format.

temperature?: number

The temperature of the model.

ZaiModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 3 more }

Z.ai (ZhipuAI) model configuration (OpenAI-compatible).

max_output_tokens?: number

The maximum number of tokens the model can generate.

parallel_tool_calls?: boolean

Whether to enable parallel tool calling.

provider_type?: "zai"

The type of the provider.

response_format?: TextResponseFormat { type } | JsonSchemaResponseFormat { json_schema, type } | JsonObjectResponseFormat { type } | null

The response format for the model.

Accepts one of the following:
TextResponseFormat { type }

Response format for plain text responses.

type?: "text"

The type of the response format.

JsonSchemaResponseFormat { json_schema, type }

Response format for JSON schema-based responses.

json_schema: Record<string, unknown>

The JSON schema of the response.

type?: "json_schema"

The type of the response format.

JsonObjectResponseFormat { type }

Response format for JSON object responses.

type?: "json_object"

The type of the response format.

temperature?: number

The temperature of the model.

thinking?: Thinking { clear_thinking, type }

The thinking configuration for GLM-4.5+ models.

clear_thinking?: boolean

If False, preserved thinking is used (recommended for agents).

type?: "enabled" | "disabled"

Whether thinking is enabled or disabled.

Accepts one of the following:
"enabled"
"disabled"
GroqModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 2 more }

Groq model configuration (OpenAI-compatible).

max_output_tokens?: number

The maximum number of tokens the model can generate.

parallel_tool_calls?: boolean

Whether to enable parallel tool calling.

provider_type?: "groq"

The type of the provider.

response_format?: TextResponseFormat { type } | JsonSchemaResponseFormat { json_schema, type } | JsonObjectResponseFormat { type } | null

The response format for the model.

Accepts one of the following:
TextResponseFormat { type }

Response format for plain text responses.

type?: "text"

The type of the response format.

JsonSchemaResponseFormat { json_schema, type }

Response format for JSON schema-based responses.

json_schema: Record<string, unknown>

The JSON schema of the response.

type?: "json_schema"

The type of the response format.

JsonObjectResponseFormat { type }

Response format for JSON object responses.

type?: "json_object"

The type of the response format.

temperature?: number

The temperature of the model.

DeepseekModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 2 more }

Deepseek model configuration (OpenAI-compatible).

max_output_tokens?: number

The maximum number of tokens the model can generate.

parallel_tool_calls?: boolean

Whether to enable parallel tool calling.

provider_type?: "deepseek"

The type of the provider.

response_format?: TextResponseFormat { type } | JsonSchemaResponseFormat { json_schema, type } | JsonObjectResponseFormat { type } | null

The response format for the model.

Accepts one of the following:
TextResponseFormat { type }

Response format for plain text responses.

type?: "text"

The type of the response format.

JsonSchemaResponseFormat { json_schema, type }

Response format for JSON schema-based responses.

json_schema: Record<string, unknown>

The JSON schema of the response.

type?: "json_schema"

The type of the response format.

JsonObjectResponseFormat { type }

Response format for JSON object responses.

type?: "json_object"

The type of the response format.

temperature?: number

The temperature of the model.

TogetherModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 2 more }

Together AI model configuration (OpenAI-compatible).

max_output_tokens?: number

The maximum number of tokens the model can generate.

parallel_tool_calls?: boolean

Whether to enable parallel tool calling.

provider_type?: "together"

The type of the provider.

response_format?: TextResponseFormat { type } | JsonSchemaResponseFormat { json_schema, type } | JsonObjectResponseFormat { type } | null

The response format for the model.

Accepts one of the following:
TextResponseFormat { type }

Response format for plain text responses.

type?: "text"

The type of the response format.

JsonSchemaResponseFormat { json_schema, type }

Response format for JSON schema-based responses.

json_schema: Record<string, unknown>

The JSON schema of the response.

type?: "json_schema"

The type of the response format.

JsonObjectResponseFormat { type }

Response format for JSON object responses.

type?: "json_object"

The type of the response format.

temperature?: number

The temperature of the model.

BedrockModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 2 more }

AWS Bedrock model configuration.

max_output_tokens?: number

The maximum number of tokens the model can generate.

parallel_tool_calls?: boolean

Whether to enable parallel tool calling.

provider_type?: "bedrock"

The type of the provider.

response_format?: TextResponseFormat { type } | JsonSchemaResponseFormat { json_schema, type } | JsonObjectResponseFormat { type } | null

The response format for the model.

Accepts one of the following:
TextResponseFormat { type }

Response format for plain text responses.

type?: "text"

The type of the response format.

JsonSchemaResponseFormat { json_schema, type }

Response format for JSON schema-based responses.

json_schema: Record<string, unknown>

The JSON schema of the response.

type?: "json_schema"

The type of the response format.

JsonObjectResponseFormat { type }

Response format for JSON object responses.

type?: "json_object"

The type of the response format.

temperature?: number

The temperature of the model.

OpenRouterModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 2 more }

OpenRouter model configuration (OpenAI-compatible).

max_output_tokens?: number

The maximum number of tokens the model can generate.

parallel_tool_calls?: boolean

Whether to enable parallel tool calling.

provider_type?: "openrouter"

The type of the provider.

response_format?: TextResponseFormat { type } | JsonSchemaResponseFormat { json_schema, type } | JsonObjectResponseFormat { type } | null

The response format for the model.

Accepts one of the following:
TextResponseFormat { type }

Response format for plain text responses.

type?: "text"

The type of the response format.

JsonSchemaResponseFormat { json_schema, type }

Response format for JSON schema-based responses.

json_schema: Record<string, unknown>

The JSON schema of the response.

type?: "json_schema"

The type of the response format.

JsonObjectResponseFormat { type }

Response format for JSON object responses.

type?: "json_object"

The type of the response format.

temperature?: number

The temperature of the model.

ChatGptoAuthModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 2 more }

ChatGPT OAuth model configuration (uses ChatGPT backend API).

max_output_tokens?: number

The maximum number of tokens the model can generate.

parallel_tool_calls?: boolean

Whether to enable parallel tool calling.

provider_type?: "chatgpt_oauth"

The type of the provider.

reasoning?: Reasoning { reasoning_effort }

The reasoning configuration for the model.

reasoning_effort?: "none" | "low" | "medium" | 2 more

The reasoning effort level for GPT-5.x and o-series models.

Accepts one of the following:
"none"
"low"
"medium"
"high"
"xhigh"
temperature?: number

The temperature of the model.

prompt?: string

The prompt to use for summarization.

prompt_acknowledgement?: boolean

Whether to include an acknowledgement post-prompt (helps prevent non-summary outputs).

sliding_window_percentage?: number

The percentage of the context window to keep post-summarization (only used in sliding window mode).

CompactionResponse { num_messages_after, num_messages_before, summary }
num_messages_after: number
num_messages_before: number
summary: string