Create Message
Process a user message and return the agent’s response. This endpoint accepts a message from a user and processes it through the agent.
Note: Sending multiple concurrent requests to the same agent can lead to undefined behavior. Each agent processes messages sequentially, and concurrent requests may interleave in unexpected ways. Wait for each request to complete before sending the next one. Use separate agents or conversations for parallel processing.
The response format is controlled by the streaming field in the request body:
- If
streaming=false(default): Returns a complete LettaResponse with all messages - If
streaming=true: Returns a Server-Sent Events (SSE) stream
Additional streaming options (only used when streaming=true):
stream_tokens: Stream individual tokens instead of complete stepsinclude_pings: Include keepalive pings to prevent connection timeoutsbackground: Process the request in the background
ParametersExpand Collapse
Deprecatedassistant_message_tool_kwarg: Optional[str]
The name of the message argument in the designated message tool. Still supported for legacy agent types, but deprecated for letta_v1_agent onward.
Deprecatedassistant_message_tool_name: Optional[str]
The name of the designated message tool. Still supported for legacy agent types, but deprecated for letta_v1_agent onward.
client_skills: Optional[Iterable[ClientSkill]]
Client-side skills available in the environment. These are rendered in the system prompt’s available skills section alongside agent-scoped skills from MemFS.
client_tools: Optional[Iterable[ClientTool]]
Client-side tools that the agent can call. When the agent calls a client-side tool, execution pauses and returns control to the client to execute the tool and provide the result via a ToolReturn.
Deprecatedenable_thinking: Optional[str]
If set to True, enables reasoning before responses or tool calls from the agent.
If True, compaction events emit structured SummaryMessage and EventMessage types. If False (default), compaction messages are not included in the response.
Whether to include periodic keepalive ping messages in the stream to prevent connection timeouts (only used when streaming=true).
input: Optional[Union[str, Iterable[InputUnionMember1], null]]
Syntactic sugar for a single user message. Equivalent to messages=[{‘role’: ‘user’, ‘content’: input}].
Iterable[InputUnionMember1]
class ImageContent: …
class ToolCallContent: …
class ReasoningContent: …
messages: Optional[Iterable[Message]]
The messages to be sent to the agent.
class MessageCreate: …
Request to create a message
The content of the message.
List[LettaMessageContentUnion]
class ImageContent: …
class ToolCallContent: …
class ReasoningContent: …
class ApprovalCreate: …
Input to approve or deny a tool call request
approvals: Optional[List[Approval]]
The list of approval responses
class ToolReturn: …
tool_return: Union[List[ToolReturnUnionMember0], str]
The tool return value - either a string or list of content parts (text/image)
List[ToolReturnUnionMember0]
class MessageToolReturnCreate: …
Submit tool return(s) from client-side tool execution.
This is the preferred way to send tool results back to the agent after client-side tool execution. It is equivalent to sending an ApprovalCreate with tool return approvals, but provides a cleaner API for the common case.
List of tool returns from client-side execution
tool_return: Union[List[ToolReturnUnionMember0], str]
The tool return value - either a string or list of content parts (text/image)
List[ToolReturnUnionMember0]
override_model: Optional[str]
Model handle to use for this request instead of the agent’s default model. This allows sending a message to a different model without changing the agent’s configuration.
override_system: Optional[str]
Optional per-request system prompt override. When set, this is passed directly to the underlying LLM request and bypasses the persisted/compiled system message for that request.
If True, returns log probabilities of the output tokens in the response. Useful for RL training. Only supported for OpenAI-compatible providers (including SGLang).
If True, returns token IDs and logprobs for ALL LLM generations in the agent step, not just the last one. Uses SGLang native /generate endpoint. Returns ‘turns’ field with TurnTokenData for each assistant/tool turn. Required for proper multi-turn RL training with loss masking.
Flag to determine if individual tokens should be streamed, rather than streaming per step (only used when streaming=true).
streaming: Optional[Literal[false]]
If True, returns a streaming response (Server-Sent Events). If False (default), returns a complete response.
top_logprobs: Optional[int]
Number of most likely tokens to return at each position (0-20). Requires return_logprobs=True.
ReturnsExpand Collapse
class LettaResponse: …
Response object from an agent interaction, consisting of the new messages generated by the agent and usage statistics.
The type of the returned messages can be either Message or LettaMessage, depending on what was specified in the request.
Attributes: messages (List[Union[Message, LettaMessage]]): The messages returned by the agent. usage (LettaUsageStatistics): The usage statistics
The messages returned by the agent.
class SystemMessage: …
A message generated by the system. Never streamed back on a response, only used for cursor pagination.
Args: id (str): The ID of the message date (datetime): The date the message was created in ISO format name (Optional[str]): The name of the sender of the message content (str): The message content sent by the system
class UserMessage: …
A message sent by the user. Never streamed back on a response, only used for cursor pagination.
Args: id (str): The ID of the message date (datetime): The date the message was created in ISO format name (Optional[str]): The name of the sender of the message content (Union[str, List[LettaUserMessageContentUnion]]): The message content sent by the user (can be a string or an array of multi-modal content parts)
The message content sent by the user (can be a string or an array of multi-modal content parts)
class ReasoningMessage: …
Representation of an agent’s internal reasoning.
Args: id (str): The ID of the message date (datetime): The date the message was created in ISO format name (Optional[str]): The name of the sender of the message source (Literal[“reasoner_model”, “non_reasoner_model”]): Whether the reasoning content was generated natively by a reasoner model or derived via prompting reasoning (str): The internal reasoning of the agent signature (Optional[str]): The model-generated signature of the reasoning step
otid: Optional[str]
The offline threading id (OTID). Set by the client to deduplicate requests. Used for idempotency in background streaming mode — each message in a request must have a unique OTID. Retries of the same request should reuse the same OTIDs.
class HiddenReasoningMessage: …
Representation of an agent’s internal reasoning where reasoning content has been hidden from the response.
Args: id (str): The ID of the message date (datetime): The date the message was created in ISO format name (Optional[str]): The name of the sender of the message state (Literal[“redacted”, “omitted”]): Whether the reasoning content was redacted by the provider or simply omitted by the API hidden_reasoning (Optional[str]): The internal reasoning of the agent
class ToolCallMessage: …
A message representing a request to call a tool (generated by the LLM to trigger tool execution).
Args: id (str): The ID of the message date (datetime): The date the message was created in ISO format name (Optional[str]): The name of the sender of the message tool_call (Union[ToolCall, ToolCallDelta]): The tool call
otid: Optional[str]
The offline threading id (OTID). Set by the client to deduplicate requests. Used for idempotency in background streaming mode — each message in a request must have a unique OTID. Retries of the same request should reuse the same OTIDs.
tool_calls: Optional[ToolCalls]
List[ToolCall]
class ToolReturnMessage: …
A message representing the return value of a tool call (generated by Letta executing the requested tool).
Args: id (str): The ID of the message date (datetime): The date the message was created in ISO format name (Optional[str]): The name of the sender of the message tool_return (str): The return value of the tool (deprecated, use tool_returns) status (Literal[“success”, “error”]): The status of the tool call (deprecated, use tool_returns) tool_call_id (str): A unique identifier for the tool call that generated this message (deprecated, use tool_returns) stdout (Optional[List(str)]): Captured stdout (e.g. prints, logs) from the tool invocation (deprecated, use tool_returns) stderr (Optional[List(str)]): Captured stderr from the tool invocation (deprecated, use tool_returns) tool_returns (Optional[List[ToolReturn]]): List of tool returns for multi-tool support
otid: Optional[str]
The offline threading id (OTID). Set by the client to deduplicate requests. Used for idempotency in background streaming mode — each message in a request must have a unique OTID. Retries of the same request should reuse the same OTIDs.
tool_return: Union[List[ToolReturnUnionMember0], str]
The tool return value - either a string or list of content parts (text/image)
List[ToolReturnUnionMember0]
class AssistantMessage: …
A message sent by the LLM in response to user input. Used in the LLM context.
Args: id (str): The ID of the message date (datetime): The date the message was created in ISO format name (Optional[str]): The name of the sender of the message content (Union[str, List[LettaAssistantMessageContentUnion]]): The message content sent by the agent (can be a string or an array of content parts)
class ApprovalRequestMessage: …
A message representing a request for approval to call a tool (generated by the LLM to trigger tool execution).
Args: id (str): The ID of the message date (datetime): The date the message was created in ISO format name (Optional[str]): The name of the sender of the message tool_call (ToolCall): The tool call
otid: Optional[str]
The offline threading id (OTID). Set by the client to deduplicate requests. Used for idempotency in background streaming mode — each message in a request must have a unique OTID. Retries of the same request should reuse the same OTIDs.
tool_calls: Optional[ToolCalls]
The tool calls that have been requested by the llm to run, which are pending approval
List[ToolCall]
class ApprovalResponseMessage: …
A message representing a response form the user indicating whether a tool has been approved to run.
Args: id (str): The ID of the message date (datetime): The date the message was created in ISO format name (Optional[str]): The name of the sender of the message approve: (bool) Whether the tool has been approved approval_request_id: The ID of the approval request reason: (Optional[str]) An optional explanation for the provided approval status
approvals: Optional[List[Approval]]
The list of approval responses
class ToolReturn: …
tool_return: Union[List[ToolReturnUnionMember0], str]
The tool return value - either a string or list of content parts (text/image)
List[ToolReturnUnionMember0]
class SummaryMessage: …
A message representing a summary of the conversation. Sent to the LLM as a user or system message depending on the provider.
compaction_stats: Optional[CompactionStats]
class EventMessage: …
A message for notifying the developer that an event that has occured (e.g. a compaction). Events are NOT part of the context window.
usage: Usage
The usage statistics of the agent.
cache_write_tokens: Optional[int]
The number of input tokens written to cache (Anthropic only). None if not reported by provider.
cached_input_tokens: Optional[int]
The number of input tokens served from cache. None if not reported by provider.
logprobs: Optional[Logprobs]
turns: Optional[List[Turn]]
Token data for all LLM generations in multi-turn agent interaction. Includes token IDs and logprobs for each assistant turn, plus tool result content. Only present if return_token_ids was enabled. Used for RL training with loss masking.
Create Message
import os
from letta_client import Letta
client = Letta(
api_key=os.environ.get("LETTA_API_KEY"), # This is the default and can be omitted
)
for message in client.agents.messages.create(
agent_id="agent-123e4567-e89b-42d3-8456-426614174000",
):
print(message){
"messages": [
{
"id": "id",
"content": "content",
"date": "2019-12-27T18:11:19.117Z",
"is_err": true,
"message_type": "system_message",
"name": "name",
"otid": "otid",
"run_id": "run_id",
"sender_id": "sender_id",
"seq_id": 0,
"step_id": "step_id"
}
],
"stop_reason": {
"stop_reason": "end_turn",
"message_type": "stop_reason"
},
"usage": {
"cache_write_tokens": 0,
"cached_input_tokens": 0,
"completion_tokens": 0,
"context_tokens": 0,
"message_type": "usage_statistics",
"prompt_tokens": 0,
"reasoning_tokens": 0,
"run_ids": [
"string"
],
"step_count": 0,
"total_tokens": 0
},
"logprobs": {
"content": [
{
"token": "token",
"logprob": 0,
"top_logprobs": [
{
"token": "token",
"logprob": 0,
"bytes": [
0
]
}
],
"bytes": [
0
]
}
],
"refusal": [
{
"token": "token",
"logprob": 0,
"top_logprobs": [
{
"token": "token",
"logprob": 0,
"bytes": [
0
]
}
],
"bytes": [
0
]
}
]
},
"turns": [
{
"role": "assistant",
"content": "content",
"output_ids": [
0
],
"output_token_logprobs": [
[
{}
]
],
"tool_name": "tool_name"
}
]
}Returns Examples
{
"messages": [
{
"id": "id",
"content": "content",
"date": "2019-12-27T18:11:19.117Z",
"is_err": true,
"message_type": "system_message",
"name": "name",
"otid": "otid",
"run_id": "run_id",
"sender_id": "sender_id",
"seq_id": 0,
"step_id": "step_id"
}
],
"stop_reason": {
"stop_reason": "end_turn",
"message_type": "stop_reason"
},
"usage": {
"cache_write_tokens": 0,
"cached_input_tokens": 0,
"completion_tokens": 0,
"context_tokens": 0,
"message_type": "usage_statistics",
"prompt_tokens": 0,
"reasoning_tokens": 0,
"run_ids": [
"string"
],
"step_count": 0,
"total_tokens": 0
},
"logprobs": {
"content": [
{
"token": "token",
"logprob": 0,
"top_logprobs": [
{
"token": "token",
"logprob": 0,
"bytes": [
0
]
}
],
"bytes": [
0
]
}
],
"refusal": [
{
"token": "token",
"logprob": 0,
"top_logprobs": [
{
"token": "token",
"logprob": 0,
"bytes": [
0
]
}
],
"bytes": [
0
]
}
]
},
"turns": [
{
"role": "assistant",
"content": "content",
"output_ids": [
0
],
"output_token_logprobs": [
[
{}
]
],
"tool_name": "tool_name"
}
]
}
Skip to content