Conversations
Create Conversation
List Conversations
Retrieve Conversation
Update Conversation
Delete Conversation
Cancel Conversation
Recompile Conversation
ModelsExpand Collapse
Conversation = object { id, agent_id, created_at, 8 more }
Represents a conversation on an agent for concurrent messaging.
id: string
The unique identifier of the conversation.
agent_id: string
The ID of the agent this conversation belongs to.
created_at: optional string
The timestamp when the object was created.
created_by_id: optional string
The id of the user that made this object.
in_context_message_ids: optional array of string
The IDs of in-context messages for the conversation.
isolated_block_ids: optional array of string
IDs of blocks that are isolated (specific to this conversation, overriding agent defaults).
last_updated_by_id: optional string
The id of the user that made this object.
model: optional string
The model handle for this conversation (overrides agent's model). Format: provider/model-name.
model_settings: optional OpenAIModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 4 more } or object { max_output_tokens, parallel_tool_calls, provider_type, 5 more } or AnthropicModelSettings { effort, max_output_tokens, parallel_tool_calls, 6 more } or 11 more
The model settings for this conversation (overrides agent's model settings).
OpenAIModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 4 more }
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "openai"
The type of the provider.
reasoning: optional object { reasoning_effort }
The reasoning configuration for the model.
reasoning_effort: optional "none" or "minimal" or "low" or 3 more
The reasoning effort to use when generating text reasoning models
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
strict: optional boolean
Enable strict mode for tool calling. When true, tool outputs are guaranteed to match JSON schemas.
temperature: optional number
The temperature of the model.
Sglang = object { max_output_tokens, parallel_tool_calls, provider_type, 5 more }
SGLang model configuration (OpenAI-compatible runtime with SGLang-specific parsing).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "sglang"
The type of the provider.
reasoning: optional object { reasoning_effort }
The reasoning configuration for the model.
reasoning_effort: optional "none" or "minimal" or "low" or 3 more
The reasoning effort to use when generating text reasoning models
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
strict: optional boolean
Enable strict mode for tool calling. When true, tool outputs are guaranteed to match JSON schemas.
temperature: optional number
The temperature of the model.
tool_call_parser: optional string
SGLang tool call parser name (for example 'glm47', 'qwen25', or 'hermes').
AnthropicModelSettings = object { effort, max_output_tokens, parallel_tool_calls, 6 more }
effort: optional "low" or "medium" or "high" or "max"
Effort level for supported Anthropic models (controls token spending). 'max' is only available on Opus 4.6. Not setting this gives similar performance to 'high'.
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "anthropic"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
strict: optional boolean
Enable strict mode for tool calling. When true, tool outputs are guaranteed to match JSON schemas.
temperature: optional number
The temperature of the model.
thinking: optional object { budget_tokens, type }
The thinking configuration for the model.
budget_tokens: optional number
The maximum number of tokens the model can use for extended thinking.
type: optional "enabled" or "disabled"
The type of thinking to use.
verbosity: optional "low" or "medium" or "high"
Soft control for how verbose model output should be, used for GPT-5 models.
GoogleAIModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 3 more }
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "google_ai"
The type of the provider.
response_schema: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response schema for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
thinking_config: optional object { include_thoughts, thinking_budget }
The thinking configuration for the model.
include_thoughts: optional boolean
Whether to include thoughts in the model's response.
thinking_budget: optional number
The thinking budget for the model.
GoogleVertexModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 3 more }
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "google_vertex"
The type of the provider.
response_schema: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response schema for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
thinking_config: optional object { include_thoughts, thinking_budget }
The thinking configuration for the model.
include_thoughts: optional boolean
Whether to include thoughts in the model's response.
thinking_budget: optional number
The thinking budget for the model.
AzureModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
Azure OpenAI model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "azure"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
XaiModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
xAI model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "xai"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
Zai = object { max_output_tokens, parallel_tool_calls, provider_type, 3 more }
Z.ai (ZhipuAI) model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "zai"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
thinking: optional object { clear_thinking, type }
The thinking configuration for GLM-4.5+ models.
clear_thinking: optional boolean
If False, preserved thinking is used (recommended for agents).
type: optional "enabled" or "disabled"
Whether thinking is enabled or disabled.
GroqModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
Groq model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "groq"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
DeepseekModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
Deepseek model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "deepseek"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
TogetherModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
Together AI model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "together"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
BedrockModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
AWS Bedrock model configuration.
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "bedrock"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
Openrouter = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
OpenRouter model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "openrouter"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
ChatgptOAuth = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
ChatGPT OAuth model configuration (uses ChatGPT backend API).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "chatgpt_oauth"
The type of the provider.
reasoning: optional object { reasoning_effort }
The reasoning configuration for the model.
reasoning_effort: optional "none" or "low" or "medium" or 2 more
The reasoning effort level for GPT-5.x and o-series models.
temperature: optional number
The temperature of the model.
summary: optional string
A summary of the conversation.
updated_at: optional string
The timestamp when the object was last updated.
CreateConversation = object { isolated_block_labels, model, model_settings, summary }
Request model for creating a new conversation.
isolated_block_labels: optional array of string
List of block labels that should be isolated (conversation-specific) rather than shared across conversations. New blocks will be created as copies of the agent's blocks with these labels.
model: optional string
The model handle for this conversation (overrides agent's model). Format: provider/model-name.
model_settings: optional OpenAIModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 4 more } or object { max_output_tokens, parallel_tool_calls, provider_type, 5 more } or AnthropicModelSettings { effort, max_output_tokens, parallel_tool_calls, 6 more } or 11 more
The model settings for this conversation (overrides agent's model settings).
OpenAIModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 4 more }
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "openai"
The type of the provider.
reasoning: optional object { reasoning_effort }
The reasoning configuration for the model.
reasoning_effort: optional "none" or "minimal" or "low" or 3 more
The reasoning effort to use when generating text reasoning models
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
strict: optional boolean
Enable strict mode for tool calling. When true, tool outputs are guaranteed to match JSON schemas.
temperature: optional number
The temperature of the model.
Sglang = object { max_output_tokens, parallel_tool_calls, provider_type, 5 more }
SGLang model configuration (OpenAI-compatible runtime with SGLang-specific parsing).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "sglang"
The type of the provider.
reasoning: optional object { reasoning_effort }
The reasoning configuration for the model.
reasoning_effort: optional "none" or "minimal" or "low" or 3 more
The reasoning effort to use when generating text reasoning models
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
strict: optional boolean
Enable strict mode for tool calling. When true, tool outputs are guaranteed to match JSON schemas.
temperature: optional number
The temperature of the model.
tool_call_parser: optional string
SGLang tool call parser name (for example 'glm47', 'qwen25', or 'hermes').
AnthropicModelSettings = object { effort, max_output_tokens, parallel_tool_calls, 6 more }
effort: optional "low" or "medium" or "high" or "max"
Effort level for supported Anthropic models (controls token spending). 'max' is only available on Opus 4.6. Not setting this gives similar performance to 'high'.
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "anthropic"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
strict: optional boolean
Enable strict mode for tool calling. When true, tool outputs are guaranteed to match JSON schemas.
temperature: optional number
The temperature of the model.
thinking: optional object { budget_tokens, type }
The thinking configuration for the model.
budget_tokens: optional number
The maximum number of tokens the model can use for extended thinking.
type: optional "enabled" or "disabled"
The type of thinking to use.
verbosity: optional "low" or "medium" or "high"
Soft control for how verbose model output should be, used for GPT-5 models.
GoogleAIModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 3 more }
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "google_ai"
The type of the provider.
response_schema: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response schema for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
thinking_config: optional object { include_thoughts, thinking_budget }
The thinking configuration for the model.
include_thoughts: optional boolean
Whether to include thoughts in the model's response.
thinking_budget: optional number
The thinking budget for the model.
GoogleVertexModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 3 more }
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "google_vertex"
The type of the provider.
response_schema: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response schema for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
thinking_config: optional object { include_thoughts, thinking_budget }
The thinking configuration for the model.
include_thoughts: optional boolean
Whether to include thoughts in the model's response.
thinking_budget: optional number
The thinking budget for the model.
AzureModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
Azure OpenAI model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "azure"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
XaiModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
xAI model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "xai"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
Zai = object { max_output_tokens, parallel_tool_calls, provider_type, 3 more }
Z.ai (ZhipuAI) model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "zai"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
thinking: optional object { clear_thinking, type }
The thinking configuration for GLM-4.5+ models.
clear_thinking: optional boolean
If False, preserved thinking is used (recommended for agents).
type: optional "enabled" or "disabled"
Whether thinking is enabled or disabled.
GroqModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
Groq model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "groq"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
DeepseekModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
Deepseek model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "deepseek"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
TogetherModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
Together AI model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "together"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
BedrockModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
AWS Bedrock model configuration.
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "bedrock"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
Openrouter = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
OpenRouter model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "openrouter"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
ChatgptOAuth = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
ChatGPT OAuth model configuration (uses ChatGPT backend API).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "chatgpt_oauth"
The type of the provider.
reasoning: optional object { reasoning_effort }
The reasoning configuration for the model.
reasoning_effort: optional "none" or "low" or "medium" or 2 more
The reasoning effort level for GPT-5.x and o-series models.
temperature: optional number
The temperature of the model.
summary: optional string
A summary of the conversation.
UpdateConversation = object { model, model_settings, summary }
Request model for updating a conversation.
model: optional string
The model handle for this conversation (overrides agent's model). Format: provider/model-name.
model_settings: optional OpenAIModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 4 more } or object { max_output_tokens, parallel_tool_calls, provider_type, 5 more } or AnthropicModelSettings { effort, max_output_tokens, parallel_tool_calls, 6 more } or 11 more
The model settings for this conversation (overrides agent's model settings).
OpenAIModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 4 more }
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "openai"
The type of the provider.
reasoning: optional object { reasoning_effort }
The reasoning configuration for the model.
reasoning_effort: optional "none" or "minimal" or "low" or 3 more
The reasoning effort to use when generating text reasoning models
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
strict: optional boolean
Enable strict mode for tool calling. When true, tool outputs are guaranteed to match JSON schemas.
temperature: optional number
The temperature of the model.
Sglang = object { max_output_tokens, parallel_tool_calls, provider_type, 5 more }
SGLang model configuration (OpenAI-compatible runtime with SGLang-specific parsing).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "sglang"
The type of the provider.
reasoning: optional object { reasoning_effort }
The reasoning configuration for the model.
reasoning_effort: optional "none" or "minimal" or "low" or 3 more
The reasoning effort to use when generating text reasoning models
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
strict: optional boolean
Enable strict mode for tool calling. When true, tool outputs are guaranteed to match JSON schemas.
temperature: optional number
The temperature of the model.
tool_call_parser: optional string
SGLang tool call parser name (for example 'glm47', 'qwen25', or 'hermes').
AnthropicModelSettings = object { effort, max_output_tokens, parallel_tool_calls, 6 more }
effort: optional "low" or "medium" or "high" or "max"
Effort level for supported Anthropic models (controls token spending). 'max' is only available on Opus 4.6. Not setting this gives similar performance to 'high'.
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "anthropic"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
strict: optional boolean
Enable strict mode for tool calling. When true, tool outputs are guaranteed to match JSON schemas.
temperature: optional number
The temperature of the model.
thinking: optional object { budget_tokens, type }
The thinking configuration for the model.
budget_tokens: optional number
The maximum number of tokens the model can use for extended thinking.
type: optional "enabled" or "disabled"
The type of thinking to use.
verbosity: optional "low" or "medium" or "high"
Soft control for how verbose model output should be, used for GPT-5 models.
GoogleAIModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 3 more }
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "google_ai"
The type of the provider.
response_schema: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response schema for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
thinking_config: optional object { include_thoughts, thinking_budget }
The thinking configuration for the model.
include_thoughts: optional boolean
Whether to include thoughts in the model's response.
thinking_budget: optional number
The thinking budget for the model.
GoogleVertexModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 3 more }
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "google_vertex"
The type of the provider.
response_schema: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response schema for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
thinking_config: optional object { include_thoughts, thinking_budget }
The thinking configuration for the model.
include_thoughts: optional boolean
Whether to include thoughts in the model's response.
thinking_budget: optional number
The thinking budget for the model.
AzureModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
Azure OpenAI model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "azure"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
XaiModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
xAI model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "xai"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
Zai = object { max_output_tokens, parallel_tool_calls, provider_type, 3 more }
Z.ai (ZhipuAI) model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "zai"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
thinking: optional object { clear_thinking, type }
The thinking configuration for GLM-4.5+ models.
clear_thinking: optional boolean
If False, preserved thinking is used (recommended for agents).
type: optional "enabled" or "disabled"
Whether thinking is enabled or disabled.
GroqModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
Groq model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "groq"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
DeepseekModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
Deepseek model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "deepseek"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
TogetherModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
Together AI model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "together"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
BedrockModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
AWS Bedrock model configuration.
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "bedrock"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
Openrouter = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
OpenRouter model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "openrouter"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
ChatgptOAuth = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
ChatGPT OAuth model configuration (uses ChatGPT backend API).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "chatgpt_oauth"
The type of the provider.
reasoning: optional object { reasoning_effort }
The reasoning configuration for the model.
reasoning_effort: optional "none" or "low" or "medium" or 2 more
The reasoning effort level for GPT-5.x and o-series models.
temperature: optional number
The temperature of the model.
summary: optional string
A summary of the conversation.
ConversationsMessages
List Conversation Messages
Send Conversation Message
Retrieve Conversation Stream
Compact Conversation
ModelsExpand Collapse
CompactionRequest = object { compaction_settings }
compaction_settings: optional object { clip_chars, mode, model, 4 more }
Configuration for conversation compaction / summarization.
Per-model settings (temperature, max tokens, etc.) are derived from the default configuration for that handle.
clip_chars: optional number
The maximum length of the summary in characters. If none, no clipping is performed.
mode: optional "all" or "sliding_window" or "self_compact_all" or "self_compact_sliding_window"
The type of summarization technique use.
model: optional string
Model handle to use for sliding_window/all summarization (format: provider/model-name). If None, uses lightweight provider-specific defaults.
model_settings: optional OpenAIModelSettings { max_output_tokens, parallel_tool_calls, provider_type, 4 more } or object { max_output_tokens, parallel_tool_calls, provider_type, 5 more } or AnthropicModelSettings { effort, max_output_tokens, parallel_tool_calls, 6 more } or 11 more
Optional model settings used to override defaults for the summarizer model.
OpenAIModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 4 more }
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "openai"
The type of the provider.
reasoning: optional object { reasoning_effort }
The reasoning configuration for the model.
reasoning_effort: optional "none" or "minimal" or "low" or 3 more
The reasoning effort to use when generating text reasoning models
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
strict: optional boolean
Enable strict mode for tool calling. When true, tool outputs are guaranteed to match JSON schemas.
temperature: optional number
The temperature of the model.
Sglang = object { max_output_tokens, parallel_tool_calls, provider_type, 5 more }
SGLang model configuration (OpenAI-compatible runtime with SGLang-specific parsing).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "sglang"
The type of the provider.
reasoning: optional object { reasoning_effort }
The reasoning configuration for the model.
reasoning_effort: optional "none" or "minimal" or "low" or 3 more
The reasoning effort to use when generating text reasoning models
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
strict: optional boolean
Enable strict mode for tool calling. When true, tool outputs are guaranteed to match JSON schemas.
temperature: optional number
The temperature of the model.
tool_call_parser: optional string
SGLang tool call parser name (for example 'glm47', 'qwen25', or 'hermes').
AnthropicModelSettings = object { effort, max_output_tokens, parallel_tool_calls, 6 more }
effort: optional "low" or "medium" or "high" or "max"
Effort level for supported Anthropic models (controls token spending). 'max' is only available on Opus 4.6. Not setting this gives similar performance to 'high'.
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "anthropic"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
strict: optional boolean
Enable strict mode for tool calling. When true, tool outputs are guaranteed to match JSON schemas.
temperature: optional number
The temperature of the model.
thinking: optional object { budget_tokens, type }
The thinking configuration for the model.
budget_tokens: optional number
The maximum number of tokens the model can use for extended thinking.
type: optional "enabled" or "disabled"
The type of thinking to use.
verbosity: optional "low" or "medium" or "high"
Soft control for how verbose model output should be, used for GPT-5 models.
GoogleAIModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 3 more }
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "google_ai"
The type of the provider.
response_schema: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response schema for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
thinking_config: optional object { include_thoughts, thinking_budget }
The thinking configuration for the model.
include_thoughts: optional boolean
Whether to include thoughts in the model's response.
thinking_budget: optional number
The thinking budget for the model.
GoogleVertexModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 3 more }
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "google_vertex"
The type of the provider.
response_schema: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response schema for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
thinking_config: optional object { include_thoughts, thinking_budget }
The thinking configuration for the model.
include_thoughts: optional boolean
Whether to include thoughts in the model's response.
thinking_budget: optional number
The thinking budget for the model.
AzureModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
Azure OpenAI model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "azure"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
XaiModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
xAI model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "xai"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
Zai = object { max_output_tokens, parallel_tool_calls, provider_type, 3 more }
Z.ai (ZhipuAI) model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "zai"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
thinking: optional object { clear_thinking, type }
The thinking configuration for GLM-4.5+ models.
clear_thinking: optional boolean
If False, preserved thinking is used (recommended for agents).
type: optional "enabled" or "disabled"
Whether thinking is enabled or disabled.
GroqModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
Groq model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "groq"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
DeepseekModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
Deepseek model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "deepseek"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
TogetherModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
Together AI model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "together"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
BedrockModelSettings = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
AWS Bedrock model configuration.
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "bedrock"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
Openrouter = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
OpenRouter model configuration (OpenAI-compatible).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "openrouter"
The type of the provider.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
The response format for the model.
TextResponseFormat = object { type }
Response format for plain text responses.
type: optional "text"
The type of the response format.
JsonSchemaResponseFormat = object { json_schema, type }
Response format for JSON schema-based responses.
json_schema: map[unknown]
The JSON schema of the response.
type: optional "json_schema"
The type of the response format.
JsonObjectResponseFormat = object { type }
Response format for JSON object responses.
type: optional "json_object"
The type of the response format.
temperature: optional number
The temperature of the model.
ChatgptOAuth = object { max_output_tokens, parallel_tool_calls, provider_type, 2 more }
ChatGPT OAuth model configuration (uses ChatGPT backend API).
max_output_tokens: optional number
The maximum number of tokens the model can generate.
parallel_tool_calls: optional boolean
Whether to enable parallel tool calling.
provider_type: optional "chatgpt_oauth"
The type of the provider.
reasoning: optional object { reasoning_effort }
The reasoning configuration for the model.
reasoning_effort: optional "none" or "low" or "medium" or 2 more
The reasoning effort level for GPT-5.x and o-series models.
temperature: optional number
The temperature of the model.
prompt: optional string
The prompt to use for summarization. If None, uses mode-specific default.
prompt_acknowledgement: optional boolean
Whether to include an acknowledgement post-prompt (helps prevent non-summary outputs).
sliding_window_percentage: optional number
The percentage of the context window to keep post-summarization (only used in sliding window modes).