Models
List Llm Models
ModelsExpand Collapse
LlmConfig object { context_window, model, model_endpoint_type, 24 more }
Configuration for Language Model (LLM) connection and generation parameters.
.. deprecated:: LLMConfig is deprecated and should not be used as an input or return type in API calls. Use the schemas in letta.schemas.model (ModelSettings, OpenAIModelSettings, etc.) instead. For conversion, use the _to_model() method or Model._from_llm_config() method.
model_endpoint_type: "openai" or "anthropic" or "google_ai" or 27 more
The endpoint type for the model.
enable_reasoner: optional boolean
Whether or not the model should use extended thinking if it is a ‘reasoning’ style model
frequency_penalty: optional number
Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim. From OpenAI: Number between -2.0 and 2.0.
max_reasoning_tokens: optional number
Configurable thinking budget for extended thinking. Used for enable_reasoner and also for Google Vertex models like Gemini 2.5 Flash. Minimum value is 1024 when used with enable_reasoner.
max_tokens: optional number
The maximum number of tokens to generate. If not set, the model will use its default value.
Deprecatedparallel_tool_calls: optional boolean
Deprecated: Use model_settings to configure parallel tool calls instead. If set to True, enables parallel tool calling. Defaults to False.
put_inner_thoughts_in_kwargs: optional boolean
Puts ‘inner_thoughts’ as a kwarg in the function call if this is set to True. This helps with function calling performance and also the generation of inner thoughts.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
return_logprobs: optional boolean
Whether to return log probabilities of the output tokens. Useful for RL training.
return_token_ids: optional boolean
Whether to return token IDs for all LLM generations via SGLang native endpoint. Required for multi-turn RL training with loss masking. Only works with SGLang provider.
strict: optional boolean
Enable strict mode for tool calling. When true, tool schemas include strict: true and additionalProperties: false, guaranteeing tool outputs match JSON schemas.
temperature: optional number
The temperature to use when generating text with the model. A higher temperature will result in more random text.
tool_call_parser: optional string
SGLang tool call parser name (e.g. ‘glm47’, ‘qwen25’, ‘hermes’). Used by the SGLang native adapter to parse tool calls from raw model output.
Model object { context_window, max_context_window, model, 28 more }
Deprecatedcontext_window: number
Deprecated: Use ‘max_context_window’ field instead. The context window size for the model.
Deprecatedmodel_endpoint_type: "openai" or "anthropic" or "google_ai" or 26 more
Deprecated: Use ‘provider_type’ field instead. The endpoint type for the model.
Deprecatedenable_reasoner: optional boolean
Deprecated: Whether or not the model should use extended thinking if it is a ‘reasoning’ style model.
Deprecatedfrequency_penalty: optional number
Deprecated: Positive values penalize new tokens based on their existing frequency in the text so far.
Deprecatedmax_reasoning_tokens: optional number
Deprecated: Configurable thinking budget for extended thinking.
Deprecatedparallel_tool_calls: optional boolean
Deprecated: If set to True, enables parallel tool calling.
Deprecatedput_inner_thoughts_in_kwargs: optional boolean
Deprecated: Puts ‘inner_thoughts’ as a kwarg in the function call.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
return_logprobs: optional boolean
Whether to return log probabilities of the output tokens. Useful for RL training.
return_token_ids: optional boolean
Whether to return token IDs for all LLM generations via SGLang native endpoint. Required for multi-turn RL training with loss masking. Only works with SGLang provider.
strict: optional boolean
Enable strict mode for tool calling. When true, tool schemas include strict: true and additionalProperties: false, guaranteeing tool outputs match JSON schemas.
Deprecatedtemperature: optional number
Deprecated: The temperature to use when generating text with the model.
tool_call_parser: optional string
SGLang tool call parser name (e.g. ‘glm47’, ‘qwen25’, ‘hermes’). Used by the SGLang native adapter to parse tool calls from raw model output.
Deprecatedcontext_window: number
Deprecated: Use ‘max_context_window’ field instead. The context window size for the model.
Deprecatedmodel_endpoint_type: "openai" or "anthropic" or "google_ai" or 26 more
Deprecated: Use ‘provider_type’ field instead. The endpoint type for the model.
Deprecatedenable_reasoner: optional boolean
Deprecated: Whether or not the model should use extended thinking if it is a ‘reasoning’ style model.
Deprecatedfrequency_penalty: optional number
Deprecated: Positive values penalize new tokens based on their existing frequency in the text so far.
Deprecatedmax_reasoning_tokens: optional number
Deprecated: Configurable thinking budget for extended thinking.
Deprecatedparallel_tool_calls: optional boolean
Deprecated: If set to True, enables parallel tool calling.
Deprecatedput_inner_thoughts_in_kwargs: optional boolean
Deprecated: Puts ‘inner_thoughts’ as a kwarg in the function call.
response_format: optional TextResponseFormat { type } or JsonSchemaResponseFormat { json_schema, type } or JsonObjectResponseFormat { type }
return_logprobs: optional boolean
Whether to return log probabilities of the output tokens. Useful for RL training.
return_token_ids: optional boolean
Whether to return token IDs for all LLM generations via SGLang native endpoint. Required for multi-turn RL training with loss masking. Only works with SGLang provider.
strict: optional boolean
Enable strict mode for tool calling. When true, tool schemas include strict: true and additionalProperties: false, guaranteeing tool outputs match JSON schemas.
Deprecatedtemperature: optional number
Deprecated: The temperature to use when generating text with the model.
tool_call_parser: optional string
SGLang tool call parser name (e.g. ‘glm47’, ‘qwen25’, ‘hermes’). Used by the SGLang native adapter to parse tool calls from raw model output.
Skip to content