List LLM Models
Headers
Header authentication of the form Bearer <token>
Query parameters
Response
Puts ‘inner_thoughts’ as a kwarg in the function call if this is set to True. This helps with function calling performance and also the generation of inner thoughts.
The handle for this config, in the format provider/model-name.
Configurable thinking budget for extended thinking. Used for enable_reasoner and also for Google Vertex models like Gemini 2.5 Flash. Minimum value is 1024 when used with enable_reasoner.
Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim. From OpenAI: Number between -2.0 and 2.0.
Soft control for how verbose model output should be, used for GPT-5 models.
The cost tier for the model (cloud only).