Create Chat Completion

Create a chat completion using a Letta agent (OpenAI-compatible). This endpoint provides full OpenAI API compatibility. The agent is selected based on: - The 'model' parameter in the request (should contain an agent ID in format 'agent-...') When streaming is enabled (stream=true), the response will be Server-Sent Events with ChatCompletionChunk objects.

Authentication

AuthorizationBearer

Header authentication of the form Bearer <token>

Request

This endpoint expects an object.
modelstringRequired
ID of the model to use
messageslist of objectsRequired
Messages comprising the conversation so far
temperaturedouble or nullOptional>=0<=2
Sampling temperature
top_pdouble or nullOptional>=0<=1
Nucleus sampling parameter
ninteger or nullOptional>=1Defaults to 1
Number of chat completion choices to generate
streamboolean or nullOptionalDefaults to false
Whether to stream back partial progress
stopstring or list of strings or nullOptional
Sequences where the API will stop generating
max_tokensinteger or nullOptional
Maximum number of tokens to generate
presence_penaltydouble or nullOptional>=-2<=2
Presence penalty
frequency_penaltydouble or nullOptional>=-2<=2
Frequency penalty
userstring or nullOptional

A unique identifier representing your end-user

Response

Successful response
idstring
choiceslist of objects
createdinteger
modelstring
object"chat.completion"Defaults to chat.completion
service_tierenum or null
Allowed values:
system_fingerprintstring or null
usageobject or null

Errors