Create Chat Completion

1 curl -X POST https://api.letta.com/v1/chat/completions \ 2 -H "Authorization: Bearer <token>" \ 3 -H "Content-Type: application/json" \ 4 -d '{ 5 "model": "string", 6 "messages": [ 7 { 8 "content": "string", 9 "role": "string" 10 } 11 ] 12 }'

1 { 2 "id": "string", 3 "choices": [ 4 { 5 "finish_reason": "stop", 6 "index": 1, 7 "message": { 8 "role": "string", 9 "content": "string", 10 "refusal": "string", 11 "annotations": [ 12 { 13 "type": "string", 14 "url_citation": { 15 "end_index": 1, 16 "start_index": 1, 17 "title": "string", 18 "url": "string" 19 } 20 } 21 ], 22 "audio": { 23 "id": "string", 24 "data": "string", 25 "expires_at": 1, 26 "transcript": "string" 27 }, 28 "function_call": { 29 "arguments": "string", 30 "name": "string" 31 }, 32 "tool_calls": [ 33 { 34 "id": "string", 35 "function": { 36 "arguments": "string", 37 "name": "string" 38 }, 39 "type": "string" 40 } 41 ] 42 }, 43 "logprobs": { 44 "content": [ 45 { 46 "token": "string", 47 "logprob": 1.1, 48 "top_logprobs": [ 49 { 50 "token": "string", 51 "logprob": 1.1, 52 "bytes": [ 53 1 54 ] 55 } 56 ], 57 "bytes": [ 58 1 59 ] 60 } 61 ], 62 "refusal": [ 63 { 64 "token": "string", 65 "logprob": 1.1, 66 "top_logprobs": [ 67 { 68 "token": "string", 69 "logprob": 1.1, 70 "bytes": [ 71 1 72 ] 73 } 74 ], 75 "bytes": [ 76 1 77 ] 78 } 79 ] 80 } 81 } 82 ], 83 "created": 1, 84 "model": "string", 85 "object": "string", 86 "service_tier": "auto", 87 "system_fingerprint": "string", 88 "usage": { 89 "completion_tokens": 1, 90 "prompt_tokens": 1, 91 "total_tokens": 1, 92 "completion_tokens_details": { 93 "accepted_prediction_tokens": 1, 94 "audio_tokens": 1, 95 "reasoning_tokens": 1, 96 "rejected_prediction_tokens": 1 97 }, 98 "prompt_tokens_details": { 99 "audio_tokens": 1, 100 "cached_tokens": 1 101 } 102 } 103 }

Create a chat completion using a Letta agent (OpenAI-compatible).

This endpoint provides full OpenAI API compatibility. The agent is selected based on:

The ‘model’ parameter in the request (should contain an agent ID in format ‘agent-…’)

When streaming is enabled (stream=true), the response will be Server-Sent Events with ChatCompletionChunk objects.

Create a chat completion using a Letta agent (OpenAI-compatible). This endpoint provides full OpenAI API compatibility. The agent is selected based on: - The 'model' parameter in the request (should contain an agent ID in format 'agent-...') When streaming is enabled (stream=true), the response will be Server-Sent Events with ChatCompletionChunk objects.

Authentication

AuthorizationBearer

Header authentication of the form Bearer <token>

Request

This endpoint expects an object.

modelstringRequired

ID of the model to use

messageslist of objectsRequired

Messages comprising the conversation so far

temperaturedouble or nullOptional>=0<=2

Sampling temperature

top_pdouble or nullOptional>=0<=1

Nucleus sampling parameter

ninteger or nullOptional>=1Defaults to 1

Number of chat completion choices to generate

streamboolean or nullOptionalDefaults to false

Whether to stream back partial progress

stopstring or list of strings or nullOptional

Sequences where the API will stop generating

max_tokensinteger or nullOptional

Maximum number of tokens to generate

presence_penaltydouble or nullOptional>=-2<=2

Presence penalty

frequency_penaltydouble or nullOptional>=-2<=2

Frequency penalty

userstring or nullOptional

A unique identifier representing your end-user

Response

Successful response

idstring

choiceslist of objects

createdinteger

modelstring

object"chat.completion"Defaults to chat.completion

service_tierenum or null

Allowed values:

system_fingerprintstring or null

usageobject or null

Authentication

Request

Response

Errors