CLI Commands
The letta-evals command-line interface lets you run evaluations, validate configurations, and inspect available components.
Quick overview:
run
- Execute an evaluation suite (most common)validate
- Check suite configuration without runninglist-extractors
- Show available extractorslist-graders
- Show available grader functions- Exit codes - 0 for pass, 1 for fail (perfect for CI/CD)
Typical workflow:
- Validate your suite:
letta-evals validate suite.yaml
- Run evaluation:
letta-evals run suite.yaml --output results/
- Check exit code:
echo $?
(0 = passed, 1 = failed)
run
Run an evaluation suite.
Arguments
suite.yaml
: Path to the suite configuration file (required)
Options
—output, -o
Save results to a directory.
Creates:
results/header.json
: Evaluation metadataresults/summary.json
: Aggregate metrics and configurationresults/results.jsonl
: Per-sample results (one JSON per line)
—quiet, -q
Quiet mode - only show pass/fail result.
Output:
—max-concurrent
Maximum concurrent sample evaluations. Default: 15
Higher values = faster evaluation but more resource usage.
—api-key
Letta API key (overrides LETTA_API_KEY environment variable).
—base-url
Letta server base URL (overrides suite config and environment variable).
—project-id
Letta project ID for cloud deployments.
—cached, -c
Path to cached results (JSONL) for re-grading trajectories without re-running the agent.
Use this to test different graders on the same agent trajectories.
—num-runs
Run the evaluation multiple times to measure consistency. Default: 1
Output with multiple runs:
- Each run creates a separate
run_N/
directory with individual results - An
aggregate_stats.json
file contains statistics across all runs (mean, standard deviation, pass rate)
Examples
Basic run:
Save results:
Letta Cloud:
Quiet CI mode:
Exit Codes
0
: Evaluation passed (gate criteria met)1
: Evaluation failed (gate criteria not met or error)
validate
Validate a suite configuration without running it.
Checks:
- YAML syntax is valid
- Required fields are present
- Paths exist
- Configuration is consistent
- Grader/extractor combinations are valid
Output on success:
Output on error:
list-extractors
List all available extractors.
Output:
list-graders
List all available grader functions.
Output:
help
Show help information.
Show help for a specific command:
Environment Variables
LETTA_API_KEY
API key for Letta authentication.
LETTA_BASE_URL
Letta server base URL.
LETTA_PROJECT_ID
Letta project ID (for cloud).
OPENAI_API_KEY
OpenAI API key (for rubric graders).
Configuration Priority
Configuration values are resolved in this order (highest to lowest priority):
- CLI arguments (
--api-key
,--base-url
,--project-id
) - Suite YAML configuration
- Environment variables
Using in CI/CD
GitHub Actions
GitLab CI
Debugging
Common Issues
“Agent file not found”
“Connection refused”
“Invalid API key”
Next Steps
- Understanding Results - Interpreting evaluation output
- Suite YAML Reference - Complete configuration options
- Getting Started - Complete tutorial with examples