Letta Evals Documentation
Letta Evals Documentation
Section titled “Letta Evals Documentation”Welcome to the comprehensive documentation for Letta Evals Kit - a framework for evaluating Letta AI agents.
Table of Contents
Section titled “Table of Contents”Getting Started
Section titled “Getting Started”- Getting Started - Installation, first evaluation, and core concepts
Core Concepts
Section titled “Core Concepts”- Overview - Understanding the evaluation framework
- Suites - Evaluation suite configuration
- Datasets - Creating and managing test datasets
- Targets - What you’re evaluating
- Graders - How responses are scored
- Extractors - Extracting submissions from agent output
- Gates - Pass/fail criteria
Graders
Section titled “Graders”- Grader Overview - Understanding grader types
- Tool Graders - Built-in and custom function graders
- Rubric Graders - LLM-as-judge evaluation
- Multi-Metric Grading - Evaluating with multiple metrics
Extractors
Section titled “Extractors”- Extractor Overview - Understanding extractors
- Built-in Extractors - All available extractors
- Custom Extractors - Writing your own extractors
Configuration
Section titled “Configuration”- Suite YAML Reference - Complete YAML schema
- Target Configuration - Target setup options
- Grader Configuration - Grader parameters
- Environment Variables - Environment setup
Advanced Usage
Section titled “Advanced Usage”- Custom Graders - Writing custom grading functions
- Multi-Turn Conversations - Testing conversational memory and state
- Agent Factories - Programmatic agent creation
- Multi-Model Evaluation - Testing across models
- Setup Scripts - Pre-evaluation setup
- Memory Block Testing - Testing agent memory
- Result Streaming - Real-time results and caching
Results & Metrics
Section titled “Results & Metrics”- Understanding Results - Result structure and interpretation
- Metrics - Aggregate statistics
- Output Formats - JSON, JSONL, and console output
CLI Reference
Section titled “CLI Reference”Examples
Section titled “Examples”- Example Walkthroughs - Detailed example explanations
API Reference
Section titled “API Reference”- Data Models - Pydantic models reference
- Decorators - @grader and @extractor decorators