---
title: Letta Evals | Letta Docs
description: Introduction to Letta's evaluation framework for testing and measuring agent performance.
---

**Systematic testing for stateful AI agents.** Validate changes, prevent regressions, and ship with confidence.

Test agent memory, tool usage, multi-turn conversations, and state evolution with automated grading and pass/fail gates.

**Ready to start?** Jump to [Getting Started](/guides/evals/getting-started/index.md) or learn the [Core Concepts](/guides/evals/concepts/overview/index.md) first.

## Core Concepts

Understand the building blocks of evaluations:

- [Suites](/guides/evals/concepts/suites/index.md) - Configure your evaluation
- [Datasets](/guides/evals/concepts/datasets/index.md) - Define test cases
- [Targets](/guides/evals/concepts/targets/index.md) - Specify the agent to test
- [Graders](/guides/evals/concepts/graders/index.md) - Score agent outputs
- [Extractors](/guides/evals/concepts/extractors/index.md) - Extract content from responses
- [Gates](/guides/evals/concepts/gates/index.md) - Set pass/fail criteria

### Grading & Extraction

Choose how to score your agents:

- [Tool Graders](/guides/evals/graders/tool-graders/index.md) - Fast, deterministic grading with Python functions
- [Rubric Graders](/guides/evals/graders/rubric-graders/index.md) - Flexible LLM-as-judge evaluation
- [Built-in Extractors](/guides/evals/extractors/builtin/index.md) - Pre-built content extractors
- [Multi-Metric Grading](/guides/evals/graders/multi-metric/index.md) - Evaluate multiple dimensions

### Advanced

- [Custom Graders](/guides/evals/advanced/custom-graders/index.md) - Write your own grading logic
- [Custom Extractors](/guides/evals/extractors/custom/index.md) - Build custom extractors
- [Multi-Turn Conversations](/guides/evals/advanced/multi-turn-conversations/index.md) - Test memory and state
- [Suite YAML Reference](/guides/evals/configuration/suite-yaml/index.md) - Complete configuration schema

### Reference

- [CLI Commands](/guides/evals/cli/commands/index.md) - Command-line interface
- [Understanding Results](/guides/evals/results/overview/index.md) - Interpret metrics
- [Troubleshooting](/guides/evals/troubleshooting/index.md) - Common issues and solutions

## Resources

- **[GitHub Repository](https://github.com/letta-ai/letta-evals)** - Source code, issues, and contributions
- **[PyPI Package](https://pypi.org/project/letta-evals/)** - Install with `pip install letta-evals`
