Custom Extractors
Custom Extractors
Section titled “Custom Extractors”Create your own extractors to pull exactly what you need from agent trajectories.
While built-in extractors cover common cases (last assistant message, tool arguments, memory blocks), custom extractors let you implement specialized extraction logic for your specific use case.
Why Custom Extractors?
Section titled “Why Custom Extractors?”Use custom extractors when you need to:
- Extract structured data: Parse JSON fields from agent responses
- Filter specific patterns: Extract code blocks, URLs, or formatted content
- Combine data sources: Merge information from multiple messages or memory blocks
- Count occurrences: Track how many times something happened in the conversation
- Complex logic: Implement domain-specific extraction that built-ins can’t handle
Example: You want to test if your agent correctly stores fruit preferences in memory using the memory_insert tool. A custom extractor can grab the tool call arguments, and a custom grader can verify the fruit name is in the right memory block.
Quick Example
Section titled “Quick Example”Here’s a real custom extractor that pulls memory_insert tool call arguments:
from typing import Listfrom letta_client import LettaMessageUnion, ToolCallMessagefrom letta_evals.decorators import extractor
@extractordef memory_insert_extractor(trajectory: List[List[LettaMessageUnion]], config: dict) -> str: """Extract memory_insert tool call arguments from trajectory.""" for turn in trajectory: for message in turn: if isinstance(message, ToolCallMessage) and message.tool_call.name == "memory_insert": return message.tool_call.arguments
return "{}" # Return empty JSON if not foundThis extractor:
- Loops through all conversation turns
- Finds
ToolCallMessageobjects - Checks if the tool is
memory_insert - Returns the JSON arguments
- Returns
"{}"if no matching tool call found
You can then pair this with a custom grader to verify the arguments are correct (see Custom Graders).
Basic Structure
Section titled “Basic Structure”from typing import List, Optionalfrom letta_client import LettaMessageUnion, AgentStatefrom letta_evals.decorators import extractor
@extractordef my_extractor( trajectory: List[List[LettaMessageUnion]], config: dict, agent_state: Optional[AgentState] = None) -> str: """Your custom extraction logic.""" # Extract and return content return extracted_textThe @extractor Decorator
Section titled “The @extractor Decorator”The @extractor decorator registers your function:
from letta_evals.decorators import extractor
@extractor # Makes this available as "my_extractor"def my_extractor(trajectory, config, agent_state=None): ...Function Signature
Section titled “Function Signature”Required Parameters
Section titled “Required Parameters”trajectory: List of conversation turns, each containing messagesconfig: Dictionary with extractor configuration from YAML
Optional Parameters
Section titled “Optional Parameters”agent_state: Agent state (only needed if extracting from memory blocks or other agent state). Most extractors only need the trajectory.
Return Value
Section titled “Return Value”Must return a string - the extracted content to be graded.
Trajectory Structure
Section titled “Trajectory Structure”The trajectory is a list of turns:
[ # Turn 1 [ UserMessage(...), AssistantMessage(...), ToolCallMessage(...), ToolReturnMessage(...) ], # Turn 2 [ AssistantMessage(...) ]]Message types:
UserMessage: User inputAssistantMessage: Agent responseToolCallMessage: Tool invocationToolReturnMessage: Tool resultSystemMessage: System messages
Configuration
Section titled “Configuration”Access extractor config via the config parameter:
extractor: my_extractorextractor_config: max_length: 100 # Truncate output at 100 chars include_metadata: true # Include metadata in extraction@extractordef my_extractor(trajectory, config, agent_state=None): max_length = config.get("max_length", 500) include_metadata = config.get("include_metadata", False) ...Examples
Section titled “Examples”Extract Last N Messages
Section titled “Extract Last N Messages”from letta_evals.decorators import extractorfrom letta_evals.extractors.utils import get_assistant_messages, flatten_content
@extractordef last_n_messages(trajectory, config, agent_state=None): """Extract the last N assistant messages.""" n = config.get("n", 3) messages = get_assistant_messages(trajectory) last_n = messages[-n:] if len(messages) >= n else messages contents = [flatten_content(msg.content) for msg in last_n] return "\n".join(contents)Usage:
extractor: last_n_messages # Use custom extractorextractor_config: n: 3 # Extract last 3 assistant messagesExtract JSON Field
Section titled “Extract JSON Field”import jsonfrom letta_evals.decorators import extractorfrom letta_evals.extractors.utils import get_assistant_messages, flatten_content
@extractordef json_field(trajectory, config, agent_state=None): """Extract a specific field from JSON response.""" field_name = config.get("field", "result") messages = get_assistant_messages(trajectory)
if not messages: return ""
content = flatten_content(messages[-1].content)
try: data = json.loads(content) return str(data.get(field_name, "")) except json.JSONDecodeError: return ""Usage:
extractor: json_field # Parse JSON from agent responseextractor_config: field: result # Extract the "result" field from JSONExtract Code Blocks
Section titled “Extract Code Blocks”import refrom letta_evals.decorators import extractorfrom letta_evals.extractors.utils import get_assistant_messages, flatten_content
@extractordef code_blocks(trajectory, config, agent_state=None): """Extract all code blocks from messages.""" language = config.get("language", None) # Optional: filter by language messages = get_assistant_messages(trajectory)
code_pattern = r'```(?:(\w+)\n)?(.*?)```' all_code = []
for msg in messages: content = flatten_content(msg.content) matches = re.findall(code_pattern, content, re.DOTALL)
for lang, code in matches: if language is None or lang == language: all_code.append(code.strip())
return "\n\n".join(all_code)Usage:
extractor: code_blocks # Extract code from markdown blocksextractor_config: language: python # Optional: only extract Python code blocksExtract Tool Call Count
Section titled “Extract Tool Call Count”from letta_client import ToolCallMessagefrom letta_evals.decorators import extractor
@extractordef tool_call_count(trajectory, config, agent_state=None): """Count how many times a specific tool was called.""" tool_name = config.get("tool_name") count = 0
for turn in trajectory: for message in turn: if isinstance(message, ToolCallMessage): if tool_name is None or message.tool_call.name == tool_name: count += 1
return str(count)Usage:
extractor: tool_call_count # Count tool invocationsextractor_config: tool_name: search # Optional: count only "search" tool callsExtract Multiple Memory Blocks
Section titled “Extract Multiple Memory Blocks”from letta_evals.decorators import extractor
@extractordef multiple_memory_blocks(trajectory, config, agent_state=None): """Extract and concatenate multiple memory blocks.""" if agent_state is None: return ""
block_labels = config.get("block_labels", ["human", "persona"]) separator = config.get("separator", "\n---\n")
blocks = [] for block in agent_state.memory.blocks: if block.label in block_labels: blocks.append(f"{block.label}: {block.value}")
return separator.join(blocks)Usage:
extractor: multiple_memory_blocks # Combine multiple memory blocksextractor_config: block_labels: [human, persona] # Which blocks to extract separator: "\n---\n" # How to separate them in outputHelper Utilities
Section titled “Helper Utilities”The framework provides helper functions:
get_assistant_messages
Section titled “get_assistant_messages”from letta_evals.extractors.utils import get_assistant_messages
messages = get_assistant_messages(trajectory)# Returns list of AssistantMessage objectsget_last_turn_messages
Section titled “get_last_turn_messages”from letta_evals.extractors.utils import get_last_turn_messagesfrom letta_client import AssistantMessage
messages = get_last_turn_messages(trajectory, AssistantMessage)# Returns assistant messages from last turnflatten_content
Section titled “flatten_content”from letta_evals.extractors.utils import flatten_content
text = flatten_content(message.content)# Converts complex content to plain textAgent State Requirements
Section titled “Agent State Requirements”If your extractor needs agent state, include it in the signature:
@extractordef my_extractor(trajectory, config, agent_state: Optional[AgentState] = None): if agent_state is None: raise RuntimeError("This extractor requires agent_state")
# Use agent_state.memory.blocks, etc. ...The runner will automatically fetch agent state when your extractor is used.
Note: Fetching agent state adds overhead. Only use when necessary.
Using Custom Extractors
Section titled “Using Custom Extractors”Method 1: Custom Evaluators File
Section titled “Method 1: Custom Evaluators File”Create custom_evaluators.py:
from letta_evals.decorators import extractor
@extractordef my_extractor(trajectory, config, agent_state=None): ...The file will be discovered automatically if in the same directory.
Method 2: Setup Script
Section titled “Method 2: Setup Script”Use a setup script to import custom extractors before the suite runs:
from letta_evals.models import SuiteSpecimport custom_extractors # Imports and registers your @extractor functions
def prepare_environment(suite: SuiteSpec) -> None: # Runs before evaluation starts passsetup_script: setup.py:prepare_environment # Import custom extractors
graders: my_metric: extractor: my_extractor # Now available from custom_extractorsTesting Your Extractor
Section titled “Testing Your Extractor”from letta_client import AssistantMessage
# Mock trajectorytrajectory = [ [ AssistantMessage( content="The answer is 42", role="assistant" ) ]]
config = {"max_length": 100}result = my_extractor(trajectory, config)print(f"Extracted: {result}")Best Practices
Section titled “Best Practices”- Handle empty trajectories: Check if messages exist
- Return strings: Always return a string, not None
- Use config for flexibility: Make behavior configurable
- Document required config: Explain config parameters
- Handle errors gracefully: Return empty string on error
- Keep it fast: Extractors run for every sample
- Use helper utilities: Leverage built-in functions
Next Steps
Section titled “Next Steps”- Built-in Extractors - Learn from examples
- Custom Graders - Pair with custom grading
- Core Concepts - How extractors work