Multi-Turn Conversations
Multi-turn conversations allow you to test how agents handle context across multiple exchanges.
This is essential for stateful agents where behavior depends on conversation history.
Why Use Multi-Turn?
Multi-turn conversations enable testing that single-turn prompts cannot:
- Memory storage: Verify agents persist information to memory blocks
- Tool call sequences: Test multi-step workflows
- Context retention: Ensure agents remember details from earlier
- State evolution: Track how agent state changes across interactions
- Conversational coherence: Test if agents maintain context appropriately
Format
Single-Turn (Default)
Multi-Turn
The agent processes each input in sequence, with state carrying over between turns.
Example 1: Memory Recall Testing
Test if the agent remembers information across turns:
Suite configuration:
Example 2: Memory Correction Testing
Test if the agent correctly updates memory when users correct themselves:
Suite configuration:
Key difference: The memory_block
extractor verifies the agent actually stored the corrected information in memory, not just that it responded correctly. This tests real memory persistence.
When to Test Memory Blocks vs. Responses
Use last_assistant
or all_assistant
extractors when:
- Testing what the agent says in conversation
- Verifying response content and phrasing
- Checking conversational coherence
Use memory_block
extractor when:
- Verifying information was actually stored in memory
- Testing memory updates and corrections
- Validating persistent state changes
- Ensuring the agent’s internal state is correct
See the multiturn-memory-block-extractor example for a complete working implementation.
Next Steps
- Datasets - Creating test datasets
- Extractors - Extracting from trajectories
- Targets - Agent lifecycle and testing behavior