-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Open
Labels
core[Component] This issue is related to the core interface and implementation[Component] This issue is related to the core interface and implementation
Description
Is your feature request related to a problem? Please describe.
Yes. Currently, building resilient multi-agent systems with AgentTool requires significant custom code:
- No built-in timeout mechanism - Developers must create custom wrappers to add timeout protection
- No automatic fallback - Requires LLM reasoning and prompt engineering to route to alternative agents
- No result validation - No way to verify that sub-agent results are complete
- Complexity leakage - All sub-agent events are exposed, making it hard to hide internal complexity from users
Example Problem:
When a sub-agent times out or fails, the parent agent must manually handle the error, decide whether to retry, choose an alternative agent, and format user-friendly error messages. This requires:
- Custom
TimeoutAgentToolwrapper - Complex prompt engineering for routing
- Manual error handling logic
- Additional agents for error recovery
Impact:
- High barrier to entry for building resilient multi-agent systems
- Inconsistent error handling across different implementations
- Difficult to test timeout and failure scenarios
- Poor user experience when errors occur
Describe the solution you'd like
Add built-in resilience features to AgentTool:
1. Built-in Timeout Support
AgentTool(
agent=sub_agent,
timeout=30.0, # Timeout in seconds
timeout_handler='error' | 'fallback' | 'retry', # How to handle timeout
)2. Automatic Fallback Configuration
AgentTool(
agent=primary_agent,
fallback_agent=fallback_agent,
fallback_on_timeout=True,
fallback_on_error=True,
fallback_on_partial_result=False,
)3. Result Validation
AgentTool(
agent=sub_agent,
validate_result=True,
required_fields=['summary', 'sources'], # For structured output
result_validator=lambda r: len(r.get('summary', '')) > 100,
)4. Event Filtering
AgentTool(
agent=sub_agent,
stream_events=True, # Stream all events
stream_events=False, # Only stream final result
hide_intermediate_steps=True, # Hide tool calls, show only results
)5. Partial Result Handling
AgentTool(
agent=sub_agent,
handle_partial_results='error' | 'retry' | 'return', # How to handle
partial_result_threshold=0.8, # 80% complete = valid
)Describe alternatives you've considered
Alternative 1: Custom Wrappers (Current Approach)
Pros:
- Works today without ADK changes
- Flexible and customizable
- Non-breaking
Cons:
- Requires significant custom code
- Inconsistent across implementations
- Hard to maintain
- High barrier to entry
Alternative 2: Plugin-Based Solution
Pros:
- Extensible
- Doesn't require ADK core changes
Cons:
- Still requires custom code
- Less discoverable
- More complex API
Alternative 3: Built-in Support (Proposed)
Pros:
- Simple, consistent API
- Low barrier to entry
- Better developer experience
- Easier to test
Cons:
- Requires ADK core changes
- Need to maintain backward compatibility
Recommendation: Built-in support is the best long-term solution, as it makes resilience patterns a first-class feature.
Additional context
Sample Implementation
I've created a working sample (#4086) that demonstrates:
- Custom
TimeoutAgentToolwrapper - Integration with
ReflectAndRetryToolPlugin - Prompt-based dynamic routing
- Error recovery patterns
Metadata
Metadata
Assignees
Labels
core[Component] This issue is related to the core interface and implementation[Component] This issue is related to the core interface and implementation