Skip to content

Conversation

@moonbox3
Copy link
Contributor

@moonbox3 moonbox3 commented Jan 5, 2026

Motivation and Context

Tools with approval_mode="always_require" were not executed after user approval. Instead, the response showed "Changes confirmed and applied successfully!" without actually invoking the tool.

When AG-UI received an approval payload ({"accepted": true}), it was:

  1. Converted to TextContent with is_tool_result flag, routed to HumanInTheLoopOrchestrator
  2. The orchestrator only handled confirm_changes (predictive state) approvals, not real function approvals
  3. sanitize_tool_history injected synthetic "skipped" results, causing model API errors

The fixes are in:

  • _message_adapters.py: Convert approval payloads to FunctionApprovalResponseContent (which the agent framework knows how to execute). Remove any existing tool results for the call_id since the framework will re-execute.
  • _message_hygiene.py: Recognize FunctionApprovalResponseContent and skip synthetic result injection for those call_ids.

Additionally, for 3010:

Shared state wasn't properly handled when updated via the UI.

Description

Code Cleanup

Refactored the agent_framework_ag_ui package to reduce duplication and improve maintainability:

New modules under _orchestration/:

  • _helpers.py: Extracted orchestration helper functions (message selection, tool call tracking, approval handling)
  • _predictive_state.py: Consolidated predictive state logic into PredictiveStateHandler class

Improvements:

  • Added shared utilities to _utils.py: safe_json_parse, get_role_value, role constants
  • Replaced Any types with proper SDK types (ChatMessage, Contents) for better type safety
  • Made ConfirmationStrategy properly abstract with @abstractmethod decorators on properties
  • Reduced code complexity in _orchestrators.py

All existing tests pass (251 tests).

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

@moonbox3
Copy link
Contributor Author

moonbox3 commented Jan 5, 2026

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/ag-ui/agent_framework_ag_ui
   _confirmation_strategies.py101595%210–211, 213, 215, 217
   _events.py2621195%197, 216, 245, 259, 282, 362–363, 392, 403, 426, 482
   _message_adapters.py4494589%68, 78–79, 88–91, 157, 221–222, 259, 263, 266, 269, 273, 290, 307, 329, 365, 377, 428, 445–446, 507, 511, 517, 525–526, 528, 532–535, 548, 637–640, 642, 759, 763, 766, 769, 777–778
   _orchestrators.py3594687%104, 113, 180, 413, 468–470, 479–480, 488, 490–492, 496, 505, 538, 540–542, 544–547, 549, 551–552, 554, 556–561, 635–637, 641–646, 690–692, 695
   _utils.py95792%70, 73, 86, 109, 155, 254, 259
packages/ag-ui/agent_framework_ag_ui/_orchestration
   _helpers.py1865470%60, 109, 112, 149, 152, 156, 174, 177, 181, 198, 201, 204, 224, 236, 240, 255, 260, 291–308, 310–313, 317–318, 332, 351, 353–355, 362, 382, 384, 387–391
   _predictive_state.py98980%5–8, 10, 12, 14, 17, 20, 31–36, 38, 40–41, 43, 57–58, 60–62, 64–71, 73, 75, 84–89, 91, 105–107, 109–110, 117–120, 122, 124–125, 127, 129, 138, 140–145, 147–148, 150–154, 156, 158, 172, 174–177, 179–182, 184, 186–190, 192, 194, 204–206, 213–214, 216, 226, 228–230
   _state_manager.py50198%95
TOTAL17621288683% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
2614 154 💤 0 ❌ 0 🔥 58.066s ⏱️

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a bug where tools with approval_mode="always_require" were not executed after user approval. The issue occurred because approval payloads ({"accepted": true}) were converted to generic TextContent instead of FunctionApprovalResponseContent, preventing the agent framework from executing the approved tools.

Key Changes

  • Modified approval payload handling to create FunctionApprovalResponseContent objects that the framework can execute
  • Added logic to filter out stale tool results when re-executing approved tools
  • Updated message hygiene to recognize and properly handle approval responses

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 12 comments.

File Description
python/packages/ag-ui/tests/test_agent_wrapper_comprehensive.py Added integration tests for approval and rejection flows to verify tools execute correctly
python/packages/ag-ui/agent_framework_ag_ui/_orchestration/_message_hygiene.py Added logic to detect FunctionApprovalResponseContent and skip synthetic result injection for approved calls
python/packages/ag-ui/agent_framework_ag_ui/_message_adapters.py Modified to convert approval payloads to FunctionApprovalResponseContent and remove stale tool results
Comments suppressed due to low confidence (2)

python/packages/ag-ui/agent_framework_ag_ui/_message_adapters.py:125

  • The new approval payload handling logic (lines 69-125) that converts tool results with {"accepted": true/false} to FunctionApprovalResponseContent lacks direct unit test coverage in test_message_adapters.py. While integration tests exist in test_agent_wrapper_comprehensive.py, consider adding a focused unit test that verifies this conversion logic in isolation, including edge cases like:
  • Approval payload with matching function call
  • Approval payload without matching function call (confirm_changes fallback)
  • Rejection payload ({"accepted": false})
  • Approval payload with existing tool result that gets filtered out
            if is_approval:
                # Look for the matching function call in previous messages to create
                # a proper FunctionApprovalResponseContent. This enables the agent framework
                # to execute the approved tool (fix for GitHub issue #3034).
                parsed_approval = json.loads(result_content)
                accepted = parsed_approval.get("accepted", False)

                # Find the function call that matches this tool_call_id
                matching_func_call = None
                for prev_msg in result:
                    role_val = prev_msg.role.value if hasattr(prev_msg.role, "value") else str(prev_msg.role)
                    if role_val != "assistant":
                        continue
                    for content in prev_msg.contents or []:
                        if isinstance(content, FunctionCallContent):
                            if content.call_id == tool_call_id and content.name != "confirm_changes":
                                matching_func_call = content
                                break

                if matching_func_call:
                    # Remove any existing tool result for this call_id since the framework
                    # will re-execute the tool after approval. Keeping old results causes
                    # OpenAI API errors ("tool message must follow assistant with tool_calls").
                    result = [
                        m
                        for m in result
                        if not (
                            (m.role.value if hasattr(m.role, "value") else str(m.role)) == "tool"
                            and any(
                                isinstance(c, FunctionResultContent) and c.call_id == tool_call_id
                                for c in (m.contents or [])
                            )
                        )
                    ]

                    # Create FunctionApprovalResponseContent for the agent framework
                    approval_response = FunctionApprovalResponseContent(
                        approved=accepted,
                        id=str(tool_call_id),
                        function_call=matching_func_call,
                    )
                    chat_msg = ChatMessage(
                        role=Role.USER,
                        contents=[approval_response],
                    )
                else:
                    # No matching function call found - this is likely a confirm_changes approval
                    # Keep the old behavior for backwards compatibility
                    chat_msg = ChatMessage(
                        role=Role.USER,
                        contents=[TextContent(text=str(result_content))],
                        additional_properties={"is_tool_result": True, "tool_call_id": str(tool_call_id or "")},
                    )
                if "id" in msg:
                    chat_msg.message_id = msg["id"]
                result.append(chat_msg)
                continue

python/packages/ag-ui/agent_framework_ag_ui/_message_adapters.py:287

  • This import of module json is redundant, as it was previously imported on line 5.
                    import json

Copy link
Member

@eavanvalkenburg eavanvalkenburg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit, otherwise looks good.

@moonbox3 moonbox3 changed the title Python: fix(ag-ui): Execute tools with approval_mode="always_require" after user approval Python: fix(ag-ui): Execute tools with approval_mode and fix shared state Jan 8, 2026
@moonbox3 moonbox3 changed the title Python: fix(ag-ui): Execute tools with approval_mode and fix shared state Python: fix(ag-ui): Execute tools with approval_mode, fix shared state, code cleanup Jan 9, 2026
@moonbox3 moonbox3 added this pull request to the merge queue Jan 9, 2026
Merged via the queue into microsoft:main with commit 88968da Jan 9, 2026
23 checks passed
@github-project-automation github-project-automation bot moved this from In Review to Done in Agent Framework Jan 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Python: [AG-UI] Tool not executed after approval in Human-in-the-loop Python: AG-UI, Tool calls suppress shared state message injection

3 participants