AgentTool.run_async extracts only p.text, loses code_execution_result.output from inner code-executor agents

## Summary

`AgentTool.run_async` silently drops `code_execution_result.output` when wrapping an inner `LlmAgent` that uses a `code_executor` (`BuiltInCodeExecutor`, `AgentEngineSandboxCodeExecutor`). Parent agents receive `response={'result': ''}` on a substantial fraction of invocations even though the sandbox successfully ran code and produced output. Observed rates across 30+ production dispatches on `gemini-3-flash-preview`: 38% baseline, up to 62% under some configurations. After the local subclass fix described below: 0% on two consecutive clean runs.

**Affected file:** `google/adk/tools/agent_tool.py:267-270` (as of v1.27.2; confirmed unchanged through v1.31.1 at time of writing).

**Relation to prior issue #180:** [#180 (closed)](https://github.com/google/adk-python/issues/180) reported the same underlying behavior on `gemini-2.0-flash` + `BuiltInCodeExecutor`. It was closed with a prompt-level workaround — instructing the inner agent to "present EXACTLY what was returned." That workaround is fragile because (a) it relies on the inner LLM's self-consistency under compaction, (b) it doesn't help when the inner LLM emits *no* text part at all (the dominant failure mode with Gemini 3's code-exec path), and (c) it shifts the bug burden onto every team using the pattern. This issue proposes a structural code fix that removes the responsibility from users.

## Current behavior

```python
# google/adk/tools/agent_tool.py
if last_content is None or last_content.parts is None:
    return ''
merged_text = '\n'.join(
    p.text for p in last_content.parts if p.text and not p.thought
)
```

The extraction inspects `p.text` only. When the inner Gemini model finishes a turn with only `executable_code` + `code_execution_result` parts (no wrapping text summary — a documented Gemini code-exec behavior), `merged_text = ''` and the parent tool sees an empty result.

The parent LLM has no way to recover — it sees an empty string, not the sandbox's stdout or tracebacks. This prevents intelligent retry behavior and forces either (a) silent loss of computational output or (b) manual code-path workarounds.

## Reproducer

```python
from google.adk.agents import LlmAgent
from google.adk.models.google_llm import Gemini
from google.adk.code_executors import AgentEngineSandboxCodeExecutor
from google.adk.tools.agent_tool import AgentTool

inner = LlmAgent(
    name="calculator",
    model=Gemini(model="gemini-3-flash-preview",
                 generation_config={"max_output_tokens": 16384}),
    tools=[],
    code_executor=AgentEngineSandboxCodeExecutor(
        agent_engine_resource_name="projects/.../reasoningEngines/..."
    ),
    include_contents="default",
)
tool = AgentTool(agent=inner, skip_summarization=False)

# Call via parent agent with request='Compute IRR for these cash flows: [...]'
# Observe response['result'] == '' on ~30% of runs despite audit logs showing
# successful Python execution with stdout containing the IRR value.
```

## Proposed fix

One-line extension to the extraction — also capture `code_execution_result.output` when it exists:

```python
if last_content is None or last_content.parts is None:
    return ''
merged_text = '\n'.join(
    p.text for p in last_content.parts if p.text and not p.thought
)
# NEW: fall back to code_execution_result.output when text parts are empty
if not merged_text.strip():
    code_outputs = [
        p.code_execution_result.output
        for p in last_content.parts
        if p.code_execution_result and p.code_execution_result.output
    ]
    if code_outputs:
        merged_text = '\n\n'.join(code_outputs)
```

## Reference workaround

Subclass approach that preserves base-class semantics and adds the fallback: [CodeExecutionAgentTool in `utils/code_execution_audit.py`](https://github.com/Number531/Claude-2-Google/blob/main/utils/code_execution_audit.py).

Measured impact across 30+ dispatches: empty-response rate on the "code-ran-no-text" class dropped from 38-62% to 0% on the two most recent runs; parent LLMs now receive actionable diagnostic content (including sandbox tracebacks) instead of empty strings, enabling intelligent retry/recovery.

## Request for maintainers

Would you prefer:
(a) A PR adding the one-block fix above directly to `AgentTool.run_async`?
(b) A new `CodeExecutionAgentTool` subclass shipped alongside `AgentTool`?
(c) Documentation of the subclass pattern as the recommended approach for code-executor-bearing sub-agents?

Happy to contribute in whichever form best fits the project's direction.

## Environment

- ADK: v1.27.2 (also reproduces on v1.28.x-v1.31.1 — no change to affected path)
- google-genai: 1.x
- Model: `gemini-3-flash-preview` (also observed on `gemini-2.5-flash`)
- Executor: `AgentEngineSandboxCodeExecutor`, `BuiltInCodeExecutor`
- Python: 3.13

## Impact

Without the fix, any production workflow using AgentTool to wrap a code-executor sub-agent loses 25-60% of computational output silently. Parent LLMs cannot recover because they see no diagnostic content. This particularly affects multi-agent patterns that work around the Gemini API's `code_execution + function_calling` mutual-exclusion constraint via AgentTool decomposition.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AgentTool.run_async extracts only p.text, loses code_execution_result.output from inner code-executor agents #5481

Summary

Current behavior

Reproducer

Proposed fix

Reference workaround

Request for maintainers

Environment

Impact

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

AgentTool.run_async extracts only p.text, loses code_execution_result.output from inner code-executor agents #5481

Description

Summary

Current behavior

Reproducer

Proposed fix

Reference workaround

Request for maintainers

Environment

Impact

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions