Support functions as output_type, as well as lists of functions and other types #1785

DouweM · 2025-05-20T22:28:01Z

Consider reviewing the first commit by itself and then the rest as one diff!
Commit 1 brings in some output handling refactoring borrowed from #1628, to make sure we don't hard-code this against the tool-call output mode (as the original PR did). Also makes it less of a rebase hell for me :) This commit does not change any behavior.

Example:

def foobar_ctx(ctx: RunContext[int], x: str, y: int) -> str:
    return f'{x} {y}'


async def foobar_plain(x: int, y: int) -> int:
    return x * y

marker: ToolOutput[bool | tuple[str, int]] = ToolOutput(bool | tuple[str, int])  # type: ignore
agent = Agent(output_type=[Foo, Bar, foobar_ctx, ToolOutput(foobar_plain), marker])

To do:

Handle the case where the function has an argument of type RunContext; this value should be injected, not obtained from the model.
Support bound instance methods
Docs
- Output
- Tools (as this effectively lets you force a tool call)

…ther types

github-actions · 2025-05-20T22:33:08Z

Docs Preview

commit:	`6111ad1`
Preview URL:	https://157f12b8-pydantic-ai-previews.pydantic.workers.dev

pydantic_ai_slim/pydantic_ai/_function_schema.py

pydantic_ai_slim/pydantic_ai/_output.py

pydantic_ai_slim/pydantic_ai/tools.py

tests/test_examples.py

dmontagu

Generally looks good, I think we should just confirm the changes to FunctionSchema are okay with Samuel (he'll probably just rubber stamp but still), and I'd like to get @Kludex's take on the crazy OutputType types but I'm okay with it if he is.

dmontagu · 2025-05-22T06:00:26Z

Oh we also need to add some typing-related tests

…checked, not executed

# Conflicts: # pydantic_ai_slim/pydantic_ai/_output.py # pydantic_ai_slim/pydantic_ai/tools.py

…er to rerun when the length of the example changes

hyperlint-ai · 2025-05-23T18:04:21Z

PR Change Summary

Enhanced output handling to support functions as output types, improving flexibility in agent responses.

Refactored output handling to avoid hard-coded dependencies on tool-call output mode.
Introduced support for functions as output types, allowing agents to return results from function calls.
Updated documentation to reflect changes in output handling and added examples for new features.

Modified Files

docs/multi-agent-applications.md
docs/output.md
docs/tools.md

How can I customize these reviews?

Check out the Hyperlint AI Reviewer docs for more information on how to customize the review.

If you just want to ignore it on this PR, you can add the hyperlint-ignore label to the PR. Future changes won't trigger a Hyperlint review.

Note specifically for link checks, we only check the first 30 links in a file and we cache the results for several hours (for instance, if you just added a page, you might experience this). Our recommendation is to add hyperlint-ignore to the PR to ignore the link check for this PR.

burningion · 2025-05-23T20:30:27Z

One thing I'd also like to be able to do here is do a single LLM call with my Agent.

Ideally it'd do the same behavior as direct model call:

https://ai.pydantic.dev/direct/

Except expose the MCP server, and (optionally) force a Pydantic model response in a single call.

Am I missing something or does that interface not exist in the Agent yet?

DouweM · 2025-05-23T20:46:00Z

What I've done here to make output_type take types, functions, async functions, ToolOutput markers, and sequences of any of those, works perfectly with pyright's type inference, but unfortunately not with mypy, and not (yet) with pyrefly and ty.

Specifically, none of the three support Sequence[type[T]]: (issues for mypy, pyrefly, ty)

from __future__ import annotations

from dataclasses import dataclass

from typing_extensions import (
    Generic,
    Sequence,
    TypeVar,
    assert_type,
)

T = TypeVar("T")


@dataclass
class Agent(Generic[T]):
    output_type: Sequence[type[T]]


class Foo:
    pass


class Bar:
    pass


# pyright - works
# mypy - error: Expression is of type "Agent[object]", not "Agent[Foo | Bar]"  [assert-type]
# pyrefly - assert_type(Agent[Foo], Agent[Bar | Foo]) failed + Argument `list[type[Bar] | type[Foo]]` is not assignable to parameter `output_type` with type `Sequence[type[Foo]]` in function `Agent.__init__`
# ty - `Agent[Foo | Bar]` and `Agent[Unknown]` are not equivalent types
assert_type(Agent([Foo, Bar]), Agent[Foo | Bar])

# pyright - works
# mypy - error: Expression is of type "Agent[Never]", not "Agent[int | str]"  [assert-type]
# pyrefly - assert_type(Agent[int], Agent[str | int]) failed + Argument `list[type[str] | type[int]]` is not assignable to parameter `output_type` with type `Sequence[type[int]]` in function `Agent.__init__`
# ty - `Agent[int | str]` and `Agent[Unknown]` are not equivalent types
assert_type(Agent([int, str]), Agent[int | str])

# works
assert_type(Agent[Foo | Bar]([Foo, Bar]), Agent[Foo | Bar])

# works
assert_type(Agent[int | str]([int, str]), Agent[int | str])

Ty doesn't support Callable[..., T]: (issue)

from dataclasses import dataclass

from typing_extensions import (
    Callable,
    Generic,
    TypeVar,
    assert_type,
)

T = TypeVar("T")


@dataclass
class Agent(Generic[T]):
    output_type: Callable[..., T]


def func() -> int:
    return 1


# pyright, mypy, pyrefly - works
# ty - `Agent[int]` and `Agent[Unknown]` are not equivalent types + Expected `((...) -> T) | ((...) -> Awaitable[T])`, found `def func() -> int`
assert_type(Agent(func), Agent[int])

# works
assert_type(Agent[int](func), Agent[int])

And mypy (and ty, because of the above issue) don't support Callable[..., T] | Callable[..., Awaitable[T]]: (issue for mypy)

from dataclasses import dataclass

from typing_extensions import (
    Awaitable,
    Callable,
    Generic,
    TypeVar,
    assert_type,
)

T = TypeVar("T")


@dataclass
class Agent(Generic[T]):
    output_type: Callable[..., T] | Callable[..., Awaitable[T]]


async def coro() -> bool:
    return True


def func() -> int:
    return 1


# pyright, mypy, pyrefly - works
# ty - `Agent[int]` and `Agent[Unknown]` are not equivalent types + Expected `((...) -> T) | ((...) -> Awaitable[T])`, found `def func() -> int`
assert_type(Agent(func), Agent[int])

# mypy - error: Argument 1 to "Agent" has incompatible type "Callable[[], Coroutine[Any, Any, bool]]"; expected "Callable[..., Never] | Callable[..., Awaitable[Never]]"  [arg-type]
coro_agent = Agent(coro)
# pyright, pyrefly - works
# mypy - error: Expression is of type "Agent[Any]", not "Agent[bool]"
# ty - `Agent[bool]` and `Agent[Unknown]` are not equivalent types
assert_type(coro_agent, Agent[bool])

# works
assert_type(Agent[bool](coro), Agent[bool])

The issue with Sequence[type[T]] can possibly be worked around with one of these tricks, which will require users to use a helper function or builder instead of a simple list:

@dataclass
class Output(Generic[T]):
    output_type: type[T]

    def or_[S](self, output_type: type[S]) -> Output[T | S]:
        return Output(self.output_type | output_type)  # type: ignore

# or 

def output_type[T](*args: type[T]) -> Output[T]:
    raise NotImplementedError

The issue with Callable[..., T] | Callable[..., Awaitable[T]] is less severe because pyright and pyrefly handle it correctly, and ty doesn't handle Callable[..., T] right at all, so it's possible they simply haven't implemented this yet. But it is tricky because if the return type of a Callable is an awaitable (meaning it's an async function), both sides of the union are a valid match, so it's not technically a bug in the typechecker to match T to the coroutine itself instead of the coroutine's return type. A workaround here could look like a helper function or builder that has overloads that prefer the Awaitable[T] match over the simple T match through the order in which they're defined. (I verified that swapping the union members to be Callable[..., Awaitable[T]] | Callable[..., T] doesn't change anything, so the order is not a factor)

We could also decide we're fine with all of this because it works correctly on all typecheckers when you explicitly specify the generic parameters with Agent[...](...). Documenting that people should do that feels similar to the workaround we already tell people to use with output_type=Union[...], which can't be typechecked until https://peps.python.org/pep-0747/ lands. I've updated the docs to cover the new edge cases (with mypy in particular).

I've filed some issues with the type checkers to see what their teams say.

…test

…..., ...]]

…tion

DouweM · 2025-05-27T22:48:11Z

In discussion with @dmontagu we decided to merge this as is, with output_type taking a sequence of types (e.g. [Foo, Bar], despite this not being inferred correctly by mypy, because:

alternative APIs (e.g. Output(Foo).or_(Bar)) are significantly more clunky to the point of people likely preferring to just use a sequences and manually annotate the type
we can always add an alternative mypy-type-safe alternative later, if the demand is there

HamzaFarhan · 2025-05-29T13:11:25Z

When would this be released?

DouweM · 2025-05-29T14:25:41Z

@HamzaFarhan Done! https://github.com/pydantic/pydantic-ai/releases/tag/v0.2.12

…ther types (#1785)

bootloop-noah · 2025-06-04T00:01:11Z

@DouweM in the case of using output_type for agent delegation like in the router scenario (as described in the docs here and here), we'd expect it to operate like a directed graph but with a "decision point".

The final result.output reflects this but calling result.all_messages() only shows the UserPromptPart, ToolCallPart and the ToolReturnPart from the router agent which just says "Final result processed.".

Is there way to add the results of the delegated agent back into the main message history so it's something more like ModelRequest (router)->ModelRequest (output agent)->ModelResponse (output agent response) and matches the underlying graph?

HamzaFarhan · 2025-06-04T00:15:32Z

from dotenv import load_dotenv
from pydantic_ai import Agent, RunContext

load_dotenv()


maths_agent = Agent(
    model="google-gla:gemini-2.0-flash",
    instructions="You are a maths tutor. Given a question, you will provide a step by step solution.",
)


async def hand_off_to_maths_agent(ctx: RunContext, query: str) -> str:
    res = await maths_agent.run(query)
    ctx.messages += res.new_messages()
    return res.output


poet_agent = Agent(
    model="google-gla:gemini-2.0-flash",
    instructions="You are a poet. Given a topic, you will provide a poem.",
)


async def hand_off_to_poet_agent(ctx: RunContext, query: str) -> str:
    res = await poet_agent.run(query)
    ctx.messages += res.new_messages()
    return res.output


router_agent = Agent(
    model="google-gla:gemini-2.0-flash",
    instructions="You are a router. Given a user query, you will route it to the appropriate agent.",
    output_type=[hand_off_to_maths_agent, hand_off_to_poet_agent],
)


async def main():
    query = "Calculate 10 + 10"
    result = await router_agent.run(query)
    for message in result.all_messages():
        print(message, "\n")
    print(result.output)


if __name__ == "__main__":
    import asyncio

    asyncio.run(main())

bootloop-noah · 2025-06-04T00:44:49Z

@DouweM @HamzaFarhan great thank-you! After implementing this, there seems to be an error if we want to maintain state across multiple runs of the router in a case like:

...

async def hand_off_to_maths_agent(ctx: RunContext, query: str) -> str:
    res = await maths_agent.run(query)
    ctx.messages += res.new_messages()
    return res.output

...

async def main():
    query = "Calculate 10 + 10"
    result = await router_agent.run(query)
    
    query = "Write a poem about your answer"
    result = await router_agent.run(query, message_history=result.all_messages())

This raises a ModelHTTPError for both Gemini and OpenAI models (haven't tried any others). It looks like a validator maybe isn't looking ahead? The messages definitely contain a matching ToolReturnPart. Is there a way to extend the context messages after the agent has returned? Here's the error:

pydantic_ai.exceptions.ModelHTTPError: status_code: 400, model_name: gpt-4o, body: {'message': "An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'. The following tool_call_ids did not have response messages: call_KNSIrALYz0K6cTvpEIVVV7lW", 'type': 'invalid_request_error', 'param': 'messages.[3].role', 'code': None}

result.all_messages() =

ModelRequest(parts=[UserPromptPart(content='Calculate 10 + 10', timestamp=datetime.datetime(2025, 6, 4, 0, 33, 24, 203261, tzinfo=datetime.timezone.utc))], instructions='You are a router. Given a user query, you will route it to the appropriate agent.') 

ModelResponse(parts=[ToolCallPart(tool_name='final_result_hand_off_to_maths_agent', args='{"query":"Calculate 10 + 10"}', tool_call_id='call_KNSIrALYz0K6cTvpEIVVV7lW')], usage=Usage(requests=1, request_tokens=118, response_tokens=25, total_tokens=143, details={'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0, 'cached_tokens': 0}), model_name='gpt-4o-2024-08-06', timestamp=datetime.datetime(2025, 6, 4, 0, 33, 24, tzinfo=datetime.timezone.utc), vendor_id='chatcmpl-BeWMGkev6eeWK9arEZ17o3OVyWUan') 

ModelRequest(parts=[UserPromptPart(content='Calculate 10 + 10', timestamp=datetime.datetime(2025, 6, 4, 0, 33, 25, 554116, tzinfo=datetime.timezone.utc))], instructions='You are a maths tutor. Given a question, you will provide a step by step solution.') 

ModelResponse(parts=[TextPart(content='To calculate \\(10 + 10\\), simply add the two numbers together:\n\n\\[ \n10 + 10 = 20 \n\\]\n\nSo, the answer is \\(20\\).')], usage=Usage(requests=1, request_tokens=46, response_tokens=37, total_tokens=83, details={'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0, 'cached_tokens': 0}), model_name='gpt-4o-2024-08-06', timestamp=datetime.datetime(2025, 6, 4, 0, 33, 25, tzinfo=datetime.timezone.utc), vendor_id='chatcmpl-BeWMHmqsk9Nip3TEfkAbHZevvLe4V') 

ModelRequest(parts=[ToolReturnPart(tool_name='final_result_hand_off_to_maths_agent', content='Final result processed.', tool_call_id='call_KNSIrALYz0K6cTvpEIVVV7lW', timestamp=datetime.datetime(2025, 6, 4, 0, 33, 26, 305492, tzinfo=datetime.timezone.utc))])

HamzaFarhan · 2025-06-04T01:31:57Z

Ooh right
So it looks like this is how a "handoff" is meant to be.

HamzaFarhan · 2025-06-04T01:32:43Z

For what it's worth, here's a hack:

from dataclasses import replace

from dotenv import load_dotenv
from pydantic_ai import Agent, RunContext
from pydantic_ai.messages import ModelMessage, ModelRequest, ToolCallPart, ToolReturnPart

load_dotenv()


maths_agent = Agent(
    model="google-gla:gemini-2.0-flash",
    instructions="You are a maths tutor. Given a question, you will provide a step by step solution.",
)


async def hand_off_to_maths_agent(ctx: RunContext, query: str) -> str:
    res = await maths_agent.run(query)
    ctx.messages += res.new_messages()
    return res.output


poet_agent = Agent(
    model="google-gla:gemini-2.0-flash",
    instructions="You are a poet. Given a topic, you will provide a poem.",
)


async def hand_off_to_poet_agent(ctx: RunContext, query: str) -> str:
    res = await poet_agent.run(query)
    ctx.messages += res.new_messages()
    return res.output


router_agent = Agent(
    model="google-gla:gemini-2.0-flash",
    instructions="You are a router. Given a user query, you will route it to the appropriate agent.",
    output_type=[hand_off_to_maths_agent, hand_off_to_poet_agent],
)


def filter_tool_parts(messages: list[ModelMessage], filter_str: str) -> list[ModelMessage]:
    filtered_messages: list[ModelMessage] = []
    for message in messages:
        if isinstance(message, ModelRequest):
            filtered_parts = [
                part
                for part in message.parts
                if not (isinstance(part, ToolReturnPart) and filter_str in part.tool_name)
            ]
            if filtered_parts:
                filtered_messages.append(replace(message, parts=filtered_parts))
        else:
            filtered_parts = [
                part
                for part in message.parts
                if not (isinstance(part, ToolCallPart) and filter_str in part.tool_name)
            ]
            if filtered_parts:
                filtered_messages.append(replace(message, parts=filtered_parts))
    return filtered_messages


async def main():
    query = "Calculate 10 + 10"
    result = await router_agent.run(query)
    message_history = filter_tool_parts(result.all_messages(), "hand_off")
    sep = "\n" + "-" * 100 + "\n"

    print(sep)
    for message in message_history:
        print(message, "\n")
    print(f"{sep}{result.output}{sep}")

    query = "Write a poem about your answer"
    result = await router_agent.run(query, message_history=message_history)
    print(result.output)


if __name__ == "__main__":
    import asyncio

    asyncio.run(main())

DouweM added 2 commits May 20, 2025 19:44

Output handling refactoring borrowed from output_mode PR

b3f8bd2

Support functions as output_type, as well as lists of functions and o…

2ebe6a9

…ther types

DouweM self-assigned this May 20, 2025

DouweM mentioned this pull request May 20, 2025

Support structured and manual JSON output_type modes in addition to tool calls #1628

Draft

14 tasks

DouweM added 2 commits May 20, 2025 23:35

Fix tests

ab576d7

Make Python 3.9 happy

6c4fcec

DouweM mentioned this pull request May 21, 2025

How to force a tool call in a run? #1777

Closed

DouweM added 3 commits May 21, 2025 18:09

Support output_type = bound instance method

1bd16dc

Support RunContext arg on output_type function using same logic as tools

98e64d4

Improve test coverage

60d789e

DouweM marked this pull request as ready for review May 21, 2025 22:50