You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/tools.md
+60Lines changed: 60 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -293,6 +293,66 @@ _(This example is complete, it can be run "as is")_
293
293
294
294
Some models (e.g. Gemini) natively support semi-structured return values, while some expect text (OpenAI) but seem to be just as good at extracting meaning from the data. If a Python object is returned and the model expects a string, the value will be serialized to JSON.
295
295
296
+
### Advanced Tool Returns
297
+
298
+
For scenarios where you need more control over both the tool's return value and the content sent to the model, you can use [`ToolReturn`][pydantic_ai.messages.ToolReturn]. This is particularly useful when you want to:
299
+
300
+
- Provide rich multi-modal content (images, documents, etc.) to the model as context
301
+
- Separate the programmatic return value from the model's context
302
+
- Include additional metadata that shouldn't be sent to the LLM
303
+
304
+
Here's an example of a computer automation tool that captures screenshots and provides visual feedback:
"Please analyze the changes and suggest next steps."
335
+
],
336
+
metadata={
337
+
"coordinates": {"x": x, "y": y},
338
+
"action_type": "click_and_capture",
339
+
"timestamp": time.time()
340
+
}
341
+
)
342
+
343
+
# The model receives the rich visual content for analysis
344
+
# while your application can access the structured return_value and metadata
345
+
result = agent.run_sync("Click on the submit button and tell me what happened")
346
+
print(result.output)
347
+
# The model can analyze the screenshots and provide detailed feedback
348
+
```
349
+
350
+
-**`return_value`**: The actual return value used in the tool response. This is what gets serialized and sent back to the model as the tool's result.
351
+
-**`content`**: A sequence of content (text, images, documents, etc.) that provides additional context to the model. This appears as a separate user message.
352
+
-**`metadata`**: Optional metadata that your application can access but is not sent to the LLM. Useful for logging, debugging, or additional processing. Some other AI frameworks call this feature "artifacts".
353
+
354
+
This separation allows you to provide rich context to the model while maintaining clean, structured return values for your application logic.
355
+
296
356
## Function Tools vs. Structured Outputs
297
357
298
358
As the name suggests, function tools use the model's "tools" or "functions" API to let the model know what is available to call. Tools or functions are also used to define the schema(s) for structured responses, thus a model might have access to many tools, some of which call function tools while others end the run and produce a final output.
0 commit comments