|
| 1 | +# Direct Model Requests |
| 2 | + |
| 3 | +The `direct` module provides low-level methods for making imperative requests to LLMs where the only abstraction is input and output schema translation, enabling you to use all models with the same API. |
| 4 | + |
| 5 | +These methods are thin wrappers around the [`Model`][pydantic_ai.models.Model] implementations, offering a simpler interface when you don't need the full functionality of an [`Agent`][pydantic_ai.Agent]. |
| 6 | + |
| 7 | +The following functions are available: |
| 8 | + |
| 9 | +- [`model_request`][pydantic_ai.direct.model_request]: Make a non-streamed async request to a model |
| 10 | +- [`model_request_sync`][pydantic_ai.direct.model_request_sync]: Make a non-streamed synchronous request to a model |
| 11 | +- [`model_request_stream`][pydantic_ai.direct.model_request_stream]: Make a streamed async request to a model |
| 12 | + |
| 13 | +## Basic Example |
| 14 | + |
| 15 | +Here's a simple example demonstrating how to use the direct API to make a basic request: |
| 16 | + |
| 17 | +```python title="direct_basic.py" |
| 18 | +from pydantic_ai.direct import model_request_sync |
| 19 | +from pydantic_ai.messages import ModelRequest |
| 20 | + |
| 21 | +# Make a synchronous request to the model |
| 22 | +model_response = model_request_sync( |
| 23 | + 'anthropic:claude-3-5-haiku-latest', |
| 24 | + [ModelRequest.user_text_prompt('What is the capital of France?')] |
| 25 | +) |
| 26 | + |
| 27 | +print(model_response.parts[0].content) |
| 28 | +#> Paris |
| 29 | +print(model_response.usage) |
| 30 | +""" |
| 31 | +Usage(requests=1, request_tokens=56, response_tokens=1, total_tokens=57, details=None) |
| 32 | +""" |
| 33 | +``` |
| 34 | + |
| 35 | +_(This example is complete, it can be run "as is")_ |
| 36 | + |
| 37 | +## Advanced Example with Tool Calling |
| 38 | + |
| 39 | +You can also use the direct API to work with function/tool calling. |
| 40 | + |
| 41 | +Even here we can use Pydantic to generate the JSON schema for the tool: |
| 42 | + |
| 43 | +```python |
| 44 | +from pydantic import BaseModel |
| 45 | +from typing_extensions import Literal |
| 46 | + |
| 47 | +from pydantic_ai.direct import model_request |
| 48 | +from pydantic_ai.messages import ModelRequest |
| 49 | +from pydantic_ai.models import ModelRequestParameters |
| 50 | +from pydantic_ai.tools import ToolDefinition |
| 51 | + |
| 52 | + |
| 53 | +class Divide(BaseModel): |
| 54 | + """Divide two numbers.""" |
| 55 | + |
| 56 | + numerator: float |
| 57 | + denominator: float |
| 58 | + on_inf: Literal['error', 'infinity'] = 'infinity' |
| 59 | + |
| 60 | + |
| 61 | +async def main(): |
| 62 | + # Make a request to the model with tool access |
| 63 | + model_response = await model_request( |
| 64 | + 'openai:gpt-4.1-nano', |
| 65 | + [ModelRequest.user_text_prompt('What is 123 / 456?')], |
| 66 | + model_request_parameters=ModelRequestParameters( |
| 67 | + function_tools=[ |
| 68 | + ToolDefinition( |
| 69 | + name=Divide.__name__.lower(), |
| 70 | + description=Divide.__doc__ or '', |
| 71 | + parameters_json_schema=Divide.model_json_schema(), |
| 72 | + ) |
| 73 | + ], |
| 74 | + allow_text_output=True, # Allow model to either use tools or respond directly |
| 75 | + ), |
| 76 | + ) |
| 77 | + print(model_response) |
| 78 | + """ |
| 79 | + ModelResponse( |
| 80 | + parts=[ |
| 81 | + ToolCallPart( |
| 82 | + tool_name='divide', |
| 83 | + args={'numerator': '123', 'denominator': '456'}, |
| 84 | + tool_call_id='pyd_ai_2e0e396768a14fe482df90a29a78dc7b', |
| 85 | + part_kind='tool-call', |
| 86 | + ) |
| 87 | + ], |
| 88 | + usage=Usage( |
| 89 | + requests=1, |
| 90 | + request_tokens=55, |
| 91 | + response_tokens=7, |
| 92 | + total_tokens=62, |
| 93 | + details=None, |
| 94 | + ), |
| 95 | + model_name='gpt-4.1-nano', |
| 96 | + timestamp=datetime.datetime(...), |
| 97 | + kind='response', |
| 98 | + ) |
| 99 | + """ |
| 100 | +``` |
| 101 | + |
| 102 | +_(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)_ |
| 103 | + |
| 104 | +## When to Use the direct API vs Agent |
| 105 | + |
| 106 | +The direct API is ideal when: |
| 107 | + |
| 108 | +1. You need more direct control over model interactions |
| 109 | +2. You want to implement custom behavior around model requests |
| 110 | +3. You're building your own abstractions on top of model interactions |
| 111 | + |
| 112 | +For most application use cases, the higher-level [`Agent`][pydantic_ai.Agent] API provides a more convenient interface with additional features such as built-in tool execution, retrying, structured output parsing, and more. |
| 113 | + |
| 114 | +## OpenTelemetry or Logfire Instrumentation |
| 115 | + |
| 116 | +As with [agents][pydantic_ai.Agent], you can enable OpenTelemetry/Logfire instrumentation with just a few extra lines |
| 117 | + |
| 118 | +```python {title="direct_instrumented.py" hl_lines="1 6 7"} |
| 119 | +import logfire |
| 120 | + |
| 121 | +from pydantic_ai.direct import model_request_sync |
| 122 | +from pydantic_ai.messages import ModelRequest |
| 123 | + |
| 124 | +logfire.configure() |
| 125 | +logfire.instrument_pydantic_ai() |
| 126 | + |
| 127 | +# Make a synchronous request to the model |
| 128 | +model_response = model_request_sync( |
| 129 | + 'anthropic:claude-3-5-haiku-latest', |
| 130 | + [ModelRequest.user_text_prompt('What is the capital of France?')], |
| 131 | +) |
| 132 | + |
| 133 | +print(model_response.parts[0].content) |
| 134 | +#> Paris |
| 135 | +``` |
| 136 | + |
| 137 | +_(This example is complete, it can be run "as is")_ |
| 138 | + |
| 139 | +You can also enable OpenTelemetry on a per call basis: |
| 140 | + |
| 141 | +```python {title="direct_instrumented.py" hl_lines="1 6 12"} |
| 142 | +import logfire |
| 143 | + |
| 144 | +from pydantic_ai.direct import model_request_sync |
| 145 | +from pydantic_ai.messages import ModelRequest |
| 146 | + |
| 147 | +logfire.configure() |
| 148 | + |
| 149 | +# Make a synchronous request to the model |
| 150 | +model_response = model_request_sync( |
| 151 | + 'anthropic:claude-3-5-haiku-latest', |
| 152 | + [ModelRequest.user_text_prompt('What is the capital of France?')], |
| 153 | + instrument=True |
| 154 | +) |
| 155 | + |
| 156 | +print(model_response.parts[0].content) |
| 157 | +#> Paris |
| 158 | +``` |
| 159 | + |
| 160 | +See [Debugging and Monitoring](logfire.md) for more details, including how to instrument with plain OpenTelemetry without Logfire. |
0 commit comments