pydantic · svilupp · Jul 5, 2025 · Jul 5, 2025 · Jul 8, 2025 · Jul 8, 2025
diff --git a/docs/agents.md b/docs/agents.md
@@ -466,26 +466,45 @@ PydanticAI offers a [`settings.ModelSettings`][pydantic_ai.settings.ModelSetting
 This structure allows you to configure common parameters that influence the model's behavior, such as `temperature`, `max_tokens`,
 `timeout`, and more.
 
-There are two ways to apply these settings:
+There are three ways to apply these settings, with a clear precedence order:
 
-1. Passing to `run{_sync,_stream}` functions via the `model_settings` argument. This allows for fine-tuning on a per-request basis.
-2. Setting during [`Agent`][pydantic_ai.agent.Agent] initialization via the `model_settings` argument. These settings will be applied by default to all subsequent run calls using said agent. However, `model_settings` provided during a specific run call will override the agent's default settings.
+1. **Model-level defaults** - Set when creating a model instance via the `settings` parameter. These serve as the base defaults for that model.
+2. **Agent-level defaults** - Set during [`Agent`][pydantic_ai.agent.Agent] initialization via the `model_settings` argument. These override model defaults.
+3. **Run-time overrides** - Passed to `run{_sync,_stream}` functions via the `model_settings` argument. These have the highest priority and override both agent and model defaults.
+
+**Settings Precedence**: Run-time > Agent > Model
 
 For example, if you'd like to set the `temperature` setting to `0.0` to ensure less random behavior,
 you can do the following:
 
 ```py
 from pydantic_ai import Agent
+from pydantic_ai.models.openai import OpenAIModel
+from pydantic_ai.settings import ModelSettings
 
-agent = Agent('openai:gpt-4o')
+# 1. Model-level defaults
+model = OpenAIModel(
+    'gpt-4o',
+    settings=ModelSettings(temperature=0.8, max_tokens=500)  # Base defaults
+)
 
+# 2. Agent-level defaults (override model defaults)
+agent = Agent(model, model_settings=ModelSettings(temperature=0.5))
+
+# 3. Run-time overrides (highest priority)
 result_sync = agent.run_sync(
-    'What is the capital of Italy?', model_settings={'temperature': 0.0}
+    'What is the capital of Italy?',
+    model_settings=ModelSettings(temperature=0.0)  # Final temperature: 0.0
 )
 print(result_sync.output)
 #> Rome
 ```
 
+The final request uses `temperature=0.0` (run-time), `max_tokens=500` (from model), demonstrating how settings merge with run-time taking precedence.
+
+!!! note "Model Settings Support"
+    Model-level settings are supported by all concrete model implementations (OpenAI, Anthropic, Google, etc.). Wrapper models like `FallbackModel`, `WrapperModel`, and `InstrumentedModel` don't have their own settings - they use the settings of their underlying models.
+
 ### Model specific settings
 
 If you wish to further customize model behavior, you can use a subclass of [`ModelSettings`][pydantic_ai.settings.ModelSettings], like [`GeminiModelSettings`][pydantic_ai.models.gemini.GeminiModelSettings], associated with your model of choice.

diff --git a/docs/models/index.md b/docs/models/index.md
@@ -124,6 +124,39 @@ The `ModelResponse` message above indicates in the `model_name` field that the o
 !!! note
     Each model's options should be configured individually. For example, `base_url`, `api_key`, and custom clients should be set on each model itself, not on the `FallbackModel`.
 
+### Per-Model Settings
+
+You can configure different `ModelSettings` for each model in a fallback chain by passing the `settings` parameter when creating each model. This is particularly useful when different providers have different optimal configurations:
-You can configure different `ModelSettings` for each model in a fallback chain by passing the `settings` parameter when creating each model. This is particularly useful when different providers have different optimal configurations:
+You can configure different [`ModelSettings`][pydantic_ai.settings.ModelSettings] for each model in a fallback chain by passing the `settings` parameter when creating each model. This is particularly useful when different providers have different optimal configurations:
-You can configure different `ModelSettings` for each model in a fallback chain by passing the `settings` parameter when creating each model. This is particularly useful when different providers have different optimal configurations:
+You can configure different [`ModelSettings`][pydantic_ai.settings.ModelSettings] for each model in a fallback chain by passing the `settings` parameter when creating each model. This is particularly useful when different providers have different optimal configurations:
+
+```python {title="fallback_model_per_settings.py"}
+from pydantic_ai import Agent
+from pydantic_ai.models.anthropic import AnthropicModel
+from pydantic_ai.models.fallback import FallbackModel
+from pydantic_ai.models.openai import OpenAIModel
+from pydantic_ai.settings import ModelSettings
+
+# Configure each model with provider-specific optimal settings
+openai_model = OpenAIModel(
+    'gpt-4o',
+    settings=ModelSettings(temperature=0.7, max_tokens=1000)  # Higher creativity for OpenAI
+)
+anthropic_model = AnthropicModel(
+    'claude-3-5-sonnet-latest',
+    settings=ModelSettings(temperature=0.2, max_tokens=1000)  # Lower temperature for consistency
+)
+
+fallback_model = FallbackModel(openai_model, anthropic_model)
+agent = Agent(fallback_model)
+
+result = agent.run_sync('Write a creative story about space exploration')
+print(result.output)
+"""
+In the year 2157, Captain Maya Chen piloted her spacecraft through the vast expanse of the Andromeda Galaxy. As she discovered a planet with crystalline mountains that sang in harmony with the cosmic winds, she realized that space exploration was not just about finding new worlds, but about finding new ways to understand the universe and our place within it.
+"""
+```
+
+In this example, if the OpenAI model fails, the agent will automatically fall back to the Anthropic model with its own configured settings. The `FallbackModel` itself doesn't have settings - it uses the individual settings of whichever model successfully handles the request.
+
 In this next example, we demonstrate the exception-handling capabilities of `FallbackModel`.
 If all models fail, a [`FallbackExceptionGroup`][pydantic_ai.exceptions.FallbackExceptionGroup] is raised, which
 contains all the exceptions encountered during the `run` execution.

diff --git a/pydantic_ai_slim/pydantic_ai/agent.py b/pydantic_ai_slim/pydantic_ai/agent.py
@@ -674,12 +674,24 @@ async def main():
         # typecast reasonable, even though it is possible to violate it with otherwise-type-checked code.
         output_validators = cast(list[_output.OutputValidator[AgentDepsT, RunOutputDataT]], self._output_validators)
 
+        # Merge model settings from any Model
+        if isinstance(model_used, InstrumentedModel):
+            # For InstrumentedModel, get settings from the wrapped model
+            wrapped_model_settings = getattr(model_used.wrapped, 'settings', None)
+            if wrapped_model_settings is not None:
+                model_settings = merge_model_settings(wrapped_model_settings, model_settings)
+        else:
+            # For regular models, use their settings directly
+            current_settings = getattr(model_used, 'settings', None)
+            if current_settings is not None:
+                model_settings = merge_model_settings(current_settings, model_settings)
+
         model_settings = merge_model_settings(self.model_settings, model_settings)
         usage_limits = usage_limits or _usage.UsageLimits()
 
         if isinstance(model_used, InstrumentedModel):
-            instrumentation_settings = model_used.settings
-            tracer = model_used.settings.tracer
+            instrumentation_settings = model_used.instrumentation_settings
+            tracer = model_used.instrumentation_settings.tracer
         else:
             instrumentation_settings = None
             tracer = NoOpTracer()

diff --git a/pydantic_ai_slim/pydantic_ai/models/__init__.py b/pydantic_ai_slim/pydantic_ai/models/__init__.py
@@ -322,6 +322,14 @@ class Model(ABC):
 
     _profile: ModelProfileSpec | None = None
 
+    def __init__(self, *, settings: ModelSettings | None = None) -> None:
+        """Initialize the model with optional settings.
+
+        Args:
+            settings: Model-specific settings that will be used as defaults for this model.
+        """
+        self.settings: ModelSettings | None = settings
+
     @abstractmethod
     async def request(
         self,

diff --git a/pydantic_ai_slim/pydantic_ai/models/anthropic.py b/pydantic_ai_slim/pydantic_ai/models/anthropic.py
@@ -127,6 +127,7 @@ def __init__(
         *,
         provider: Literal['anthropic'] | Provider[AsyncAnthropic] = 'anthropic',
         profile: ModelProfileSpec | None = None,
+        settings: ModelSettings | None = None,
     ):
         """Initialize an Anthropic model.
 
@@ -136,7 +137,9 @@ def __init__(
             provider: The provider to use for the Anthropic API. Can be either the string 'anthropic' or an
                 instance of `Provider[AsyncAnthropic]`. If not provided, the other parameters will be used.
             profile: The model profile to use. Defaults to a profile picked by the provider based on the model name.
+            settings: Default model settings for this model instance.
         """
+        super().__init__(settings=settings)
         self._model_name = model_name
 
         if isinstance(provider, str):

diff --git a/pydantic_ai_slim/pydantic_ai/models/bedrock.py b/pydantic_ai_slim/pydantic_ai/models/bedrock.py
@@ -201,6 +201,7 @@ def __init__(
         *,
         provider: Literal['bedrock'] | Provider[BaseClient] = 'bedrock',
         profile: ModelProfileSpec | None = None,
+        settings: ModelSettings | None = None,
     ):
         """Initialize a Bedrock model.
 
@@ -212,7 +213,9 @@ def __init__(
                 'bedrock' or an instance of `Provider[BaseClient]`. If not provided, a new provider will be
                 created using the other parameters.
             profile: The model profile to use. Defaults to a profile picked by the provider based on the model name.
+            settings: Model-specific settings that will be used as defaults for this model.
         """
+        super().__init__(settings=settings)
         self._model_name = model_name
 
         if isinstance(provider, str):

diff --git a/pydantic_ai_slim/pydantic_ai/models/cohere.py b/pydantic_ai_slim/pydantic_ai/models/cohere.py
@@ -111,6 +111,7 @@ def __init__(
         *,
         provider: Literal['cohere'] | Provider[AsyncClientV2] = 'cohere',
         profile: ModelProfileSpec | None = None,
+        settings: ModelSettings | None = None,
     ):
         """Initialize an Cohere model.
 
@@ -121,7 +122,9 @@ def __init__(
                 'cohere' or an instance of `Provider[AsyncClientV2]`. If not provided, a new provider will be
                 created using the other parameters.
             profile: The model profile to use. Defaults to a profile picked by the provider based on the model name.
+            settings: Model-specific settings that will be used as defaults for this model.
         """
+        super().__init__(settings=settings)
         self._model_name = model_name
 
         if isinstance(provider, str):

diff --git a/pydantic_ai_slim/pydantic_ai/models/fallback.py b/pydantic_ai_slim/pydantic_ai/models/fallback.py
@@ -42,6 +42,7 @@ def __init__(
             fallback_models: The names or instances of the fallback models to use upon failure.
             fallback_on: A callable or tuple of exceptions that should trigger a fallback.
         """
+        super().__init__()
         self.models = [infer_model(default_model), *[infer_model(m) for m in fallback_models]]
 
         if isinstance(fallback_on, tuple):

diff --git a/pydantic_ai_slim/pydantic_ai/models/function.py b/pydantic_ai_slim/pydantic_ai/models/function.py
@@ -52,7 +52,12 @@ class FunctionModel(Model):
 
     @overload
     def __init__(
-        self, function: FunctionDef, *, model_name: str | None = None, profile: ModelProfileSpec | None = None
+        self,
+        function: FunctionDef,
+        *,
+        model_name: str | None = None,
+        profile: ModelProfileSpec | None = None,
+        settings: ModelSettings | None = None,
     ) -> None: ...
 
     @overload
@@ -62,6 +67,7 @@ def __init__(
         stream_function: StreamFunctionDef,
         model_name: str | None = None,
         profile: ModelProfileSpec | None = None,
+        settings: ModelSettings | None = None,
     ) -> None: ...
 
     @overload
@@ -72,6 +78,7 @@ def __init__(
         stream_function: StreamFunctionDef,
         model_name: str | None = None,
         profile: ModelProfileSpec | None = None,
+        settings: ModelSettings | None = None,
     ) -> None: ...
 
     def __init__(
@@ -81,6 +88,7 @@ def __init__(
         stream_function: StreamFunctionDef | None = None,
         model_name: str | None = None,
         profile: ModelProfileSpec | None = None,
+        settings: ModelSettings | None = None,
     ):
         """Initialize a `FunctionModel`.
 
@@ -91,9 +99,11 @@ def __init__(
             stream_function: The function to call for streamed requests.
             model_name: The name of the model. If not provided, a name is generated from the function names.
             profile: The model profile to use.
+            settings: Model-specific settings that will be used as defaults for this model.
         """
         if function is None and stream_function is None:
             raise TypeError('Either `function` or `stream_function` must be provided')
+        super().__init__(settings=settings)
         self.function = function
         self.stream_function = stream_function
 

diff --git a/pydantic_ai_slim/pydantic_ai/models/gemini.py b/pydantic_ai_slim/pydantic_ai/models/gemini.py
@@ -133,6 +133,7 @@ def __init__(
         *,
         provider: Literal['google-gla', 'google-vertex'] | Provider[httpx.AsyncClient] = 'google-gla',
         profile: ModelProfileSpec | None = None,
+        settings: ModelSettings | None = None,
     ):
         """Initialize a Gemini model.
 
@@ -142,7 +143,9 @@ def __init__(
                 'google-gla' or 'google-vertex' or an instance of `Provider[httpx.AsyncClient]`.
                 If not provided, a new provider will be created using the other parameters.
             profile: The model profile to use. Defaults to a profile picked by the provider based on the model name.
+            settings: Default model settings for this model instance.
         """
+        super().__init__(settings=settings)
         self._model_name = model_name
         self._provider = provider
 

diff --git a/pydantic_ai_slim/pydantic_ai/models/google.py b/pydantic_ai_slim/pydantic_ai/models/google.py
@@ -151,6 +151,7 @@ def __init__(
         *,
         provider: Literal['google-gla', 'google-vertex'] | Provider[genai.Client] = 'google-gla',
         profile: ModelProfileSpec | None = None,
+        settings: ModelSettings | None = None,
     ):
         """Initialize a Gemini model.
 
@@ -160,7 +161,9 @@ def __init__(
                 'google-gla' or 'google-vertex' or an instance of `Provider[httpx.AsyncClient]`.
                 If not provided, a new provider will be created using the other parameters.
             profile: The model profile to use. Defaults to a profile picked by the provider based on the model name.
+            settings: The model settings to use. Defaults to None.
         """
+        super().__init__(settings=settings)
         self._model_name = model_name
 
         if isinstance(provider, str):

diff --git a/pydantic_ai_slim/pydantic_ai/models/groq.py b/pydantic_ai_slim/pydantic_ai/models/groq.py
@@ -120,6 +120,7 @@ def __init__(
         *,
         provider: Literal['groq'] | Provider[AsyncGroq] = 'groq',
         profile: ModelProfileSpec | None = None,
+        settings: ModelSettings | None = None,
     ):
         """Initialize a Groq model.
 
@@ -130,7 +131,9 @@ def __init__(
                 'groq' or an instance of `Provider[AsyncGroq]`. If not provided, a new provider will be
                 created using the other parameters.
             profile: The model profile to use. Defaults to a profile picked by the provider based on the model name.
+            settings: Model-specific settings that will be used as defaults for this model.
         """
+        super().__init__(settings=settings)
         self._model_name = model_name
 
         if isinstance(provider, str):

diff --git a/pydantic_ai_slim/pydantic_ai/models/instrumented.py b/pydantic_ai_slim/pydantic_ai/models/instrumented.py
@@ -182,23 +182,26 @@ def messages_to_otel_events(self, messages: list[ModelMessage]) -> list[Event]:
 GEN_AI_REQUEST_MODEL_ATTRIBUTE = 'gen_ai.request.model'
 
 
-@dataclass
+@dataclass(init=False)
 class InstrumentedModel(WrapperModel):
     """Model which wraps another model so that requests are instrumented with OpenTelemetry.
 
     See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info.
     """
 
-    settings: InstrumentationSettings
-    """Configuration for instrumenting requests."""
+    instrumentation_settings: InstrumentationSettings
+    """Instrumentation settings for this model."""
 
     def __init__(
         self,
         wrapped: Model | KnownModelName,
         options: InstrumentationSettings | None = None,
     ) -> None:
         super().__init__(wrapped)
-        self.settings = options or InstrumentationSettings()
+        # Store instrumentation settings separately from model settings
+        self.instrumentation_settings = options or InstrumentationSettings()
+        # Initialize base Model with no settings to avoid storing InstrumentationSettings there
+        Model.__init__(self, settings=None)
 
     async def request(
         self,
@@ -260,7 +263,7 @@ def _instrument(
 
         record_metrics: Callable[[], None] | None = None
         try:
-            with self.settings.tracer.start_as_current_span(span_name, attributes=attributes) as span:
+            with self.instrumentation_settings.tracer.start_as_current_span(span_name, attributes=attributes) as span:
 
                 def finish(response: ModelResponse):
                     # FallbackModel updates these span attributes.
@@ -278,12 +281,12 @@ def _record_metrics():
                             'gen_ai.response.model': response_model,
                         }
                         if response.usage.request_tokens:  # pragma: no branch
-                            self.settings.tokens_histogram.record(
+                            self.instrumentation_settings.tokens_histogram.record(
                                 response.usage.request_tokens,
                                 {**metric_attributes, 'gen_ai.token.type': 'input'},
                             )
                         if response.usage.response_tokens:  # pragma: no branch
-                            self.settings.tokens_histogram.record(
+                            self.instrumentation_settings.tokens_histogram.record(
                                 response.usage.response_tokens,
                                 {**metric_attributes, 'gen_ai.token.type': 'output'},
                             )
@@ -294,8 +297,8 @@ def _record_metrics():
                     if not span.is_recording():
                         return
 
-                    events = self.settings.messages_to_otel_events(messages)
-                    for event in self.settings.messages_to_otel_events([response]):
+                    events = self.instrumentation_settings.messages_to_otel_events(messages)
+                    for event in self.instrumentation_settings.messages_to_otel_events([response]):
                         events.append(
                             Event(
                                 'gen_ai.choice',
@@ -328,9 +331,9 @@ def _record_metrics():
                 record_metrics()
 
     def _emit_events(self, span: Span, events: list[Event]) -> None:
-        if self.settings.event_mode == 'logs':
+        if self.instrumentation_settings.event_mode == 'logs':
             for event in events:
-                self.settings.event_logger.emit(event)
+                self.instrumentation_settings.event_logger.emit(event)
         else:
             attr_name = 'events'
             span.set_attributes(

diff --git a/pydantic_ai_slim/pydantic_ai/models/mcp_sampling.py b/pydantic_ai_slim/pydantic_ai/models/mcp_sampling.py
@@ -42,6 +42,10 @@ class MCPSamplingModel(Model):
     [`ModelSettings`][pydantic_ai.settings.ModelSettings], so this value is used as fallback.
     """
 
+    def __post_init__(self):
+        """Initialize the base Model class."""
+        super().__init__()
+
     async def request(
         self,
         messages: list[ModelMessage],