Skip to content

Add tenacity utilities/integration for improved retry handling #2282

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 25, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions docs/api/retries.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# `pydantic_ai.retries`

::: pydantic_ai.retries
338 changes: 338 additions & 0 deletions docs/retries.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,338 @@
# HTTP Request Retries

Pydantic AI provides retry functionality for HTTP requests made by model providers through custom HTTP transports.
This is particularly useful for handling transient failures like rate limits, network timeouts, or temporary server errors.

## Overview

The retry functionality is built on top of the [tenacity](https://github.com/jd/tenacity) library and integrates
seamlessly with httpx clients. You can configure retry behavior for any provider that accepts a custom HTTP client.

## Installation

To use the retry transports, you need to install `tenacity`, which you can do via the `retries` dependency group:

```bash
pip/uv-add 'pydantic-ai-slim[retries]'
```

## Usage Example

Here's an example of adding retry functionality with smart retry handling:

```python {title="smart_retry_example.py"}
from httpx import AsyncClient, HTTPStatusError
from tenacity import (
AsyncRetrying,
stop_after_attempt,
wait_exponential,
retry_if_exception_type
)
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.retries import AsyncTenacityTransport, wait_retry_after
from pydantic_ai.providers.openai import OpenAIProvider

def create_retrying_client():
"""Create a client with smart retry handling for multiple error types."""

def should_retry_status(response):
"""Raise exceptions for retryable HTTP status codes."""
if response.status_code in (429, 502, 503, 504):
response.raise_for_status() # This will raise HTTPStatusError

transport = AsyncTenacityTransport(
controller=AsyncRetrying(
# Retry on HTTP errors and connection issues
retry=retry_if_exception_type((HTTPStatusError, ConnectionError)),
# Smart waiting: respects Retry-After headers, falls back to exponential backoff
wait=wait_retry_after(
fallback_strategy=wait_exponential(multiplier=1, max=60),
max_wait=300
),
# Stop after 5 attempts
stop=stop_after_attempt(5),
# Re-raise the last exception if all retries fail
reraise=True
),
validate_response=should_retry_status
)
return AsyncClient(transport=transport)

# Use the retrying client with a model
client = create_retrying_client()
model = OpenAIModel('gpt-4o', provider=OpenAIProvider(http_client=client))
agent = Agent(model)
```

## Wait Strategies

### wait_retry_after

The `wait_retry_after` function is a smart wait strategy that automatically respects HTTP `Retry-After` headers:

```python {title="wait_strategy_example.py"}
from pydantic_ai.retries import wait_retry_after
from tenacity import wait_exponential

# Basic usage - respects Retry-After headers, falls back to exponential backoff
wait_strategy_1 = wait_retry_after()

# Custom configuration
wait_strategy_2 = wait_retry_after(
fallback_strategy=wait_exponential(multiplier=2, max=120),
max_wait=600 # Never wait more than 10 minutes
)
```

This wait strategy:
- Automatically parses `Retry-After` headers from HTTP 429 responses
- Supports both seconds format (`"30"`) and HTTP date format (`"Wed, 21 Oct 2015 07:28:00 GMT"`)
- Falls back to your chosen strategy when no header is present
- Respects the `max_wait` limit to prevent excessive delays

## Transport Classes

### AsyncTenacityTransport

For asynchronous HTTP clients (recommended for most use cases):

```python {title="async_transport_example.py"}
from httpx import AsyncClient
from tenacity import AsyncRetrying, stop_after_attempt
from pydantic_ai.retries import AsyncTenacityTransport

# Create the basic components
async_retrying = AsyncRetrying(stop=stop_after_attempt(3), reraise=True)

def validator(response):
"""Treat responses with HTTP status 4xx/5xx as failures that need to be retried.
Without a response validator, only network errors and timeouts will result in a retry.
"""
response.raise_for_status()

# Create the transport
transport = AsyncTenacityTransport(
controller=async_retrying, # AsyncRetrying instance
validate_response=validator # Optional response validator
)

# Create a client using the transport:
client = AsyncClient(transport=transport)
```

### TenacityTransport

For synchronous HTTP clients:

```python {title="sync_transport_example.py"}
from httpx import Client
from tenacity import Retrying, stop_after_attempt
from pydantic_ai.retries import TenacityTransport

# Create the basic components
retrying = Retrying(stop=stop_after_attempt(3), reraise=True)

def validator(response):
"""Treat responses with HTTP status 4xx/5xx as failures that need to be retried.
Without a response validator, only network errors and timeouts will result in a retry.
"""
response.raise_for_status()

# Create the transport
transport = TenacityTransport(
controller=retrying, # Retrying instance
validate_response=validator # Optional response validator
)

# Create a client using the transport
client = Client(transport=transport)
```

## Common Retry Patterns

### Rate Limit Handling with Retry-After Support

```python {title="rate_limit_handling.py"}
from httpx import AsyncClient, HTTPStatusError
from tenacity import AsyncRetrying, stop_after_attempt, retry_if_exception_type, wait_exponential
from pydantic_ai.retries import AsyncTenacityTransport, wait_retry_after

def create_rate_limit_client():
"""Create a client that respects Retry-After headers from rate limiting responses."""
transport = AsyncTenacityTransport(
controller=AsyncRetrying(
retry=retry_if_exception_type(HTTPStatusError),
wait=wait_retry_after(
fallback_strategy=wait_exponential(multiplier=1, max=60),
max_wait=300 # Don't wait more than 5 minutes
),
stop=stop_after_attempt(10),
reraise=True
),
validate_response=lambda r: r.raise_for_status() # Raises HTTPStatusError for 4xx/5xx
)
return AsyncClient(transport=transport)

# Example usage
client = create_rate_limit_client()
# Client is now ready to use with any HTTP requests and will respect Retry-After headers
```

The `wait_retry_after` function automatically detects `Retry-After` headers in 429 (rate limit) responses and waits for the specified time. If no header is present, it falls back to exponential backoff.

### Network Error Handling

```python {title="network_error_handling.py"}
import httpx
from tenacity import AsyncRetrying, retry_if_exception_type, wait_exponential, stop_after_attempt
from pydantic_ai.retries import AsyncTenacityTransport

def create_network_resilient_client():
"""Create a client that handles network errors with retries."""
transport = AsyncTenacityTransport(
controller=AsyncRetrying(
retry=retry_if_exception_type((
httpx.TimeoutException,
httpx.ConnectError,
httpx.ReadError
)),
wait=wait_exponential(multiplier=1, max=10),
stop=stop_after_attempt(3),
reraise=True
)
)
return httpx.AsyncClient(transport=transport)

# Example usage
client = create_network_resilient_client()
# Client will now retry on timeout, connection, and read errors
```

### Custom Retry Logic

```python {title="custom_retry_logic.py"}
import httpx
from tenacity import AsyncRetrying, wait_exponential, stop_after_attempt
from pydantic_ai.retries import AsyncTenacityTransport, wait_retry_after

def create_custom_retry_client():
"""Create a client with custom retry logic."""
def custom_retry_condition(exception):
"""Custom logic to determine if we should retry."""
if isinstance(exception, httpx.HTTPStatusError):
# Retry on server errors but not client errors
return 500 <= exception.response.status_code < 600
return isinstance(exception, (httpx.TimeoutException, httpx.ConnectError))

transport = AsyncTenacityTransport(
controller=AsyncRetrying(
retry=custom_retry_condition,
# Use wait_retry_after for smart waiting on rate limits,
# with custom exponential backoff as fallback
wait=wait_retry_after(
fallback_strategy=wait_exponential(multiplier=2, max=30),
max_wait=120
),
stop=stop_after_attempt(5),
reraise=True
),
validate_response=lambda r: r.raise_for_status()
)
return httpx.AsyncClient(transport=transport)

client = create_custom_retry_client()
# Client will retry server errors (5xx) and network errors, but not client errors (4xx)
```

## Using with Different Providers

The retry transports work with any provider that accepts a custom HTTP client:

### OpenAI

```python {title="openai_with_retries.py" requires="smart_retry_example.py"}
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.providers.openai import OpenAIProvider

from smart_retry_example import create_retrying_client

client = create_retrying_client()
model = OpenAIModel('gpt-4o', provider=OpenAIProvider(http_client=client))
agent = Agent(model)
```

### Anthropic

```python {title="anthropic_with_retries.py" requires="smart_retry_example.py"}
from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.providers.anthropic import AnthropicProvider

from smart_retry_example import create_retrying_client

client = create_retrying_client()
model = AnthropicModel('claude-3-5-sonnet-20241022', provider=AnthropicProvider(http_client=client))
agent = Agent(model)
```

### Any OpenAI-Compatible Provider

```python {title="openai_compatible_with_retries.py" requires="smart_retry_example.py"}
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.providers.openai import OpenAIProvider

from smart_retry_example import create_retrying_client

client = create_retrying_client()
model = OpenAIModel(
'your-model-name', # Replace with actual model name
provider=OpenAIProvider(
base_url='https://api.example.com/v1', # Replace with actual API URL
api_key='your-api-key', # Replace with actual API key
http_client=client
)
)
agent = Agent(model)
```

## Best Practices

1. **Start Conservative**: Begin with a small number of retries (3-5) and reasonable wait times.

2. **Use Exponential Backoff**: This helps avoid overwhelming servers during outages.

3. **Set Maximum Wait Times**: Prevent indefinite delays with reasonable maximum wait times.

4. **Handle Rate Limits Properly**: Respect `Retry-After` headers when possible.

5. **Log Retry Attempts**: Add logging to monitor retry behavior in production. (This will be picked up by Logfire automatically if you instrument httpx.)

6. **Consider Circuit Breakers**: For high-traffic applications, consider implementing circuit breaker patterns.

## Error Handling

The retry transports will re-raise the last exception if all retry attempts fail. Make sure to handle these appropriately in your application:

```python {title="error_handling_example.py" requires="smart_retry_example.py"}
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.providers.openai import OpenAIProvider

from smart_retry_example import create_retrying_client

client = create_retrying_client()
model = OpenAIModel('gpt-4o', provider=OpenAIProvider(http_client=client))
agent = Agent(model)
```

## Performance Considerations

- Retries add latency to requests, especially with exponential backoff
- Consider the total timeout for your application when configuring retry behavior
- Monitor retry rates to detect systemic issues
- Use async transports for better concurrency when handling multiple requests

For more advanced retry configurations, refer to the [tenacity documentation](https://tenacity.readthedocs.io/).
2 changes: 2 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ nav:
- thinking.md
- direct.md
- common-tools.md
- retries.md
- MCP:
- mcp/index.md
- mcp/client.md
Expand Down Expand Up @@ -101,6 +102,7 @@ nav:
- api/models/mcp-sampling.md
- api/profiles.md
- api/providers.md
- api/retries.md
- api/pydantic_graph/graph.md
- api/pydantic_graph/nodes.md
- api/pydantic_graph/persistence.md
Expand Down
Loading