Skip to content

Commit 852648e

Browse files
committed
Implement standardized provider interface and registry
Addresses issue #1534 by creating: - Provider protocol and base class implementation - Provider registry system for registration and discovery - Sample implementation for OpenAI provider - Documentation for provider system This is the first step toward reorganizing the provider architecture.
1 parent 2581e38 commit 852648e

File tree

5 files changed

+458
-0
lines changed

5 files changed

+458
-0
lines changed

docs/concepts/providers.md

Lines changed: 137 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,137 @@
1+
# Provider System
2+
3+
Instructor supports multiple Large Language Model (LLM) providers through a standardized provider interface. This documentation explains how the provider system works and how to implement new providers.
4+
5+
## Provider Interface
6+
7+
All providers in Instructor implement a common interface that defines the expected behavior and capabilities. This interface is defined by the `ProviderProtocol` in `instructor.providers.base`.
8+
9+
```python
10+
from instructor.providers.base import ProviderProtocol
11+
```
12+
13+
The protocol requires the following properties and methods:
14+
15+
- `supported_modes`: List of `Mode` values supported by the provider
16+
- `capabilities()`: Dictionary of provider capabilities (streaming, function_calling, etc.)
17+
- `create_client()`: Factory method to create an Instructor client for this provider
18+
- `create()`: Method to create structured output from messages
19+
- `create_stream()`: Method to stream structured output from messages
20+
21+
## Base Provider Implementation
22+
23+
A base implementation of the provider interface is available in `ProviderBase`, which handles common functionality like mode validation:
24+
25+
```python
26+
from instructor.providers.base import ProviderBase
27+
```
28+
29+
## Using Provider Registry
30+
31+
Providers register themselves with the registry using the `register_provider` decorator:
32+
33+
```python
34+
from instructor.providers import register_provider
35+
36+
@register_provider("openai")
37+
class OpenAIProvider(ProviderBase):
38+
# Implementation...
39+
```
40+
41+
You can list all available providers and get a specific provider:
42+
43+
```python
44+
from instructor.providers import list_providers, get_provider
45+
46+
# List all providers
47+
providers = list_providers()
48+
print(providers) # ['openai', 'anthropic', ...]
49+
50+
# Get a specific provider
51+
openai_provider = get_provider("openai")
52+
```
53+
54+
## Creating a New Provider
55+
56+
To create a new provider, implement the `ProviderBase` class and register it:
57+
58+
```python
59+
from typing import List, Dict, Any, Type, Union, Iterator, Optional
60+
from pydantic import BaseModel
61+
62+
from instructor.mode import Mode
63+
from instructor.client import Instructor, AsyncInstructor
64+
from instructor.providers import register_provider
65+
from instructor.providers.base import ProviderBase
66+
67+
68+
@register_provider("my_provider")
69+
class MyProvider(ProviderBase):
70+
"""My custom LLM provider."""
71+
72+
_supported_modes = [Mode.JSON]
73+
74+
@classmethod
75+
def create_client(
76+
cls,
77+
client: Any,
78+
provider_id: Optional[str] = None,
79+
**kwargs
80+
) -> Union[Instructor, AsyncInstructor]:
81+
# Implementation...
82+
83+
def create(
84+
self,
85+
response_model: Type[BaseModel],
86+
messages: List[Dict[str, Any]],
87+
mode: Mode,
88+
**kwargs
89+
) -> BaseModel:
90+
# Implementation...
91+
92+
def create_stream(
93+
self,
94+
response_model: Type[BaseModel],
95+
messages: List[Dict[str, Any]],
96+
mode: Mode,
97+
**kwargs
98+
) -> Iterator[BaseModel]:
99+
# Implementation...
100+
```
101+
102+
## Provider Capabilities
103+
104+
The `capabilities()` method returns a dictionary of provider capabilities:
105+
106+
```python
107+
{
108+
"streaming": True, # Supports streaming responses
109+
"function_calling": True, # Supports function calling
110+
"async": True, # Supports async operations
111+
"multimodal": False # Supports multimodal inputs
112+
}
113+
```
114+
115+
## Mode Support
116+
117+
Each provider indicates which modes it supports through the `supported_modes` property. Common modes include:
118+
119+
- `Mode.JSON`: Provider uses JSON mode for output format
120+
- `Mode.TOOLS`: Provider uses tools/functions for output format
121+
- `Mode.MARKDOWN`: Provider uses markdown for output format
122+
123+
Providers validate that requested modes are supported:
124+
125+
```python
126+
def validate_mode(self, mode: Mode) -> None:
127+
if mode not in self.supported_modes:
128+
supported_str = ", ".join(str(m) for m in self.supported_modes)
129+
raise ValueError(
130+
f"Mode {mode} not supported by this provider. "
131+
f"Supported modes: {supported_str}"
132+
)
133+
```
134+
135+
## Provider-Specific Arguments
136+
137+
Providers can accept additional arguments through `**kwargs` in the `create_client()`, `create()`, and `create_stream()` methods.

instructor/providers/__init__.py

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
"""
2+
Provider registry and interface for Instructor.
3+
4+
This module defines the provider protocol and registry for Instructor.
5+
Providers should be registered here to be discoverable by the auto_client.
6+
"""
7+
8+
from typing import Optional, TypeVar
9+
10+
# Type for provider base class
11+
T = TypeVar("T", bound="ProviderBase")
12+
13+
# Global registry of providers
14+
_provider_registry: dict[str, type[T]] = {}
15+
16+
17+
def register_provider(name: str):
18+
"""
19+
Decorator to register a provider class.
20+
21+
Args:
22+
name: Unique provider identifier
23+
"""
24+
25+
def decorator(cls: type[T]) -> type[T]:
26+
_provider_registry[name] = cls
27+
return cls
28+
29+
return decorator
30+
31+
32+
def get_provider(name: str) -> Optional[type[T]]:
33+
"""
34+
Get provider by name.
35+
36+
Args:
37+
name: Provider name
38+
39+
Returns:
40+
Provider class or None if not found
41+
"""
42+
return _provider_registry.get(name)
43+
44+
45+
def list_providers() -> list[str]:
46+
"""
47+
List all registered providers.
48+
49+
Returns:
50+
List of provider names
51+
"""
52+
return list(_provider_registry.keys())

instructor/providers/base.py

Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
"""
2+
Base provider interfaces for Instructor.
3+
4+
This module defines the base provider protocols and classes that all providers should implement.
5+
"""
6+
7+
from abc import abstractmethod
8+
from typing import Protocol, TypeVar, Generic, Any, Optional, Union, ClassVar
9+
from collections.abc import Iterator
10+
from pydantic import BaseModel
11+
12+
from ..mode import Mode
13+
from ..client import Instructor, AsyncInstructor
14+
15+
16+
# Type variable for response model
17+
T = TypeVar("T", bound=BaseModel)
18+
19+
20+
class ProviderProtocol(Protocol, Generic[T]):
21+
"""Protocol defining what all providers must implement."""
22+
23+
@property
24+
def supported_modes(self) -> list[Mode]:
25+
"""Modes supported by this provider."""
26+
...
27+
28+
@classmethod
29+
def capabilities(cls) -> dict[str, bool]:
30+
"""Provider capabilities (streaming, function_calling, etc.)"""
31+
...
32+
33+
@classmethod
34+
def create_client(
35+
cls, client: Any, provider_id: Optional[str] = None, **kwargs
36+
) -> Union[Instructor, AsyncInstructor]:
37+
"""Create an Instructor client for this provider."""
38+
...
39+
40+
def create(
41+
self,
42+
response_model: type[T],
43+
messages: list[dict[str, Any]],
44+
mode: Mode,
45+
**kwargs,
46+
) -> T:
47+
"""Create structured output from messages."""
48+
...
49+
50+
def create_stream(
51+
self,
52+
response_model: type[T],
53+
messages: list[dict[str, Any]],
54+
mode: Mode,
55+
**kwargs,
56+
) -> Iterator[T]:
57+
"""Stream structured output from messages."""
58+
...
59+
60+
61+
class ProviderBase(Generic[T]):
62+
"""Base implementation for providers with common functionality."""
63+
64+
# Class variables that should be overridden by subclasses
65+
_supported_modes: ClassVar[list[Mode]] = []
66+
67+
@property
68+
def supported_modes(self) -> list[Mode]:
69+
"""Get modes supported by this provider."""
70+
return self._supported_modes
71+
72+
@classmethod
73+
def capabilities(cls) -> dict[str, bool]:
74+
"""
75+
Get provider capabilities.
76+
77+
Returns a dictionary with keys:
78+
- streaming: Whether the provider supports streaming
79+
- function_calling: Whether the provider supports function calling
80+
- async: Whether the provider supports async operations
81+
- multimodal: Whether the provider supports multimodal inputs
82+
"""
83+
return {
84+
"streaming": hasattr(cls, "create_stream"),
85+
"function_calling": hasattr(cls, "create_with_functions"),
86+
"async": hasattr(cls, "acreate"),
87+
"multimodal": hasattr(cls, "supports_multimodal")
88+
and cls.supports_multimodal(),
89+
}
90+
91+
@classmethod
92+
def supports_multimodal(cls) -> bool:
93+
"""Whether this provider supports multimodal inputs."""
94+
return False
95+
96+
@classmethod
97+
@abstractmethod
98+
def create_client(
99+
cls, client: Any, provider_id: Optional[str] = None, **kwargs
100+
) -> Union[Instructor, AsyncInstructor]:
101+
"""
102+
Create an Instructor client for this provider.
103+
104+
Args:
105+
client: The provider's native client instance
106+
provider_id: Optional provider identifier (e.g., URL or name)
107+
**kwargs: Additional provider-specific arguments
108+
109+
Returns:
110+
An instance of Instructor or AsyncInstructor
111+
"""
112+
raise NotImplementedError()
113+
114+
def validate_mode(self, mode: Mode) -> None:
115+
"""
116+
Validate that the mode is supported by this provider.
117+
118+
Args:
119+
mode: The mode to validate
120+
121+
Raises:
122+
ValueError: If the mode is not supported
123+
"""
124+
if mode not in self.supported_modes:
125+
supported_str = ", ".join(str(m) for m in self.supported_modes)
126+
raise ValueError(
127+
f"Mode {mode} not supported by this provider. "
128+
f"Supported modes: {supported_str}"
129+
)

0 commit comments

Comments
 (0)