Skip to content

Conversation

aksg87
Copy link
Collaborator

@aksg87 aksg87 commented Aug 8, 2025

Description

Introduces a provider registry system enabling third-party providers to be dynamically registered and discovered through a plugin architecture. Users can now integrate custom LLM backends (Azure OpenAI, AWS Bedrock, custom inference servers) without modifying core LangExtract code.

Feature

How Has This Been Tested?

$ tox -e py310
$ pytest tests/registry_test.py -v
$ pytest tests/factory_test.py::FactoryTest -v
$ python examples/custom_provider_plugin/test_example_provider.py

Verified backward compatibility with existing providers and tested custom plugin with entry points.

Checklist:

  • I have read and acknowledged Google's Open Source
    Code of conduct.
  • I have read the
    Contributing
    page, and I either signed the Google
    Individual CLA
    or am covered by my company's
    Corporate CLA.
  • I have discussed my proposed solution with code owners in the linked
    issue(s) and we have agreed upon the general approach.
  • I have made any needed documentation changes, or noted in the linked
    issue(s) that documentation elsewhere needs updating.
  • I have added tests, or I have ensured existing tests cover the changes
  • I have followed
    Google's Python Style Guide
    and ran `pylint` over the affected code.

Key Changes

Provider Registry (`langextract/providers/registry.py`)

  • Pattern-based registration with priority resolution
  • Automatic discovery via Python entry points
  • Lazy loading for performance

Factory Enhancements (`langextract/factory.py`)

  • `ModelConfig` dataclass for structured configuration
  • Explicit provider selection when patterns overlap
  • Full backward compatibility maintained

Plugin Example (`examples/custom_provider_plugin/`)

  • Complete working example with entry point configuration
  • Shows how to create custom providers for any backend

Documentation

  • Comprehensive provider system README with architecture diagrams
  • Step-by-step plugin creation guide

Breaking Changes

No anticipated breakage - full backward compatibility maintained. Given significant internal changes to provider loading, issues should be reported if unexpected behavior is encountered.

@github-actions github-actions bot added the size/XL Pull request with over 1000 lines changed - too large label Aug 8, 2025
@aksg87 aksg87 force-pushed the feat/provider-registry-infrastructure branch from 6f53efd to 392c436 Compare August 8, 2025 11:04
@aksg87 aksg87 added the ready-to-merge Triggers live API tests for PRs from forks label Aug 8, 2025
@aksg87 aksg87 force-pushed the feat/provider-registry-infrastructure branch 4 times, most recently from e55f845 to f57ae05 Compare August 8, 2025 11:35
Introduces a provider registry system enabling third-party providers to be dynamically registered and discovered through a plugin architecture. Users can now integrate custom LLM backends (Azure OpenAI, AWS Bedrock, custom inference servers) without modifying core LangExtract code.

Fixes #80, #67, #54, #49, #48, #53

Key Changes:

**Provider Registry** (`langextract/providers/registry.py`)
- Pattern-based registration with priority resolution
- Automatic discovery via Python entry points
- Lazy loading for performance

**Factory Enhancements** (`langextract/factory.py`)
- `ModelConfig` dataclass for structured configuration
- Explicit provider selection when patterns overlap
- Full backward compatibility maintained

**Plugin Example** (`examples/custom_provider_plugin/`)
- Complete working example with entry point configuration
- Shows how to create custom providers for any backend

**Documentation**
- Comprehensive provider system README with architecture diagrams
- Step-by-step plugin creation guide

**Dependencies**
- Move openai to optional dependencies
- Update tox.ini to include openai in test environments

**Lint Fixes**
- Add appropriate pylint suppressions for legitimate patterns
- Fix unused variable warnings in tests
- Address import and global statement warnings

No anticipated breakage - full backward compatibility maintained. Given significant internal changes to provider loading, issues should be reported if unexpected behavior is encountered.
@aksg87 aksg87 force-pushed the feat/provider-registry-infrastructure branch from f57ae05 to 9c06ee7 Compare August 8, 2025 11:37
@google google deleted a comment from github-actions bot Aug 8, 2025
@google google deleted a comment from github-actions bot Aug 8, 2025
@aksg87 aksg87 merged commit 00acc43 into main Aug 8, 2025
12 of 13 checks passed
@aksg87 aksg87 deleted the feat/provider-registry-infrastructure branch August 8, 2025 11:52
Copy link

github-actions bot commented Aug 8, 2025

No linked issues found. Please add the corresponding issues in the pull request description.
Use GitHub automation to close the issue when a PR is merged

@aksg87 aksg87 self-assigned this Aug 10, 2025
aksg87 added a commit that referenced this pull request Aug 21, 2025
…le (#97)

Introduces a provider registry system enabling third-party providers to be dynamically registered and discovered through a plugin architecture. Users can now integrate custom LLM backends (Azure OpenAI, AWS Bedrock, custom inference servers) without modifying core LangExtract code.

Fixes #80, #67, #54, #49, #48, #53

Key Changes:

**Provider Registry** (`langextract/providers/registry.py`)
- Pattern-based registration with priority resolution
- Automatic discovery via Python entry points
- Lazy loading for performance

**Factory Enhancements** (`langextract/factory.py`)
- `ModelConfig` dataclass for structured configuration
- Explicit provider selection when patterns overlap
- Full backward compatibility maintained

**Plugin Example** (`examples/custom_provider_plugin/`)
- Complete working example with entry point configuration
- Shows how to create custom providers for any backend

**Documentation**
- Comprehensive provider system README with architecture diagrams
- Step-by-step plugin creation guide

**Dependencies**
- Move openai to optional dependencies
- Update tox.ini to include openai in test environments

**Lint Fixes**
- Add appropriate pylint suppressions for legitimate patterns
- Fix unused variable warnings in tests
- Address import and global statement warnings

No anticipated breakage - full backward compatibility maintained. Given significant internal changes to provider loading, issues should be reported if unexpected behavior is encountered.
sinnaj pushed a commit to sinnaj/langextract that referenced this pull request Sep 3, 2025
…le (google#97)

Introduces a provider registry system enabling third-party providers to be dynamically registered and discovered through a plugin architecture. Users can now integrate custom LLM backends (Azure OpenAI, AWS Bedrock, custom inference servers) without modifying core LangExtract code.

Fixes google#80, #67, #54, #49, #48, #53

Key Changes:

**Provider Registry** (`langextract/providers/registry.py`)
- Pattern-based registration with priority resolution
- Automatic discovery via Python entry points
- Lazy loading for performance

**Factory Enhancements** (`langextract/factory.py`)
- `ModelConfig` dataclass for structured configuration
- Explicit provider selection when patterns overlap
- Full backward compatibility maintained

**Plugin Example** (`examples/custom_provider_plugin/`)
- Complete working example with entry point configuration
- Shows how to create custom providers for any backend

**Documentation**
- Comprehensive provider system README with architecture diagrams
- Step-by-step plugin creation guide

**Dependencies**
- Move openai to optional dependencies
- Update tox.ini to include openai in test environments

**Lint Fixes**
- Add appropriate pylint suppressions for legitimate patterns
- Fix unused variable warnings in tests
- Address import and global statement warnings

No anticipated breakage - full backward compatibility maintained. Given significant internal changes to provider loading, issues should be reported if unexpected behavior is encountered.
aksg87 added a commit that referenced this pull request Sep 12, 2025
…le (#97)

Introduces a provider registry system enabling third-party providers to be dynamically registered and discovered through a plugin architecture. Users can now integrate custom LLM backends (Azure OpenAI, AWS Bedrock, custom inference servers) without modifying core LangExtract code.

Fixes #80, #67, #54, #49, #48, #53

Key Changes:

**Provider Registry** (`langextract/providers/registry.py`)
- Pattern-based registration with priority resolution
- Automatic discovery via Python entry points
- Lazy loading for performance

**Factory Enhancements** (`langextract/factory.py`)
- `ModelConfig` dataclass for structured configuration
- Explicit provider selection when patterns overlap
- Full backward compatibility maintained

**Plugin Example** (`examples/custom_provider_plugin/`)
- Complete working example with entry point configuration
- Shows how to create custom providers for any backend

**Documentation**
- Comprehensive provider system README with architecture diagrams
- Step-by-step plugin creation guide

**Dependencies**
- Move openai to optional dependencies
- Update tox.ini to include openai in test environments

**Lint Fixes**
- Add appropriate pylint suppressions for legitimate patterns
- Fix unused variable warnings in tests
- Address import and global statement warnings

No anticipated breakage - full backward compatibility maintained. Given significant internal changes to provider loading, issues should be reported if unexpected behavior is encountered.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-to-merge Triggers live API tests for PRs from forks size/XL Pull request with over 1000 lines changed - too large

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant