Skip to content

Conversation

aksg87
Copy link
Collaborator

@aksg87 aksg87 commented Aug 8, 2025

Description

Introduces a provider registry system enabling third-party providers to be dynamically registered and discovered through a plugin architecture. Users can now integrate custom LLM backends (Azure OpenAI, AWS Bedrock, custom inference servers) without modifying core LangExtract code.

Fixes #80, #67, #54, #49, #48, #53

Feature

How Has This Been Tested?

$ tox -e py310
$ pytest tests/registry_test.py -v
$ pytest tests/factory_test.py::FactoryTest -v
$ python examples/custom_provider_plugin/test_example_provider.py

Verified backward compatibility with existing providers and tested custom plugin with entry points.

Checklist:

  • I have read and acknowledged Google's Open Source
    Code of conduct.
  • I have read the
    Contributing
    page, and I either signed the Google
    Individual CLA
    or am covered by my company's
    Corporate CLA.
  • I have discussed my proposed solution with code owners in the linked
    issue(s) and we have agreed upon the general approach.
  • I have made any needed documentation changes, or noted in the linked
    issue(s) that documentation elsewhere needs updating.
  • I have added tests, or I have ensured existing tests cover the changes
  • I have followed
    Google's Python Style Guide
    and ran `pylint` over the affected code.

Key Changes

Provider Registry (`langextract/providers/registry.py`)

  • Pattern-based registration with priority resolution
  • Automatic discovery via Python entry points
  • Lazy loading for performance

Factory Enhancements (`langextract/factory.py`)

  • `ModelConfig` dataclass for structured configuration
  • Explicit provider selection when patterns overlap
  • Full backward compatibility maintained

Plugin Example (`examples/custom_provider_plugin/`)

  • Complete working example with entry point configuration
  • Shows how to create custom providers for any backend

Documentation

  • Comprehensive provider system README with architecture diagrams
  • Step-by-step plugin creation guide

Breaking Changes

No anticipated breakage - full backward compatibility maintained. Given significant internal changes to provider loading, issues should be reported if unexpected behavior is encountered.

@github-actions github-actions bot added the size/XL Pull request with over 1000 lines changed - too large label Aug 8, 2025
@aksg87 aksg87 force-pushed the feat/provider-registry-infrastructure branch from f7b1c67 to bb67654 Compare August 8, 2025 10:33
Introduces a provider registry system enabling third-party providers to be dynamically registered and discovered through a plugin architecture. Users can now integrate custom LLM backends (Azure OpenAI, AWS Bedrock, custom inference servers) without modifying core LangExtract code.

Addresses #80, #67, #54, #49, #48, #53

Key Changes:

**Provider Registry** (`langextract/providers/registry.py`)
- Pattern-based registration with priority resolution
- Automatic discovery via Python entry points
- Lazy loading for performance

**Factory Enhancements** (`langextract/factory.py`)
- `ModelConfig` dataclass for structured configuration
- Explicit provider selection when patterns overlap
- Full backward compatibility maintained

**Plugin Example** (`examples/custom_provider_plugin/`)
- Complete working example with entry point configuration
- Shows how to create custom providers for any backend

**Documentation**
- Comprehensive provider system README with architecture diagrams
- Step-by-step plugin creation guide

**Dependencies**
- Move openai to optional dependencies
- Update tox.ini to include openai in test environments

**Lint Fixes**
- Add appropriate pylint suppressions for legitimate patterns
- Fix unused variable warnings in tests
- Address import and global statement warnings

## Breaking Changes

No anticipated breakage - full backward compatibility maintained. Given significant internal changes to provider loading, issues should be reported if unexpected behavior is encountered.
@aksg87 aksg87 force-pushed the feat/provider-registry-infrastructure branch from bb67654 to 489db72 Compare August 8, 2025 10:45
@aksg87 aksg87 added the ready-to-merge Triggers live API tests for PRs from forks label Aug 8, 2025
Copy link

github-actions bot commented Aug 8, 2025

❌ Live API tests failed. Please check the workflow logs for details.

@aksg87 aksg87 closed this Aug 8, 2025
@aksg87 aksg87 deleted the feat/provider-registry-infrastructure branch August 8, 2025 10:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-to-merge Triggers live API tests for PRs from forks size/XL Pull request with over 1000 lines changed - too large

Projects

None yet

Development

Successfully merging this pull request may close these issues.

no support for OpenAI compatible model, with params of base_url, model_id, api_key

1 participant