-
Notifications
You must be signed in to change notification settings - Fork 3k
Add LiteLLM chat and embedding model providers. #2026
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Just adding a note here that this PR should close #1575. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request adds LiteLLM as a new provider for both chat and embedding models in GraphRAG, significantly expanding the library's model support to include many additional providers through LiteLLM's unified interface.
- Implements LiteLLM chat and embedding model classes with comprehensive request wrappers
- Adds robust retry mechanisms with multiple strategies (native, exponential backoff, random wait, incremental wait)
- Implements static rate limiting with RPM/TPM controls and threading support
Reviewed Changes
Copilot reviewed 32 out of 33 changed files in this pull request and generated 3 comments.
Show a summary per file
File | Description |
---|---|
pyproject.toml | Adds litellm dependency to the project |
graphrag/config/enums.py | Adds new LitellmChat and LitellmEmbedding model types |
graphrag/config/models/language_model_config.py | Extends configuration to support LiteLLM with model_provider field and validation |
graphrag/config/defaults.py | Adds model_provider default configuration |
graphrag/language_model/factory.py | Registers new LiteLLM model types in the factory |
graphrag/language_model/providers/litellm/ | Complete LiteLLM implementation including chat/embedding models, retry services, rate limiting, and request wrappers |
tests/unit/litellm_services/ | Comprehensive test suite covering retry mechanisms and rate limiting functionality |
dictionary.txt | Adds litellm to the project dictionary |
.semversioner/next-release/ | Marks this as a minor version release |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
graphrag/language_model/providers/litellm/services/rate_limiter/rate_limiter.py
Outdated
Show resolved
Hide resolved
graphrag/language_model/providers/litellm/request_wrappers/with_retries.py
Outdated
Show resolved
Hide resolved
graphrag/language_model/providers/litellm/request_wrappers/with_retries.py
Outdated
Show resolved
Hide resolved
if model_config.api_base: | ||
base_args["api_base"] = model_config.api_base | ||
if model_config.api_version: | ||
base_args["api_version"] = model_config.api_version | ||
if model_config.api_key: | ||
base_args["api_key"] = model_config.api_key | ||
if model_config.organization: | ||
base_args["organization"] = model_config.organization | ||
if model_config.proxy: | ||
base_args["proxy"] = model_config.proxy | ||
if model_config.audience: | ||
base_args["audience"] = model_config.audience |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could this be replaced by something like:
base_args.update({
"api_base": model_config.api_base,
...
})
?
Closing in favor of #2051 |
Add LiteLLM chat and embedding model providers.