LLM Bridge is a centralized service for managing and proxying API requests to large language models. It supports multiple providers and offers a unified API interface, simplifying the process of using and developing with various models.
- 🚀 Unified API interface compatible with OpenAI's format
- 🔄 Supports both streaming (SSE) and WebSocket connections
- 🛠 Supports multiple popular LLM providers:
- OpenAI
- Google Gemini
- Deepseek
- Other providers compatible with the OpenAI format
- 🔌 Flexible proxy configuration
- 📝 Structured JSON logging
- 🔑 API key management and authentication
- 📊 Token counting and usage statistics
- Python 3.8+
- pip
-
Clone the repository:
git clone https://github.com/Rundao/LLM-Bridge.git cd llm-bridge
-
Install dependencies
(Optional) Create a conda virtual environment:
conda create -n llm-bridge python=3.12 conda activate llm-bridge
Install dependencies:
pip install -r requirements.txt
-
Configure environment variables
cp .env.example .env
Then edit the
.env
file and fill in the necessary configurations:ACCESS_API_KEYS=your-access-key-1,your-access-key-2 CLOSEAI_API_KEY=your-closeai-key GEMINI_API_KEY=your-gemini-key DEEPSEEK_API_KEY=your-deepseek-key
Here,
ACCESS_API_KEYS
is used for authenticating API requests. The other keys correspond to the API keys for each provider. -
Start the service
cd src && uvicorn main:app --reload --port 1219
The service will be available at http://localhost:1219.
Example using curl:
curl http://localhost:1219/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-access-key" \
-d '{
"model": "closeai/gpt-4o-mini",
"messages": [{"role": "user", "content": "Hello"}],
"stream": true
}'
Example using Cherry Studio:
- Click "Settings" in the bottom left corner.
- In "Model Provider", click "Add" and choose Provider Type as "OpenAI".
- Enter one of your
ACCESS_API_KEYS
in the "API Key" field. - Enter
http://127.0.0.1:1219
in the "API URL" field.- Some software (such as Cherry Studio) will automatically supplement
/v1/chat/completions
, please adjust according to the actual situation
- Some software (such as Cherry Studio) will automatically supplement
- Click "Manage" to add models.
- Check the connectivity and start using it.
Connect to the WebSocket endpoint at /v1/ws
for real-time bidirectional communication:
const ws = new WebSocket('ws://localhost:1219/v1/ws');
ws.onmessage = function(event) {
console.log('Received:', event.data);
};
ws.send(JSON.stringify({
type: 'chat',
api_key: 'your-access-key',
payload: {
model: 'closeai/gpt-4o-mini',
messages: [{role: 'user', content: 'Hello'}]
}
}));
Specify the provider by prefixing the model name. For example:
- CloseAI models:
closeai/gpt-4o
,closeai/gpt-4o-mini
- Gemini models:
gemini/gemini-2.0-pro-exp-02-05
- Deepseek models:
deepseek/deepseek-chat
You can use the /v1/models
endpoint to retrieve a complete list of supported models.
sequenceDiagram
participant Client
participant Gateway
participant Auth
participant Router
participant Adapter
participant LLM
Client->>Gateway: Send chat request
Gateway->>Auth: Verify API Key
Auth-->>Gateway: Return user permissions
Gateway->>Router: Pass request context
Router->>Router: Select model based on strategy
Router->>Adapter: Call corresponding adapter
Adapter->>Adapter: Normalize request format
Adapter->>LLM: Async API call
LLM-->>Adapter: Return raw response
Adapter->>Adapter: Normalize error handling
Adapter-->>Router: Return unified format
Router-->>Gateway: Return processed result
Gateway->>Gateway: Record audit log
Gateway-->>Client: Return final response
llm-bridge/
├── configs/
│ └── config.yaml # Global configuration
├── src/
│ ├── core/
│ │ ├── gateway/ # FastAPI-based request handlers
│ │ │ ├── http_handler.py # REST API handler
│ │ │ └── websocket_handler.py
│ │ └── router.py # Request routing
│ ├── adapters/
│ │ ├── base.py # Abstract base class
│ │ ├── openai.py # OpenAI format adapter
│ │ └── gemini.py # Gemini API adapter
│ ├── infrastructure/
│ │ ├── config.py # Configuration management
│ │ └── logging.py # Structured logging
│ └── main.py # Service entry point
├── docs/ # Documentation
├── requirements.txt
└── README.md
Configure supported models and their settings in configs/config.yaml
. Each model can have basic settings and parameter customization through param_config
:
providers:
closeai:
base_url: "https://api.openai-proxy.org/v1/chat/completions"
requires_proxy: false
models:
gpt-4o:
max_tokens: 8192
timeout: 120
o3-mini:
max_tokens: 4096
timeout: 60
param_config:
add_params:
reasoning_effort: "medium" # Add new parameter
deepseek-reasoner:
max_tokens: 8192
timeout: 180
param_config:
update_params:
temperature: 0.6 # Update parameter value
The param_config
section supports four types of parameter customization:
-
add_params: Add new parameters to the request
- Use case: When a model requires additional parameters not in the standard API
- Example: Adding a model-specific parameter like
reasoning_effort
add_params: reasoning_effort: "medium" # Values: low, medium, high
-
update_params: Modify values of existing parameters
- Use case: When a model needs specific parameter values for optimal performance
- Example: Setting a fixed temperature for consistent output
update_params: temperature: 0.6
-
rename_params: Change parameter names
- Use case: When a model uses different names for standard parameters
- Example: Renaming max_tokens to match model's API
rename_params: max_tokens: "max_reasoning_token"
-
delete_params: Remove parameters from the request
- Use case: When certain parameters should not be sent to specific models
- Example: Removing unsupported parameters
delete_params: - "presence_penalty" - "frequency_penalty"
Parameter customization is processed in this order: update_params → add_params → rename_params → delete_params
Configure logging settings in configs/config.yaml
:
logging:
format: "json" # json or text
output:
file:
path: "logs/llm-bridge.log"
max_size: 10485760 # 10MB
backup_count: 5
console: true
level: "info" # debug, info, warning, error
- Create a new adapter in
src/adapters/
that implements theModelAdapter
interface - Add the provider configuration to
configs/config.yaml
- Update the Router class to support the new adapter
- Add corresponding API key to your
.env
file
The service provides standardized error handling:
- 400: Bad Request (invalid parameters)
- 401: Unauthorized (invalid API key)
- 429: Too Many Requests (rate limit exceeded)
- 500: Internal Server Error
MIT License
Contributions are welcome! Please submit your issues and pull requests to help improve the project.