-
Notifications
You must be signed in to change notification settings - Fork 38
Open
Labels
enhancementNew feature or requestNew feature or requestpriority:lowLow priority. Could take 1+ month to resolveLow priority. Could take 1+ month to resolve
Description
Feature Request
Providers like OpenAI have some rate limits (things like a limit in the requests per minute).
This feature would allow llm studio to wait it out (or keep trying) when necessary so that the response does not error even if it takes longer.
Advanced feature:
By being aware of the exact rate limit of the user (depends on their tier in OpenAI for example) it could also decide which prompts to send at what time in order to maximize the rate limit without overstepping it (in cases where these are parallel).
Motivation
Gives more robustness to LLM calls. The user does not need to worry about their application breaking when making to many requests per minute.
Your contribution
Discussion
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestpriority:lowLow priority. Could take 1+ month to resolveLow priority. Could take 1+ month to resolve