requests per minute and day limiting #260
Replies: 6 comments 4 replies
-
want this for another reason, GitHub is banning people for overconsumption using the GitHub copilot api. i dont want to get my account suspended 😳 |
Beta Was this translation helpful? Give feedback.
-
Hello, I have the same limit issued by anthropic at 50.000 words by minute to not get an error. Please implement the settings (or make it automatic from know provider limit ?) |
Beta Was this translation helpful? Give feedback.
-
Does the per-profile rate limiting help enough with this request? |
Beta Was this translation helpful? Give feedback.
-
Some API responses actually tell you exactly when to retry. Here's a "429 Too Many Requests" error body from Gemini (I cut and pasted from the Roo Code pane, and formatted): {
"error": {
"message": {
"error": {
"code": 429,
"message": "You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits.",
"status": "RESOURCE_EXHAUSTED",
"details": [
{
"@type": "type.googleapis.com/google.rpc.QuotaFailure",
"violations": [
{
"quotaMetric": "generativelanguage.googleapis.com/generate_content_paid_tier_input_token_count",
"quotaId": "GenerateContentPaidTierInputTokensPerModelPerMinute",
"quotaDimensions": {
"location": "global",
"model": "gemini-2.5-flash"
},
"quotaValue": "1000000"
}
]
},
{
"@type": "type.googleapis.com/google.rpc.Help",
"links": [
{
"description": "Learn more about Gemini API quotas",
"url": "https://ai.google.dev/gemini-api/docs/rate-limits"
}
]
},
{
"@type": "type.googleapis.com/google.rpc.RetryInfo",
"retryDelay": "56s"
}
]
}
},
"code": 429,
"status": "Too Many Requests"
}
}
Note that it specifically tells you when to retry: |
Beta Was this translation helpful? Give feedback.
-
Had a similar issue and found this feature request. I have an implementation idea for this issue. Made a short write-up. Not sure how feasible it is, but it may help. I hope it addresses the user's issue. I think it may address mine. |
Beta Was this translation helpful? Give feedback.
-
@mrubens Would you take another look at this feature request? Summary: monitor API responses for http 429 "too many requests" responses from LLM provider. If present, use the numeric value in the retryDelay parameter (in error/message/error/details[]) in the API response as the number of seconds to wait before retrying the API again. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
since roo-cline works so quickly most API's will quickly limit you. once this happens you have to manually press retry often. this could be paired with rotating keys/providers to make everything faster
Beta Was this translation helpful? Give feedback.
All reactions