Exponential delay on retry should be optional/removed #1360
Replies: 6 comments 1 reply
-
Yeah it makes sense to give users more control over the exponent and the cap, thank you for flagging this. |
Beta Was this translation helpful? Give feedback.
-
interesting, is anyone else working on this bug? |
Beta Was this translation helpful? Give feedback.
-
yeah, i would love this, like sometimes i would have to wait 200 sec or something for after multiple fails |
Beta Was this translation helpful? Give feedback.
-
@mrubens what about just changing this line https://github.com/RooVetGit/Roo-Code/blob/main/src/core/Cline.ts#L1017 from: which (with default
to:
a few retries at 5s doesn't seem too bad, and if the user sets a more aggressive |
Beta Was this translation helpful? Give feedback.
-
especially with the behavior being seen with the Gemini free model, it feels likely that this issue would bite new users more often, so a tweak in default behavior might be preferable to adding more params |
Beta Was this translation helpful? Give feedback.
-
what about clamping the max retry delay to maybe something like 30? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Which version of the app are you using?
v3.3.14
Which API Provider are you using?
Google Gemini
Which Model are you using?
gemini-2.0-flash
What happened?
I'm not sure what was the reason behind introducing exponentialDelay for API requests retry but it goes out of control for Gemini models. Gemini API has 2 cases when API request fail: either because user exceeded the quota per minute or uncontrollable shared quota. The shared quota is applied randomly to everyone when API is "busy" and is randomly removed every few seconds. When such case happens the delay grows from 5 to 40-80 seconds and it ends up just waiting for over a minute without even trying to call API. I don't think there's any API that requires this cooldown period anyways. This feature should either be optional, or removed imho, or at least have the cap of like 30-45 seconds max before the next retry.
Steps to reproduce
Relevant API REQUEST output
Additional context
No response
Beta Was this translation helpful? Give feedback.
All reactions