feat: implement smart exponential backoff with rate limit headers for OpenAI embedder #6068

daniel-lxs · 2025-07-22T16:50:52Z

Summary

This PR implements intelligent rate limit handling for the OpenAI embedder by utilizing response headers to calculate optimal retry delays. The implementation ensures no embedding batches are lost due to rate limiting while minimizing unnecessary wait times.

Problem

Previously, the OpenAI embedder used a simple exponential backoff strategy with a fixed retry limit (3 attempts). This approach had several issues:

Batches could be lost if rate limits persisted beyond 3 retries
Fixed exponential backoff didn't consider actual rate limit reset times
Multiple concurrent batches could hit the API simultaneously when rate limited
Excessive retry logs flooded stderr

Solution

1. Smart Backoff Using Rate Limit Headers

Extracts OpenAI's rate limit headers:
- x-ratelimit-limit-requests / x-ratelimit-limit-tokens
- x-ratelimit-remaining-requests / x-ratelimit-remaining-tokens
- x-ratelimit-reset-requests / x-ratelimit-reset-tokens
Calculates optimal wait time based on the maximum of request and token reset times
Adds 10% buffer to account for clock differences
Falls back to exponential backoff when headers are unavailable

2. Infinite Retries for Rate Limits

HTTP 429 (rate limit) errors now retry indefinitely
Ensures no embedding batches are lost
Other errors (401, 500, etc.) fail immediately without retries

3. Global Rate Limit Coordination

Implemented mutex-based coordination using async-mutex
Prevents multiple concurrent batches from hitting the API when rate limited
All batches wait for the global rate limit to clear before proceeding
Thread-safe access to rate limit state

4. Reduced Logging

Only logs rate limit warnings on first retry
Silent waiting during global rate limit periods
Prevents stderr flooding

Code Changes

Modified Files:

src/services/code-index/embedders/openai.ts
- Added rate limit header extraction
- Implemented smart backoff calculation
- Added global rate limit state with mutex
- Modified retry logic for infinite retries on 429
- Reduced logging frequency
src/services/code-index/embedders/__tests__/openai.spec.ts
- Updated tests for new retry behavior
- Added tests for smart backoff calculation
- Added tests for rate limit header parsing
- Reset global state in beforeEach

Testing

All existing tests pass with the following updates:

Rate limit errors are tested to retry indefinitely
Smart backoff calculation is tested with various header formats
Non-rate-limit errors are tested to fail immediately
Global rate limit coordination is implicitly tested

Example Log Output

Before:

Rate limit hit, retrying in 500ms (attempt 1/3)
Rate limit hit, retrying in 1000ms (attempt 2/3)
Rate limit hit, retrying in 2000ms (attempt 3/3)
[DirectoryScanner] Error processing batch: Failed to create embeddings after 3 attempts

After:

Rate limit hit, retrying in 33000ms (attempt 1/∞)
Rate limits - Requests: 0/60, Tokens: 0/150000
[Silent waiting for subsequent retries]

Benefits

No Data Loss: Infinite retries ensure all batches are eventually processed
Optimal Throughput: Smart backoff minimizes wait times based on actual rate limits
Better Resource Usage: Mutex prevents thundering herd problem
Cleaner Logs: Reduced logging prevents stderr flooding
Consistent Behavior: Aligns with openai-compatible embedder implementation

Breaking Changes

None. The API remains the same, only the internal retry behavior has changed.

Related Issues

Fixes rate limit handling issues where batches were lost after 3 retries
Addresses log flooding during rate limiting periods
Implements feature parity with openai-compatible embedder

Checklist

Code follows project style guidelines
Tests have been added/updated
All tests pass
No breaking changes to public API
Documentation has been updated (inline comments)

… OpenAI embedder - Extract rate limit headers from OpenAI API responses - Calculate optimal wait times based on reset headers - Implement infinite retries for rate limit errors (HTTP 429) - Add mutex-based global rate limit coordination - Reduce logging to prevent stderr flooding - Update tests for new retry behavior

delve-auditor · 2025-07-22T16:54:54Z

✅ No security or compliance issues detected. Reviewed everything up to 2d15f06.

Security Overview

🔎 Scanned files: 2 changed file(s)

Detected Code Changes

Change Type	Relevant files
Enhancement	► openai.ts Implement smart exponential backoff with rate limit headers Add mutex-based global rate limit coordination Reduce logging for rate limits ► openai.spec.ts Update tests for new retry behavior
Bug Fix	► check-services.sh Update service start command references ► package.json Fix evals command ► new-run.tsx Add missing unit to timeout display
Configuration changes	► package.json Update service commands ► migrations/* Remove migration files

Reply to this PR with @delve-auditor followed by a description of what change you want and we'll auto-submit a change to this PR to implement it.

daniel-lxs · 2025-07-22T17:28:39Z

@roomote-agent use switch_mode to switch to pr-fixer mode and fix the unit tests on this PR, do not create a new PR.

roomote · 2025-07-22T17:43:30Z

Hi @daniel-lxs! I see your request to fix the unit tests on this PR. I notice the platform-unit-test checks are failing on both Ubuntu and Windows. Let me check out the branch and investigate the failing tests.

- Switch to real timers for non-rate-limit error retry tests - Prevents async promise rejections from being detected as unhandled - Fixes failing CI tests on Ubuntu and Windows platforms

roomote · 2025-07-22T17:55:34Z

Hi @daniel-lxs! Ive successfully fixed the unit tests on this PR.

Summary of changes:

Fixed unhandled promise rejections in the OpenAI embedder tests
Modified two test cases that were testing non-rate-limit error retries to use real timers instead of fake timers
This prevents async promise rejections from being detected as unhandled by Vitest

Results:

✅ All CI checks are now passing
✅ platform-unit-test (ubuntu-latest) - PASS
✅ platform-unit-test (windows-latest) - PASS

The fix was minimal and only affected the test implementation, not the actual embedder logic. The tests still verify the same behavior but now handle the async nature of the retries properly.

github-project-automation bot added this to Roo Code Roadmap and Roo Code Roadmap Jul 22, 2025

github-project-automation bot moved this to Triage in Roo Code Roadmap Jul 22, 2025

github-project-automation bot moved this to New in Roo Code Roadmap Jul 22, 2025

daniel-lxs moved this from Triage to PR [Needs Prelim Review] in Roo Code Roadmap Jul 22, 2025

daniel-lxs moved this from PR [Needs Prelim Review] to PR [Draft / In Progress] in Roo Code Roadmap Jul 22, 2025

hannesrudolph added the PR - Draft / In Progress label Jul 22, 2025

fix: resolve unhandled promise rejections in OpenAI embedder tests

d6df0f9

- Switch to real timers for non-rate-limit error retry tests - Prevents async promise rejections from being detected as unhandled - Fixes failing CI tests on Ubuntu and Windows platforms

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: implement smart exponential backoff with rate limit headers for OpenAI embedder #6068

feat: implement smart exponential backoff with rate limit headers for OpenAI embedder #6068

daniel-lxs commented Jul 22, 2025

Uh oh!

delve-auditor bot commented Jul 22, 2025

Uh oh!

daniel-lxs commented Jul 22, 2025

Uh oh!

roomote bot commented Jul 22, 2025

Uh oh!

roomote bot commented Jul 22, 2025

Uh oh!

Uh oh!

feat: implement smart exponential backoff with rate limit headers for OpenAI embedder #6068

Are you sure you want to change the base?

feat: implement smart exponential backoff with rate limit headers for OpenAI embedder #6068

Conversation

daniel-lxs commented Jul 22, 2025

Summary

Problem

Solution

1. Smart Backoff Using Rate Limit Headers

2. Infinite Retries for Rate Limits

3. Global Rate Limit Coordination

4. Reduced Logging

Code Changes

Modified Files:

Testing

Example Log Output

Benefits

Breaking Changes

Related Issues

Checklist

Uh oh!

delve-auditor bot commented Jul 22, 2025

Uh oh!

daniel-lxs commented Jul 22, 2025

Uh oh!

roomote bot commented Jul 22, 2025

Uh oh!

roomote bot commented Jul 22, 2025

Summary of changes:

Results:

Uh oh!

Uh oh!