-
Notifications
You must be signed in to change notification settings - Fork 56
fix(wsConnection): resolve WebSocket connectivity with exponential backoff #1110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughAdds an internal WebSocket proxy endpoint (/internal/sync-websocket) to nginx, sets MAIN_SERVER_HOST=app in two app-sync Docker Compose files, and implements exponential-backoff WebSocket reconnection logic in the sync server that reads MAIN_SERVER_HOST (fallback localhost). Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Client
participant Nginx as Nginx (prod)
participant App as app_backend
rect rgb(235,245,255)
note over Nginx: New internal endpoint (/internal/sync-websocket)
Client->>Nginx: WS Upgrade /internal/sync-websocket
Nginx->>App: Proxy (HTTP/1.1, Upgrade, X-Forwarded-*)
App-->>Nginx: 101 Switching Protocols
Nginx-->>Client: 101 Switching Protocols
end
Client-)Nginx: Bi-directional frames
Nginx-)App: Bi-directional frames
sequenceDiagram
autonumber
participant Sync as Sync Server
participant Main as Main Server
rect rgb(245,255,245)
note over Sync: Build URL using MAIN_SERVER_HOST (fallback: localhost)
Sync->>Main: Attempt WebSocket connect
alt open success
Main-->>Sync: onopen
note over Sync: reset reconnectAttempts
else error/close
Main-->>Sync: onerror/onclose (code, reason)
Sync->>Sync: scheduleReconnect(delay = min(60s, 2^n * 2s) ± jitter)
Sync-->>Main: reconnect after delay
end
end
note over Sync: On send when socket not OPEN ➜ schedule reconnect
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @naSim087, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the robustness and reliability of WebSocket connectivity between the sync server and the main application server. It addresses potential networking issues in containerized environments by correctly configuring hostnames and introduces a sophisticated exponential backoff retry mechanism to ensure persistent WebSocket connections, even in the face of transient network disruptions. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a robust WebSocket reconnection mechanism using exponential backoff for the sync-server. It also correctly configures networking for containerized environments by adding the MAIN_SERVER_HOST
environment variable and setting up the necessary nginx location block for WebSocket proxying. The changes are well-implemented and significantly improve the resilience of the sync-server's connection to the main server. I've added one suggestion to further improve the backoff strategy by adding jitter, which is a best practice to prevent synchronized reconnection attempts from multiple clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (4)
deployment/portable/docker-compose.sync-version.yml (1)
21-21
: Good addition; host resolves via Docker DNS.Optional: parameterize for non-compose setups.
- - MAIN_SERVER_HOST=app + - MAIN_SERVER_HOST=${MAIN_SERVER_HOST:-app}deployment/portable/docker-compose.sync-build.yml (1)
24-24
: Mirrors runtime compose; looks good.Optional: make it overrideable via env for flexibility.
- - MAIN_SERVER_HOST=app + - MAIN_SERVER_HOST=${MAIN_SERVER_HOST:-app}server/sync-server.ts (1)
73-79
: Log human-readable close reason.
reason
is a Buffer; current log may be unreadable.- mainServerWebSocket.on("close", (code, reason) => { - Logger.warn( - `WebSocket connection to main server closed (code: ${code}, reason: ${reason || "unknown"})`, - ) + mainServerWebSocket.on("close", (code, reason) => { + const reasonText = + Buffer.isBuffer(reason) ? reason.toString("utf8") : String(reason || "unknown") + Logger.warn( + `WebSocket connection to main server closed (code: ${code}, reason: ${reasonText})`, + ) mainServerWebSocket = null scheduleReconnect() })deployment/nginx.prod.conf (1)
200-216
: WS proxy looks correct; consider restricting exposure.Endpoint is publicly reachable on the main server. If the app enforces Authorization, fine; otherwise add network/IP allowlists or an auth_request gate to reduce attack surface.
- Confirm the upstream WS handler validates
Authorization: Bearer <METRICS_SECRET>
.- If not, restrict by source IP (docker network ranges) or add rate limits to this location.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
deployment/nginx.prod.conf
(1 hunks)deployment/portable/docker-compose.sync-build.yml
(1 hunks)deployment/portable/docker-compose.sync-version.yml
(1 hunks)server/sync-server.ts
(1 hunks)
🔇 Additional comments (1)
server/sync-server.ts (1)
45-47
: Confirm WebSocket endpoint and port configuration
server/server.ts
exposes/internal/sync-websocket
;sync-server.ts
connects to the same path.- Bun.serve listens on
config.port
; verifyconfig.port
is loaded fromprocess.env.APP_PORT
(default 3001) so it matches NGINX upstream.- Ensure
MAIN_SERVER_HOST
matches the actual app host (container name orhost.docker.internal
in prod).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
server/sync-server.ts (1)
103-109
: Close handler looks good.Properly logs the close event with code and reason, then schedules a reconnect. Note that the indefinite retry concern mentioned in the error handler comment also applies here.
🧹 Nitpick comments (1)
server/sync-server.ts (1)
366-366
: Consider adding graceful shutdown cleanup.While not critical (Node.js cleans up on exit), explicitly closing the WebSocket connection and clearing the reconnect timeout on graceful shutdown (e.g., SIGTERM) would be a best practice improvement.
Example implementation:
process.on('SIGTERM', () => { Logger.info('SIGTERM received, cleaning up...') if (reconnectTimeout) { clearTimeout(reconnectTimeout) } if (mainServerWebSocket && mainServerWebSocket.readyState === WebSocket.OPEN) { mainServerWebSocket.close(1000, 'Server shutting down') } process.exit(0) })
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
server/sync-server.ts
(2 hunks)
🔇 Additional comments (6)
server/sync-server.ts (6)
40-41
: LGTM! State variables properly initialized.The reconnection state variables are correctly typed and initialized for tracking retry attempts and managing the reconnect timer.
44-60
: Excellent guard against duplicate connections.The implementation correctly prevents overlapping connection attempts by clearing pending timers and checking for OPEN/CONNECTING states before initiating a new connection.
62-72
: LGTM! Proper configuration and validation.The environment variable handling for
MAIN_SERVER_HOST
aligns with the Docker Compose changes in the PR, and the upfrontMETRICS_SECRET
validation prevents connection attempts when authentication is misconfigured.
112-146
: Excellent exponential backoff implementation with jitter.The reconnection strategy correctly implements:
- Exponential backoff (2s → 4s → 8s → 16s → 32s → 60s max)
- Jitter (0-500ms) to prevent thundering herd, addressing the previous review comment
- Guards against scheduling when already connected/connecting
- Proper timeout cleanup
84-87
: LGTM! Proper reset on successful connection.Correctly resets the retry counter when the connection is established, ensuring the backoff starts fresh after any future disconnection.
171-177
: LGTM! Improved reconnection logic.The changes correctly:
- Use
!== WebSocket.OPEN
to cover all non-ready states (CONNECTING, CLOSING, CLOSED)- Call
scheduleReconnect()
instead of immediate connection, ensuring the exponential backoff strategy is respected
## [3.17.1](v3.17.0...v3.17.1) (2025-10-15) ### Bug Fixes * **wsConnection:** resolve WebSocket connectivity with exponential backoff ([#1110](#1110)) ([802c353](802c353))
Description
Add nginx WebSocket location block for /internal/sync-websocket
Fix container networking using MAIN_SERVER_HOST env var
Implement exponential backoff retry (2s-60s delays)
Update Docker Compose configurations
Testing
Additional Notes
Summary by CodeRabbit
New Features
Chores