Skip to content

Conversation

naSim087
Copy link
Contributor

@naSim087 naSim087 commented Oct 15, 2025

Description

Add nginx WebSocket location block for /internal/sync-websocket
Fix container networking using MAIN_SERVER_HOST env var
Implement exponential backoff retry (2s-60s delays)
Update Docker Compose configurations

Testing

Additional Notes

Summary by CodeRabbit

  • New Features

    • Sync service now auto-reconnects with exponential backoff for improved stability during transient network issues.
  • Chores

    • Added a configurable environment variable to set the main server host for the sync service.
    • Updated production proxy to expose an internal WebSocket endpoint for sync traffic with proper proxy handling and timeouts.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 15, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds an internal WebSocket proxy endpoint (/internal/sync-websocket) to nginx, sets MAIN_SERVER_HOST=app in two app-sync Docker Compose files, and implements exponential-backoff WebSocket reconnection logic in the sync server that reads MAIN_SERVER_HOST (fallback localhost).

Changes

Cohort / File(s) Summary
Nginx internal WebSocket proxy
deployment/nginx.prod.conf
Adds /internal/sync-websocket location proxying to app_backend with HTTP/1.1, Upgrade headers, forwarded IP headers, timeouts, access_log off; the block appears twice (two server contexts).
Docker Compose: app-sync env
deployment/portable/docker-compose.sync-build.yml, deployment/portable/docker-compose.sync-version.yml
Adds MAIN_SERVER_HOST=app to the app-sync service environment.
Sync server reconnection logic
server/sync-server.ts
Adds reconnect state and scheduling, exponential-backoff with jitter (base 2s, max 60s), logs attempts/errors/close reason, reads MAIN_SERVER_HOST (fallback localhost) to build main server URL, schedules reconnects instead of immediate retries when not OPEN.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Client
  participant Nginx as Nginx (prod)
  participant App as app_backend

  rect rgb(235,245,255)
  note over Nginx: New internal endpoint (/internal/sync-websocket)
  Client->>Nginx: WS Upgrade /internal/sync-websocket
  Nginx->>App: Proxy (HTTP/1.1, Upgrade, X-Forwarded-*)
  App-->>Nginx: 101 Switching Protocols
  Nginx-->>Client: 101 Switching Protocols
  end

  Client-)Nginx: Bi-directional frames
  Nginx-)App: Bi-directional frames
Loading
sequenceDiagram
  autonumber
  participant Sync as Sync Server
  participant Main as Main Server

  rect rgb(245,255,245)
  note over Sync: Build URL using MAIN_SERVER_HOST (fallback: localhost)
  Sync->>Main: Attempt WebSocket connect
  alt open success
    Main-->>Sync: onopen
    note over Sync: reset reconnectAttempts
  else error/close
    Main-->>Sync: onerror/onclose (code, reason)
    Sync->>Sync: scheduleReconnect(delay = min(60s, 2^n * 2s) ± jitter)
    Sync-->>Main: reconnect after delay
  end
  end

  note over Sync: On send when socket not OPEN ➜ schedule reconnect
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • zereraz
  • junaid-shirur
  • devesh-juspay
  • kalpadhwaryu

Poem

I twitch my whiskers—webs now weave,
A socket hums where packets cleave.
With backoff hops—2, 4, then more—
I’ll reconnect from burrowed floor.
MAIN_SERVER_HOST points out the way—thump! 🐇

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title “fix(wsConnection): resolve WebSocket connectivity with exponential backoff” clearly captures the primary purpose of the changeset, namely enhancing WebSocket reconnection logic with exponential backoff, and follows a concise, conventional commit style.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/wsConnection

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @naSim087, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the robustness and reliability of WebSocket connectivity between the sync server and the main application server. It addresses potential networking issues in containerized environments by correctly configuring hostnames and introduces a sophisticated exponential backoff retry mechanism to ensure persistent WebSocket connections, even in the face of transient network disruptions.

Highlights

  • Nginx WebSocket Proxy: A new Nginx location block has been added to proxy WebSocket connections for the /internal/sync-websocket endpoint to the app_backend service, ensuring proper routing for internal sync communication.
  • Container Networking Fix: The MAIN_SERVER_HOST environment variable is now utilized in Docker Compose configurations to correctly resolve the main application server's hostname within the container network, replacing the previous localhost assumption.
  • Exponential Backoff for WebSocket Reconnection: The sync server's WebSocket client now implements an exponential backoff strategy for reconnecting to the main server. This retry mechanism starts with a 2-second delay and gradually increases up to a maximum of 60 seconds, improving connection resilience.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a robust WebSocket reconnection mechanism using exponential backoff for the sync-server. It also correctly configures networking for containerized environments by adding the MAIN_SERVER_HOST environment variable and setting up the necessary nginx location block for WebSocket proxying. The changes are well-implemented and significantly improve the resilience of the sync-server's connection to the main server. I've added one suggestion to further improve the backoff strategy by adding jitter, which is a best practice to prevent synchronized reconnection attempts from multiple clients.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
deployment/portable/docker-compose.sync-version.yml (1)

21-21: Good addition; host resolves via Docker DNS.

Optional: parameterize for non-compose setups.

-      - MAIN_SERVER_HOST=app
+      - MAIN_SERVER_HOST=${MAIN_SERVER_HOST:-app}
deployment/portable/docker-compose.sync-build.yml (1)

24-24: Mirrors runtime compose; looks good.

Optional: make it overrideable via env for flexibility.

-      - MAIN_SERVER_HOST=app
+      - MAIN_SERVER_HOST=${MAIN_SERVER_HOST:-app}
server/sync-server.ts (1)

73-79: Log human-readable close reason.

reason is a Buffer; current log may be unreadable.

-  mainServerWebSocket.on("close", (code, reason) => {
-    Logger.warn(
-      `WebSocket connection to main server closed (code: ${code}, reason: ${reason || "unknown"})`,
-    )
+  mainServerWebSocket.on("close", (code, reason) => {
+    const reasonText =
+      Buffer.isBuffer(reason) ? reason.toString("utf8") : String(reason || "unknown")
+    Logger.warn(
+      `WebSocket connection to main server closed (code: ${code}, reason: ${reasonText})`,
+    )
     mainServerWebSocket = null
     scheduleReconnect()
   })
deployment/nginx.prod.conf (1)

200-216: WS proxy looks correct; consider restricting exposure.

Endpoint is publicly reachable on the main server. If the app enforces Authorization, fine; otherwise add network/IP allowlists or an auth_request gate to reduce attack surface.

  • Confirm the upstream WS handler validates Authorization: Bearer <METRICS_SECRET>.
  • If not, restrict by source IP (docker network ranges) or add rate limits to this location.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 836c107 and b8b08db.

📒 Files selected for processing (4)
  • deployment/nginx.prod.conf (1 hunks)
  • deployment/portable/docker-compose.sync-build.yml (1 hunks)
  • deployment/portable/docker-compose.sync-version.yml (1 hunks)
  • server/sync-server.ts (1 hunks)
🔇 Additional comments (1)
server/sync-server.ts (1)

45-47: Confirm WebSocket endpoint and port configuration

  • server/server.ts exposes /internal/sync-websocket; sync-server.ts connects to the same path.
  • Bun.serve listens on config.port; verify config.port is loaded from process.env.APP_PORT (default 3001) so it matches NGINX upstream.
  • Ensure MAIN_SERVER_HOST matches the actual app host (container name or host.docker.internal in prod).

shivamashtikar
shivamashtikar previously approved these changes Oct 15, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
server/sync-server.ts (1)

103-109: Close handler looks good.

Properly logs the close event with code and reason, then schedules a reconnect. Note that the indefinite retry concern mentioned in the error handler comment also applies here.

🧹 Nitpick comments (1)
server/sync-server.ts (1)

366-366: Consider adding graceful shutdown cleanup.

While not critical (Node.js cleans up on exit), explicitly closing the WebSocket connection and clearing the reconnect timeout on graceful shutdown (e.g., SIGTERM) would be a best practice improvement.

Example implementation:

process.on('SIGTERM', () => {
  Logger.info('SIGTERM received, cleaning up...')
  if (reconnectTimeout) {
    clearTimeout(reconnectTimeout)
  }
  if (mainServerWebSocket && mainServerWebSocket.readyState === WebSocket.OPEN) {
    mainServerWebSocket.close(1000, 'Server shutting down')
  }
  process.exit(0)
})
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b8b08db and 93ffcf5.

📒 Files selected for processing (1)
  • server/sync-server.ts (2 hunks)
🔇 Additional comments (6)
server/sync-server.ts (6)

40-41: LGTM! State variables properly initialized.

The reconnection state variables are correctly typed and initialized for tracking retry attempts and managing the reconnect timer.


44-60: Excellent guard against duplicate connections.

The implementation correctly prevents overlapping connection attempts by clearing pending timers and checking for OPEN/CONNECTING states before initiating a new connection.


62-72: LGTM! Proper configuration and validation.

The environment variable handling for MAIN_SERVER_HOST aligns with the Docker Compose changes in the PR, and the upfront METRICS_SECRET validation prevents connection attempts when authentication is misconfigured.


112-146: Excellent exponential backoff implementation with jitter.

The reconnection strategy correctly implements:

  • Exponential backoff (2s → 4s → 8s → 16s → 32s → 60s max)
  • Jitter (0-500ms) to prevent thundering herd, addressing the previous review comment
  • Guards against scheduling when already connected/connecting
  • Proper timeout cleanup

84-87: LGTM! Proper reset on successful connection.

Correctly resets the retry counter when the connection is established, ensuring the backoff starts fresh after any future disconnection.


171-177: LGTM! Improved reconnection logic.

The changes correctly:

  1. Use !== WebSocket.OPEN to cover all non-ready states (CONNECTING, CLOSING, CLOSED)
  2. Call scheduleReconnect() instead of immediate connection, ensuring the exponential backoff strategy is respected

@shivamashtikar shivamashtikar merged commit 802c353 into main Oct 15, 2025
4 checks passed
@shivamashtikar shivamashtikar deleted the fix/wsConnection branch October 15, 2025 11:29
oindrila-b pushed a commit that referenced this pull request Oct 15, 2025
## [3.17.1](v3.17.0...v3.17.1) (2025-10-15)

### Bug Fixes

* **wsConnection:** resolve WebSocket connectivity with exponential backoff ([#1110](#1110)) ([802c353](802c353))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants