Skip to content

[automatic failover] Improve failover resilience by evaluating circuit breaker before retries #4178

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

ggivo
Copy link
Collaborator

@ggivo ggivo commented Jun 11, 2025

Problem

Currently, the circuit breaker is evaluated only after all retry attempts are exhausted. This delays failover to healthy endpoints in unstable network conditions, as the system must wait for all retries to complete before marking an endpoint as unhealthy.

Solution

This PR changes the evaluation order so that the circuit breaker counts each connection error separately, even during retry attempts. This allows the circuit breaker to open more quickly when an endpoint is experiencing issues, enabling faster failover to healthy endpoints.

Changes

  • Modified the order of evaluation between circuit breaker and retry mechanisms
  • Added integration test to verify the new behavior

This commit modifies the order in which circuit breaker and retry mechanisms
are evaluated during failover scenarios. Previously, the circuit breaker was
applied after all retries were exhausted, which could delay failover in
unstable network conditions. Now, the circuit breaker counts each connection
error separately, even during retry attempts.
@ggivo ggivo force-pushed the automatic-failover-cb-retry-reorder branch from bdf52bf to 8836a10 Compare June 11, 2025 14:37
@ggivo ggivo requested a review from Copilot June 11, 2025 14:59
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR reorders the circuit breaker and retry decorations so that each connection error immediately counts toward opening the circuit breaker, and adds an integration test to verify the new behavior.

  • Swap retry and circuit breaker in CircuitBreakerCommandExecutor to evaluate the circuit breaker before retry attempts.
  • Add an integration test (testCircuitBreakerCountsEachConnectionErrorSeparately) to validate per-attempt error counting and faster failover.
  • Adjust the formatter-maven-plugin version in the POM.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
src/main/java/redis/clients/jedis/mcf/CircuitBreakerCommandExecutor.java Move withCircuitBreaker ahead of withRetry so the circuit breaker triggers before retrying
src/test/java/redis/clients/jedis/failover/FailoverIntegrationTest.java Add integration test covering the new circuit breaker behavior under retry scenarios
formatter-pom.xml Change formatter-maven-plugin version from 2.23.0 to 2.16.0
Comments suppressed due to low confidence (4)

src/main/java/redis/clients/jedis/mcf/CircuitBreakerCommandExecutor.java:38

  • [nitpick] Consider adding a brief comment explaining why the retry decorator was moved below the circuit breaker to clarify the execution order for future maintainers.
supplier.withCircuitBreaker(cluster.getCircuitBreaker());

src/main/java/redis/clients/jedis/mcf/CircuitBreakerCommandExecutor.java:39

  • Add a unit test specifically for CircuitBreakerCommandExecutor to confirm that the circuit breaker is evaluated before retry logic, ensuring this new order is exercised and protected against regressions.
supplier.withRetry(cluster.getRetry());

formatter-pom.xml:23

  • [nitpick] Verify that downgrading the formatter-maven-plugin is intentional and compatible with your formatter rules; consider aligning to the latest stable plugin to benefit from fixes.
<version>2.16.0</version>

src/test/java/redis/clients/jedis/failover/FailoverIntegrationTest.java:289

  • [nitpick] The inline comment for the failure rate threshold is split across multiple lines; consider consolidating it into a single, clear comment to improve readability.
.circuitBreakerFailureRateThreshold(50) // 50% failure

@ggivo ggivo self-assigned this Jun 11, 2025
@ggivo ggivo requested review from uglide and atakavci June 11, 2025 16:32
@@ -20,7 +20,7 @@
<plugin>
<groupId>net.revelc.code.formatter</groupId>
<artifactId>formatter-maven-plugin</artifactId>
<version>2.23.0</version>
<version>2.16.0</version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so what is the issue with 2.23 ?

Copy link
Collaborator Author

@ggivo ggivo Jun 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@atakavci
2.23 requires Java 11.
There is a slight difference in the formatted code if we format the code using 2.23.0 vs 2.16.0 (used in the pom.xml with regular build).
To avoid the difference decided to align the versions.

Copy link
Contributor

@atakavci atakavci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ggivo ggivo merged commit bce08f7 into redis:feature/automatic-failover Jun 12, 2025
9 of 12 checks passed
@ggivo ggivo deleted the automatic-failover-cb-retry-reorder branch June 16, 2025 06:37
atakavci pushed a commit to atakavci/jedis that referenced this pull request Jun 17, 2025
…t breaker before retries (redis#4178)

* fix: Change evaluation order of circuit breaker and retry mechanisms

This commit modifies the order in which circuit breaker and retry mechanisms
are evaluated during failover scenarios. Previously, the circuit breaker was
applied after all retries were exhausted, which could delay failover in
unstable network conditions. Now, the circuit breaker counts each connection
error separately, even during retry attempts.

* format

* format

* use java-8 compatible formater version

* fix AutomaticFailoverTest

* formating
atakavci pushed a commit to atakavci/jedis that referenced this pull request Jun 17, 2025
…t breaker before retries (redis#4178)

* fix: Change evaluation order of circuit breaker and retry mechanisms

This commit modifies the order in which circuit breaker and retry mechanisms
are evaluated during failover scenarios. Previously, the circuit breaker was
applied after all retries were exhausted, which could delay failover in
unstable network conditions. Now, the circuit breaker counts each connection
error separately, even during retry attempts.

* format

* format

* use java-8 compatible formater version

* fix AutomaticFailoverTest

* formating
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants