Skip to content

Fix failing network partition test #4118

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 25, 2025
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
package redis.clients.jedis.authentication;

import static org.awaitility.Awaitility.await;
import static org.awaitility.Durations.TWO_SECONDS;
import static org.awaitility.Durations.FIVE_SECONDS;
import static org.awaitility.Durations.ONE_HUNDRED_MILLISECONDS;
import static org.junit.Assert.assertEquals;
Expand All @@ -15,6 +16,8 @@
import static org.mockito.Mockito.spy;
import static org.mockito.Mockito.verify;
import static org.mockito.Mockito.when;
import static org.hamcrest.Matchers.in;
import static org.hamcrest.Matchers.is;

import java.io.IOException;
import java.util.ArrayList;
Expand All @@ -32,6 +35,7 @@

import org.awaitility.Awaitility;
import org.awaitility.Durations;
import org.hamcrest.MatcherAssert;
import org.junit.BeforeClass;
import org.junit.FixMethodOrder;
import org.junit.Test;
Expand All @@ -58,6 +62,7 @@
import redis.clients.jedis.exceptions.JedisAccessControlException;
import redis.clients.jedis.exceptions.JedisConnectionException;
import redis.clients.jedis.scenario.FaultInjectionClient;
import redis.clients.jedis.scenario.FaultInjectionClient.TriggerActionResponse;

@FixMethodOrder(MethodSorters.NAME_ASCENDING)
public class RedisEntraIDIntegrationTests {
Expand Down Expand Up @@ -341,34 +346,44 @@ public void networkPartitionEvictionTest() {
jedis.del(key);
}

triggerNetworkFailure();
TriggerActionResponse actionResponse = triggerNetworkFailure();

JedisConnectionException aclException = assertThrows(JedisConnectionException.class, () -> {
for (int i = 0; i < 50; i++) {
String key = UUID.randomUUID().toString();
jedis.set(key, "value");
assertEquals("value", jedis.get(key));
jedis.del(key);
while (!actionResponse.isCompleted(ONE_HUNDRED_MILLISECONDS, TWO_SECONDS, FIVE_SECONDS)) {
for (int i = 0; i < 50; i++) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using a blocking command like bloop. Should be more predictable than relaying on multiple get/set

  • issue blpop (guarantee that we have active connection)
  • trigger network failure (this should drop the connection )
  • wait for action to complete and validate there was a connection exception

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH these are really two different test scenarios:

  • what happens when there is a network failure during a blocking operation execution
  • what happens when there is a network failure during executing multiple short-lived operations

In a perfect world we should be testing both, but in this case I would go for either.

Copy link
Collaborator

@ggivo ggivo Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't think we should test both,
Looking at the test, we are trying to simulate network failure while the connection is being actively used and check for exceptions. Blocking operation should keep the connection opened till the network issue is simulated. My idea is to hopefully reduce the flakiness

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocking operations removes timeouts so in case of nothing shows up on the line, test would end up hanging. Other point around test content, as @ggivo suggested, test case only one target which is to check the behavior in case of network failure.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@atakavci
What about adding a timeout to the test itself?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be more predictable than relaying on multiple get/set ...

i am not strongly against it but its not clear to me what would improve between;

  • multiple calls where we expect any one to fail in n seconds
  • single call we expect to fail in n seconds

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought some flakiness might come from not throwing an exception between the calls if the server is reconnected too fast, but it seems not very likely.
Agree, let's keep it as it is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see, than i failed to make the PR stating the issue clear enough. Let me try to improve the description.

String key = UUID.randomUUID().toString();
jedis.set(key, "value");
assertEquals("value", jedis.get(key));
jedis.del(key);
}
}
});

assertEquals("Unexpected end of stream.", aclException.getMessage());
String[] expectedMessages = new String[] { "Unexpected end of stream.",
"java.net.SocketException: Connection reset" };
MatcherAssert.assertThat(aclException.getMessage(), is(in(expectedMessages)));
Awaitility.await().pollDelay(Durations.ONE_HUNDRED_MILLISECONDS).atMost(Durations.TWO_SECONDS)
.until(() -> {
String key = UUID.randomUUID().toString();
jedis.set(key, "value");
assertEquals("value", jedis.get(key));
jedis.del(key);
return true;
try {
String key = UUID.randomUUID().toString();
jedis.set(key, "value");
assertEquals("value", jedis.get(key));
jedis.del(key);
return true;
} catch (Exception e) {
log.debug("attempt to reconnect after network failure, connection has not been re-established yet:"
+ e.getMessage());
return false;
}
});
}
}

private void triggerNetworkFailure() {
private TriggerActionResponse triggerNetworkFailure() {
HashMap<String, Object> params = new HashMap<>();
params.put("bdb_id", endpointConfig.getBdbId());

FaultInjectionClient.TriggerActionResponse actionResponse = null;
TriggerActionResponse actionResponse = null;
String action = "network_failure";
try {
log.info("Triggering {}", action);
Expand All @@ -377,5 +392,6 @@ private void triggerNetworkFailure() {
fail("Fault Injection Server error:" + e.getMessage());
}
log.info("Action id: {}", actionResponse.getActionId());
return actionResponse;
}
}
Loading