-
Notifications
You must be signed in to change notification settings - Fork 340
Open
Description
After applying a patch to test context reuse in BarrierNewTest, I observe that the test passes when using the TCP transport, but fails when using TCP_LAZY.
Steps to Reproduce:
- Apply the following patch to gloo/test/barrier_test.cc:
diff --git a/gloo/test/barrier_test.cc b/gloo/test/barrier_test.cc
index 1b76f37..c666b86 100644
--- a/gloo/test/barrier_test.cc
+++ b/gloo/test/barrier_test.cc
@@ -87,15 +87,25 @@ class BarrierNewTest : public BaseTest,
public ::testing::WithParamInterface<NewParam> {};
TEST_P(BarrierNewTest, Default) {
- const auto transport = std::get<0>(GetParam());
+ const auto transport = TCP_LAZY;
const auto contextSize = std::get<1>(GetParam());
+ auto hashStore = std::make_shared<::gloo::rendezvous::HashStore>();
+
spawn(transport, contextSize, [&](std::shared_ptr<Context> context) {
BarrierOptions opts(context);
// Run barrier to synchronize processes after starting.
barrier(opts);
+ auto lateDestroyContext = std::make_shared<::gloo::rendezvous::Context>(
+ context->rank, context->size, context->base);
+ lateDestroyContext->connectFullMesh(hashStore, context->getDevice());
+ BarrierOptions opts1(lateDestroyContext);
+
+ barrier(opts1);
+
+
// Take turns in sleeping for a bit and checking that all processes
// saw that artificial delay through the barrier.
auto singleProcessDelay = std::chrono::milliseconds(10);
- Run barrier test suite.
- BarrierNewDefault/BarrierNewTest.Default/1 hangs.
Expected Behavior
Both TCP and TCP_LAZY transports should behave identically and pass the test.
Environment
Gloo version: fe67c4b (main 16.05.2025)
Metadata
Metadata
Assignees
Labels
No labels