Skip to content

Bug Report: BarrierNewTest fails with TCP_LAZY transport after using additional context #443

@piotrchmiel

Description

@piotrchmiel

After applying a patch to test context reuse in BarrierNewTest, I observe that the test passes when using the TCP transport, but fails when using TCP_LAZY.

Steps to Reproduce:

  1. Apply the following patch to gloo/test/barrier_test.cc:
diff --git a/gloo/test/barrier_test.cc b/gloo/test/barrier_test.cc
index 1b76f37..c666b86 100644
--- a/gloo/test/barrier_test.cc
+++ b/gloo/test/barrier_test.cc
@@ -87,15 +87,25 @@ class BarrierNewTest : public BaseTest,
                        public ::testing::WithParamInterface<NewParam> {};

 TEST_P(BarrierNewTest, Default) {
-  const auto transport = std::get<0>(GetParam());
+  const auto transport = TCP_LAZY;
   const auto contextSize = std::get<1>(GetParam());

+  auto hashStore = std::make_shared<::gloo::rendezvous::HashStore>();
+
   spawn(transport, contextSize, [&](std::shared_ptr<Context> context) {
     BarrierOptions opts(context);

     // Run barrier to synchronize processes after starting.
     barrier(opts);

+    auto lateDestroyContext = std::make_shared<::gloo::rendezvous::Context>(
+      context->rank, context->size, context->base);
+    lateDestroyContext->connectFullMesh(hashStore, context->getDevice());
+    BarrierOptions opts1(lateDestroyContext);
+
+    barrier(opts1);
+
+
     // Take turns in sleeping for a bit and checking that all processes
     // saw that artificial delay through the barrier.
     auto singleProcessDelay = std::chrono::milliseconds(10);
  1. Run barrier test suite.
  2. BarrierNewDefault/BarrierNewTest.Default/1 hangs.

Expected Behavior
Both TCP and TCP_LAZY transports should behave identically and pass the test.

Environment
Gloo version: fe67c4b (main 16.05.2025)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions