This repository reproduces the Bazel bug described in #26292. It implements a basic multiplex sandboxed worker in Rust that copies an input file to an output file.
- Clone Bazel
- Apply this patch to Bazel:
diff --git a/src/main/java/com/google/devtools/build/lib/worker/WorkerSpawnRunner.java b/src/main/java/com/google/devtools/build/lib/worker/WorkerSpawnRunner.java
index 7ef16da893b..378ac308633 100644
--- a/src/main/java/com/google/devtools/build/lib/worker/WorkerSpawnRunner.java
+++ b/src/main/java/com/google/devtools/build/lib/worker/WorkerSpawnRunner.java
@@ -635,6 +635,10 @@ final class WorkerSpawnRunner implements SpawnRunner {
() -> {
resourceManager.acquireResourceOwnership();
+ try {
+ Thread.sleep(10000);
+ } catch (InterruptedException exception) {}
+
Worker w = worker;
try {
if (canCancel) {
This patch increases the likelihood of the race condition occuring by making the Bazel server wait before collecting a response from a cancelled work request that lost the dynamic execution race.
- Build Bazel and take note of the path to the Bazel executable
- Set up a remote execution service, check out this repository, and configure remote execution in
user.bazelrc
I'm running a local Buildfarm cluster, so my user.bazelrc
looks like this:
build --remote_executor=grpc://localhost:8980
build --remote_cache=grpc://localhost:8980
- Build some targets using the worker
$ USE_BAZEL_VERSION=<YOUR BAZEL REPOSITORY>/bazel-bin/src/bazel bazel build //:all
- While those targets are building:
- Take note of the "Multiplexer for CopyFile found no semaphore" messages
- Take note of the fact that as more targets build, fewer run locally and more run remotely