You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Auto merge of #569 - Mark-Simulacrum:avoid-parking, r=pietroalbini
Adjust some of the code around the worker deadlock
This switches to a Condvar associated with the graph lock to maintain the
blocked worker pool. This, in and of itself, is just a simplification, but eases
fixes for these two cases:
* mark_as_failed is called with a try!/? operator, which means that even if
progress was made on some parts of a task, we may not have reached the
unparking code. The notification is now moved up to just after the graph lock
is re-acquired; this ensures that regardless of what happens, other threads
will have a chance to run.
* Finished did not unpark any blocked threads.
In practice, I suspect that the second of these is the cause of our bug. The
following is an excerpt of the log before worker-7 stalls out in thread park (in
the original version of this code). worker-7 blocks on the root node, the other
workers all do so as well and exit via Finished without waking worker-7. With
the new code, worker-7 would get woken on each finish for the other workers,
letting it also notice that the root is finished and exit.
```
worker-7 | NodeIndex(40): this is blocked
worker-7 | NodeIndex(0): this is blocked
marking node running: cleanup of crate kivo360/rusty_web_app as complete
worker-4 | NodeIndex(0): walked to node root
worker-4 | NodeIndex(0): neighbors: [NodeIndex(40)]
worker-4 | NodeIndex(40): walked to node crate completed
worker-4 | NodeIndex(40): neighbors: []
worker-4 | NodeIndex(40): marked as complete
marking node crate completed as complete
worker-6 | NodeIndex(0): walked to node root
worker-6 | NodeIndex(0): neighbors: []
worker-8 | NodeIndex(0): walked to node root
worker-8 | NodeIndex(0): neighbors: []
worker-9 | NodeIndex(0): walked to node root
worker-9 | NodeIndex(0): neighbors: []
worker-3 | NodeIndex(0): walked to node root
worker-3 | NodeIndex(0): neighbors: []
worker-2 | NodeIndex(0): walked to node root
worker-2 | NodeIndex(0): neighbors: []
worker-0 | NodeIndex(0): walked to node root
worker-0 | NodeIndex(0): neighbors: []
worker-5 | NodeIndex(0): walked to node root
worker-5 | NodeIndex(0): neighbors: []
worker-1 | NodeIndex(0): walked to node root
worker-1 | NodeIndex(0): neighbors: []
```
r? `@pietroalbini`
0 commit comments