-
Notifications
You must be signed in to change notification settings - Fork 25
Stoppable Submissions in Rust #11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main-dev
Are you sure you want to change the base?
Conversation
|
Hi @adamreichold! The proposed variant doesn't cover the synchronization between threads. Other threads need a way to know that the work has stopped. There is probably a way to reduce the contention on the |
The synchronization happens with the same atomics that are used to submit any work item to the workers.
By broadcasting the |
|
I am not sure, what "broadcasting" means? What kind of logic will be executed in the CPU - which instructions? |
stop Contention?
|
The same as when any other work item/trampoline is submitted to the thread pool. The atomic instructions used to modify Maybe think of it this way: If you already have a method to run arbitrary closures on a given set of threads, do you really need additional shared state to tell these threads to exit? Or if a reference to an authority helps: I did not come up with this. I think the first time I read about this method to handle shutting down a thread pool was in an old blog post by Herb Sutter. (Of course, I can't find it now...) |
stop Contention?
It means what the existing method called "broadcast" does: Ensure that a given closure is called exactly one by each worker thread in the pool. Only the minor technicality that we are not interested in the result and do not need to run this on the current thread means that the actual method isn't called. |
|
I think an example may help. I think we are tackling two different use-cases:
The |
And the But the "message to stop", so to speak, is packaged into a closure and transferred using the existing work submission mechanism instead of using the out-of-band (This is also why the extra trampoline argument is required. The point of this is to avoid shared global state. It could also be a return value, but I figured this might be more costly if every work has to produce it compared to ignoring the argument. It could also be an array with one (non-atomic) flag per worker which is then indexed using the thread index, but that is just much more complicated than the on-stack state for no good reason.)
Every test case in this repository runs this code when it shut downs its thread pool. There is nothing extra to see here, this is just a more efficient mechanism to implement the existing shutdown semantics. |
|
Thanks for the explanations! We can have such logic in the lambda, but we also need a way to propagate the signal handlers and other interruptions originating outside of the loop. I'll come back to this in a couple of weeks and will try to incorporate your suggestions 🤗 |
| let context = inner.context(); | ||
| unsafe { | ||
| trampoline(context, thread_index); | ||
| trampoline(context, thread_index, &mut stop); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use an out pointer instead of a return value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My initial idea was that the overhead might be lower as those work items which do not want to stop the pool (so basically all but one) can just ignore the additional argument (instead of producing a useless false as a return value). I should pull the initialization out of the loop though so that the worker loop does not pay the price of it for each work item...
I think there is an ever nicer way which does not need to touch the trampoline signature at all but using the identity of the stop_trampoline, c.f. adamreichold/fork-join-scope@cdb63bf But transplanting it here is not that trivial because of the rules around function items/pointers and their comparison. So personally, I would prefer to switch to using dyn Fn(usize) + Sync as the work items instead of manually splitting context and trampoline first before implementing that.
|
I have tested the |
|
Thanks for suggestions, @adamreichold! And for the kind words, @chengts95 😊 I'll be adding early-stoppable parallel iterators in the upcoming v3, together with Zig support and other features. Let me know if you have any other ideas worth exploring 🤗 |
|
Hi. Maybe we can have an aysnc par for and allow manually join the threads. |
No description provided.