-
Notifications
You must be signed in to change notification settings - Fork 556
Description
Short version
As other people, I would like to be able to switch the threadpool/scheduler backend of rayon, while keeping parallel iterators. I really love the fact that most of rayons parallelization boils down to join()
, so one way would be to have a trait ParallelizationContext
with a function join()
, and for any consuming iterator function, provide an alternate version that takes a ParallelizationContext
parameter. I think it's quite possible that you have considered this option previously, either way I would be very happy to hear your thoughts on it.
My use case
I'm working on a library that implements various algorithms. The size of the input these algorithms are used on varies greatly, and in most cases, parallel execution is an overkill. There are also some cases where I absolutely need single-threaded execution, e.g. explicitly single-threaded benchmarks. On the other hand, there are also cases where inputs are large, and parallelization is very important.
I'm considering the following options to let the user specify whether/how to use parallelization inside the algorithms:
- Use parallel iterators everywhere, and set the number of threads in rayon's thread pool dynamically. However, this does not allow fine-grained control.
- Allow the user to optionally pass a rayon
ThreadPool
and run the algorithm in the threadpool usingThreadPool::install()
. Without modifications to rayon, this seems like the best solution, but it feels very hacky. Furthermore, it means that when the algorithm is single-threaded, rayon will still instantiate the global thread pool, and the user additionally has to instantiate a single-thread thread pool. Finally, when implementing algorithms, one can easily forget the call toinstall()
, resulting in the algorithm running in the global thread pool. - Modify or fork rayon, to make parallel iterators configurable with the parallelization/scheduling provider. This is the option I'd like to propose, but if you have any feedback on other solutions, these are also welcome!
Some more details
As explained before, my idea would be to introduce a trait - a name might be ParallelizationContext
- that models the join()
function. There would be a zero-sized singleton that delegates to the rayon global join()
, as well as one for sequential execution. Each iterator function that consumes the iterator (like for_each()
etc) would have a version - say for_each_with_parallelization()
, mirroring the new_with_alloc()
that std introduced recently on nightly - that additionally takes a ParallelizationContext
. The old functions would delegate to the new ones, using the global rayon join()
. If default features are disabled, the global singleton ParallelizationContext
that delegates to the global join()
would be disabled, and rayon would not create a global thread pool. This way, no breaking changes are introduced.
I imagine this solution is either to be discarded because of some problems I'm not aware, might be sensible as a fork of rayon, or would be a good contribution to rayon itself. In the latter case, I'd be happy to do most of the implementation work!