Skip to content

Using rayon under a Mutex can lead to deadlocks #592

@cuviper

Description

@cuviper

https://users.rust-lang.org/t/using-rayon-in-code-locked-by-a-mutex/20119

@kornelski presented the following pseudo-code representing a possible deadlock:

fn mutex_and_par() {
   some_mutex.lock().par_iter().collect();
}

collection.par_iter().for_each(mutex_and_par);

I think the way this fails is something like:

  • Thread 1 starts processing one of the for_each(mutex_and_par) calls.
    • grabs the lock
    • splits up the par_iter() into a few jobs
    • starts the collect() on one of those jobs
  • Another thread steals one of those jobs to help out
  • Thread 1 finishes its current job
    • looks for the other jobs and sees they're stolen elsewhere
    • goes to steal more work itself while it's waiting...
    • finds another of the for_each(mutex_and_par) calls
    • tries to grab the lock (again) ... DEADLOCK

If serialized, this code would never be a problem, but a work-stealing threadpool can cause it to implicitly recurse the lock. I'm not sure what we can do about this, or if there's any better advice we can give than just "don't call rayon under a mutex"...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions