Skip to content

Deprecate (and remove) MPI_Accumulate #24

@devreal

Description

@devreal

This was mentioned during the WG call today by @jeffhammond and after some thought I really like the idea, so I put down my thoughts on it.

MPI_Accumulate is the root of all evil when it comes to atomic operation performance in MPI. It allows users to mutate an unbounded number of elements with element-wise atomicity guarantees, which span across all accumulation functions (incl. single-element MPI_Fetch_and_op). No hardware in existence today (and likely in the future) will provide efficient accumulation of more than a few elements at a time, forcing implementations to fall back to a software emulation approach to guarantee atomicity between MPI_Accumulate and MPI_Fetch_and_op. This prevents MPI_Fetch_and_op from making proper use of network hardware and has been a source of great frustration.

In essence, the MPI standard contains a function that prevents us from using low-level hardware features. It has spurred a line of proposals to mitigate its impact (#8, https://github.com/mpi-forum/mpi-standard/pull/93) that went no where and are merely band-aids. It\s also one of the main drivers for the new allocation function (#22). Instead of spending another decade on trying to overcome these shortcomings we should remove multi-element accumulate.

But I want to accumulate megabytes of data?!

Sure, MPI RMA provides you with all the functions needed to implement get-reduce-put with support from the hardware for data movement. We also provide mutual exclusion. With continuations (https://github.com/mpiwg-hybrid/mpi-standard/pull/1), you could even do that without blocking on the get or put. Or you can implement something akin to AMs using send/recv, if that fits your needs. A function that cannot make (and inhibits) proper use of hardware capabilities has no place in an API that aims at exposing low-level hardware features. You wouldn't accept a language that cannot make use of CPU AMOs for any reasonable system-level coding, either.

To summarize

  1. Deprecate MPI_Accumulate, MPI_Raccumulate, MPI_Get_accumulate, and MPI_Rget_accumulate.
  2. Introduce request-based fetch-op (https://github.com/mpi-forum/mpi-standard/pull/107) to provide an alternative to MPI_Rget_accumulate for single elements.
  3. To bridge the time until removal, add an info assertion that you won't use MPI_Accumulate anymore so that we can ignore it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions