Pylops-mpi is a Python library built on top of PyLops, designed to enable distributed and parallel processing of large-scale linear algebra operations and computations.
To install pylops-mpi, you need to have Message Passing Interface (MPI) and optionally Nvidia's Collective Communication Library (NCCL) installed on your system.
-
Download and Install MPI: Visit the official MPI website to download an appropriate MPI implementation for your system. Follow the installation instructions provided by the MPI vendor.
-
Verify MPI Installation: After installing MPI, verify its installation by opening a terminal or command prompt and running the following command:
mpiexec --version
-
Install pylops-mpi: Once MPI is installed and verified, you can proceed to install
pylops-mpi
viapip
:pip install pylops-mpi
-
(Optional) To enable the NCCL backend for multi-GPU systems, install
cupy
andnccl
viapip
:pip install cupy-cudaXx nvidia-nccl-cuX
with
X=11,12
.
Alternatively, if the Conda package manager is used to setup the Python environment, steps 1 and 2 can be skipped and mpi4py
can be installed directly alongside the MPI distribution of choice:
conda install -c conda-forge mpi4py X
with X=mpich, openmpi, impi_rt, msmpi
. Similarly step 4 can be accomplished using:
conda install -c conda-forge cupy nccl
See the docs (Installation) for more information.
Once you have installed the prerequisites and pylops-mpi, you can run pylops-mpi using the mpiexec
command.
Here is an example on how to run a python script called <script_name>.py
:
mpiexec -n <NUM_PROCESSES> python <script_name>.py
The following example is a modified version of
PyLops' README_ starting
example that can handle a 2D-array distributed across ranks over the first dimension
via the DistributedArray
object:
import numpy as np
from pylops_mpi import DistributedArray, Partition
# Initialize DistributedArray with partition set to Scatter
nx, ny = 11, 21
x = np.zeros((nx, ny), dtype=np.float64)
x[nx // 2, ny // 2] = 1.0
x_dist = pylops_mpi.DistributedArray.to_dist(
x=x.flatten(),
partition=Partition.SCATTER)
# Distributed first-derivative
D_op = pylops_mpi.MPIFirstDerivative((nx, ny), dtype=np.float64)
# y = Dx
y_dist = D_op @ x_dist
# xadj = D^H y
xadj_dist = D_op.H @ y_dist
# xinv = D^-1 y
x0_dist = pylops_mpi.DistributedArray(D_op.shape[1], dtype=np.float64)
x0_dist[:] = 0
xinv_dist = pylops_mpi.cgls(D_op, y_dist, x0=x0_dist, niter=10)[0]
Note that the DistributedArray
class provides the to_dist
class method that accepts a NumPy array as input and converts it into an instance of the DistributedArray
class. This method is used to transform a regular NumPy array into a DistributedArray that is distributed and processed across multiple nodes or processes.
Moreover, the DistributedArray
class provides also fundamental mathematical operations, such as element-wise addition, subtraction, multiplication, dot product, and an equivalent of the np.linalg.norm
function that operate in a distributed fashion,
thus utilizing the efficiency of the MPI/NCC; protocols. This enables efficient computation and processing of large-scale distributed arrays.
The MPI test scripts are located in the tests
folder.
Use the following command to run the tests:
mpiexec -n <NUM_PROCESSES> pytest tests/ --with-mpi
where the --with-mpi
option tells pytest to enable the pytest-mpi
plugin, allowing the tests to utilize the MPI functionality.
Similarly, to run the NCCL test scripts in the tests_nccl
folder,
use the following command to run the tests:
mpiexec -n <NUM_PROCESSES> pytest tests_nccl/ --with-mpi
The official documentation of Pylops-MPI is available here. Visit the official docs to learn more about pylops-mpi.
- Rohan Babbar, rohanbabbar04
- Yuxi Hong, hongyx11
- Matteo Ravasi, mrava87
- Tharit Tangkijwanichakul, tharittk