Skip to content

Initial implementation of SUMMA like MatrixMult #136

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

astroC86
Copy link

@astroC86 astroC86 commented Jun 2, 2025

No description provided.

@astroC86 astroC86 changed the title Initial implementation of SUMMA like Matrix Mul #129 Initial implementation of SUMMA like Matrix Mul Jun 2, 2025
Copy link
Contributor

@mrava87 mrava87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@astroC86 good start!

As you will see I left a number of comments, many are stylistic (still to review test_matrixmult, but in some cases I am not entirely sure I can follow (especially as I am not sure if this is meant to match the algorithm in your GSoC proposal Appendix)...

In general, I think it would be important if you add a well written docstring to the SUMMAMatrixMult method and comments in both the code and the example; after that I will do another full review and we will hopefully be closer to a final version that we can have into the pylops-mpi library 🤓

Finally, whilst I think that this algorithm is very interesting and worth having, we should make sure to understand if this is really the SUMMA algorithm from https://www.netlib.org/lapack/lawnspdf/lawn96.pdf (I am not so sure, as there they block each matrix over both rows and columns) and either refer to a paper that implements your algorithm or give it a name (not SUMMA) and explain it in quite some details in the Notes of the docstring of the class.


super().__init__(shape=shape, dtype=np.dtype(dtype), base_comm=base_comm)

def _matvec(self, x: DistributedArray) -> DistributedArray:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some comments to the code, so far it is very hard to follow...

@mrava87 mrava87 changed the title Initial implementation of SUMMA like Matrix Mul Initial implementation of SUMMA like MatrixMult Jun 9, 2025
@mrava87 mrava87 added the enhancement New feature or request label Jun 9, 2025
f"Rank {rank}: z_dist shape {z_dist.local_array.shape} != expected {expected_z_shape}"
)

# Verify adjoint result values
Copy link
Contributor

@mrava87 mrava87 Jun 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should follow what done in all our tests, use .asarray() on y_dist and z_dist and compare with C_true and Z_true only for rank 0.

See for example

if rank == 0:

Copy link
Author

@astroC86 astroC86 Jun 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this wont be possible because the output is replicated across processes so when we call .asarray()
(an allgather)
we wont end up with
C_0||C_1||C_2||C_3
instead we would end up with
C_0||C_1||C_0||C_1||C_2||C_3||C_2||C_3

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, I have been thinking a bit about this and I have a working solution 😄 I think we should add a _allgather_subcomm method to DistributedArray and return the effective array (C_0||C_1||C_2||C_3) instead of the replicated array (C_0||C_1||C_0||C_1||C_2||C_3||C_2||C_3) in asarray when a sub communicator is used... however, to avoid breaking backward compatibility, we could add an optional parameter (masked or something along these lines) that if True will return the effective array and if False (default) will return the total replicated array... since it is a minor change I am going to push this code here together with what I worked out in the example to have comparison on rank 0 of the full matrix... then you can do the same for the tests

@astroC86 astroC86 force-pushed the astroC86-SUMMA branch 2 times, most recently from 12d73e4 to f22f5ec Compare June 13, 2025 20:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants