-
Notifications
You must be signed in to change notification settings - Fork 99
Adds distributed row gatherer #1589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
6b4521b
to
ae60198
Compare
6acf7c4
to
8aa6ab9
Compare
49557f1
to
4a79442
Compare
8aa6ab9
to
77398bd
Compare
4a79442
to
172eb7d
Compare
77398bd
to
d278cad
Compare
98fa10a
to
79de4c3
Compare
One issue that I have is the constructor. It takes a
If I can't come up with anything better, I guess I will use that. |
79de4c3
to
b0e5c92
Compare
d278cad
to
d6112ef
Compare
b0e5c92
to
775854a
Compare
d6112ef
to
1582673
Compare
Do we need to have the |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #1589 +/- ##
==========================================
Coverage ? 88.64%
==========================================
Files ? 849
Lines ? 71101
Branches ? 0
==========================================
Hits ? 63029
Misses ? 8072
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: Marcel Koch <marcel.koch@kit.edu>
Signed-off-by: Marcel Koch <marcel.koch@kit.edu>
- only allocate if necessary - synchronize correct executor Co-authored-by: Pratik Nayak <pratik.nayak@kit.edu>
- split tests into core and backend part - fix formatting - fix openmpi pre 4.1.x macro Co-authored-by: Pratik Nayak <pratik.nayak4@gmail.com> Co-authored-by: Yu-Hsiang M. Tsai <yhmtsai@gmail.com> Signed-off-by: Marcel Koch <marcel.koch@kit.edu>
- add copy/move tests - undo using MPI_Init_thread - add extra host_recv_buffer_ - create row-gatherer as unique_ptr Co-authored-by: Yu-Hsiang M. Tsai <yhmtsai@gmail.com>
The `LinOp::apply` function creates temporary clones to match the operators executor, but this will lead to wrong behavior, if MPI doesn't support GPU buffers.
right now the RG doesn't support (blocking) apply, so it doesn't make much sense to keep it as a LinOp
- documentation - format - unused code Co-authored-by: Yu-Hsiang M. Tsai <yhmtsai@gmail.com>
…r of cores required.
dafb3e6
to
e7178d5
Compare
|
This PR adds a distributed row gatherer. This operator essentially provides the communication required in our matrix apply.
Besides the normal apply (which is blocking), it also provides two asynchronous calls. One version has an additional
workspace
parameter which is used as send buffer. This version can be called multiple times without restrictions, if different workspaces are used for each call. The other version doesn't have a workspace parameter, and instead uses an internal buffer. As a consequence, this function can only be called a second time, if the request of the previous call has been waited on. Otherwise, this function will throw.This is the second part of splitting up #1546.
It also introduces some intermediate changes, which could be extracted out beforehand:
a type-erasedDenseCache
makingnow part of Use index_map in distributed::matrix #1544detail::run
easier to usePR Stack: