Add a Domain decomposition Matrix format #1719

fritzgoebel · 2024-10-31T10:49:10Z

This PR adds a new distributed matrix format useful for domain decomposition preconditioners such as BDDC. It is in preparation of eventually adding BDDC into Ginkgo.

Instead of assembling the global matrix, each rank stores its local contribution to the global matrix in a globally non-assembled state, where degrees of freedom on the subdomain interfaces are "shared" between multiple ranks. The global matrix application uses a restriction matrix R that maps into an enriched space where the shared degrees of freedom have the multiplicity of ranks sharing in them. In this space, the local contributions can be applied independently of each other before applying R^T to sum up the local parts of the result vector.

This PR relies on:

TODO:

General Tests for matrix setup

pratikvn

Nice work on the documentation! Mostly looks good to me. A few comments.

pratikvn · 2025-03-18T08:32:01Z

core/distributed/dd_matrix.cpp

+    check_and_adjust_buffer_size(const size_type nrhs) const
+{
+    auto exec = this->get_executor();
+    auto comm = this->get_communicator();
+}


Seems to be incomplete ? Does not use nrhs

pratikvn · 2025-03-18T08:38:48Z

test/distributed/dd_matrix_kernels.cpp

+            GKO_ASSERT_ARRAY_EQ(d_non_local_col_idxs, non_local_col_idxs);
+        }
+
+        std::default_random_engine engine;


Not needed here.

pratikvn · 2025-03-18T08:41:04Z

test/mpi/dd_matrix.cpp

+                 TupleTypenameNameGenerator);
+
+
+TYPED_TEST(DdMatrix, ReadsDistributed)


I think tests for different row and col partitions would also be useful to have.

pratikvn · 2025-03-18T08:43:18Z

include/ginkgo/core/distributed/dd_matrix.hpp

+ * 1        | 0 1 ! 1 0 |         | 1 0 |  | 1 |
+ * 1        | 0 0 ! 0 1 |         | 0 1 |  | 0 |
+ * ```
+ * With these operators and  ablock diagonal 4x4 matrix A_BD


Suggested change

* With these operators and ablock diagonal 4x4 matrix A_BD

* With these operators and a block diagonal 4x4 matrix A_BD

pratikvn · 2025-03-18T08:53:46Z

include/ginkgo/core/distributed/dd_matrix.hpp

+ * second and third, this would lead to a restriction operator R
+ * ```
+ * Part-Id  Global              Local    Non-Local
+ * 0        | 1 0 ! 0 |         | 1 0 |  | |
+ * 0        | 0 1 ! 0 |         | 0 1 |  | |
+ *          |---------|  ---->
+ * 1        | 0 1 ! 0 |         | 0 |    | 1 |
+ * 1        | 0 0 ! 1 |         | 1 |    | 0 |


I think this would be the prolongation operator, right ? Its mapping from a coarser space to a finer space with a left apply.

It restricts the global vector to the local problems. Agreed, the global space for the block-diagonal matrix is larger, but for each rank it's a restriction. We can discuss this though if you disagree.

pratikvn · 2025-03-18T08:57:27Z

include/ginkgo/core/distributed/dd_matrix.hpp

+     *        by the process. The local matrix still considers these and the
+     * restriction and prolongation operators take care of fetching /
+     * re-distributing the corresponding vector entries.


Suggested change

* by the process. The local matrix still considers these and the

* restriction and prolongation operators take care of fetching /

* re-distributing the corresponding vector entries.

* by the process. The local matrix still considers these and the

* restriction and prolongation operators take care of fetching /

* re-distributing the corresponding vector entries.

pratikvn · 2025-03-18T08:57:49Z

include/ginkgo/core/distributed/dd_matrix.hpp

+     * @param data  The device_matrix_data structure.
+     * @param partition  The global left and right partition.
+     *
+     * @return the index_map induced by the partitions and the matrix structure


Does not return anything. Same for all read_distributed functions below.

fritzgoebel · 2025-04-03T11:20:43Z

format!

Co-authored-by: fritzgoebel <fritz.goebel@tum.de>

pratikvn

Some questions, and minor issues

pratikvn · 2025-04-14T14:44:07Z

include/ginkgo/core/distributed/dd_matrix.hpp

+ * auto part = Partition<...>::build_from_mapping(...);
+ * auto mat = Matrix<...>::create(exec, comm);
+ * mat->read_distributed(matrix_data, part);
+ * ```
+ * This will set the dimensions of the global and local matrices and generate
+ * the restriction and prolongation matrices automatically by deducing the sizes
+ * from the partition.
+ *
+ * By default the Matrix type uses Csr for the local matrix and the storage of
+ * the local and non-local parts of the restriction and prolongation matrices.
+ * It is possible to explicitly change the datatype for the local matrices, with
+ * the constraint that the new type should implement the LinOp and
+ * ReadableFromMatrixData interface. The type can be set by:
+ * ```
+ * auto mat = Matrix<ValueType, LocalIndexType[, ...]>::create(
+ *   exec, comm,
+ *   Coo<ValueType, LocalIndexType>::create(exec).get());
+ * ```
+ * Alternatively, the helper function with_matrix_type can be used:
+ * ```
+ * auto mat = Matrix<ValueType, LocalIndexType>::create(
+ *   exec, comm,
+ *   with_matrix_type<Coo>());
+ * ```


This needs to be updated ? Still uses the Matrix instead of DdMatrix

pratikvn · 2025-04-14T14:44:31Z

include/ginkgo/core/distributed/dd_matrix.hpp

+ * experimental::distributed::Matrix *A;       // distributed matrix
+ * experimental::distributed::Vector *b, *x;   // distributed multi-vectors


pratikvn · 2025-04-14T14:50:39Z

include/ginkgo/core/distributed/dd_matrix.hpp

+ */
+template <typename ValueType = default_precision,
+          typename LocalIndexType = int32, typename GlobalIndexType = int64>
+class DdMatrix


I feel that DdMatrix might be a bit vague. What do you think about UnassembledMatrix or maybe SubstructuredMatrix ?

pratikvn · 2025-04-15T06:27:16Z

core/distributed/dd_matrix.cpp

+        data, make_temporary_clone(exec, partition).get(),
+        make_temporary_clone(exec, partition).get(), local_part,


Do you still need both row and column partitions here ? The kernel can probably be simplified to use just one partition ?

pratikvn · 2025-04-15T07:08:42Z

test/distributed/dd_matrix_kernels.cpp

+    auto input = gko::test::generate_random_device_matrix_data<
+        value_type, global_index_type>(
+        num_rows, num_cols,
+        std::uniform_int_distribution<int>(static_cast<int>(num_cols - 1),


Did you mean to have fully dense rows here ?

pratikvn · 2025-04-15T07:09:03Z

test/distributed/dd_matrix_kernels.cpp

+        std::uniform_int_distribution<int>(static_cast<int>(num_cols),
+                                           static_cast<int>(num_cols)),


Same question here

pratikvn · 2025-04-15T07:12:57Z

test/mpi/dd_matrix.cpp

+}
+
+
+TYPED_TEST(DdMatrix, CanApplyToSingleVector)


Probably also makes sense to add some apply tests to ensure exception throws in cases of dimension mismatches.

MarcelKoch

looks mostly good. The most important remarks are on clarifying the documentation a bit more.

MarcelKoch · 2025-05-07T07:36:53Z

include/ginkgo/core/distributed/dd_matrix.hpp

+     *        restriction and prolongation operators take care of fetching /
+     *        re-distributing the corresponding vector entries.
+     *
+     * @param data  The device_matrix_data structure.


It should note somewhere that the data has to use global indexing. Maybe you could also add a small note on how to get to global indexing if only local indexing is available via the index map.

MarcelKoch · 2025-05-07T07:41:03Z

include/ginkgo/core/distributed/dd_matrix.hpp

+ * operator, which are distributed matrices
+ * (gko::experimental::distributed::Matrix) defined through the global indices


This sounds more like an implementation detail. I don't think it's relevant for the users that the internal data structure for the restriction/prolongation operator is a distributed matrix. It might even make sense to switch that to a distributed::RowGatherer. All I'm saying is that we should not unnecessarily restrict ourselves now to an internal data structure.

MarcelKoch · 2025-05-07T07:47:01Z

include/ginkgo/core/distributed/dd_matrix.hpp

+
+
+/**
+ * The DdMatrix class defines a (MPI-)distributed matrix.


There should be a big note that this type only supports additive overlap. What I mean by that is that local rows on different processes which map to the same global row have to be added up. This is the correct behavior for non-overlapping DD methods, but it doesn't match overlapping ones. For those, some local rows may only be read from other processes and not added up.
I think it's fine to not support those kinds of overlapping DD methods for now, we just have to make clear that the current implementation has this restriction.

MarcelKoch · 2025-05-07T07:48:40Z

include/ginkgo/core/distributed/dd_matrix.hpp

+ * With a partition where rank 0 owns the first two rows and rank 1 the
+ * third, this would lead to a restriction operator R
+ * ```
+ * Part-Id  Global              Local    Non-Local


TBH, I don't think differentiating between local and non-local is necessary. If users are interested in that, they can look at the docs for the distributed::Matrix.

MarcelKoch · 2025-05-07T07:49:51Z

include/ginkgo/core/distributed/dd_matrix.hpp

+ * |  4 -2  0 |  |  0  0  0 |  |  4 -2  0 |
+ * | -2  2  0 |  |  0  2 -2 |  | -2  4 -2 |
+ * |  0  0  0 |  |  0 -2  4 |  |  0 -2  4 |


Maybe note that the zero rows are not stored locally, i.e. each rank only stores a 2x2 matrix.

MarcelKoch · 2025-05-07T08:23:44Z

core/distributed/dd_matrix.cpp

+    device_matrix_data<value_type, global_index_type> data_copy{exec, data};
+    auto arrays = data_copy.empty_out();


This copy is not necessary, right? You only need a copy of the values for the local data, the indices are only read during the mapping to local indexing.

MarcelKoch · 2025-05-07T08:28:43Z

core/distributed/dd_matrix.cpp

+        make_temporary_clone(exec, partition).get(), local_part,
+        non_owning_row_idxs, non_owning_col_idxs));
+
+    auto map = gko::experimental::distributed::index_map<LocalIndexType,


nit:

Suggested change

auto map = gko::experimental::distributed::index_map<LocalIndexType,

auto map = index_map<LocalIndexType,

MarcelKoch · 2025-05-07T08:29:00Z

core/distributed/dd_matrix.cpp

+        arrays.col_idxs, gko::experimental::distributed::index_space::combined);
+    auto local_row_idxs = map.map_to_local(
+        arrays.row_idxs, gko::experimental::distributed::index_space::combined);


nit

Suggested change

arrays.col_idxs, gko::experimental::distributed::index_space::combined);

auto local_row_idxs = map.map_to_local(

arrays.row_idxs, gko::experimental::distributed::index_space::combined);

arrays.col_idxs, index_space::combined);

auto local_row_idxs = map.map_to_local(

arrays.row_idxs, index_space::combined);

MarcelKoch · 2025-05-07T08:41:18Z

test/distributed/dd_matrix_kernels.cpp

+                 TupleTypenameNameGenerator);
+
+
+TYPED_TEST(DdMatrix, BuildsEmptyIsSameAsRef)


This test isn't building anything, it tests the filter, so maybe rename it (and the other tests):

Suggested change

TYPED_TEST(DdMatrix, BuildsEmptyIsSameAsRef)

TYPED_TEST(DdMatrix, FiltersEmptyIsSameAsRef)

MarcelKoch · 2025-05-07T08:46:26Z

test/mpi/CMakeLists.txt

@@ -1,4 +1,5 @@
 ginkgo_create_common_and_reference_test(assembly MPI_SIZE 3)
+ginkgo_create_common_and_reference_test(dd_matrix MPI_SIZE 3)


you can use this instead of the #ifndef GKO_COMPILING_DPCPP macro:

Suggested change

ginkgo_create_common_and_reference_test(dd_matrix MPI_SIZE 3)

ginkgo_create_common_and_reference_test(dd_matrix MPI_SIZE 3 DISABLE_EXECUTORS dpcpp)

fritzgoebel self-assigned this Oct 31, 2024

ginkgo-bot added reg:build This is related to the build system. reg:testing This is related to testing. type:solver This is related to the solvers labels Oct 31, 2024

fritzgoebel marked this pull request as draft October 31, 2024 12:05

fritzgoebel force-pushed the dd_matrix_format branch from 01e9089 to 1c0e4e9 Compare October 31, 2024 13:53

fritzgoebel changed the base branch from develop to dd_base October 31, 2024 13:54

fritzgoebel force-pushed the dd_base branch from 5bc3f36 to b9a3f24 Compare December 17, 2024 10:50

fritzgoebel force-pushed the dd_matrix_format branch from a3fb354 to 762277e Compare January 7, 2025 15:34

pratikvn self-requested a review January 22, 2025 13:29

MarcelKoch added this to the Ginkgo 1.10.0 milestone Mar 13, 2025

fritzgoebel force-pushed the dd_matrix_format branch from 057b33c to a6d9c13 Compare March 17, 2025 17:56

fritzgoebel changed the base branch from dd_base to develop March 17, 2025 17:57

fritzgoebel marked this pull request as ready for review March 17, 2025 17:58

MarcelKoch self-requested a review March 18, 2025 07:59

pratikvn requested changes Mar 18, 2025

View reviewed changes

Fritz Goebel and others added 8 commits April 3, 2025 11:55

Add DD matrix and read_distributed functionality

25e6788

Add additive read distributed to prolongation operator

9554432

Add test for read distributed

e1e672f

Rely on local to global mapping, add apply implementation

a6d5ba8

Add row and column scaling

2026c95

Add device implementation of DD matrix kernel.

1f99e0f

Update DD matrix to support half

80f99f2

Add test file for dd_matrix kernels

ebd5836

fritzgoebel added 6 commits April 3, 2025 11:55

Add device tests for DD Matrix

4532fba

Add documentation for DD matrix

0a080c7

Fix after rebase

c3b6755

Address Review Comments

d02e614

Add dimension check for row and col scaling

b783bb6

Only one partition, conversion to next precision

46788cf

fritzgoebel force-pushed the dd_matrix_format branch from 15733d8 to 46788cf Compare April 3, 2025 11:19

Format files

9143720

Co-authored-by: fritzgoebel <fritz.goebel@tum.de>

pratikvn requested changes Apr 15, 2025

View reviewed changes

MarcelKoch requested changes May 7, 2025

View reviewed changes

MarcelKoch removed this from the Ginkgo 1.10.0 milestone May 21, 2025

		TupleTypenameNameGenerator);


		TYPED_TEST(DdMatrix, ReadsDistributed)

	* With these operators and ablock diagonal 4x4 matrix A_BD
	* With these operators and a block diagonal 4x4 matrix A_BD

		* experimental::distributed::Matrix *A; // distributed matrix
		* experimental::distributed::Vector b, x; // distributed multi-vectors

		data, make_temporary_clone(exec, partition).get(),
		make_temporary_clone(exec, partition).get(), local_part,

		std::uniform_int_distribution<int>(static_cast<int>(num_cols),
		static_cast<int>(num_cols)),

		* operator, which are distributed matrices
		* (gko::experimental::distributed::Matrix) defined through the global indices

		device_matrix_data<value_type, global_index_type> data_copy{exec, data};
		auto arrays = data_copy.empty_out();

	auto map = gko::experimental::distributed::index_map<LocalIndexType,
	auto map = index_map<LocalIndexType,

		TupleTypenameNameGenerator);


		TYPED_TEST(DdMatrix, BuildsEmptyIsSameAsRef)

	TYPED_TEST(DdMatrix, BuildsEmptyIsSameAsRef)
	TYPED_TEST(DdMatrix, FiltersEmptyIsSameAsRef)

		@@ -1,4 +1,5 @@
		ginkgo_create_common_and_reference_test(assembly MPI_SIZE 3)
		ginkgo_create_common_and_reference_test(dd_matrix MPI_SIZE 3)

Add a Domain decomposition Matrix format #1719

Are you sure you want to change the base?

Add a Domain decomposition Matrix format #1719

Uh oh!

Conversation

fritzgoebel commented Oct 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pratikvn left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fritzgoebel commented Apr 3, 2025

Uh oh!

pratikvn left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MarcelKoch left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fritzgoebel commented Oct 31, 2024 •

edited

Loading