Skip to content

Conversation

CoxyMielle
Copy link
Collaborator

Implementation of simple_mspm (AB = C) and mspm (aAB+bC = C) in the reference and omp executor and adding corresponding tests.

@ginkgo-bot ginkgo-bot added reg:testing This is related to testing. type:matrix-format This is related to the Matrix formats mod:all This touches all Ginkgo modules. labels Aug 13, 2025
@ginkgo-bot
Copy link
Member

Error: The following files need to be formatted:

omp/matrix/dense_kernels.cpp
reference/matrix/dense_kernels.cpp
reference/test/matrix/dense_kernels.cpp
test/matrix/dense_kernels.cpp

You can find a formatting patch under Artifacts here or run format! if you have write access to Ginkgo

@yhmtsai yhmtsai requested review from upsj and yhmtsai August 13, 2025 13:48
@yhmtsai yhmtsai added the 1:ST:ready-for-review This PR is ready for review label Aug 13, 2025
Copy link
Member

@yhmtsai yhmtsai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make the code format fitting into Ginkgo requirement, please install pre-commit and run pre-commit install to install the ginkgo config.
after installing that, it should format your code when you try to commit something.
To apply it to the codes already in branch, you can use re-commit run --from-ref origin/develop --to-ref HEAD

const matrix::Csr<ValueType, IndexType>* b,
const matrix::Dense<ValueType>* beta, matrix::Dense<ValueType>* c)
{
// TODO: implement c = alpha * a * b + beta * c with single thread
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// TODO: implement c = alpha * a * b + beta * c with single thread

you can delete this, too

const matrix::Csr<ValueType, IndexType>* b,
matrix::Dense<ValueType>* c)
{
// TODO: implement c = a * b with single thread
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// TODO: implement c = a * b with single thread

sorry for my wrong copy. you can delete this comment here

const matrix::Csr<ValueType, IndexType>* b,
matrix::Dense<ValueType>* c)
{
// TODO: implement c = a * b with single thread
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// TODO: implement c = a * b with single thread

const matrix::Csr<ValueType, IndexType>* b,
const matrix::Dense<ValueType>* beta, matrix::Dense<ValueType>* c)
{
// TODO: implement c = alpha * a * b + beta * c with single thread
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// TODO: implement c = alpha * a * b + beta * c with single thread

Comment on lines +153 to +154
const auto a_vals = acc::helper::build_const_rrm_accessor<ValueType>(a);
const auto b_vals = acc::helper::build_const_rrm_accessor<ValueType>(b);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is unnecessary because the kernel is working for uniform precision now.
Does it affect your performance?

for(IndexType k=zero<IndexType>(); k<b->get_size()[0]; k++){
const auto val_A = define_multiplication_operand(row, k);
//iterate over the non-zero values of a row
for(IndexType idx_B=b_rowptrs[k]; idx_B<b_rowptrs[k+1]; idx_B++){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for(IndexType idx_B=b_rowptrs[k]; idx_B<b_rowptrs[k+1]; idx_B++){
for(auto idx_b=b_rowptrs[k]; idx_b<b_rowptrs[k+1]; idx_b++){

also for the name

initialize_accumulator(th_acc_begin_ptr, sub_acc_size, row);
//iterate over the whole matrix b
for(IndexType k=zero<IndexType>(); k<b->get_size()[0]; k++){
const auto val_A = define_multiplication_operand(row, k);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +181 to +182
auto out_ptr = c_vals_ptr + row*c->get_stride();
std::copy(th_acc_begin_ptr, th_acc_end_ptr, out_ptr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Maybe I missing that. Could you remind me why you need the accumulator array?
From the implementation, one thread handle a row, so you can directly operate on the output data without accumulator, right?

Comment on lines +215 to +216
std::transform( //initialize the accumulator with c + beta
begin_row_c_vals_ptr, begin_row_c_vals_ptr + acc_size,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::transform( //initialize the accumulator with c + beta
begin_row_c_vals_ptr, begin_row_c_vals_ptr + acc_size,
//initialize the accumulator with c + beta
std::transform(
begin_row_c_vals_ptr, begin_row_c_vals_ptr + acc_size,

I would move comment out of function call to let clang-format do format unless there is a good reason.

Comment on lines +165 to +167
auto advanced_def_mult_operand = [a, alpha](IndexType row, IndexType k){
return alpha->at(0, 0) * a->at(row, k); //multiply a(row,k) by alpha
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto advanced_def_mult_operand = [a, alpha](IndexType row, IndexType k){
return alpha->at(0, 0) * a->at(row, k); //multiply a(row,k) by alpha
};
auto advanced_def_mult_operand = [alpha](auto a_val, auto b_val){
return alpha->at(0, 0) * a_val * b_val; //multiply a(row,k) by alpha
};

from the name, I was expecting this form. also this will reduce the unclear data access from the function call, but I do not have strong opinion on this yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

1:ST:ready-for-review This PR is ready for review mod:all This touches all Ginkgo modules. reg:testing This is related to testing. type:matrix-format This is related to the Matrix formats

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants