Skip to content

Conversation

@yasahi-hpc
Copy link
Collaborator

This PR aims at allowing transpose function to perform deep_copy if transpose is not needed.

  • Using execution space instance appropriately in transpose unit-tests
  • Skip some case when map is identical. I found that the test will result in test failures in fftshift tests in Release build on Cuda backend. This does not happen in CPU backends or Cuda backend with Debug build. This happens only with 4D cases.

@yasahi-hpc yasahi-hpc self-assigned this Oct 26, 2025
@yasahi-hpc yasahi-hpc added enhancement New feature or request cleanup labels Oct 26, 2025
Comment on lines 1433 to 1438
std::size_t dst_i0 = (map[0] == 1) ? i1
: (map[0] == 2) ? i2
: (map[0] == 3) ? i3
: (map[0] == 4) ? i4
: (map[0] == 5) ? i5
: (map[0] == 6) ? i6
Copy link

@PaulGannay PaulGannay Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not related to this PR.
Maybe you should have an array std::size_t idx[7], that you'd use to iterate over idx[0], idx[1], ...
An other array std::size_t dst_idx[7]
And then, you could write

for (int i=0; i<7;++i) {
  dst_idx[i] = idx[map[i]];
}

And maybe you could use a recursive template for the 7 and put the whole loop in a function that could be reused across all the dimensions?

Copy link

@PaulGannay PaulGannay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in KokkosFFT_transpose.hpp looks correct, but the test file is so big that it is becoming hard to review, it would be best if you could find a way to factorise some of the code to make the changes easier to track.
It is orthogonal with the work done in this PR though.

Copy link
Member

@tretre91 tretre91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me

@yasahi-hpc
Copy link
Collaborator Author

Thanks @PaulGannay @tretre91
I will refactor first in another PR
As you said, tests can be simplified

@yasahi-hpc
Copy link
Collaborator Author

@PaulGannay I opened #348 focusing on the execution space instance. Can I have your review please.

@yasahi-hpc
Copy link
Collaborator Author

@PaulGannay I opened #349 to make reference transposed view simpler. Actually, I encountered an issue with this change alone. The failed test is skipped now. Can I have your review please.

@yasahi-hpc yasahi-hpc force-pushed the copy-if-map-is-identical branch from af5d2a7 to fbb74ae Compare October 30, 2025 10:41
@yasahi-hpc
Copy link
Collaborator Author

Hi, I have rebased after PRs #348 and #349
May I have your another review please @PaulGannay

Copy link

@PaulGannay PaulGannay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yasahi-hpc
Copy link
Collaborator Author

@PaulGannay

Thanks! I will merge this

@yasahi-hpc yasahi-hpc merged commit 88fd4f3 into kokkos:main Oct 30, 2025
39 checks passed
@yasahi-hpc yasahi-hpc deleted the copy-if-map-is-identical branch October 30, 2025 12:42
@yasahi-hpc yasahi-hpc mentioned this pull request Oct 31, 2025
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cleanup enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants