Skip to content

Conversation

yhmtsai
Copy link
Member

@yhmtsai yhmtsai commented Nov 15, 2024

The default sub-group size 8 for most kernels (I assume) is available on A770, but it can not run or terminiate successfully with this kind of kernels. It might be from some mismatch between driver/kernel/compiler.
We pass the -fsycl-default-sub-group-size=16 such that it does not hang for now.
Interestingly, the cooperative group with sub-group size 8 works for now.
I was inspired by that because job with 8 hangs but job with 16 works.

@yhmtsai yhmtsai added the plat:intel This is related to the Intel compilers. label Nov 15, 2024
@yhmtsai yhmtsai requested review from a team November 15, 2024 18:23
@yhmtsai yhmtsai self-assigned this Nov 15, 2024
@ginkgo-bot ginkgo-bot added the reg:ci-cd This is related to the continuous integration system. label Nov 15, 2024
@yhmtsai yhmtsai force-pushed the a770_action_workaround branch 2 times, most recently from f76e723 to e2a9110 Compare November 16, 2024 13:03
@yhmtsai yhmtsai force-pushed the a770_action_workaround branch from e2a9110 to 229df3c Compare November 16, 2024 20:41
Copy link
Member

@upsj upsj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

cmake .. -DCMAKE_INSTALL_PREFIX=install_ginkgo -DCMAKE_CXX_FLAGS="-Wpedantic -ffp-model=precise" -DCMAKE_CXX_COMPILER=${{ matrix.config.compiler }} -DCMAKE_BUILD_TYPE=${{ matrix.config.build_type }} -DGINKGO_MIXED_PRECISION=${{ matrix.config.mixed }} -DGINKGO_BUILD_CUDA=OFF -DGINKGO_BUILD_HIP=OFF -DGINKGO_BUILD_MPI=OFF -DGINKGO_DPCPP_SINGLE_MODE=ON
make -j8
ONEAPI_DEVICE_SELECTOR=level_zero:gpu ctest -j10 --output-on-failure
cmake .. -GNinja -DCMAKE_INSTALL_PREFIX=install_ginkgo -DCMAKE_CXX_FLAGS="-Wpedantic -ffp-model=precise -fsycl-default-sub-group-size=16 -Wno-unused-command-line-argument -Wno-deprecated" -DCMAKE_CXX_COMPILER=${{ matrix.config.compiler }} -DCMAKE_BUILD_TYPE=${{ matrix.config.build_type }} -DGINKGO_MIXED_PRECISION=${{ matrix.config.mixed }} -DGINKGO_BUILD_CUDA=OFF -DGINKGO_BUILD_HIP=OFF -DGINKGO_BUILD_MPI=OFF -DGINKGO_DPCPP_SINGLE_MODE=ON
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep -Wdeprecated around, especially now that we are starting to fix those deprecated usages.

@yhmtsai yhmtsai added the 1:ST:do-not-merge Please do not merge PR this yet. label Nov 18, 2024
@yhmtsai
Copy link
Member Author

yhmtsai commented Nov 19, 2024

the workaround is not needed anymore as the machine runs normally now

@yhmtsai yhmtsai closed this Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

1:ST:do-not-merge Please do not merge PR this yet. plat:intel This is related to the Intel compilers. reg:ci-cd This is related to the continuous integration system.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants