-
Notifications
You must be signed in to change notification settings - Fork 295
Fix use of CUDA CCs in LLVM easyblock and cleanup #3755
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Fix use of CUDA CCs in LLVM easyblock and cleanup #3755
Conversation
easybuild/easyblocks/l/llvm.py
Outdated
if gpu_archs: | ||
self.runtimes_cmake_args['LIBOMPTARGET_DEVICE_ARCHITECTURES'] = '%s' % '|'.join(gpu_archs) | ||
self._cmakeopts['LIBOMPTARGET_DEVICE_ARCHITECTURES'] = ';'.join(gpu_archs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is going to work, and I would be extremely hesitant towards reverting this.
The sanity check may pass, simply because all architectures are being built, but this is not what we want.
See e.g. #3675 (comment) for LLVM 19. I think this is also something where @bedroge had issues with, which we resolved with the currently implemented approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually see the opposite: I did a test with the CLang and the LLVM easyblock of the "same" easyconfig and did a diff:
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_35.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_37.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_50.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_52.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_53.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_60.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_61.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_62.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_70.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_72.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_75.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_86.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_87.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_89.bc.
Nur in /tmp/install-old/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0/lib64: libomptarget-nvptx-sm_90.bc.
I inspected the CMake files and every CMake value with prefix LIBOMPTARGET
get automatically passed through to the sub-builds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were significant changes with LLVM 19 & LLVM 20, which might influence the results a bit.
I just checked my LLVM 19 & 20 installs on my machines, and with the currently implemented approach, only the passed architectures are built.
I don't have an LLVM 18.1.8 installation with the new EasyBlock available to compare to, and unfortunately jsc-zen3
hasn't either. Maybe we missed something for this version in particular.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a companion PR for testing with some easyconfigs of various versions. Build is running, will check
IIRC LLVM 20 doesn't use those anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking a Clang installation done with the old EasyBlock just before EB 5.0.0 (4.9.5dev), I can see that all architectures are built:
jreuter@ZAM054 /opt/EasyBuild/apps/software/Clang/18.1.8-GCCcore-13.3.0-CUDA-12.6.0 find . -name "*.bc"
./lib/libomptarget-amdgpu-gfx1150.bc
./lib/libomptarget-amdgpu-gfx900.bc
./lib/libomptarget-amdgpu-gfx941.bc
./lib/libomptarget-amdgpu-gfx1032.bc
./lib/libomptarget-amdgpu-gfx902.bc
./lib/libomptarget-nvptx-sm_89.bc
./lib/libomptarget-amdgpu-gfx801.bc
./lib/libomptarget-amdgpu-gfx1103.bc
./lib/libomptarget-amdgpu-gfx1010.bc
./lib/libomptarget-nvptx-sm_37.bc
./lib/libomptarget-nvptx-sm_80.bc
./lib/libomptarget-nvptx-sm_86.bc
./lib/libomptarget-nvptx-sm_61.bc
./lib/libomptarget-nvptx-sm_70.bc
./lib/libomptarget-nvptx-sm_50.bc
./lib/libomptarget-amdgpu-gfx1030.bc
./lib/libomptarget-nvptx-sm_62.bc
./lib/libomptarget-nvptx-sm_75.bc
./lib/libomptarget-nvptx-sm_72.bc
./lib/libomptarget-amdgpu-gfx1031.bc
./lib/libomptarget-amdgpu-gfx90c.bc
./lib/libomptarget-nvptx-sm_87.bc
./lib/libomptarget-nvptx-sm_60.bc
./lib/libomptarget-amdgpu-gfx1035.bc
./lib/libomptarget-amdgpu-gfx803.bc
./lib/libomptarget-nvptx-sm_35.bc
./lib/libomptarget-nvptx-sm_53.bc
./lib/libomptarget-amdgpu-gfx942.bc
./lib/libomptarget-amdgpu-gfx940.bc
./lib/libomptarget-amdgpu-gfx1101.bc
./lib/libomptarget-amdgpu-gfx1100.bc
./lib/libomptarget-amdgpu-gfx90a.bc
./lib/libomptarget-amdgpu-gfx906.bc
./lib/libomptarget-amdgpu-gfx701.bc
./lib/libomptarget-amdgpu-gfx700.bc
./lib/libomptarget-amdgpu-gfx908.bc
./lib/libomptarget-amdgpu-gfx1033.bc
./lib/libomptarget-amdgpu-gfx1151.bc
./lib/libomptarget-amdgpu-gfx1034.bc
./lib/libomptarget-amdgpu-gfx1036.bc
./lib/libomptarget-nvptx-sm_90.bc
./lib/libomptarget-amdgpu-gfx1102.bc
./lib/libomptarget-nvptx-sm_52.bc
Compared to LLVM 19.1.1 with the latest EasyBlock:
jreuter@ZAM054 /opt/EasyBuild/apps/software/LLVM/19.1.1-GCCcore-13.3.0 find . -name "*.bc"
./lib/x86_64-unknown-linux-gnu/libomptarget-nvptx-sm_80.bc
./lib/x86_64-unknown-linux-gnu/libomptarget-nvptx-sm_75.bc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I might have made a mistake during the comparison. I'm doing a rebuild of both versions now.
The idea was to simplify this and make it easier to understand in the future. To me this runtimes_cmake_args
thing was highly suspicious/confusing especially with the pipe separator and it isn't clear either if and why this argument isn't required for the "main" build.
In the source you can see that there are some variables
and the LIBOMP*
prefix passed through by default. And according to git blame
this has been virtually always the case.
So I'm confident this is the best way, especially as I couldn't find any examples doing it the current way which makes me thing this is "safer".
I think building a few versions (see the EC PR) with this should be enough to check that this works for all 3 major versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked LLVM 18.1.8 with your changes, and there it seems to work with your solution as well.
$ find . -name "*.bc"
./lib/libomptarget-nvptx-sm_75.bc
./lib/libomptarget-amdgpu-gfx1101.bc
./lib/libomptarget-nvptx-sm_80.bc
Lets check the other ones for safety as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've redone the tests with Clang 18.1.8 once with the clang easyblock and once with the updated llvm easyblock, both with CCC=7.0,8.0, and can still see that the clang easyblock builds all architectures. Diff of the install folder restricted to added files and ignoring the lib64 symlink:
Only in Clang-Clang/easybuild/reprod/easyblocks: clang.py.
Only in Clang-Clang/lib: libgomp.so.
Only in Clang-Clang/lib: libiomp5.so.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx1010.bc.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx1030.bc.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx1031.bc.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx1032.bc.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx1033.bc.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx1034.bc.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx1035.bc.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx1036.bc.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx1100.bc.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx1101.bc.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx1102.bc.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx1103.bc.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx1150.bc.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx1151.bc.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx700.bc.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx701.bc.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx801.bc.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx803.bc.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx900.bc.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx902.bc.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx906.bc.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx908.bc.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx90a.bc.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx90c.bc.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx940.bc.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx941.bc.
Only in Clang-Clang/lib: libomptarget-amdgpu-gfx942.bc.
Only in Clang-Clang/lib: libomptarget-nvptx-sm_35.bc.
Only in Clang-Clang/lib: libomptarget-nvptx-sm_37.bc.
Only in Clang-Clang/lib: libomptarget-nvptx-sm_50.bc.
Only in Clang-Clang/lib: libomptarget-nvptx-sm_52.bc.
Only in Clang-Clang/lib: libomptarget-nvptx-sm_53.bc.
Only in Clang-Clang/lib: libomptarget-nvptx-sm_60.bc.
Only in Clang-Clang/lib: libomptarget-nvptx-sm_61.bc.
Only in Clang-Clang/lib: libomptarget-nvptx-sm_62.bc.
Only in Clang-Clang/lib: libomptarget-nvptx-sm_72.bc.
Only in Clang-Clang/lib: libomptarget-nvptx-sm_75.bc.
Only in Clang-Clang/lib: libomptarget-nvptx-sm_86.bc.
Only in Clang-Clang/lib: libomptarget-nvptx-sm_87.bc.
Only in Clang-Clang/lib: libomptarget-nvptx-sm_89.bc.
Only in Clang-Clang/lib: libomptarget-nvptx-sm_90.bc.
Only in Clang-LLVM/bin: count.
Only in Clang-LLVM/bin: FileCheck.
Only in Clang-LLVM/bin: lli-child-target.
Only in Clang-LLVM/bin: llvm-jitlink-executor.
Only in Clang-LLVM/bin: llvm-PerfectShuffle.
Only in Clang-LLVM/bin: not.
Only in Clang-LLVM/bin: obj2yaml.
Only in Clang-LLVM/bin: split-file.
Only in Clang-LLVM/bin: UnicodeNameMappingGenerator.
Only in Clang-LLVM/bin: yaml2obj.
Only in Clang-LLVM/bin: yaml-bench.
Only in Clang-LLVM/easybuild/reprod/easyblocks: llvm.py.
Only in Clang-LLVM/include: x86_64-pc-linux.
Only in Clang-LLVM/include: x86_64-pc-linux-gnu.
Only in Clang-LLVM/lib/clang/18/lib: x86_64-pc-linux.
Only in Clang-LLVM/lib/clang/18/lib: x86_64-pc-linux-gnu.
Only in Clang-LLVM/lib: libLLVM-18.so.
Only in Clang-LLVM/lib: libLLVM.so.
Only in Clang-LLVM/lib: libLLVM.so.18.1.
Only in Clang-LLVM/lib: libMLIR.so.
Only in Clang-LLVM/lib: libMLIR.so.18.1.
Only in Clang-LLVM/lib: x86_64-pc-linux.
Only in Clang-LLVM/lib: x86_64-pc-linux-gnu.
libgomp
and libiomp
are symlinks to libomp
. Not sure why LLVM doesn't create them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might worth checking if this has some influence:
Nur in Clang-Clang/lib: libgomp.so.
Nur in Clang-Clang/lib: libiomp5.so.
One can select the OpenMP library with -fopenmp=[...]
IIRC.
If choosing libgomp
still works, I wouldn't care that the files are missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That option seemed to have been considered in an ancient Clang version but dropped. Given that those are symlinks I don't see a reason to not install them just in case.
easybuild/easyblocks/l/llvm.py
Outdated
self.runtimes_cmake_args['LIBOMPTARGET_PLUGINS_TO_BUILD'] = '%s' % '|'.join(self.offload_targets) | ||
self._cmakeopts['LIBOMPTARGET_PLUGINS_TO_BUILD'] = ';'.join(self.offload_targets) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See other comment.
I don't see why we should revert this when the current approach is proven to work just fine.
We've tested this very extensively already, both with CUDA & ROCm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding offload targets, see above.
The remainder looks good to me at a quick first glance ( except the failing CI of course 😄 )
Test report by @Thyre Overview of tested easyconfigs (in order)
Build succeeded for 0 out of 1 (1 easyconfigs in total) The CMake option changes break builds where offload architectures need to be passed. |
I had messed up quoting one argument. Shall we introduce a function that does |
As for the CI failure: I'd rather make the function more flexible and use that: easybuilders/easybuild-framework#4913 IMO this is better than having to repeat our logic of determining the CCCs in many places. I had set the build option locally which is why I didn't notice it. |
Both sound like a good idea 👍 |
@Flamefire Can you look into resolving the merge conflict after the merge of #3741 ? |
2041060
to
19db109
Compare
Rebased on develop after merged PRs Currently seeing |
Test report by @Flamefire Overview of tested easyconfigs (in order)
Build succeeded for 2 out of 2 (2 easyconfigs in total) |
Test report by @Flamefire Overview of tested easyconfigs (in order)
Build succeeded for 1 out of 2 (2 easyconfigs in total) |
@Flamefire What's up with the failed tests, are those triggered by the changes in this PR? (If not, let's follow up on those in a subsequent PR) |
I though they were not as I couldn't find what was wrong. But it seems they are triggered by this PR as my cross-check test on the same node went through: #3770 (comment) I'll check again... |
be6669b
to
c84b506
Compare
c84b506
to
9d75652
Compare
easybuild/easyblocks/l/llvm.py
Outdated
@@ -234,16 +235,22 @@ def __init__(self, *args, **kwargs): | |||
"""Initialize LLVM-specific variables.""" | |||
super().__init__(*args, **kwargs) | |||
|
|||
if self.cfg['usepolly'] is not None: | |||
self.log.deprecated("Use of easyconfig parameter 'usepolly', replace by 'use_polly'", '6.0') | |||
self.cfg['use_polly'] = self.cfg['usepolly'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should only be done if use_polly
is not specified. If both are set, use_polly
value should be used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In which case would both be set? That would IMO only be done in the easyconfig in which case I don't see an indication on which should be preferred over the other, if they'd be different in the first place.
use_polly
has a default of False
, so for this change we'd need to change it to None
which makes the documentation worse (but should work without additional changes). So I would just ignore if both are set and keep the current behavior. Or if we handle it, I'd make it an error if both are set, due to the above reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we deprecate usepolly
, then it should never override the value set in use_polly
.
If that means using None
as default value for use_polly
, so be it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change implemented in #3812, this branch rebased: https://github.com/easybuilders/easybuild-easyblocks/compare/86e11ecb5de9c7b0e74e900b4aaef6b9250c06b8..631bb23a8b75ae2a858a60e527a6fb7f031ef153
9d75652
to
b740bfb
Compare
I factored out 2 other PRs to make this smaller. and rebased this on them |
86e11ec
to
631bb23
Compare
With 652 on Sapphire Rapids, 326 on AMD EPYC with H100 GPUs (exactly half!) This is already happening with #3812 which shouldn't cause any such failures. Retesting with develop branch Build looks fine on AMD EPYC (CPU node) |
Test report by @boegel Overview of tested easyconfigs (in order)
Build succeeded for 1 out of 1 (1 easyconfigs in total) |
@Flamefire this needs a rebase now that #3812 is merged? |
631bb23
to
b0e4f75
Compare
- Use f-Strings - Update comments - Use list append/extend
`LooseVersion(self.version)` in `sanity_check_step` can be done only once. Comparisons of a LooseVersion to a string already converts the string.
b0e4f75
to
d56e66a
Compare
Rebase only to replace the |
(created using
eb --new-pr
)Combined #3812 & #3815 with additional fixes. Total:
usepolly
touse_polly
consistent with other optionsget_cuda_cc_template_value
andlist_to_cmake_arg
to avoid comments and simplify coderuntimes_cmake_args
(Variables prefixed with e.g.LIBOMPTARGET
get passed directly and should rather be passed to the main configure)super()
call, f-Strings and list changesAdditional:
build_openmp_offload
for passing CUDA CCs to allow use with LLVM 18 (i.e.LIBOMPTARGET_DEVICE_ARCHITECTURES
is now set for LLVM 18 too, since LLVM 17.0 this is defaulted to "all" in CMakeLists.txt)nvptx_target_cond
block)Test easyconfigs: easybuilders/easybuild-easyconfigs#23028
Fixes #3758