Skip to content

Also check the EasyBuild hooks when checking missing installations, restrict the CUDA license hook to only trigger under specific conditions #1075

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 20, 2025

Conversation

ocaisa
Copy link
Member

@ocaisa ocaisa commented May 6, 2025

No description provided.

Copy link

eessi-bot bot commented May 6, 2025

Instance eessi-bot-mc-aws is configured to build for:

  • architectures: x86_64/generic, x86_64/intel/haswell, x86_64/intel/sapphirerapids, x86_64/intel/skylake_avx512, x86_64/intel/cascadelake, x86_64/intel/icelake, x86_64/amd/zen2, x86_64/amd/zen3, aarch64/generic, aarch64/neoverse_n1, aarch64/neoverse_v1
  • repositories: eessi.io-2023.06-compat, eessi.io-2023.06-software

Copy link

eessi-bot bot commented May 6, 2025

Instance eessi-bot-mc-azure is configured to build for:

  • architectures: x86_64/amd/zen4
  • repositories: eessi.io-2023.06-compat, eessi.io-2023.06-software

@gpu-bot-ugent
Copy link

gpu-bot-ugent bot commented May 6, 2025

Instance eessi-bot-vsc-ugent is configured to build for:

  • architectures: x86_64/amd/zen3
  • repositories: eessi-hpc.org-2023.06-software, eessi.io-2023.06-compat, eessi-hpc.org-2023.06-compat, eessi.io-2023.06-software

@eessi-bot-toprichard
Copy link

Instance rt-Grace-jr is configured to build for:

  • architectures: aarch64/nvidia/grace
  • repositories: eessi.io-2023.06-software

@eessi-bot-surf
Copy link

eessi-bot-surf bot commented May 6, 2025

Instance eessi-bot-surf is configured to build for:

  • architectures: x86_64/amd/zen4, x86_64/amd/zen2
  • repositories: eessi-hpc.org-2023.06-software, eessi.io-2023.06-software, eessi.io-2023.06-compat, eessi-hpc.org-2023.06-compat

@ocaisa
Copy link
Member Author

ocaisa commented May 6, 2025

Ha, that's good, #1074 was merged while the CI here was running, so it does indeed work, see https://github.com/EESSI/software-layer/actions/runs/14855477571/job/41707688661?pr=1075

@ocaisa ocaisa added the enhancement New feature or request label May 6, 2025
Avoids
```
fatal: origin/2023.06-software.eessi.io...HEAD: no merge base
```
boegel
boegel previously approved these changes May 20, 2025
@boegel
Copy link
Contributor

boegel commented May 20, 2025

@ocaisa What needs to happen here to make CI happy?

@ocaisa
Copy link
Member Author

ocaisa commented May 20, 2025

An upstream PR has overwritten the hooks (so it has caught exactly what it was meant to catch). Ingesting any PR that updates the hook will fix the problem. I have a couple of different updates to the hooks that I want to make, I'll do the simplest one now and add it to this PR

@ocaisa ocaisa changed the title Also check the EasyBuild hooks when checking missing installations Also check the EasyBuild hooks when checking missing installations, restrict the CUDA license hook to only trigger under specific conditions May 20, 2025
@ocaisa
Copy link
Member Author

ocaisa commented May 20, 2025

@boegel This now solves the issue seen in https://gitlab.com/eessi/support/-/issues/116#note_2440400281

@ocaisa
Copy link
Member Author

ocaisa commented May 20, 2025

bot: build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/amd/zen2

Copy link

eessi-bot bot commented May 20, 2025

Updates by the bot instance eessi-bot-mc-aws (click for details)

@eessi-bot-deucalion
Copy link

eessi-bot-deucalion bot commented May 20, 2025

Updates by the bot instance eessi-bot-deucalion (click for details)
  • received bot command build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/amd/zen2 from ocaisa

    • expanded format: build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/amd/zen2
  • handling command build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/amd/zen2 resulted in:

    • no jobs were submitted

@eessi-bot-surf
Copy link

eessi-bot-surf bot commented May 20, 2025

Updates by the bot instance eessi-bot-surf (click for details)
  • received bot command build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/amd/zen2 from ocaisa

    • expanded format: build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/amd/zen2
  • handling command build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/amd/zen2 resulted in:

    • no jobs were submitted

@eessi-bot-toprichard
Copy link

Updates by the bot instance rt-Grace-jr (click for details)
  • account ocaisa has NO permission to send commands to the bot

Copy link

eessi-bot bot commented May 20, 2025

New job on instance eessi-bot-mc-aws for CPU micro-architecture x86_64-amd-zen2 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2025.05/pr_1075/64285

date job status comment
May 20 08:16:30 UTC 2025 submitted job id 64285 awaits release by job manager
May 20 08:17:08 UTC 2025 released job awaits launch by Slurm scheduler
May 20 08:23:11 UTC 2025 running job 64285 is running
May 20 08:30:19 UTC 2025 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-64285.out
✅ no message matching FATAL:
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen2-17477294020.tar.gzsize: 0 MiB (16060 bytes)
entries: 1
modules under 2023.06/software/linux/x86_64/amd/zen2/modules/all
no module files in tarball
software under 2023.06/software/linux/x86_64/amd/zen2/software
no software packages in tarball
other under 2023.06/software/linux/x86_64/amd/zen2
2023.06/init/easybuild/eb_hooks.py
May 20 08:30:19 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ OK ] ( 1/10) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/29Aug2024-foss-2023b-kokkos %scale=1_node /aeb2d9df @BotBuildTests:x86_64_amd_zen2+default
P: perf: 438.113 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 2/10) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/2Aug2023_update2-foss-2023a-kokkos %scale=1_node /04ff9ece @BotBuildTests:x86_64_amd_zen2+default
P: perf: 431.135 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 3/10) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_allreduce %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node %device_type=cpu /775175bf @BotBuildTests:x86_64_amd_zen2+default
P: latency: 1.79 us (r:0, l:None, u:None)
[ OK ] ( 4/10) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_allreduce %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node %device_type=cpu /52707c40 @BotBuildTests:x86_64_amd_zen2+default
P: latency: 1.78 us (r:0, l:None, u:None)
[ OK ] ( 5/10) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_alltoall %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node %device_type=cpu /b1aacda9 @BotBuildTests:x86_64_amd_zen2+default
P: latency: 3.86 us (r:0, l:None, u:None)
[ OK ] ( 6/10) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_alltoall %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node %device_type=cpu /c6bad193 @BotBuildTests:x86_64_amd_zen2+default
P: latency: 4.16 us (r:0, l:None, u:None)
[ OK ] ( 7/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /15cad6c4 @BotBuildTests:x86_64_amd_zen2+default
P: latency: 0.58 us (r:0, l:None, u:None)
[ OK ] ( 8/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /6672deda @BotBuildTests:x86_64_amd_zen2+default
P: latency: 0.54 us (r:0, l:None, u:None)
[ OK ] ( 9/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /2a9a47b1 @BotBuildTests:x86_64_amd_zen2+default
P: bandwidth: 7198.01 MB/s (r:0, l:None, u:None)
[ OK ] (10/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /1b24ab8e @BotBuildTests:x86_64_amd_zen2+default
P: bandwidth: 7175.87 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 10/10 test case(s) from 10 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-64285.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
May 20 11:19:29 UTC 2025 uploaded transfer of eessi-2023.06-software-linux-x86_64-amd-zen2-17477294020.tar.gz to S3 bucket succeeded

@@ -965,52 +965,56 @@ def post_postproc_cuda(self, *args, **kwargs):
Remove files from CUDA installation that we are not allowed to ship,
and replace them with a symlink to a corresponding installation under host_injections.
"""
if self.name == 'CUDA':
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change here is actually quite minimal. The previous condition was

if self.name == 'CUDA' and eessi_installation:

which meant that when someone tried to install CUDA but wasn't doing an EESSI installation you would (incorrectly) get the EasyBuildError

Copy link
Member Author

@ocaisa ocaisa May 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change works as expected (tested using EESSI-extend):

ocaisa@~/EESSI/software-layer(hooks_check)$ module list

Currently Loaded Modules:
  1) EESSI/2023.06   2) EasyBuild/5.0.0   3) EESSI-extend/2023.06-easybuild

ocaisa@~/EESSI/software-layer(hooks_check)$ eb CUDA-12.1.1.eb --rebuild --accept-eula-for=CUDA --hooks=eb_hooks.py
...
== Running post-postproc hook...
== EESSI hook to respect CUDA license not triggered for installation path /home/ocaisa/eessi/versions/2023.06/software/linux/x86_64/intel/skylake_avx512/software/CUDA/12.1.1

@ocaisa ocaisa added the ready-to-deploy Mark a PR as ready to deploy label May 20, 2025
@adammccartney
Copy link

Nice, this also works for us on x86-64/zen4. Thanks!

@boegel boegel added bot:deploy Ask bot to deploy missing software installations to EESSI and removed ready-to-deploy Mark a PR as ready to deploy labels May 20, 2025
@eessi-bot-toprichard
Copy link

Label bot:deploy has been set by user boegel, but this person does not have permission to trigger deployments

@boegel
Copy link
Contributor

boegel commented May 20, 2025

staging PR merged...

@ocaisa
Copy link
Member Author

ocaisa commented May 20, 2025

@boegel Ingested and CI passing now

@boegel boegel merged commit fb040d6 into EESSI:2023.06-software.eessi.io May 20, 2025
71 of 119 checks passed
Copy link

eessi-bot bot commented May 20, 2025

PR merged! Moved ['/project/def-users/SHARED/jobs/2025.05/pr_1075/64285'] to /project/def-users/SHARED/trash_bin/EESSI/software-layer/2025.05.20

@boegel
Copy link
Contributor

boegel commented May 20, 2025

@ocaisa The updated CI workflow may need a follow-up here, since the post-merge run of the modified CI workflow failed with:

 fatal: ambiguous argument 'origin/...HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
File not changed in PR. Using default branch version.
fatal: invalid object name 'origin/'.
Error: Process completed with exit code 128.

see https://github.com/EESSI/software-layer/actions/runs/15139304479/job/42558788609

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bot:deploy Ask bot to deploy missing software installations to EESSI enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants