Skip to content

Ginkgo requires a few MPI slots at configure time, so allow oversubscription #23078

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 13, 2025

Conversation

ocaisa
Copy link
Member

@ocaisa ocaisa commented Jun 11, 2025

(created using eb --new-pr)

Also noticed that compilation can be memory hungry so limited parallelism, and one test is flaky as confirmed by the developers.

@ocaisa
Copy link
Member Author

ocaisa commented Jun 11, 2025

@boegelbot please test @ jsc-zen3

@boegelbot
Copy link
Collaborator

@ocaisa: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=23078 EB_ARGS= EB_CONTAINER= EB_REPO=easybuild-easyconfigs EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_23078 --ntasks=8 ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 6675

Test results coming soon (I hope)...

- notification for comment with ID 2963401420 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@ocaisa ocaisa marked this pull request as draft June 11, 2025 17:19
@boegelbot
Copy link
Collaborator

Test report by @boegelbot
FAILED
Build succeeded for 1 out of 2 (2 easyconfigs in total)
jsczen3c2.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.5, x86_64, AMD EPYC-Milan Processor (zen3), Python 3.9.21
See https://gist.github.com/boegelbot/374c6e1906348cfe37fe321071a6d110 for a full test report.

@ocaisa ocaisa marked this pull request as ready for review June 12, 2025 12:26
@ocaisa
Copy link
Member Author

ocaisa commented Jun 12, 2025

@boegelbot please test @ jsc-zen3-a100

@boegelbot
Copy link
Collaborator

@ocaisa: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=23078 EB_ARGS= EB_CONTAINER= EB_REPO=easybuild-easyconfigs EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_23078 --ntasks=8 --partition=jsczen3g --gres=gpu:1 ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 6676

Test results coming soon (I hope)...

- notification for comment with ID 2966507217 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
jsczen3g1.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.5, x86_64, AMD EPYC-Milan Processor (zen3), 1 x NVIDIA NVIDIA A100 80GB PCIe, 555.42.06, Python 3.9.21
See https://gist.github.com/boegelbot/672c96a86c4f19eb8f1a5bff35152bc4 for a full test report.

@ocaisa
Copy link
Member Author

ocaisa commented Jun 12, 2025

I'm carrying out a lot of test builds in EESSI/software-layer#1127 , the signs are good

@ocaisa
Copy link
Member Author

ocaisa commented Jun 13, 2025

@boegelbot please test @ jsc-zen3-a100

@boegelbot
Copy link
Collaborator

@ocaisa: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=23078 EB_ARGS= EB_CONTAINER= EB_REPO=easybuild-easyconfigs EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_23078 --ntasks=8 --partition=jsczen3g --gres=gpu:1 ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 6687

Test results coming soon (I hope)...

- notification for comment with ID 2970463094 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
jsczen3g1.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.5, x86_64, AMD EPYC-Milan Processor (zen3), 1 x NVIDIA NVIDIA A100 80GB PCIe, 555.42.06, Python 3.9.21
See https://gist.github.com/boegelbot/ea7365ed41d931616f775d5f009d4c7f for a full test report.

Copy link
Member

@boegel boegel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@boegel
Copy link
Member

boegel commented Jun 13, 2025

Test report by @boegel
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
node3611.doduo.os - Linux RHEL 9.4, x86_64, AMD EPYC 7552 48-Core Processor (zen2), Python 3.9.18
See https://gist.github.com/boegel/51dc54b3ccf47233ce8c6009602431c4 for a full test report.

@smoors smoors added this to the next release (5.1.1?) milestone Jun 13, 2025
@smoors
Copy link
Contributor

smoors commented Jun 13, 2025

Going in, thanks @ocaisa!

@smoors smoors merged commit 6300ec9 into easybuilders:develop Jun 13, 2025
8 checks passed
@boegel boegel added bug fix and removed change labels Jun 13, 2025
@boegel
Copy link
Member

boegel commented Jun 13, 2025

Test report by @boegel
FAILED
Build succeeded for 0 out of 2 (2 easyconfigs in total)
node3300.joltik.os - Linux RHEL 9.4, x86_64, Intel(R) Xeon(R) Gold 6242 CPU @ 2.80GHz (cascadelake), 1 x NVIDIA Tesla V100-SXM2-32GB, 570.133.20, Python 3.9.18
See https://gist.github.com/boegel/f8572c4f5de03738dd18f277f2068654 for a full test report.

@ocaisa ocaisa deleted the 20250611180302_new_pr_Ginkgo190 branch June 14, 2025 05:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants