Skip to content

filter whiteout files with find (instead of using grep) in create_tarball.sh #21

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 24, 2025

Conversation

bedroge
Copy link
Contributor

@bedroge bedroge commented Jun 24, 2025

This should solve the issues in the tarball creation steps that we see in for instance EESSI/software-layer#1131. There are some emtpy directories in the init folder, and then the grep returns 1. By adding || true (we already did that for a few other grep commands) it will always return 0.

edit: instead of adding || true to the grep command, we decided to add an option to the find command for excluding whiteout files.

Copy link
Contributor

@trz42 trz42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

An alternative way might be to use plain find with an additional condition, e.g.,

find /path/to/upper -type f ! -name '.wh.*'

Anyhow, up to you to decide if that's worth another change or we go with your change.

@bedroge
Copy link
Contributor Author

bedroge commented Jun 24, 2025

Looks good.

An alternative way might be to use plain find with an additional condition, e.g.,

find /path/to/upper -type f ! -name '.wh.*'

Anyhow, up to you to decide if that's worth another change or we go with your change.

No strong preference, but I changed the find commands.

@bedroge
Copy link
Contributor Author

bedroge commented Jun 24, 2025

Tested this version interactively in a bot/inspect.sh session for PR EESSI/software-layer#1131:

 ./create_tarball.sh /tmp 2023.06 x86_64/amd/zen2 ""  /eessi_bot_job/eessi-2023.06-software-linux-x86_64-amd-zen2-17507550540.tar.gz
>> tmpdir: /tmp/tmp.fecT0K6YhQ
>> Collecting list of files/directories to include in tarball via /tmp/software.eessi.io/overlay-upper/versions...
handling Bazel/6.1.0-GCCcore-12.3.0
handling Java/11.0.27
handling ml_dtypes/.wh..wh..opq
handling ml_dtypes/0.3.2-gfbf-2023a
handling tensorboard/.wh..wh..opq
handling tensorboard/2.15.1-gfbf-2023a
wrote file list to /tmp/tmp.fecT0K6YhQ/files.list.txt
2023.06/software/linux/x86_64/amd/zen2/modules/all/Java/11.0.27.lua
2023.06/software/linux/x86_64/amd/zen2/modules/all/Bazel/6.1.0-GCCcore-12.3.0.lua
2023.06/software/linux/x86_64/amd/zen2/modules/all/ml_dtypes/0.3.2-gfbf-2023a.lua
2023.06/software/linux/x86_64/amd/zen2/modules/all/tensorboard/2.15.1-gfbf-2023a.lua
2023.06/software/linux/x86_64/amd/zen2/modules/lang/Java/11.0.27.lua
2023.06/software/linux/x86_64/amd/zen2/modules/devel/Bazel/6.1.0-GCCcore-12.3.0.lua
2023.06/software/linux/x86_64/amd/zen2/modules/tools/ml_dtypes/0.3.2-gfbf-2023a.lua
2023.06/software/linux/x86_64/amd/zen2/modules/lib/tensorboard/2.15.1-gfbf-2023a.lua
2023.06/software/linux/x86_64/amd/zen2/software/Bazel/6.1.0-GCCcore-12.3.0
2023.06/software/linux/x86_64/amd/zen2/software/Java/11.0.27
2023.06/software/linux/x86_64/amd/zen2/software/ml_dtypes/0.3.2-gfbf-2023a
2023.06/software/linux/x86_64/amd/zen2/software/tensorboard/2.15.1-gfbf-2023a
wrote module file list to /tmp/tmp.fecT0K6YhQ/module_files.list.txt
Bazel/6.1.0-GCCcore-12.3.0
Java/11.0.27
ml_dtypes/.wh..wh..opq
ml_dtypes/0.3.2-gfbf-2023a
tensorboard/.wh..wh..opq
tensorboard/2.15.1-gfbf-2023a
>> Creating tarball /eessi_bot_job/eessi-2023.06-software-linux-x86_64-amd-zen2-17507550540.tar.gz from /cvmfs/software.eessi.io/versions/...
2023.06/software/linux/x86_64/amd/zen2/modules/all/Java/11.0.27.lua
2023.06/software/linux/x86_64/amd/zen2/modules/all/Bazel/6.1.0-GCCcore-12.3.0.lua
2023.06/software/linux/x86_64/amd/zen2/modules/all/ml_dtypes/0.3.2-gfbf-2023a.lua
2023.06/software/linux/x86_64/amd/zen2/modules/all/tensorboard/2.15.1-gfbf-2023a.lua
2023.06/software/linux/x86_64/amd/zen2/modules/lang/Java/11.0.27.lua
2023.06/software/linux/x86_64/amd/zen2/modules/devel/Bazel/6.1.0-GCCcore-12.3.0.lua
2023.06/software/linux/x86_64/amd/zen2/modules/tools/ml_dtypes/0.3.2-gfbf-2023a.lua
2023.06/software/linux/x86_64/amd/zen2/modules/lib/tensorboard/2.15.1-gfbf-2023a.lua
2023.06/software/linux/x86_64/amd/zen2/software/Bazel/6.1.0-GCCcore-12.3.0/
2023.06/software/linux/x86_64/amd/zen2/software/Bazel/6.1.0-GCCcore-12.3.0/bin/
2023.06/software/linux/x86_64/amd/zen2/software/Bazel/6.1.0-GCCcore-12.3.0/bin/bazel
2023.06/software/linux/x86_64/amd/zen2/software/Bazel/6.1.0-GCCcore-12.3.0/easybuild/
2023.06/software/linux/x86_64/amd/zen2/software/Bazel/6.1.0-GCCcore-12.3.0/easybuild/reprod/
2023.06/software/linux/x86_64/amd/zen2/software/Bazel/6.1.0-GCCcore-12.3.0/easybuild/reprod/easyblocks/
2023.06/software/linux/x86_64/amd/zen2/software/Bazel/6.1.0-GCCcore-12.3.0/easybuild/reprod/easyblocks/bazel.py
2023.06/software/linux/x86_64/amd/zen2/software/Bazel/6.1.0-GCCcore-12.3.0/easybuild/reprod/Bazel-6.1.0-GCCcore-12.3.0.env
2023.06/software/linux/x86_64/amd/zen2/software/Bazel/6.1.0-GCCcore-12.3.0/easybuild/reprod/Bazel-6.1.0-GCCcore-12.3.0.eb
2023.06/software/linux/x86_64/amd/zen2/software/Bazel/6.1.0-GCCcore-12.3.0/easybuild/reprod/hooks/
2023.06/software/linux/x86_64/amd/zen2/software/Bazel/6.1.0-GCCcore-12.3.0/easybuild/reprod/hooks/eb_hooks.py
2023.06/software/linux/x86_64/amd/zen2/software/Bazel/6.1.0-GCCcore-12.3.0/easybuild/easybuild-Bazel-6.1.0-20250624.084608.log.bz2
2023.06/software/linux/x86_64/amd/zen2/software/Bazel/6.1.0-GCCcore-12.3.0/easybuild/Bazel-6.1.0-GCCcore-12.3.0-easybuild-devel
2023.06/software/linux/x86_64/amd/zen2/software/Bazel/6.1.0-GCCcore-12.3.0/easybuild/Bazel-6.3.1_add-symlinks-in-runfiles.patch
2023.06/software/linux/x86_64/amd/zen2/software/Bazel/6.1.0-GCCcore-12.3.0/easybuild/Bazel-6.1.0-GCCcore-12.3.0.eb
2023.06/software/linux/x86_64/amd/zen2/software/Bazel/6.1.0-GCCcore-12.3.0/easybuild/easybuild-Bazel-6.1.0-20250624.084608_test_report.md
2023.06/software/linux/x86_64/amd/zen2/software/Java/11.0.27/
2023.06/software/linux/x86_64/amd/zen2/software/Java/11.0.27/man/
2023.06/software/linux/x86_64/amd/zen2/software/Java/11.0.27/man/ja_JP.UTF-8/
2023.06/software/linux/x86_64/amd/zen2/software/Java/11.0.27/man/ja_JP.UTF-8/man1/
2023.06/software/linux/x86_64/amd/zen2/software/Java/11.0.27/man/ja_JP.UTF-8/man1/rmid.1
2023.06/software/linux/x86_64/amd/zen2/software/Java/11.0.27/man/ja_JP.UTF-8/man1/javadoc.1

<lots of output>

2023.06/software/linux/x86_64/amd/zen2/software/tensorboard/2.15.1-gfbf-2023a/lib/python3.11/site-packages/docs/__pycache__/conf.cpython-311.pyc
/eessi_bot_job/eessi-2023.06-software-linux-x86_64-amd-zen2-17507550540.tar.gz created!
>> Cleaning up tmpdir /tmp/tmp.fecT0K6YhQ...

@bedroge
Copy link
Contributor Author

bedroge commented Jun 24, 2025

Hmm, there are still some .wh.* things in the output, let me check if that's expected...

@bedroge
Copy link
Contributor Author

bedroge commented Jun 24, 2025

Found it, it was due to the -o (or) in the file types, then the -name basically gets added as an and. Actually we don't need that -o, as you can list multiple types at once.

              To search for more than one type at once, you can supply the combined list of type letters separated by a comma `,' (GNU extension).

Running the updated version results in the correct output:

$ ./create_tarball.sh /tmp 2023.06 x86_64/amd/zen2 ""  /eessi_bot_job/eessi-2023.06-software-linux-x86_64-amd-zen2-17507550540.tar.gz
>> tmpdir: /tmp/tmp.bzSpAga59r
>> Collecting list of files/directories to include in tarball via /tmp/software.eessi.io/overlay-upper/versions...
handling Bazel/6.1.0-GCCcore-12.3.0
handling Java/11.0.27
handling ml_dtypes/0.3.2-gfbf-2023a
handling tensorboard/2.15.1-gfbf-2023a
wrote file list to /tmp/tmp.bzSpAga59r/files.list.txt
2023.06/software/linux/x86_64/amd/zen2/modules/all/Java/11.0.27.lua
2023.06/software/linux/x86_64/amd/zen2/modules/all/Bazel/6.1.0-GCCcore-12.3.0.lua
2023.06/software/linux/x86_64/amd/zen2/modules/all/ml_dtypes/0.3.2-gfbf-2023a.lua
2023.06/software/linux/x86_64/amd/zen2/modules/all/tensorboard/2.15.1-gfbf-2023a.lua
2023.06/software/linux/x86_64/amd/zen2/modules/lang/Java/11.0.27.lua
2023.06/software/linux/x86_64/amd/zen2/modules/devel/Bazel/6.1.0-GCCcore-12.3.0.lua
2023.06/software/linux/x86_64/amd/zen2/modules/tools/ml_dtypes/0.3.2-gfbf-2023a.lua
2023.06/software/linux/x86_64/amd/zen2/modules/lib/tensorboard/2.15.1-gfbf-2023a.lua
2023.06/software/linux/x86_64/amd/zen2/software/Bazel/6.1.0-GCCcore-12.3.0
2023.06/software/linux/x86_64/amd/zen2/software/Java/11.0.27
2023.06/software/linux/x86_64/amd/zen2/software/ml_dtypes/0.3.2-gfbf-2023a
2023.06/software/linux/x86_64/amd/zen2/software/tensorboard/2.15.1-gfbf-2023a
wrote module file list to /tmp/tmp.bzSpAga59r/module_files.list.txt
Bazel/6.1.0-GCCcore-12.3.0
Java/11.0.27
ml_dtypes/0.3.2-gfbf-2023a
tensorboard/2.15.1-gfbf-2023a
>> Creating tarball /eessi_bot_job/eessi-2023.06-software-linux-x86_64-amd-zen2-17507550540.tar.gz from /cvmfs/software.eessi.io/versions/...
2023.06/software/linux/x86_64/amd/zen2/modules/all/Java/11.0.27.lua
2023.06/software/linux/x86_64/amd/zen2/modules/all/Bazel/6.1.0-GCCcore-12.3.0.lua
2023.06/software/linux/x86_64/amd/zen2/modules/all/ml_dtypes/0.3.2-gfbf-2023a.lua
2023.06/software/linux/x86_64/amd/zen2/modules/all/tensorboard/2.15.1-gfbf-2023a.lua
2023.06/software/linux/x86_64/amd/zen2/modules/lang/Java/11.0.27.lua
2023.06/software/linux/x86_64/amd/zen2/modules/devel/Bazel/6.1.0-GCCcore-12.3.0.lua
2023.06/software/linux/x86_64/amd/zen2/modules/tools/ml_dtypes/0.3.2-gfbf-2023a.lua
2023.06/software/linux/x86_64/amd/zen2/modules/lib/tensorboard/2.15.1-gfbf-2023a.lua
2023.06/software/linux/x86_64/amd/zen2/software/Bazel/6.1.0-GCCcore-12.3.0/
2023.06/software/linux/x86_64/amd/zen2/software/Bazel/6.1.0-GCCcore-12.3.0/bin/
2023.06/software/linux/x86_64/amd/zen2/software/Bazel/6.1.0-GCCcore-12.3.0/bin/bazel

...

2023.06/software/linux/x86_64/amd/zen2/software/tensorboard/2.15.1-gfbf-2023a/lib/python3.11/site-packages/docs/__pycache__/
2023.06/software/linux/x86_64/amd/zen2/software/tensorboard/2.15.1-gfbf-2023a/lib/python3.11/site-packages/docs/__pycache__/conf.cpython-311.pyc
/eessi_bot_job/eessi-2023.06-software-linux-x86_64-amd-zen2-17507550540.tar.gz created!
>> Cleaning up tmpdir /tmp/tmp.bzSpAga59r...

@bedroge
Copy link
Contributor Author

bedroge commented Jun 24, 2025

Removed one more grep ... || true for finding installation directories that are not named like .wh.*, replaced by a find with a maxdepth 0.

One more test with this version:

$ ./create_tarball.sh /tmp 2023.06 x86_64/amd/zen2 ""  /eessi_bot_job/eessi-2023.06-software-linux-x86_64-amd-zen2-17507550540.tar.gz
>> tmpdir: /tmp/tmp.ZGLM8HISv9
>> Collecting list of files/directories to include in tarball via /tmp/software.eessi.io/overlay-upper/versions...
handling Bazel/6.1.0-GCCcore-12.3.0
handling Java/11.0.27
handling ml_dtypes/0.3.2-gfbf-2023a
handling tensorboard/2.15.1-gfbf-2023a
wrote file list to /tmp/tmp.ZGLM8HISv9/files.list.txt
2023.06/software/linux/x86_64/amd/zen2/modules/all/Java/11.0.27.lua
2023.06/software/linux/x86_64/amd/zen2/modules/all/Bazel/6.1.0-GCCcore-12.3.0.lua
2023.06/software/linux/x86_64/amd/zen2/modules/all/ml_dtypes/0.3.2-gfbf-2023a.lua
2023.06/software/linux/x86_64/amd/zen2/modules/all/tensorboard/2.15.1-gfbf-2023a.lua
2023.06/software/linux/x86_64/amd/zen2/modules/lang/Java/11.0.27.lua
2023.06/software/linux/x86_64/amd/zen2/modules/devel/Bazel/6.1.0-GCCcore-12.3.0.lua
2023.06/software/linux/x86_64/amd/zen2/modules/tools/ml_dtypes/0.3.2-gfbf-2023a.lua
2023.06/software/linux/x86_64/amd/zen2/modules/lib/tensorboard/2.15.1-gfbf-2023a.lua
2023.06/software/linux/x86_64/amd/zen2/software/Bazel/6.1.0-GCCcore-12.3.0
2023.06/software/linux/x86_64/amd/zen2/software/Java/11.0.27
2023.06/software/linux/x86_64/amd/zen2/software/ml_dtypes/0.3.2-gfbf-2023a
2023.06/software/linux/x86_64/amd/zen2/software/tensorboard/2.15.1-gfbf-2023a
wrote module file list to /tmp/tmp.ZGLM8HISv9/module_files.list.txt
Bazel/6.1.0-GCCcore-12.3.0
Java/11.0.27
ml_dtypes/0.3.2-gfbf-2023a
tensorboard/2.15.1-gfbf-2023a
>> Creating tarball /eessi_bot_job/eessi-2023.06-software-linux-x86_64-amd-zen2-17507550540.tar.gz from /cvmfs/software.eessi.io/versions/...
2023.06/software/linux/x86_64/amd/zen2/modules/all/Java/11.0.27.lua
2023.06/software/linux/x86_64/amd/zen2/modules/all/Bazel/6.1.0-GCCcore-12.3.0.lua
2023.06/software/linux/x86_64/amd/zen2/modules/all/ml_dtypes/0.3.2-gfbf-2023a.lua
2023.06/software/linux/x86_64/amd/zen2/modules/all/tensorboard/2.15.1-gfbf-2023a.lua
2023.06/software/linux/x86_64/amd/zen2/modules/lang/Java/11.0.27.lua
2023.06/software/linux/x86_64/amd/zen2/modules/devel/Bazel/6.1.0-GCCcore-12.3.0.lua
2023.06/software/linux/x86_64/amd/zen2/modules/tools/ml_dtypes/0.3.2-gfbf-2023a.lua
2023.06/software/linux/x86_64/amd/zen2/modules/lib/tensorboard/2.15.1-gfbf-2023a.lua
2023.06/software/linux/x86_64/amd/zen2/software/Bazel/6.1.0-GCCcore-12.3.0/
2023.06/software/linux/x86_64/amd/zen2/software/Bazel/6.1.0-GCCcore-12.3.0/bin/
2023.06/software/linux/x86_64/amd/zen2/software/Bazel/6.1.0-GCCcore-12.3.0/bin/bazel

...

2023.06/software/linux/x86_64/amd/zen2/software/tensorboard/2.15.1-gfbf-2023a/lib/python3.11/site-packages/docs/__pycache__/
2023.06/software/linux/x86_64/amd/zen2/software/tensorboard/2.15.1-gfbf-2023a/lib/python3.11/site-packages/docs/__pycache__/conf.cpython-311.pyc
/eessi_bot_job/eessi-2023.06-software-linux-x86_64-amd-zen2-17507550540.tar.gz created!
>> Cleaning up tmpdir /tmp/tmp.ZGLM8HISv9...

Copy link
Contributor

@trz42 trz42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@trz42 trz42 merged commit a676158 into EESSI:main Jun 24, 2025
45 checks passed
@Neves-P
Copy link
Member

Neves-P commented Jun 24, 2025

Here it working in the wild: EESSI/software-layer#1130 (comment)
🎉

@bedroge bedroge changed the title add || true to grep commands in create_tarball.sh filter whiteout files with find (instead of using grep) in create_tarball.sh Jun 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants