-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Improve Ci cache #16709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Ci cache #16709
Conversation
linux build test: 5m → 2m really like linux build test improvement, as it's a bottleneck for the other tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @blaginin -- this looks good to me
TIL about https://github.com/apache/datafusion/actions/caches which is great
I did look at the timings for the CI jobs on this PR and on main and they didn't seem all that different. However since most of this caching is happening on main I suspect we'll see the benefits after we merge this
Thanks again for working on this area
@@ -45,5 +45,7 @@ runs: | |||
rustup component add rustfmt | |||
- name: Setup rust cache | |||
uses: Swatinem/rust-cache@v2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL that rust-cache
caches the Rust binaries too
https://github.com/Swatinem/rust-cache?tab=readme-ov-file#cache-details
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, something I noticed was maybe we should pin to an actual hash rather than a tag that is updated
https://github.com/Swatinem/rust-cache/tags
I think security best practice would be to pin to Swatinem/rust-cache@98c8021 for example rather than https://github.com/Swatinem/rust-cache/releases/tag/v2
However, this wasn't introduced in this PR so I don't think it needs to be fixed here
@@ -45,5 +45,7 @@ runs: | |||
rustup component add rustfmt | |||
- name: Setup rust cache | |||
uses: Swatinem/rust-cache@v2 | |||
with: | |||
save-if: ${{ github.ref_name == 'main' }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this also have the shared-key: amd
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a macos run, so I don't this we should use shared key with other runs (which are on linux)
If shared key isn't set, job-based one is used: https://github.com/Swatinem/rust-cache/blob/7e1e2d0a10862b34e5df481373b2b0f295d1a2ef/action.yml#L10
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So should be good as is!
This line is added because we now cache CI runs on any branch, which takes up all our free CI space 😱 (which also made it hard to test during development)
https://github.com/apache/datafusion/actions/runs/16207285209/job/45760367631?pr=16709 |
Yes, exactly. When doing the development, I removed "save-if" condition: https://github.com/apache/datafusion/actions/runs/16128523988 |
Well, let's merge it and see how it works in action and iterate ! |
before after seems to be working! |
13% faster -- not bad! |
Which issue does this PR close?
Rationale for this change
CI is slow - and for many stages, downloading dependencies / build is takes a considerable amount of time:
All these are useless and can be checked / compiled once on main:
What changes are included in this PR?
We can reduce build time if we start caching dependencies.
The size of one cache build is roughly 500 MB:
Github has a cache size of 10 GB - so I think we should be able to cache build on
main
for the slowest steps on main - to make them faster!