Skip to content

[rush] Split ProjectChangeAnalyzer, fix build cache hashes #4476

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Oct 17, 2024

Conversation

dmichon-msft
Copy link
Contributor

@dmichon-msft dmichon-msft commented Jan 9, 2024

Summary

Fixes #4400

Makes #3994 more feasible.

Details

Splits the functionality of ProjectChangeAnalyzer into two components:

  1. ProjectChangeAnalyzer#_tryGetSnapshotProviderAsync - Performs all the async I/O operations to determine the current state of the repository, including taking a snapshot of process.env so that per-operation hashes can be lazily computed in a stable manner.
  2. InputsSnapshot - Synchronous data structure representing the state of the repository, which can be queried for operation-level state hashes, or tracked file lists. Performs the evaluation of input/output collisions as part of evaluating said queries. Includes evaluation of incrementalBuildIgnoredGlobs and dependsOnEnvVars, but does not include the hashes from dependencies, since the Operation graph is not provided to this API.

Moves build cache hash computation to the beforeExecuteOperations tap in CacheableOperationPlugin and updates the computation to factor in the runtime operation graph, rather than only the npm dependencies.

For a specific example of how this alters the build cache, consider a repository with one project and two phases in rush build:
Project A
Phases _phase:first and _phase:second, with _phase:second depends on self: ['_phase:first']

Define a command line parameter --some-flag-for-first that is forwarded only to _phase:first.

Before this change, running rush build and rush build --some-flag-for-first would produce different cache IDs for A (_phase:first), but the same cache id for A (_phase:second).

With this change the cache IDs for A (_phase:second) are also impacted by the presence of the --some-flag-for-first, since it can change the output of A (_phase:first).

Added a new flag enabled to Operation to control whether or not a given operation in the graph should be executed. Updated graph construction to use this flag for operations that are dependencies of the user's selection, but not directly included (e.g. when using --only or --impacted-by), or during watch-mode incremental builds.

Added a new property cacheHashSalt to build-cache.json to allow repository maintainers to globally force cache misses if they have altered the build process in a way that Rush can't currently detect, for example if Rush plugins emit files that would be picked up by the cache.

How it was tested

For the splitting, added unit tests for InputSnapshot and ProjectChangeAnalyzer#_tryGetSnapshotProviderAsync.

For the build cache IDs, temporarily removed applicability of the --production CLI parameter from _phase:test and validated that running node ./apps/rush/lib/start-dev.js test and node ./apps/rush/lib/start-dev.js test --production still produced different cache keys for _phase:test.

For comparison, validated that the command-line.json alternation and existing rush test and rush test --production had a collision on the cache keys for _phase:test.

Validated cache reads/writes using the heft-node-everything-test project. Confirmed that cache is not written during a rush rebuild -o, but can be read during rush build -o. Confirmed that cache reads can occur during rush start -o, even on subsequent iterations that leverage the new :incremental suffixed scripts.

Validate the cacheHashSalt property by setting the salt to different values and validating cache/miss hit depending on whether or not a prior build had happened with the same salt.

Impacted documentation

Any documentation about how build cache keys are computed.

Copy link
Member

@D4N14L D4N14L left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should do a pre-publish to verify that everything that is changing here doesn't break our flow, since we're very likely the largest customer on this tool.

@dmichon-msft dmichon-msft force-pushed the split-project-change-analyzer branch from c7d98ec to 1836e05 Compare January 19, 2024 23:54
@dmichon-msft dmichon-msft force-pushed the split-project-change-analyzer branch 2 times, most recently from f4a0556 to 5a6794d Compare September 27, 2024 00:31
Copy link
Contributor Author

@dmichon-msft dmichon-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed feedback.

@dmichon-msft dmichon-msft force-pushed the split-project-change-analyzer branch from 13b7dbc to 2dcdd49 Compare October 3, 2024 19:55
@iclanton iclanton force-pushed the split-project-change-analyzer branch from 2dcdd49 to d8dc986 Compare October 3, 2024 22:50
@dmichon-msft dmichon-msft force-pushed the split-project-change-analyzer branch from d8dc986 to 5638027 Compare October 4, 2024 00:56
@dmichon-msft dmichon-msft force-pushed the split-project-change-analyzer branch from e57e3dc to 4edbbb8 Compare October 4, 2024 20:50
@iclanton iclanton merged commit 3530cb2 into microsoft:main Oct 17, 2024
4 checks passed
@octogonz octogonz mentioned this pull request Nov 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Closed
Development

Successfully merging this pull request may close these issues.

[rush] Build cache keys do not correctly account for configuration changes in dependencies
3 participants