Skip to content

Conversation

@Ma27
Copy link
Member

@Ma27 Ma27 commented Jan 16, 2025

Given that CppNix' n-e-j already uses my patches to bring back constituents[1], I decided to also submit the part that allows globbing in hydra-eval-jobs to this repo[2].

This basically allows you to do

constituents = [ "nixos.*" ];

to select all jobs starting with nixos. as constituents. Additionally, it fixes a bug in Hydra (that's also present here probably) where transitive constituent jobs may be discarded silently if declared by string instead of by derivation.

Given that the code became slightly larger and this part of the evaluation shouldn't blow up main even more, I decided to move it to its own compilation unit. Also fixed all the suggestions from the linter in the diff.

[1] https://determinate.systems/posts/hydra-deployment-source-of-truth/#aggregate-jobs
[2] NixOS/hydra#1425

cc @ctheune @Mic92 @Ericson2314

@Ma27 Ma27 force-pushed the constituent-globs branch from 1a82e69 to aec6fa1 Compare January 16, 2025 13:33
@Mic92
Copy link
Member

Mic92 commented Jan 16, 2025

 FAILED tests/test_eval.py::test_constituents_error - AssertionError: assert False
 +  where False = <built-in method startswith of str object at 0x8e6b90>('doesnteval: error: derivation ')
 +    where <built-in method startswith of str object at 0x8e6b90> = 'doesnteval: error:\n       … from call site\n         at /nix/store/75ixv345i0q2hqvfim7xg881ca8mag96-source/tests/ass...ument passed to builtins.toString\n\n       error: cannot coerce a set to a string: { }\ndoesntexist: does not exist\n'.startswith

Just shorten the error message check to doesnteval: error:.
No need to poke internal error messages beyond that.

@Ma27 Ma27 force-pushed the constituent-globs branch from aec6fa1 to 411a4ec Compare January 16, 2025 14:03
@Ma27
Copy link
Member Author

Ma27 commented Jan 16, 2025

Marking as draft: some tests from the Hydra PR fail and I don't know why. Unfortunately I cannot really debug since the test environment of the CppNix Hydra is broken on both master and the n-e-j branch, so this will have to wait.

@Ma27 Ma27 marked this pull request as draft January 16, 2025 14:58
@Ma27 Ma27 force-pushed the constituent-globs branch from 411a4ec to e3baca0 Compare January 23, 2025 12:21
Given that CppNix' n-e-j already uses my patches to bring back
constituents[1], I decided to also submit the part that allows globbing
in hydra-eval-jobs to this repo[2].

This basically allows you to do

    constituents = [ "nixos.*" ];

to select all jobs starting with nixos. as constituents.
Additionally, it fixes a bug in Hydra (that's also present here probably) where
transitive constituent jobs may be discarded silently if declared by string
instead of by derivation.

Given that the code became slightly larger and this part of the
evaluation shouldn't blow up `main` even more, I decided to move it to
its own compilation unit. Also fixed all the suggestions from the linter
in the diff.

[1] https://determinate.systems/posts/hydra-deployment-source-of-truth/#aggregate-jobs
[2] NixOS/hydra#1425
@Ma27 Ma27 force-pushed the constituent-globs branch from e3baca0 to abea976 Compare January 23, 2025 12:24
@Ma27 Ma27 marked this pull request as ready for review January 23, 2025 12:24
@Ma27
Copy link
Member Author

Ma27 commented Jan 23, 2025

Should be ready for review again now, found the bug (drvPath of transitive constituents wasn't updated correctly when using named constituents).
Will file a draft PR for Hydra based on this branch in a minute.

Ma27 added a commit to flyingcircusio/hydra that referenced this pull request Jan 23, 2025
Depends on nix-community/nix-eval-jobs#349 & NixOS#1421.

Almost equivalent to NixOS#1425, but with a small change: when having e.g. an
aggregate job with a glob that matches nothing, the jobset evaluation is
failed now. This was the intended behavior before (hydra-eval-jobset
fails hard if an aggregate is broken), the code-path was never reached
however since the aggregate was never marked as broken in this case
before.
@Ericson2314
Copy link
Contributor

Do we know what the error is?

Remove newline and `startswith` that effectively depend on timing.
@Ma27 Ma27 force-pushed the constituent-globs branch from b791b8c to 13b6da5 Compare February 7, 2025 16:29
@Ma27
Copy link
Member Author

Ma27 commented Feb 7, 2025

Should be good now. Removed the portions that appear timing sensitive.

Comment on lines +10 to +14
// This is copied from `libutil/topo-sort.hh` in CppNix and slightly modified.
// However, I needed a way to use strings as identifiers to sort, but still be
// able to put AggregateJob objects into this function since I'd rather not have
// to transform back and forth between a list of strings and AggregateJobs in
// resolveNamedConstituents.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an explanation for why you didn't just use libutil/topo-sort.hh?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like if instead of having

std::set<AggregateJob> you had std::map<std::string, AggregateJob> you could both

  1. Use libutil/topo-sort.hh as-is

  2. Use the automatic auto operator<=>(const AggregateJob &) const; rather than one which is "not lawful"

  3. The final std::vector<std::string> to std::vector<std::pair<std::string, AggregateJob>> would be not too bad, either a simple or std::transform?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to do this in d9febaf but it seems to compute the wrong result.
So reverted the commit again. However feel free to fix this if you see what is wrong. To me it seems these both sorting functions are not functionally equivalent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thanks for trying!

lf- pushed a commit to lix-project/lix that referenced this pull request Mar 2, 2025
Hydra used to support aggregate jobs that only succeeded when their
constituents succeed. This is still used by e.g. nixpkgs[1].

Prior art:
* https://git.lix.systems/lix-project/nix-eval-jobs/pulls/17: got ported
  into the CppNix implementation[2]
* nix-community/nix-eval-jobs#349: implements
  glob expressions for constituents - something we needed at work. This
  also restructures the code a bit which is what I re-used here. The
  globbing is not part of this patch.

Essentially, the following things happen here (assuming `--constituents`
is set):

* Derivations with `_hydraAggregate = true;` are considered aggregates.
  These are not written to stdout when received by a worker, but stored
  until the end.

* Constituents can be drv paths or strings (that must be the `attr` of
  another job). In that case, the derivation of the aggregate job is
  rewritten so that it depends on the drv of the constituent job.

* At the very end the aggregate jobs are also written to stdout.

Additionally, this fixes one bug, the old `hydra-eval-jobs`
implementation had (and we actually hit at work):

Given the leaf jobs `packages.foo` & `packages.bar`, an aggregate job
`aggregate0` with

    _hydraAggregate = true;
    constituents = [ "packages.bar" "packages.foo" ];

and an aggregate job `aggregate1` with

    constituents = [ "aggregate0" ];

then it may happen depending on the order of evaluation that `aggregate1`
depends on the old derivation of `aggregate0` (i.e. the one without
rewritten constituents) and doesn't depend on `packages.foo` and
`packages.bar` because it was rewritten before `aggregate0` was
rewritten.

This is done in here correctly, but topologically sorting the aggregate
jobs before rewriting those.

[1] https://github.com/NixOS/nixpkgs/blob/bba6b37c9d0898867a7d9c38a1b5b77efcfb07b9/nixos/release-combined.nix#L69
[2] nix-community/nix-eval-jobs#340

Change-Id: I5baad5e57336b4985ef8595e903814de83eb01c1
@Mic92
Copy link
Member

Mic92 commented Mar 26, 2025

Will work on this in #364

@Mic92 Mic92 closed this Mar 26, 2025
zowoq pushed a commit to qowoz/hydra that referenced this pull request Mar 28, 2025
Depends on nix-community/nix-eval-jobs#349 & NixOS#1421.

Almost equivalent to NixOS#1425, but with a small change: when having e.g. an
aggregate job with a glob that matches nothing, the jobset evaluation is
failed now. This was the intended behavior before (hydra-eval-jobset
fails hard if an aggregate is broken), the code-path was never reached
however since the aggregate was never marked as broken in this case
before.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants