-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
add some "no invalidations" tests for the sysimage #57884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
a09863b
to
885c03e
Compare
02674e3
to
26efc09
Compare
1efd53f
to
2fc945e
Compare
2fc945e
to
18504c1
Compare
115b46a
to
9cd212b
Compare
d99e0a8
to
8abdb2c
Compare
Actually I had a discussion with @timholy on Slack regarding this PR back in March, amazingly it seems I can still read it: @timholy was basically advocating to check for good inference instead of checking for invalidations, although TBH I'm not sure they really checked out this PR. I'm pretty sure:
|
To expand on this: I want to ensure that user packages can create new types without causing invalidation of the sysimage. Otherwise there's a conflict between wanting to use Julia naturally in one's package and having to "ensure" the sysimage doesn't get invalidated (which is not something package authors should need to worry about). This PR does not forbid all invalidation or all world-splitting. |
Furthermore, as can be seen from the test set names, this PR currently only encompasses some examples of adding subtypes of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no denying that this tests for something practically important: that certain common extensions don't trigger invalidation and thus longer TTFX. My concern is that it will catch issues for which there is no obvious fault and no sensible resolution.
For example, suppose someone submits a PR that breaks this test; let's say that adding a new foo
method invalidates the compiled code for bar
. Any one of the following changes in the PR would "fix" this test, but only the ones marked (*) represent an improvement in Julia:
- add a dummy
foo
method to Base, pushing the number of applicable methods abovemax_methods
for our default AbstractInterpreter (currently 3) - if there are already >3
foo
methods but the invalidation inbar
happens becausebar
is partially inferrable so that you at least know you're dealing withfoo(::Real)
and there are only a couple of applicable methods, you could worsen inference inbar
so that inference only knows that it'sfoo(::Any)
. Thus more than 3foo
methods are potentially applicable at that call site and you've exceededmax_methods
by makingbar
worse. - stop precompiling
bar
so it can't be invalidated - (maybe *) add an
invokelatest
orBase.invoke_in_world
inbar
- (*) improve inference in
bar
I'm not saying there isn't a reason to consider any but the last; indeed, there was at least one case where we did the first of these because it prevented a common cause of massive invalidations. But this case can also be easily tested: @test length(methods, (foo, T)) > 3
.
I think it's fair to say that the only case that's generically beneficial is the last one. To me this argues that we need an "inference quality" test. This PR is a very indirect way of testing inference quality, and I think we'd be better served by directly testing what matters.
I disagree on some points:
|
91b09e0
to
dec4a8b
Compare
I don't understand this conclusion. You yourself list "improve inference in |
incoming opinion from the peanut gallery: I generally associate unit tests (aka everything running under something like this of course a bit blurs the line between testing "correctness" vs testing "quality" because changes in inference can and do change the results of operations. but I think personally speaking I would still associate this more with a "quality" pass than something to be put in unit tests. it could even still be in CI, just under
I think I tend to disagree with this sentiment, in that why not do both! |
I agree, speaking generally. The issue is that the sysimage and precompilation are really essential features for Julia, otherwise TTFX becomes unbearable. So I believe that sysimage invalidation should block a PR from being merged, thus these tests should be in Julia's test suite.
Agree, apart from getting this PR merged, I definitely want more metrics available in CI. Of both the "inference quality" and of "invalidation count" kinds. |
1266a12
to
bd269ba
Compare
6d07c9d
to
8967b5e
Compare
ref: #47093 |
f233ddd
to
9c8c8a1
Compare
The `eltype` function was one of the few functions in the sysimage with a `max_methods` value (the world-splitting threshold) greater than the default. This was a workaround for the unnecessarily large number of methods of `eltype(::Type{<:Tuple})`. The `max_methods` value was increased in PR JuliaLang#48322 to help effect inference. Reduce the number of `eltype(::Type{<:Tuple})` methods, which then also allows keeping the `max_methods` for `eltype` at the default value. The intent here is to guard against unnecessary invalidation of the sysimage or other precompiled code, which might require decreasing the `max_methods` value to the natural minimum for many interface functions, by which I mean "generic function meant to have methods added to it by package authors". See also: PR JuliaLang#57884.
The `eltype` function was one of the few functions in the sysimage with a `max_methods` value (the world-splitting threshold) greater than the default. This was a workaround for the unnecessarily large number of methods of `eltype(::Type{<:Tuple})`. The `max_methods` value was increased in PR JuliaLang#48322 to help effect inference. Reduce the number of `eltype(::Type{<:Tuple})` methods, which then also allows keeping the `max_methods` for `eltype` at the default value. The intent here is to guard against unnecessary invalidation of the sysimage or other precompiled code, which might require decreasing the `max_methods` value to the natural minimum for many interface functions, by which I mean "generic function meant to have methods added to it by package authors". I intend to approach this issue in following PRs. See also: PR JuliaLang#57884. Regarding future work: I consider the "natural minimum" value for interface functions taking types to be **two**, because the bottom type subtypes each type. If the interface function doesn't accept types, the natural minimum should be **one**.
Test that defining certain handpicked type-method pairs (as a user) doesn't result in any invalidation of the sysimage. Some of the tests are marked as broken. Merging this will make the contribution process more demanding: when someone makes a PR they might need to fix unrelated type stability issues, or decrease `max_methods` for some function, before being able to merge their initial PR. However that's better than ignoring the constant danger of type instability/invalidation regressions making the sysimage and precompilation less useful. This change should only be merged while up-to-date with the `master` branch. Fixes JuliaLang#47022
6c43f23
to
bc6ab5c
Compare
The `eltype` function was one of the few functions in the sysimage with a `max_methods` value (the world-splitting threshold) greater than the default. This was a workaround for the unnecessarily large number of methods of `eltype(::Type{<:Tuple})`. The `max_methods` value was increased in PR JuliaLang#48322 to help effect inference. Reduce the number of `eltype(::Type{<:Tuple})` methods, which then also allows keeping the `max_methods` for `eltype` at the default value. The intent here is to guard against unnecessary invalidation of the sysimage or other precompiled code, which might require decreasing the `max_methods` value to the natural minimum for many interface functions, by which I mean "generic function meant to have methods added to it by package authors". I intend to approach this issue in following PRs. See also: PR JuliaLang#57884. Regarding future work: I consider the "natural minimum" value for interface functions taking types to be **two**, because the bottom type subtypes each type. If the interface function doesn't accept types, the natural minimum should be **one**.
Test that defining certain handpicked type-method pairs (as a user) doesn't result in any invalidation of the sysimage.
Some of the tests are marked as broken.
Merging this will make Julia's contribution process more demanding: when someone makes a PR they might need to fix unrelated type stability issues, or decrease
max_methods
for some function, before being able to merge their initial PR. However that's better than ignoring the constant danger of type instability/invalidation regressions making the sysimage and precompilation less useful.This change should only be merged while up-to-date with the
master
branch.Fixes #47022