You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: blog/2020/05/invalidations.md
+19-18Lines changed: 19 additions & 18 deletions
Original file line number
Diff line number
Diff line change
@@ -42,8 +42,8 @@ end
42
42
Here I've defined two functions; `f` is a function with two very simple methods,
43
43
and `arrayf` is a function with just one method that supports `Any` argument at all.
44
44
When you call `applyf`, Julia will compile _specialized_ versions on demand for the
45
-
particular types of `container` that you're using at that moment (even though I didn't
46
-
use a single type in its definition!).
45
+
particular types of `container` that you're using at that moment, even though I didn't
46
+
specify a single type in its definition.
47
47
48
48
If you call `applyf([100, 200])`, Julia will compile and use a version of `applyf` specifically
49
49
created for `Vector{Int}`. Since the element type (`Int`) is a part of the `container`'s type, it
@@ -80,12 +80,12 @@ In this case, you can see that Julia knew those two `arrayref` statements would
80
80
return a `Bool`, and since it knows the value of `f(::Bool)` it just went
81
81
ahead and computed the result at compile time for you.
82
82
83
-
At the end of these experiments, hidden away in Julia's "method cache" there will
83
+
After calling `applyf` with both sets of arguments, hidden away in Julia's "method cache" there will
84
84
be two `MethodInstance`s of `applyf`, one specialized for `Vector{Int}` and the other specialized for `Vector{Bool}`.
85
85
You don't normally see these, but Julia manages them for you; anytime you write
86
86
code that calls `applyf`, it checks to see if this previous compilation work can be reused.
87
87
88
-
For the purpose of this blog post, things start to get especially interesting if we try the following:
88
+
For the purpose of this blog post, things start to get especially interesting if use a container that can store elements with different types (here, type `Any`):
89
89
90
90
```julia-repl
91
91
julia> c = Any[1, false];
@@ -215,7 +215,7 @@ julia> @btime applyf($c)
215
215
It's almost a tenfold difference.
216
216
If `applyf` is performance-critical, you'll be very happy that Julia tries to give you the best version it can, given the available information.
217
217
But this leaves the door open to invalidation, which means recompilation the next time you use `applyf`.
218
-
If method invalidation happens often, this might contribute to making Julia "feel" sluggish.
218
+
If method invalidation happens often, this might contribute to making Julia feel sluggish.
219
219
220
220
## How common is method invalidation?
221
221
@@ -405,7 +405,7 @@ This list is ordered from least- to most-consequential in terms of total number
405
405
The final entry, for `(::Type{X})(x::Real) where X<:FixedPoint`, triggered the invalidation of what nominally appear to be more than 350 `MethodInstance`s.
406
406
(There is no guarantee that these methods are all disjoint from one another;
407
407
the results are represented as a tree, where each node links to its callers.)
408
-
In contrast, the first entry is responsible for just two invalidations.
408
+
In contrast, the first three entries are responsible for a tiny handful of invalidations.
409
409
410
410
One does not have to look at this list for very long to see that the majority of the invalidated methods are due to [method ambiguity].
411
411
Consider the line `...char.jl:48 with MethodInstance for (::Type{T} where T<:AbstractChar)(::Int32)`.
@@ -471,11 +471,11 @@ Consequently, we can turn our attention to other cases.
471
471
472
472
For now we'll skip `trees[end-1]`, and consider `tree[end-2]` which results from defining
473
473
`sizeof(::Type{X}) where X<:FixedPoint`.
474
-
Simply put, this looks like a method that we don't need; perhaps it dates from some confusion, or an era where perhaps it was necessary.
474
+
There's a perfectly good default definition, and this looks like a method that we don't need; perhaps it dates from some confusion, or an era where perhaps it was necessary.
475
475
So we've discovered an easy place where a developer could do something to productively decrease the number of invalidations, in this case by just deleting the method.
476
476
477
-
You'll also notice one example where the new method is *less specific*.
478
-
It is not clear why such methods should be invalidating, and this may be a Julia bug.
477
+
Rarely (in other packages) you'll notice cases where the new method is *less specific*.
478
+
It is not clear why such methods should be invalidating, and this may be either a SnoopCompile or Julia bug.
@@ -537,8 +539,8 @@ Let's go back to our table above, and count the number of invalidations in each
537
539
| DifferentialEquations | 5152 | 18 | 6218 |
538
540
539
541
The numbers in this table don't add up to those in the first, for a variety of reasons (here there is no attempt to remove duplicates, here we don't count "mt_cache" invalidations which were included in the first table, etc.).
540
-
In general terms, the last two columns should probably be fixed by changes in how Julia does invalidations; the first column indicates invalidations that should either be fixed in packages, Julia's own code, or will need to remain unfixed.
541
-
The good news is that these counts reveal that much will likely be fixed by "automated" means.
542
+
In general terms, the last two columns should probably be fixed by changes in how Julia does invalidations; the first column is a mixture of ones that might be removed by changes in Julia (if they are due to partial specialization) or ones that should either be fixed in packages, Base and the standard libraries, or will need to remain unfixed.
543
+
The good news is that these counts reveal that more than half of all invalidations will likely be fixed by "automated" means.
542
544
However, it appears that there will need to be a second round in which package developers inspect individual invalidations to determine what, if anything, can be done to remediate them.
543
545
544
546
## Fixing invalidations
@@ -559,7 +561,7 @@ Here our focus is on those marked "more specific," since those are cases where i
559
561
### Fixing type instabilities
560
562
561
563
In engineering Julia and Revise to reduce invalidations, at least two cases were fixed by resolving type-instabilities.
562
-
For example, one set of invalidations happened because `CodeTracking`, a dependency of Revise's, defines new methods for `Base.PkgId`.
564
+
For example, one set of invalidations happened because CodeTracking, a dependency of Revise's, defines new methods for `Base.PkgId`.
563
565
It turns out that this triggered an invalidation of `_tryrequire_from_serialized`, which is used to load packages.
564
566
Fortunately, it turned out to be an easy fix: one section of `_tryrequire_from_serialized` had a passage
565
567
@@ -582,7 +584,7 @@ immediately after the `for` statement to fix the problem.
582
584
Not only does this fix the invalidation, but it lets the compiler generate better code.
583
585
584
586
The other case was a call from `Pkg` of `keys` on an AbstractDict of unknown type
585
-
(due to a caller's `@nospecialize` annotation).
587
+
(due to partial specialization).
586
588
Replacing `keys(dct)` with `Base.KeySet(dct)` (which is the default return value of `keys`) eliminated a very consequential invalidation, one that triggered seconds-long latencies in the next `Pkg` command after loading Revise.
587
589
The benefits of this change in Pkg's code went far beyond helping Revise; any package depending on the OrderedCollections package (which is a dependency of Revise and what actually triggered the invalidation) got the same benefit.
588
590
With these and a few other relatively simple changes, loading Revise no longer forces Julia to recompile much of Pkg's code the next time you try to update packages.
@@ -646,7 +648,6 @@ For addressing this purely at the level of Julia code, perhaps the best approach
646
648
647
649
```julia-repl
648
650
julia> show(node)
649
-
julia> show(node)
650
651
MethodInstance for reduce_empty(::Function, ::Type{T} where T)
651
652
MethodInstance for reduce_empty(::Base.BottomRF{typeof(max)}, ::Type{VersionNumber})
652
653
MethodInstance for reduce_empty_iter(::Base.BottomRF{typeof(max)}, ::Set{VersionNumber}, ::Base.HasEltype)
@@ -712,7 +713,7 @@ perhaps a nicer approach would be to allow one to supply `init` as a keyword arg
712
713
While this is not supported on Julia versions up through 1.5, it's a feature that seems to make sense, and this analysis suggests that it might also allow developers to make code more robust against certain kinds of invalidation.
713
714
714
715
As this hopefully illustrates, there's often more than one way to "fix" an invalidation.
715
-
Finding the best approach may require that we as a community develop experience with this novel consideration.
716
+
Finding the best approach may require some experimentation.
716
717
717
718
## Summary
718
719
@@ -721,7 +722,7 @@ These advantages come with a few costs, and here we've explored one of them, met
721
722
While Julia's core developers have been aware of its cost for a long time,
722
723
we're only now starting to get tools to analyze it in a manner suitable for a larger population of users and developers.
723
724
Because it's not been easy to measure previously, it would not be surprising if there are numerous opportunities for improvement waiting to be discovered.
724
-
One might hope that the next period of development might see significant improvement in getting packages to work together without stomping on each other's toes.
725
+
One might hope that the next period of development might see significant improvement in getting packages to work together gracefully without stomping on each other's toes.
0 commit comments