-
-
Notifications
You must be signed in to change notification settings - Fork 451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: blog post on invalidations #794
Conversation
blog/2020/05/invalidations.md
Outdated
| DataFrames | 4048 | | ||
| JuMP | 4666 | | ||
| Makie | 6118 | | ||
| DifferentialEquations | 6777 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh wow. Can I see this list? I'm a bad bad man.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not future-proof given the reporting will likely change, but you can just throw this action into a PR for the report https://github.com/ianshmean/Plots.jl/pull/1/files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're not bad, it's mostly a consequence of having lots of dependencies that extend low-level functionality. (Packages that are essentially totally isolated do not cause invalidations, it's when you start extending functionality that this becomes an issue.) It's not very hard to generate this yourself if you can build Julia, but to save you the trouble I've posted it here. About 50% of the nominal invalidations come from just 7 methods (out of 90 total that cause invalidations). And the packages for these 7 are:
StaticArrays
StaticArrays
Intervals
Unitful
ProgressLogging
SymbolicUtils
Intervals
I should probably develop printing options that, e.g., allow you to cull the ones with few children, but for now this is the state-of-the-art.
Once Julia stops considering ambiguous type-intersections as invalidating, your total will go down a lot (estimated to fix about two-thirds of "your" total count).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StaticArrays seems to be showing up in a lot of these. If that's in the sysimage, I wonder how many downstream packages get a faster compile time (+ the ambiguous fix).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that's in the sysimage
You can just put it in your sysimage and try (PackageCompiler.jl)
I greatly expanded the information about how to fix problems and added an overall summary. Having chosen a demo problem to investigate, it seemed reasonable to also submit the fix, which is JuliaLang/julia#35839 (it will also require some changes to Pkg). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor comments related to the site generator, they can be ignored.
Thanks @tlienart! Any tips on controlling text color? It's not necessary, but the diagnostics like |
No unfortunately that would require having a specific highlighter for code output, doable but not done :) |
Just FYI, I proposed to add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️ This is a really incredible deep dive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very helpful suggestions. Thanks for taking the time to read it so carefully!
blog/2020/05/invalidations.md
Outdated
!=(x, y) = !(x == y) | ||
``` | ||
|
||
Since such definitions account for hundreds of nominal invalidations, it would be well worth considering whether it is possible to delete the custom `!=` methods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question! It's certainly possible to trigger with ==
:
julia> using SnoopCompile
julia> trees = invalidation_trees(@snoopr Base.:(==)(::Dict, ::Integer) = false)
1-element Array{SnoopCompile.MethodInvalidations,1}:
insert ==(::Dict, ::Integer) in Main at REPL[2]:1 invalidated:
backedges: MethodInstance for ==(::Dict{String,Union{Base.SHA1, String}}, ::Any) triggered MethodInstance for isequal(::Dict{String,Union{Base.SHA1, String}}, ::Any) (14 children) ambiguous
MethodInstance for ==(::Dict, ::Any) triggered MethodInstance for isequal(::Dict, ::Any) (4 children) more specific
It just happens to turn out that ==(::Any, ::Any)
has no dependencies in a fresh Julia, so there is nothing to invalidate:
julia> mi = instance(==, (Any, Any))
julia> mi = instance(!=, (Any, Any))
MethodInstance for !=(::Any, ::Any)
julia> mi.backedges
15-element Array{Any,1}:
MethodInstance for cmp(::AbstractString, ::String)
MethodInstance for _show_nonempty(::IOContext{Base.GenericIOBuffer{Array{UInt8,1}}}, ::SubArray{T,2,P,I,L} where L where I where P where T, ::String)
MethodInstance for convert(::Type, ::AbstractDict)
MethodInstance for convert(::Type{Dict}, ::AbstractDict)
MethodInstance for cmp(::AbstractString, ::SubString{String})
MethodInstance for registered_name(::Pkg.Types.Context, ::Base.UUID)
MethodInstance for cmp(::SubString{String}, ::AbstractString)
MethodInstance for print_matrix(::IOContext{REPL.Terminals.TTYTerminal}, ::Array{Int64,1}, ::String, ::String, ::String, ::String, ::String, ::String, ::Int64, ::Int64)
MethodInstance for print_matrix(::IOContext{REPL.Terminals.TTYTerminal}, ::Array{String,1}, ::String, ::String, ::String, ::String, ::String, ::String, ::Int64, ::Int64)
MethodInstance for serialize_array_data(::Sockets.TCPSocket, ::Array)
MethodInstance for cmp(::SubString, ::String)
MethodInstance for common_prefix(::Any)
MethodInstance for cmp(::AbstractString, ::AbstractString)
MethodInstance for history_move_prefix(::REPL.LineEdit.PrefixSearchState, ::REPL.REPLHistoryProvider, ::AbstractString, ::Bool, ::Any)
MethodInstance for print_matrix(::IOContext{REPL.Terminals.TTYTerminal}, ::Array{Any,1}, ::String, ::String, ::String, ::String, ::String, ::String, ::Int64, ::Int64)
where instance
can be obtained from JuliaLang/julia#35855.
FWIW, I found it helpful for my own understanding to also look at the problem from "the other side" — that is, just looking at what backedges exist and watching how they're updated/changed. Once I got this down, the invalidations became more obvious to me. The key being that the backedges from abstract specializations/entries are the ones that are going to be prone to invalidation by method addition:
Here's a session that was helpful to me:
julia> function backedges(f)
mm = methods(f)
mt = mm.mt
println("### Method table backedges ###")
mtbe = isdefined(mt, :backedges) ? mt.backedges : []
for (tt, mi) in Iterators.partition(mtbe, 2)
println(tt => mi)
end
println("### Method instance backedges ###")
for m in mm.ms
m.specializations === nothing && continue
Base.visit(m.specializations) do mi
!isdefined(mi, :backedges) && return
for edge in mi.backedges
println(mi.specTypes => edge)
end
end
end
end
backedges (generic function with 1 method)
julia> f(x::Int) = 1
f (generic function with 1 method)
julia> f(x::Bool) = 2
f (generic function with 2 methods)
julia> function applyf(container)
x1 = f(container[1])
x2 = f(container[2])
return x1 + x2
end
applyf (generic function with 1 method)
julia> backedges(f)
### Method table backedges ###
### Method instance backedges ###
julia> applyf([1,2])
2
julia> backedges(f)
### Method table backedges ###
### Method instance backedges ###
Tuple{typeof(f),Int64} => MethodInstance for applyf(::Array{Int64,1})
julia> applyf([false, true])
4
julia> backedges(f)
### Method table backedges ###
### Method instance backedges ###
Tuple{typeof(f),Bool} => MethodInstance for applyf(::Array{Bool,1})
Tuple{typeof(f),Int64} => MethodInstance for applyf(::Array{Int64,1})
julia> applyf(Any[1, false])
3
julia> backedges(f)
### Method table backedges ###
Tuple{typeof(f),Any} => MethodInstance for applyf(::Array{Any,1})
### Method instance backedges ###
Tuple{typeof(f),Bool} => MethodInstance for applyf(::Array{Bool,1})
Tuple{typeof(f),Bool} => MethodInstance for applyf(::Array{Any,1})
Tuple{typeof(f),Int64} => MethodInstance for applyf(::Array{Int64,1})
Tuple{typeof(f),Int64} => MethodInstance for applyf(::Array{Any,1})
julia> f(x) = 3
f (generic function with 3 methods)
julia> backedges(f)
### Method table backedges ###
### Method instance backedges ###
Tuple{typeof(f),Bool} => MethodInstance for applyf(::Array{Bool,1})
Tuple{typeof(f),Bool} => MethodInstance for applyf(::Array{Any,1})
Tuple{typeof(f),Int64} => MethodInstance for applyf(::Array{Int64,1})
Tuple{typeof(f),Int64} => MethodInstance for applyf(::Array{Any,1})
julia> applyf(Any[1, false])
3
julia> backedges(f)
### Method table backedges ###
### Method instance backedges ###
Tuple{typeof(f),Bool} => MethodInstance for applyf(::Array{Bool,1})
Tuple{typeof(f),Bool} => MethodInstance for applyf(::Array{Any,1})
Tuple{typeof(f),Int64} => MethodInstance for applyf(::Array{Int64,1})
Tuple{typeof(f),Int64} => MethodInstance for applyf(::Array{Any,1})
Tuple{typeof(f),Any} => MethodInstance for applyf(::Array{Any,1}) It's rather interesting how type-instabilities in core Julia code have this non-obvious secondary "cost." |
@mbauman, I wondered about showing that directly, but it was already a long blog post and I didn't really feel like turning it into a tutorial on all of Julia's internal structures (since those are liable to change anyway). But perhaps it would make sense to advertise this via MethodAnalysis, since a level of indirection provides mechanisms for support as internals change.
I agree that type-instability is, in practice, a major contributor, but really this can happen any time a method has a non-concrete signature. Demo: julia> using SnoopCompile
julia> f(::Integer) = 1
f (generic function with 1 method)
julia> function applyf(container)
x1 = f(container[1])
x2 = f(container[2])
return x1 + x2
end
applyf (generic function with 1 method)
julia> c = Any[1, true]
2-element Array{Any,1}:
1
true
julia> applyf(c)
2
julia> trees = invalidation_trees(@snoopr f(::Bool) = 2)
1-element Array{SnoopCompile.MethodInvalidations,1}:
insert f(::Bool) in Main at REPL[6]:1 invalidated:
mt_backedges: signature Tuple{typeof(f),Any} triggered MethodInstance for applyf(::Array{Any,1}) (0 children) more specific
backedges: superseding f(::Integer) in Main at REPL[2]:1 with MethodInstance for f(::Integer) (1 children) more specific
2 mt_cache
julia> applyf(c)
3 Obviously invalidation was necessary here. It's possible any time you call a method with a more specific set of types than its signature and then later provide a new method that gets called for those types. But the general ethic against type-piracy means that in practice a lot of stuff won't invalidate. Hence type-instability is a major back door through which invalidation becomes more of a risk. Interesting observation: above I got both an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for writing this. I enjoyed reading it!
I'd say that calling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Tim for this great informative blog post (and great tooling to go with it).
While I love hacky workarounds as temporary solutions, I also feel rather uncomfortable that this might encourage people to remove abstractions, or make people fear their cost. I doubt this is what you intend here at all, but I think some readers might reach that conclusion and it would be useful to divert them away from it!
As to the mechanics of working around unnecessary invalidations, I wonder whether some generalization of JuliaLang/julia#35844 (with a more useful user-facing API) would be helpful. Instead of turning maximum(version_numbers)
into some call to mapreduce
, it would be much less objectionable (to me!) to turn it into @please_do_some_fixed_world_hack maximum(version_numbers)
. At least that would clearly mark it as an ugly workaround while preserving the intention of the original code :-)
Anyway thanks again for writing this, it was a fun read and very informative.
The right way to treat this is as a running diary of what choices we'd have to make to fix invalidations. Writing them down helps me think it through and encourages other people to do the same, and when we don't like what we see that means we need to come up with a better approach. This morning's I'll make sure I continue to try to document the painful parts of this just to provoke myself and others to complain. But unless it drags on (and it's not, it's going really fast right now), I'll wait to actually merge this until the most important fixes to Julia itself have been made and we can brag about how much better 1.6 will be. |
I'm looking into this for JuMP, jump-dev/JuMP.jl#2278, and I'd appreciate some advice. I've got the number of invalidations down from 6630 to 1896, but it didn't have any noticeable impact on compile time. I'm also not really sure whether we should be fixing these at the JuMP level, or in the dependent packages with type instability: jump-dev/JuMP.jl#2278 (comment). This is part of an effort to speed up JuMP's tests, where compile time is 1000 times larger than runtime jump-dev/JuMP.jl#2277. cc @mlubin |
Co-authored-by: Matt Bauman <[email protected]>
Co-authored-by: Ian Fiske <[email protected]>
71da453
to
806a5c3
Compare
OK, after a long hiatus I came back and polished this up. I'm still waiting for a youtube video to convert (it's been stuck on "99% processed" for hours, not sure what's up with that...), so inserting the video link will have to wait until that's finished. Can someone post a link to the preview, which I understand is generated for all PRs? I am not successfully figuring out where to find it. @JeffBezanson and @vtjnash, I took the liberty of adding you as authors, since you've done a lot to reduce invalidations. Let me know if that's OK, and feel free to request changes in the content! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really useful. Thanks!
Barring concerns, I'll merge this Monday. |
I fixed the date and added a link to the video: https://www.youtube.com/watch?v=7VbXbI6OmYo. The main part is a tutorial in how to use SnoopCompile/Cthulhu to fix inference-related invalidations. |
Great vid! For the preview: https://julialang.netlify.app/previews/pr794/blog/2020/08/invalidations/ The trick is the |
|
||
First, let's look at a simple way (one that is not always precisely accurate, see SnoopCompile's documentation) to collect data on a package (in this case, loading the SIMD package): | ||
|
||
![snoopr_simd](/assets/blog/2020-invalidations/SIMD_invalidations.png) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the output is fairly wide for the first one and it squashes to the width of the page. We can either choose to truncate and keep the fontsize similar or see the full display. Preference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to keep it as it is if reproducing a similar screenshot is non-trivial since you probably already fixed that :P
blog/2020/08/invalidations.md
Outdated
|
||
Using [SnoopCompile], we can count the number of invalidations triggered by loading various packages into a fresh Julia session: | ||
|
||
| Package | Version | # invalidations (Julia 1.5) | # invalidations (Julia 1.6) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion.. Perhaps it's better to reference this as master (githash, 1.6-pre)
, given that 1.6 is fast evolving? I'm sure this post will be well-visited beyond the finalization of 1.6, so just saying 1.6 might confuse things a little?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was a good suggestion. Enough of my PRs have been merged that I can now comfortably commit to a git-sha, so let's merge this sucker.
Several of us have been poking at invalidations lately. I've finally gotten some decent tooling developed and my head wrapped around sources. Here's a summary of my findings.