-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compiling top-level expressions forces global binding resolution #14055
Comments
Just happened upon this and have been thinking about it a bit. There are kind of two issues. (1) Inference and codegen cause all the bindings they examine to get resolved. That's not ideal, but is a practical necessity. (2) This case, where the same block of code reads and writes a new global variable. It's not a problem inside functions, because the function would have to contain a We could tell inference and codegen not to resolve bindings in top-level code, but that could lead to really bad and unpredictable performance. Or a hybrid approach could be avoiding binding resolution just for names that are assigned somewhere (those being the "less predictable" bindings). |
This implements world-age partitioning of bindings as proposed in #40399. In effect, much like methods, the global view of bindings now depends on your currently executing world. This means that `const` bindings can now have different values in different worlds. In principle it also means that regular global variables could have different values in different worlds, but there is currently no case where the system does this. # Motivation The reasons for this change are manifold: 1. The primary motivation is to permit Revise to redefine structs. This has been a feature request since the very begining of Revise (timholy/Revise.jl#18) and there have been numerous attempts over the past 7 years to address this, as well as countless duplicate feature request. A past attempt to implement the necessary julia support in #22721 failed because the consequences and semantics of re-defining bindings were not sufficiently worked out. One way to think of this implementation (at least with respect to types) is that it provides a well-grounded implementation of #22721. 2. A secondary motivation is to make `const`-redefinition no longer UB (although `const` redefinition will still have a significant performance penalty, so it is not recommended). See e.g. the full discussion in #54099. 3. Not currently implemented, but this mechanism can be used to re-compile code where bindings are introduced after the first compile, which is a common performance trap for new users (#53958). 4. Not currently implemented, but this mechanism can be used to clarify the semantics of bindings import and resolution to address issues like #14055. # Implementation In this PR: - `Binding` gets `min_world`/`max_world` fields like `CodeInstance` - Various lookup functions walk this linked list using the current task world_age as a key - Inference accumulates world bounds as it would for methods - Upon binding replacement, we walk all methods in the system, invalidating those whose uninferred IR references the replaced GlobalRef - One primary complication is that our IR definition permits `const` globals in value position, but if binding replacement is permitted, the validity of this may change after the fact. To address this, there is a helper in `Core.Compiler` that gets invoked in the type inference world and will rewrite the method source to be legal in all worlds. - A new `@world` macro can be used to access bindings from old world ages. This is used in printing for old objects. - The `const`-override behavior was changed to only be permitted at toplevel. The warnings about it being UB was removed. Of particular note, this PR does not include any mechanism for invalidating methods whose signatures were created using an old Binding (or types whose fields were the result of a binding evaluation). There was some discussion among the compiler team of whether such a mechanism should exist in base, but the consensus was that it should not. In particular, although uncommon, a pattern like: ``` f() = Any g(::f()) = 1 f() = Int ``` Does not redefine `g`. Thus to fully address the Revise issue, additional code will be required in Revise to track the dependency of various signatures and struct definitions on bindings. # Demo ``` julia> struct Foo a::Int end julia> g() = Foo(1) g (generic function with 1 method) julia> g() Foo(1) julia> f(::Foo) = 1 f (generic function with 1 method) julia> fold = Foo(1) Foo(1) julia> struct Foo a::Int b::Int end julia> g() ERROR: MethodError: no method matching Foo(::Int64) The type `Foo` exists, but no method is defined for this combination of argument types when trying to construct it. Closest candidates are: Foo(::Int64, ::Int64) @ Main REPL[6]:2 Foo(::Any, ::Any) @ Main REPL[6]:2 Stacktrace: [1] g() @ Main ./REPL[2]:1 [2] top-level scope @ REPL[7]:1 julia> f(::Foo) = 2 f (generic function with 2 methods) julia> methods(f) # 2 methods for generic function "f" from Main: [1] f(::Foo) @ REPL[8]:1 [2] f(::@world(Foo, 0:26898)) @ REPL[4]:1 julia> fold @world(Foo, 0:26898)(1) ``` # Performance consideration On my machine, the validation required upon binding replacement for the full system image takes about 200ms. With CedarSim loaded (I tried OmniPackage, but it's not working on master), this increases about 5x. That's a fair bit of compute, but not the end of the world. Still, Revise may have to batch its validation. There may also be opportunities for performance improvement by operating on the compressed representation directly. # Semantic TODO - [ ] Do we want to change the resolution time of bindings to (semantically) resolve them immediately? - [ ] Do we want to introduce guard bindings when inference assumes the absence of a binding? # Implementation TODO - [ ] Precompile re-validation - [ ] Various cleanups in the accessors - [ ] Invert the order of the binding linked list to make the most recent one always the head of the list - [ ] CodeInstances need forward edges for GlobalRefs not part of the uninferred code - [ ] Generated function support
This implements world-age partitioning of bindings as proposed in #40399. In effect, much like methods, the global view of bindings now depends on your currently executing world. This means that `const` bindings can now have different values in different worlds. In principle it also means that regular global variables could have different values in different worlds, but there is currently no case where the system does this. The reasons for this change are manifold: 1. The primary motivation is to permit Revise to redefine structs. This has been a feature request since the very begining of Revise (timholy/Revise.jl#18) and there have been numerous attempts over the past 7 years to address this, as well as countless duplicate feature request. A past attempt to implement the necessary julia support in #22721 failed because the consequences and semantics of re-defining bindings were not sufficiently worked out. One way to think of this implementation (at least with respect to types) is that it provides a well-grounded implementation of #22721. 2. A secondary motivation is to make `const`-redefinition no longer UB (although `const` redefinition will still have a significant performance penalty, so it is not recommended). See e.g. the full discussion in #54099. 3. Not currently implemented, but this mechanism can be used to re-compile code where bindings are introduced after the first compile, which is a common performance trap for new users (#53958). 4. Not currently implemented, but this mechanism can be used to clarify the semantics of bindings import and resolution to address issues like #14055. In this PR: - `Binding` gets `min_world`/`max_world` fields like `CodeInstance` - Various lookup functions walk this linked list using the current task world_age as a key - Inference accumulates world bounds as it would for methods - Upon binding replacement, we walk all methods in the system, invalidating those whose uninferred IR references the replaced GlobalRef - One primary complication is that our IR definition permits `const` globals in value position, but if binding replacement is permitted, the validity of this may change after the fact. To address this, there is a helper in `Core.Compiler` that gets invoked in the type inference world and will rewrite the method source to be legal in all worlds. - A new `@world` macro can be used to access bindings from old world ages. This is used in printing for old objects. - The `const`-override behavior was changed to only be permitted at toplevel. The warnings about it being UB was removed. Of particular note, this PR does not include any mechanism for invalidating methods whose signatures were created using an old Binding (or types whose fields were the result of a binding evaluation). There was some discussion among the compiler team of whether such a mechanism should exist in base, but the consensus was that it should not. In particular, although uncommon, a pattern like: ``` f() = Any g(::f()) = 1 f() = Int ``` Does not redefine `g`. Thus to fully address the Revise issue, additional code will be required in Revise to track the dependency of various signatures and struct definitions on bindings. ``` julia> struct Foo a::Int end julia> g() = Foo(1) g (generic function with 1 method) julia> g() Foo(1) julia> f(::Foo) = 1 f (generic function with 1 method) julia> fold = Foo(1) Foo(1) julia> struct Foo a::Int b::Int end julia> g() ERROR: MethodError: no method matching Foo(::Int64) The type `Foo` exists, but no method is defined for this combination of argument types when trying to construct it. Closest candidates are: Foo(::Int64, ::Int64) @ Main REPL[6]:2 Foo(::Any, ::Any) @ Main REPL[6]:2 Stacktrace: [1] g() @ Main ./REPL[2]:1 [2] top-level scope @ REPL[7]:1 julia> f(::Foo) = 2 f (generic function with 2 methods) julia> methods(f) [1] f(::Foo) @ REPL[8]:1 [2] f(::@world(Foo, 0:26898)) @ REPL[4]:1 julia> fold @world(Foo, 0:26898)(1) ``` On my machine, the validation required upon binding replacement for the full system image takes about 200ms. With CedarSim loaded (I tried OmniPackage, but it's not working on master), this increases about 5x. That's a fair bit of compute, but not the end of the world. Still, Revise may have to batch its validation. There may also be opportunities for performance improvement by operating on the compressed representation directly. - [ ] Do we want to change the resolution time of bindings to (semantically) resolve them immediately? - [ ] Do we want to introduce guard bindings when inference assumes the absence of a binding? - [ ] Precompile re-validation - [ ] Various cleanups in the accessors - [ ] Invert the order of the binding linked list to make the most recent one always the head of the list - [ ] CodeInstances need forward edges for GlobalRefs not part of the uninferred code - [ ] Generated function support
Sorry the title is so vague but I can't get my head around what is happening here. Minimal code which reproduces this bug:
Output:
Removing the
if
orfor
blocks results in this bug not triggering, so it looks like this has something to do with soft vs hard scope maybe. Rebinding an existing variable name is also required.The text was updated successfully, but these errors were encountered: