-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Very WIP: New optimizer #26079
Very WIP: New optimizer #26079
Conversation
As was mentioned in #25988, there is a handy trick where you can load a second copy of Base on top of an existing copy. This is useful for at least two reasons: 1. Base printing is available, so things like MethodErrors print nicely 2. Even if the load fails, the resulting (broken) copy of base is inspectable by standard introspection tools from the REPL, as long as you're a bit careful not to mix types from the two copies of Base. However, as I mentioned in #26079, this only actually works until about version.jl, at which point things crash. This is because at that point it tries to use PCRE which uses `Ref(0)`, which is actually an abstract type in Core, even though the type of the constructed object (`RefValue`) is in Base. As a result, the new Base gets the wrong kind of `RefValue` (the one from the original `Base`) and things break. Luckily this is easily fixed by using an explicit `RefValue` call in the relevant places. A second problem we run into is that `module`s nested under our new `Base`, get a default import of the old `Base` (unless we declare the new Base to be the global top module, but that would break the REPL subsequent to loading the new Base, which breaks reason 2 above). I suggest (and implement in this PR) to have the default import be the next topmodule along the parent link chain (as we already do for syntax defined in Base), which makes this work. A small related detail is that in all such modules `import Base: x`, needs to be replaced by `import .Base: x`, to make sure we resolve the identifier `Base` (as imported from our new top module) rather than the global name `Base` (which still refers to the old module). I changed sysimg.jl to avoid loading stdlibs in second Base mode, to avoid having to implement the same changes there. Since the stdlibs are already decoupled from Base, they can already be developed separately fairly easily, so there's not much reason to include them in this trick. For completeness, there's a couple of ways to use this trick, but perhaps the simplest is: ``` cd("base") baremodule NotBase Core.include(NotBase, "sysimg.jl") end ``` from the REPL.
As was mentioned in #25988, there is a handy trick where you can load a second copy of Base on top of an existing copy. This is useful for at least two reasons: 1. Base printing is available, so things like MethodErrors print nicely 2. Even if the load fails, the resulting (broken) copy of base is inspectable by standard introspection tools from the REPL, as long as you're a bit careful not to mix types from the two copies of Base. However, as I mentioned in #26079, this only actually works until about version.jl, at which point things crash. This is because at that point it tries to use PCRE which uses `Ref(0)`, which is actually an abstract type in Core, even though the type of the constructed object (`RefValue`) is in Base. As a result, the new Base gets the wrong kind of `RefValue` (the one from the original `Base`) and things break. Luckily this is easily fixed by using an explicit `RefValue` call in the relevant places. A second problem we run into is that `module`s nested under our new `Base`, get a default import of the old `Base` (unless we declare the new Base to be the global top module, but that would break the REPL subsequent to loading the new Base, which breaks reason 2 above). I suggest (and implement in this PR) to have the default import be the next topmodule along the parent link chain (as we already do for syntax defined in Base), which makes this work. A small related detail is that in all such modules `import Base: x`, needs to be replaced by `import .Base: x`, to make sure we resolve the identifier `Base` (as imported from our new top module) rather than the global name `Base` (which still refers to the old module). I changed sysimg.jl to avoid loading stdlibs in second Base mode, to avoid having to implement the same changes there. Since the stdlibs are already decoupled from Base, they can already be developed separately fairly easily, so there's not much reason to include them in this trick. For completeness, there's a couple of ways to use this trick, but perhaps the simplest is: ``` cd("base") baremodule NotBase Core.include(NotBase, "sysimg.jl") end ``` from the REPL.
As was mentioned in #25988, there is a handy trick where you can load a second copy of Base on top of an existing copy. This is useful for at least two reasons: 1. Base printing is available, so things like MethodErrors print nicely 2. Even if the load fails, the resulting (broken) copy of base is inspectable by standard introspection tools from the REPL, as long as you're a bit careful not to mix types from the two copies of Base. However, as I mentioned in #26079, this only actually works until about version.jl, at which point things crash. This is because at that point it tries to use PCRE which uses `Ref(0)`, which is actually an abstract type in Core, even though the type of the constructed object (`RefValue`) is in Base. As a result, the new Base gets the wrong kind of `RefValue` (the one from the original `Base`) and things break. Luckily this is easily fixed by using an explicit `RefValue` call in the relevant places. A second problem we run into is that `module`s nested under our new `Base`, get a default import of the old `Base` (unless we declare the new Base to be the global top module, but that would break the REPL subsequent to loading the new Base, which breaks reason 2 above). I suggest (and implement in this PR) to have the default import be the next topmodule along the parent link chain (as we already do for syntax defined in Base), which makes this work. A small related detail is that in all such modules `import Base: x`, needs to be replaced by `import .Base: x`, to make sure we resolve the identifier `Base` (as imported from our new top module) rather than the global name `Base` (which still refers to the old module). I changed sysimg.jl to avoid loading stdlibs in second Base mode, to avoid having to implement the same changes there. Since the stdlibs are already decoupled from Base, they can already be developed separately fairly easily, so there's not much reason to include them in this trick. For completeness, there's a couple of ways to use this trick, but perhaps the simplest is: ``` cd("base") baremodule NotBase Core.include(NotBase, "sysimg.jl") end ``` from the REPL.
As was mentioned in #25988, there is a handy trick where you can load a second copy of Base on top of an existing copy. This is useful for at least two reasons: 1. Base printing is available, so things like MethodErrors print nicely 2. Even if the load fails, the resulting (broken) copy of base is inspectable by standard introspection tools from the REPL, as long as you're a bit careful not to mix types from the two copies of Base. However, as I mentioned in #26079, this only actually works until about version.jl, at which point things crash. This is because at that point it tries to use PCRE which uses `Ref(0)`, which is actually an abstract type in Core, even though the type of the constructed object (`RefValue`) is in Base. As a result, the new Base gets the wrong kind of `RefValue` (the one from the original `Base`) and things break. Luckily this is easily fixed by using an explicit `RefValue` call in the relevant places. A second problem we run into is that `module`s nested under our new `Base`, get a default import of the old `Base` (unless we declare the new Base to be the global top module, but that would break the REPL subsequent to loading the new Base, which breaks reason 2 above). I suggest (and implement in this PR) to have the default import be the next topmodule along the parent link chain (as we already do for syntax defined in Base), which makes this work. A small related detail is that in all such modules `import Base: x`, needs to be replaced by `import .Base: x`, to make sure we resolve the identifier `Base` (as imported from our new top module) rather than the global name `Base` (which still refers to the old module). I changed sysimg.jl to avoid loading stdlibs in second Base mode, to avoid having to implement the same changes there. Since the stdlibs are already decoupled from Base, they can already be developed separately fairly easily, so there's not much reason to include them in this trick. For completeness, there's a couple of ways to use this trick, but perhaps the simplest is: ``` cd("base") baremodule NotBase Core.include(NotBase, "sysimg.jl") end ``` from the REPL.
As was mentioned in #25988, there is a handy trick where you can load a second copy of Base on top of an existing copy. This is useful for at least two reasons: 1. Base printing is available, so things like MethodErrors print nicely 2. Even if the load fails, the resulting (broken) copy of base is inspectable by standard introspection tools from the REPL, as long as you're a bit careful not to mix types from the two copies of Base. However, as I mentioned in #26079, this only actually works until about version.jl, at which point things crash. This is because at that point it tries to use PCRE which uses `Ref(0)`, which is actually an abstract type in Core, even though the type of the constructed object (`RefValue`) is in Base. As a result, the new Base gets the wrong kind of `RefValue` (the one from the original `Base`) and things break. Luckily this is easily fixed by using an explicit `RefValue` call in the relevant places. A second problem we run into is that `module`s nested under our new `Base`, get a default import of the old `Base` (unless we declare the new Base to be the global top module, but that would break the REPL subsequent to loading the new Base, which breaks reason 2 above). I suggest (and implement in this PR) to have the default import be the next topmodule along the parent link chain (as we already do for syntax defined in Base), which makes this work. A small related detail is that in all such modules `import Base: x`, needs to be replaced by `import .Base: x`, to make sure we resolve the identifier `Base` (as imported from our new top module) rather than the global name `Base` (which still refers to the old module). I changed sysimg.jl to avoid loading stdlibs in second Base mode, to avoid having to implement the same changes there. Since the stdlibs are already decoupled from Base, they can already be developed separately fairly easily, so there's not much reason to include them in this trick. For completeness, there's a couple of ways to use this trick, but perhaps the simplest is: ``` cd("base") baremodule NotBase Core.include(NotBase, "sysimg.jl") end ``` from the REPL.
From discussion with Keno, we're thinking of merging this soon behind a feature-flag. He thought he might have time today to remove all of the debug statements, re-enable the old optimization passes, and squash/rebase the PR. |
Rebased, removed disabled optimization passes, squashed, flag added to inference to turn on the new optimizer. |
base/client.jl
Outdated
@@ -298,11 +298,6 @@ function exec_options(opts) | |||
# load ~/.julia/config/startup.jl file | |||
startup && load_julia_startup() | |||
|
|||
if repl || is_interactive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I assume this was broken at some point, so carelessly deleted.
Worth to run nanosoldier? Just to make sure. |
Sure, it is disabled by default, so in theory, we should be fine, but who knows: |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan |
The tests compiler and cmdlineargs needs some minor fixup, then this seems good to go. |
e550803
to
644a375
Compare
This introduces a new framework for middle-end optimizations in julia, based on SSA form IR. Much work is still left to be done, but thanks to Jameson's work this is in a bootstrappable form, so we're merging it behind a feature flag to allow for quick iteration. Documentation of the new IR format is available in the devdocs. Co-authored-by: Jameson Nash <[email protected]>
FreeBSD and OS X hit their respective intermittent kernel bugs. AppVeyor is as on master. Other tests went through though. Merging. As a reminder, this is just getting the code in, the actual functionality is still behind an (off by default) flag. |
Implicated in dot-call global scope slow down: #26767. I thought this was hidden behind a feature flag—is there somewhat it may affect performance in global scope even when the feature flag is turned off? If so, can we mitigate that somehow? |
This PR introduces a new IR form and optimizer to address some of the performance problems encountered in #25261. The IR is pure SSA form and borrows from B3 and LLVM in its design choices. I'm planning to write a comprehensive design document once this is a bit further along. The new IR contains Phi nodes, so a significant part of this is adding support for those in codegen and the optimizer.
The current status is that this bootstraps (in the sense of building itself) fine on top of an existing system image, and can bootstrap a new system image to the same place as doing an external bootstrap on master (somewhere around version.jl at which point it gets confused about being not-quite-Base). It also passes
core
andstring
tests, in the externally bootstrapped version. It does not however, bootstrap itself internally yet (keeps recursing into inference), so this is quite an early state. Opening this PR to give @vtjnash and @JeffBezanson a place to collaborate on this.Once this is done, we should be able to delete most of the existing passes from optimize.jl, so I'm hoping that this'll be overall net negative.
TODO status:
turning on optimization passesline number infoDONEhandling meta nodes correctlyDONEpossible improvements (like not adding edges fromerror()::Undef{}
)