Very WIP: New optimizer #26079

Keno · 2018-02-16T06:10:28Z

This PR introduces a new IR form and optimizer to address some of the performance problems encountered in #25261. The IR is pure SSA form and borrows from B3 and LLVM in its design choices. I'm planning to write a comprehensive design document once this is a bit further along. The new IR contains Phi nodes, so a significant part of this is adding support for those in codegen and the optimizer.

The current status is that this bootstraps (in the sense of building itself) fine on top of an existing system image, and can bootstrap a new system image to the same place as doing an external bootstrap on master (somewhere around version.jl at which point it gets confused about being not-quite-Base). It also passes core and string tests, in the externally bootstrapped version. It does not however, bootstrap itself internally yet (keeps recursing into inference), so this is quite an early state. Opening this PR to give @vtjnash and @JeffBezanson a place to collaborate on this.

Once this is done, we should be able to delete most of the existing passes from optimize.jl, so I'm hoping that this'll be overall net negative.

TODO status:

~~turning on optimization passes~~
~~line number info~~ DONE
~~handling meta nodes correctly~~ DONE
~~possible improvements (like not adding edges from error()::Undef{})~~
replace the naive algorithms with the fast ones / performance
handling enter/exit
cleaning up code / removing debug spew

As was mentioned in #25988, there is a handy trick where you can load a second copy of Base on top of an existing copy. This is useful for at least two reasons: 1. Base printing is available, so things like MethodErrors print nicely 2. Even if the load fails, the resulting (broken) copy of base is inspectable by standard introspection tools from the REPL, as long as you're a bit careful not to mix types from the two copies of Base. However, as I mentioned in #26079, this only actually works until about version.jl, at which point things crash. This is because at that point it tries to use PCRE which uses `Ref(0)`, which is actually an abstract type in Core, even though the type of the constructed object (`RefValue`) is in Base. As a result, the new Base gets the wrong kind of `RefValue` (the one from the original `Base`) and things break. Luckily this is easily fixed by using an explicit `RefValue` call in the relevant places. A second problem we run into is that `module`s nested under our new `Base`, get a default import of the old `Base` (unless we declare the new Base to be the global top module, but that would break the REPL subsequent to loading the new Base, which breaks reason 2 above). I suggest (and implement in this PR) to have the default import be the next topmodule along the parent link chain (as we already do for syntax defined in Base), which makes this work. A small related detail is that in all such modules `import Base: x`, needs to be replaced by `import .Base: x`, to make sure we resolve the identifier `Base` (as imported from our new top module) rather than the global name `Base` (which still refers to the old module). I changed sysimg.jl to avoid loading stdlibs in second Base mode, to avoid having to implement the same changes there. Since the stdlibs are already decoupled from Base, they can already be developed separately fairly easily, so there's not much reason to include them in this trick. For completeness, there's a couple of ways to use this trick, but perhaps the simplest is: ``` cd("base") baremodule NotBase Core.include(NotBase, "sysimg.jl") end ``` from the REPL.

vtjnash · 2018-02-24T20:24:52Z

From discussion with Keno, we're thinking of merging this soon behind a feature-flag. He thought he might have time today to remove all of the debug statements, re-enable the old optimization passes, and squash/rebase the PR.

Keno · 2018-02-25T20:41:40Z

Rebased, removed disabled optimization passes, squashed, flag added to inference to turn on the new optimizer.

KristofferC · 2018-02-25T21:00:38Z

base/client.jl

@@ -298,11 +298,6 @@ function exec_options(opts)
    # load ~/.julia/config/startup.jl file
    startup && load_julia_startup()

-    if repl || is_interactive


Good catch. I assume this was broken at some point, so carelessly deleted.

KristofferC · 2018-02-25T21:01:58Z

Worth to run nanosoldier? Just to make sure.

Keno · 2018-02-25T21:04:24Z

Sure, it is disabled by default, so in theory, we should be fine, but who knows:
@nanosoldier runbenchmarks(ALL, vs=":master")

nanosoldier · 2018-02-26T02:11:16Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

vtjnash · 2018-02-26T14:52:52Z

The tests compiler and cmdlineargs needs some minor fixup, then this seems good to go.

This introduces a new framework for middle-end optimizations in julia, based on SSA form IR. Much work is still left to be done, but thanks to Jameson's work this is in a bootstrappable form, so we're merging it behind a feature flag to allow for quick iteration. Documentation of the new IR format is available in the devdocs. Co-authored-by: Jameson Nash <[email protected]>

Keno · 2018-02-28T05:04:13Z

FreeBSD and OS X hit their respective intermittent kernel bugs. AppVeyor is as on master. Other tests went through though. Merging. As a reminder, this is just getting the code in, the actual functionality is still behind an (off by default) flag.

StefanKarpinski · 2018-04-13T16:54:22Z

Implicated in dot-call global scope slow down: #26767. I thought this was hidden behind a feature flag—is there somewhat it may affect performance in global scope even when the feature flag is turned off? If so, can we mitigate that somehow?

Keno added this to the 1.0 milestone Feb 16, 2018

Keno mentioned this pull request Feb 17, 2018

Iteration protocol change #25261

Merged

3 tasks

Keno mentioned this pull request Feb 19, 2018

Make the "second Base copy" trick actually work #26111

Merged

vtjnash mentioned this pull request Feb 20, 2018

ensure promotion rules don't alter eltype values #26109

Merged

Keno force-pushed the kf/phinode branch from 94ef0dc to 558b99c Compare February 20, 2018 22:26

vtjnash mentioned this pull request Feb 24, 2018

Add IdSet and fix IdDict performance problem #26054

Merged

Keno force-pushed the kf/phinode branch from be8fabc to 205573c Compare February 25, 2018 20:40

KristofferC reviewed Feb 25, 2018

View reviewed changes

Keno force-pushed the kf/phinode branch 2 times, most recently from e550803 to 644a375 Compare February 27, 2018 01:09

Keno and others added 2 commits February 27, 2018 14:35

[NewOptimizer] Track inlining info

20e02b8

Keno force-pushed the kf/phinode branch from 644a375 to 20e02b8 Compare February 27, 2018 19:35

Keno merged commit 2345371 into master Feb 28, 2018

martinholters deleted the kf/phinode branch February 28, 2018 07:04

LilithHafner mentioned this pull request Aug 27, 2022

Switch from sorted vector to heap in optimizer #46510

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Very WIP: New optimizer #26079

Very WIP: New optimizer #26079

Keno commented Feb 16, 2018 •

edited

Loading

vtjnash commented Feb 24, 2018

Keno commented Feb 25, 2018

KristofferC Feb 25, 2018

Keno Feb 25, 2018

KristofferC commented Feb 25, 2018

Keno commented Feb 25, 2018

nanosoldier commented Feb 26, 2018

vtjnash commented Feb 26, 2018

Keno commented Feb 28, 2018

StefanKarpinski commented Apr 13, 2018

Very WIP: New optimizer #26079

Very WIP: New optimizer #26079

Conversation

Keno commented Feb 16, 2018 • edited Loading

vtjnash commented Feb 24, 2018

Keno commented Feb 25, 2018

KristofferC Feb 25, 2018

Choose a reason for hiding this comment

Keno Feb 25, 2018

Choose a reason for hiding this comment

KristofferC commented Feb 25, 2018

Keno commented Feb 25, 2018

nanosoldier commented Feb 26, 2018

vtjnash commented Feb 26, 2018

Keno commented Feb 28, 2018

StefanKarpinski commented Apr 13, 2018

Keno commented Feb 16, 2018 •

edited

Loading