-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MIR/two-phase borrowck takes a lot of time and memory for the ucd crate #53643
Comments
cc @rust-lang/wg-compiler-nll |
Should we add this to perf? Seems like maybe yes |
I can confirm this is "really dang slow" in my local testing. Doing a profile. |
The TL;DR is that there are an "awful lot of borrows". We spend something like 61% of our time propagating borrows -- and the rest I think is spent just comparing against borrows. I would expect #53328 might help here, but maybe there is something simpler we can do too. Last time we solved this via #53176 -- I suspect a slightly expanded version ought to apply here. |
Could there a be a fast(er) path for "trivial" |
A note for memory usage, I've tested this crate with #53327 and similarly to |
I believe we can safely skip a shared borrow of a local variable if the following conditions are met:
Unfortunately, that last condition -- "declared as immutable" -- does not hold in the case of ucd. This is because the locals in question are temporaries, and we always create temporaries as mutable. We could check a more general condition: basically it doesn't matter how the local is declared, it only matters if it is assigned to after the borrow, but that would require us to duplicate the code that knows about assignments etc. We could also convince the MIR builder to create immutable temporaries in some conditions. In this particular case, the temporary is created via a rust/src/librustc_mir/build/expr/as_rvalue.rs Lines 65 to 68 in 7625c03
One could imagine threading down some information about what sort of temporary to create (e.g., "this is a shared borrow, so create an immutable temporary"). Putting that together it feels like the steps needed here are:
The |
working on it |
Here's some Callgrind output. The hottest function by far is
The hot path goes through the |
There's something quadratic about the way
|
I'm also puzzled by this part of the code:
It seems like tracking visited-ness at the statement level shouldn't be necessary -- we should instead be able to do it at the basic block level, right? I.e. we can check if a basic block has been visited, and if it hasn't, we can process all the statements within it without any further checking. If I'm right about that, then the |
@nnethercote you commented the same three times :) |
@bjorn3 github experienced a 500 internal error last night and caused a ton of comments posted at that exact moment to duplicate |
@nikomatsakis the strategy you outlined above should take care of the execution time issue, but the memory usage issue, at least according to @SimonSapin 's analysis, is happening earlier in the pipeline, during |
Ah never mind; discussing with lqd led me to realize I had misread Simon’s comment and had thought the current memory regression was far more grave |
In principle yes, but it's hard to get it just right. In particular, the borrow can start in the middle of a basic block -- and you can loop back to the start. So I've considered doing the tracking at the basic block level, but not marking the initial block fixed, and then having an extra check for when you loop back to the start point. In any case, I do think we can do better than this code, but I think the best fix for UCD is simply not to have the borrows in the first place. And I had hoped to rework the loop as part of #53328 -- but I guess no reason not to do it earlier too, it's an independnt task |
@pnkfelix just for the record, the max-rss results with #53327 applied are:
Thus the memory use is 136%. Not so bad. |
This function was hot for In short, I agree that fixing it both ways is a good idea :) |
Yes, I agree. In particular, there will be cases where we can't just make the borrows go away. =) (I do also mildly worry about messing up some optimization here and ignoring borrows when we ought not to.) |
Yesterday, I started looking at rewriting |
|
Just to expand on the quadratic-ness of
It's clearly iterating through every 12th or 14th of BB's statements, and for each one it iterates from that statement to the BB's end. Boom, quadratic. |
Bumping to RC 2 — important goal, but not an RC blocker. |
Skip a shared borrow of a immutable local variables issue #53643 r? @nikomatsakis
So it's not 110% CPU and 135% RAM. Seems fine. Closing. |
Setup:
Baseline: 7.10s, peak RSS 542,664 KB
Output
With MIR-based two-phase borrowck: 67s, peak RSS 2,347,764 KB. A 9× increase in time and 4× increase in memory.
Output
Note however that the current Nightly (tested above) is already a lot better than nightly-2018-07-17 (currently used in Servo) where the baseline is about the same but MIR-based two-phase borrowck takes 25,660,988 KB (25 GB) and 457 seconds (7m37), a 47× and 64× increase from baseline.
In both cases, the outlier in
-Z time-passes
points to this item of the crate:https://github.com/sourtin/libucd/blob/fc89190397d21e43ab9b2246ab5b4aad88524875/src/tables/rem.rs#L4126-L7836
The text was updated successfully, but these errors were encountered: