incr.comp.: Fix Span hashing in various ways #46301

michaelwoerister · 2017-11-27T13:33:49Z

This PR contains two sets of changes that warrant some discussion:

The first commit makes the compiler persist macro expansion contexts and restores them during deserialization. The latter is done by directly allocating the expansion context without trying to restore the parent field in MarkData or the prev_ctxt field in SyntaxContextData. I think this is fine for now since those fields are not used after HIR lowering and incremental compilation only kicks in after that point. But I'd like some feedback from @jseyfried, @nrc, and others who are dealing with macros. Would you consider this a hack? Should I use some kind of poison value for parent and prev_ctxt so we can do runtime checks?
The second commit makes the compiler store spans as (file, line, column, length) in the incr. comp. cache instead of as byte offsets into the previous FileMap. It turned out that the previous approach silently breaks when loading something from the cache that is defined in a changed source file. The downside (or, depending on your point of view, upside) of the new approach is that it ICEs on corrupt span values; whereas the old approach would just pass them on, maybe making them a little more corrupted in the process :)
Since the compiler seems to produce tons of invalid span values, we have to deal with them somehow. For now I chose to solve this by detecting invalid span values during serializations and explicitly serialize those as a special value. During deserialization these values get turned into DUMMY_SP. In order for that not to break -Zincremental-verify-ich, I also adapted the HashStable implementation to behave semantically equivalent: All invalid values, including DUMMY_SP produce the same hash value distinct from all hashes a valid span would produce. I think this is a valid approach under the assumption that invalid spans are never legitimately used to produce a valid result.

r? @nikomatsakis
cc @rust-lang/compiler

The previous method ran into problems because ICH would treat Spans as (file,line,col) but the cache contained byte offsets and its possible for the latter to change while the former stayed stable.

nikomatsakis

Hmm, so this code seems ok to me, but I feel like I may not be the right reviewer. I could try to dig more deeply, but ultimately I don't have a strong insight into whether these changes will violate some sort of invariant.

I suppose that either way they represent progress over the status quo and hence I am in favor of landing.

One question: I didn't see any form of new tests. Is there a reason this is hard to write targeted tests for that show how the new system is better?

nikomatsakis · 2017-11-27T23:21:44Z

src/librustc/ich/hcx.rs

-                std_hash::Hash::hash(&loc2.1, hasher);
-                std_hash::Hash::hash(&loc2.2, hasher);
+        if span.hi < span.lo {
+            return std_hash::Hash::hash(&TAG_INVALID_SPAN, hasher);


I wonder why we generate so many invalid spans... how confident are you that these don't show up in diagnostics etc? I am not that confident. =) Are there asserts testing that hypothesis?

What gives me some confidence is that the output would be unreadable. That is, such spans span multiple file so displaying one of them in diagnostics would show two partial files and a random number of unrelated files that happen to be located in between the two initial files in the codemap. I haven't seen something like that yet.

I dimly recall adding some asserts of this kind in the code that generates snippets and things, though I can't find that now.

michaelwoerister · 2017-11-28T11:36:14Z

One question: I didn't see any form of new tests. Is there a reason this is hard to write targeted tests for that show how the new system is better?

Right, hm. So things only start to break once you actually cache many things that contain spans -- which is only done on a local branch of mine. So the existing tests will cover this but in the current state of the compiler it might be hard to exercise the relevant code paths.

michaelwoerister · 2017-11-28T16:42:41Z

It would be great if we could make some progress here. Maybe @nrc could have a quick look, especially at the first commit (b28e036)?

jseyfried · 2017-11-28T20:29:48Z

@michaelwoerister

I think this is fine for now since those fields are not used after HIR lowering and incremental compilation only kicks in after that point.

Those fields are used after lowering to HIR since fields, methods, and everything else that gets resolved after lowering is hygienic with macros 2.0. For example,

mod foo {
    struct S { x: i32 }
    pub macro m() {
        // Due to hygiene, the following always resolves, and is never a privacy error, wherever `m!()` is invoked.
        let s: S = S { x: 0 };
        s.x;
    }
}
// ...
fn f() { foo::m!(); }

In this example, we need the hygiene info for the two x's to be able to privacy-check the expansion correctly.

That being said, we (I) still need to implement to serializing hygiene info into the crate metadata anyway to support nested macro definitions across crate boundaries (i.e. a macro that defines a macro, the latter which gets used from another crate, see the limitation section of #40847), and I don't want to block you on this. I would be OK landing this as is, which would force me to do this :)

michaelwoerister · 2017-11-29T11:25:55Z

Thanks a lot for taking a look, @jseyfried!
What kind of errors would we potentially see due to the changes in this PR? Spurious resolution errors in things expanded from macros?

I'd be OK with landing too since:

I've run rather large test suites with a compiler implementing these changes, without running into errors, and
there's still a little bit of time before this will be used by default in an extensive way, so @jseyfried has a window of properly implementing this cross-crate functionality.

bors · 2017-11-30T13:48:16Z

☔ The latest upstream changes (presumably #46299) made this pull request unmergeable. Please resolve the merge conflicts.

nikomatsakis · 2017-11-30T20:02:47Z

OK. r=me

jseyfried · 2017-11-30T20:53:10Z

@michaelwoerister
I believe it could regress hygiene 2.0 privacy in limited instances. However, IIRC names from the cache are at least as unhygienic already, so this change should make anything much worse.

michaelwoerister · 2017-12-01T13:55:42Z

Thanks everyone! I'm closing this in favor of #46338 which contains the commits from this PR.

@nikomatsakis

incr.comp.: Load cached diagnostics lazily and allow more things in the cache. This PR implements makes two changes: 1. Diagnostics are loaded lazily from the incr. comp. cache now. This turned out to be necessary for correctness because diagnostics contain `Span` values and deserializing those requires that the source file they point to is still around in the current compilation session. Obviously this isn't always the case. Loading them lazily allows for never touching diagnostics that are not valid anymore. 2. The compiler can now deal with there being no cache entry for a given query invocation. Before, all query results of a cacheable query were always expected to be present in the cache. Now, the compiler can fall back to re-computing the result if there is no cache entry found. This allows for caching things that we cannot force from dep-node (like the `symbol_name` query). In such a case we'll just have a "best effort" caching strategy. ~~This PR is based on #46301 (=first 2 commits), so please don't merge until that has landed. The rest of the commits are ready for review though.~~ r? @nikomatsakis

michaelwoerister added 2 commits November 27, 2017 13:21

incr.comp.: Properly hash and encode macro expansion information.

b28e036

incr.comp.: Store Spans as (file,line,col,length) in incr.comp. cache.

faca970

The previous method ran into problems because ICH would treat Spans as (file,line,col) but the cache contained byte offsets and its possible for the latter to change while the former stayed stable.

rust-highfive assigned nikomatsakis Nov 27, 2017

carols10cents added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 27, 2017

nikomatsakis reviewed Nov 27, 2017

View reviewed changes

michaelwoerister mentioned this pull request Nov 28, 2017

incr.comp.: Load cached diagnostics lazily and allow more things in the cache. #46338

Merged

michaelwoerister mentioned this pull request Nov 30, 2017

incr.comp.: Current serialization logic for Spans messes with fingerprint stability #46059

Closed

kennytm added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 30, 2017

michaelwoerister closed this Dec 1, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

incr.comp.: Fix Span hashing in various ways #46301

incr.comp.: Fix Span hashing in various ways #46301

michaelwoerister commented Nov 27, 2017

nikomatsakis left a comment

nikomatsakis Nov 27, 2017

michaelwoerister Nov 28, 2017

nikomatsakis Nov 30, 2017

michaelwoerister commented Nov 28, 2017

michaelwoerister commented Nov 28, 2017

jseyfried commented Nov 28, 2017 •

edited

Loading

michaelwoerister commented Nov 29, 2017

bors commented Nov 30, 2017

nikomatsakis commented Nov 30, 2017

jseyfried commented Nov 30, 2017

michaelwoerister commented Dec 1, 2017

incr.comp.: Fix Span hashing in various ways #46301

incr.comp.: Fix Span hashing in various ways #46301

Conversation

michaelwoerister commented Nov 27, 2017

nikomatsakis left a comment

Choose a reason for hiding this comment

nikomatsakis Nov 27, 2017

Choose a reason for hiding this comment

michaelwoerister Nov 28, 2017

Choose a reason for hiding this comment

nikomatsakis Nov 30, 2017

Choose a reason for hiding this comment

michaelwoerister commented Nov 28, 2017

michaelwoerister commented Nov 28, 2017

jseyfried commented Nov 28, 2017 • edited Loading

michaelwoerister commented Nov 29, 2017

bors commented Nov 30, 2017

nikomatsakis commented Nov 30, 2017

jseyfried commented Nov 30, 2017

michaelwoerister commented Dec 1, 2017

jseyfried commented Nov 28, 2017 •

edited

Loading