incr.comp.: Maybe optimize case of dependency-less anonymous queries. #45408

michaelwoerister · 2017-10-20T10:07:31Z

Some kinds of queries can end up with no dependencies. That is a valid case if their computation solely depends on their query key and not on anything in the environment. erase_regions_ty is such a case.

We could optimize this case by not allocating a DepNode for such query invocations, which in turn would also save registering reads of that DepNode.

@nikomatsakis, should we do this? If so, I could write some mentoring instructions.

The text was updated successfully, but these errors were encountered:

michaelwoerister · 2018-02-14T16:10:33Z

The current plan is to implement this optimization via not instantiating anonymous nodes at all. Instead their edges should be duplicated to readers.

nikomatsakis · 2018-02-14T20:05:21Z

I'm a fan of not instantiating anonymous nodes at all. I think that simplifies the overall model and may be a perf win to boot. Seems good.

cjgillot · 2020-04-18T22:01:38Z

Is this still relevant? If yes, could you explain what you mean by "not instantiating anonymous nodes"?

nikomatsakis · 2020-04-20T22:07:41Z

I think it's still relevant, but I'd have to dig more into the code to be sure. An "anonymous node" is a node in the dependency graph that doesn't correspond to a query. They primarily occur in trait selection, if I recall.

wesleywiser · 2020-09-15T12:59:45Z

The query system has changed quite a bit since the last time I looked but I will try to leave some notes in case this is helpful:

The query system is defined in compiler/rustc_query_system
A query is anonymous if this is true

rust/compiler/rustc_query_system/src/query/config.rs

Line 74 in 90b1f5a

const ANON: bool;
I think a query depends solely on key and not the environment if no_tcx = true

rust/compiler/rustc_query_system/src/dep_graph/graph.rs

Line 225 in 90b1f5a

no_tcx: bool,
I believe recording the edge recording happens here

rust/compiler/rustc_query_system/src/dep_graph/graph.rs

Line 1004 in 90b1f5a

fn intern_node(

tgnottingham · 2020-11-08T00:13:29Z

I'm far from an expert in this area, but I wanted to note some potential wrinkles.

As I understand it, anonymous nodes provide two things, which we probably want to preserve:

They enable, or at least interact with (sorry, hazy on the details) caching the querys' results in a single compilation session. If we make multiple calls to the same anonymous query in a single session, we don't have to recompute the result.

I'm not sure what has to change to preserve this behavior if we remove the anonymous nodes. The caching system does have some dependency on DepNodeIndex, which wouldn't exist for an anonymous query if it wasn't allocated.
They track dependencies. Suppose query A depends on anonymous query B, which depends on queries C and D. If we remove anonymous nodes, we need to make A depend on C and D directly, or we lose that dependency info.

I'm not sure if this is trivial to handle. As we execute A, we build up its dependency list. We do that by adding the dependency's DepNodeIndex to A's dependency list. But if B isn't ever allocated, it doesn't have an index, so we're already in trouble. We really want to depend on C and D, anyway, so maybe we can get around it. But where are we getting that list of C and D from if it isn't stored somewhere? And if we're storing it somewhere, how much are we gaining by not just storing it in a node for the anonymous query B? There may be a good answer, it's just not clear to me.

Finally, I want to note that this may not actually be an optimization. If we copy B's dependencies into things that depend on B, this could actually result in more storage being used, depending on the queries involved. Multiple queries can depend on B. If B depends on 10 queries, now we've replaced each dependency on B with 10 dependencies. In a way, anonymous nodes are an optimization for this case. They deduplicate a set of common edges.

michaelwoerister · 2020-11-17T15:32:20Z

@tgnottingham's analysis is pretty spot on, afaict. To give some context (which is unfortunately lacking in the issue description):

Anonymous dep-nodes were conceived in order to represent things in the dep-graph for which we don't have a query key. In particular we used them for things cached internally in the trait selection system. Their main purpose is to enable dependency tracking for things that don't fit in the query system -- and I think we always considered them as a kind of crutch that we wanted to get rid of as soon as possible.

Some time later we discovered that we could use them as a performance optimization for queries that don't get cached on disk, like erase_regions_ty. They are more performant because they don't need to compute the DepNode fingerprint from the query key.

This issue talks about handling dependency tracking differently: Instead of introducing an anonymous node and pointing the dep-edge to it, we just duplicate the anonymous node's outgoing edges. E.g.

A <-------+  +------ C
          |  |
          |  v
         (anon) 
          |  ^
          |  |
B <-------+  +------ D

becomes

A <---------------- C
^                   |
|                   |
+-----------+       |
            |       |
+---------- | ------+
|           |
v           |
B <---------D

That way, we would still keep all dependencies intact. However, as @tgnottingham points out, this might not actually be an optimization. The number of edges becomes n * m instead of n + m. In the case of erase_regions_ty where n == 0, this would be beneficial, but for other cases it might make things a lot worse.

I'm wondering if we could omit allocating anon dep-nodes IFF they have zero outgoing edges? I'm not sure if that could lead to complications. Probably yes.

Avoid creating anonymous nodes with zero or one dependency. Anonymous nodes are only useful to encode dependencies, and cannot be replayed from one compilation session to another. As such, anonymous nodes without dependency are always green. Anonymous nodes with only one dependency are equivalent to this dependency. cc rust-lang#45408 cc `@michaelwoerister`

cjgillot · 2021-11-07T10:23:27Z

Marking as fixed by #85337.

michaelwoerister added the A-incr-comp Area: Incremental compilation label Oct 20, 2017

michaelwoerister mentioned this issue Oct 20, 2017

make erase_regions_ty query anonymous #45364

Merged

TimNN added the C-cleanup Category: PRs that clean code up or issues documenting cleanup. label Oct 22, 2017

michaelwoerister changed the title ~~incr.comp.: Maybe optimize case of dependency-less non-input queries.~~ incr.comp.: Maybe optimize case of dependency-less anonymous queries. Oct 23, 2017

michaelwoerister mentioned this issue Jan 22, 2018

Tracking Issue for Incremental Compilation #47660

Open

32 tasks

wesleywiser mentioned this issue Sep 14, 2020

Ongoing projects rust-lang/wg-incr-comp#1

Open

arora-aman mentioned this issue Sep 22, 2020

Don't allocate DepNode if anonymous #77070

Closed

jonas-schievink added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Sep 22, 2020

cjgillot mentioned this issue May 15, 2021

Avoid creating anonymous nodes with zero or one dependency. #85337

Merged

cjgillot closed this as completed Nov 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

incr.comp.: Maybe optimize case of dependency-less anonymous queries. #45408

incr.comp.: Maybe optimize case of dependency-less anonymous queries. #45408

michaelwoerister commented Oct 20, 2017

michaelwoerister commented Feb 14, 2018

nikomatsakis commented Feb 14, 2018

cjgillot commented Apr 18, 2020

nikomatsakis commented Apr 20, 2020

wesleywiser commented Sep 15, 2020

tgnottingham commented Nov 8, 2020

michaelwoerister commented Nov 17, 2020

cjgillot commented Nov 7, 2021

incr.comp.: Maybe optimize case of dependency-less anonymous queries. #45408

incr.comp.: Maybe optimize case of dependency-less anonymous queries. #45408

Comments

michaelwoerister commented Oct 20, 2017

michaelwoerister commented Feb 14, 2018

nikomatsakis commented Feb 14, 2018

cjgillot commented Apr 18, 2020

nikomatsakis commented Apr 20, 2020

wesleywiser commented Sep 15, 2020

tgnottingham commented Nov 8, 2020

michaelwoerister commented Nov 17, 2020

cjgillot commented Nov 7, 2021