-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
incr.comp.: Use a set implementation optimized for small item counts for deduplicating read-edges. #45577
Conversation
…for deduplicating read-edges.
Something like this? https://crates.io/crates/vec_map |
@leonardo-m No, this structure is more like https://docs.rs/david-set/0.1.2/david_set/struct.Set.html. |
// Many kinds of nodes often only have between 0 and 3 edges, so we provide a | ||
// specialized set implementation that does not allocate for those some counts. | ||
#[derive(Debug, PartialEq, Eq)] | ||
enum DepNodeIndexSet { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you move this to a generic data structure in rustc_data_structures
?
Could you try and get some numbers on the performance improvement or the hit rate? Even if you don't, r=me with the set moved to |
@bors try Preparing for perf. |
incr.comp.: Use a set implementation optimized for small item counts for deduplicating read-edges. Many kinds of `DepNodes` will only ever have between zero and three edges originating from them (see e.g. #45063 (comment)) so let's try to avoid allocating a `HashSet` in those cases. r? @nikomatsakis
☀️ Test successful - status-travis |
@rust-lang/infra perf check requested from #45577 (comment). |
This is fine, but an Note: ArrayVec is already in the codebase. |
The improvement/regression is negligible. There is even a +17.4% max-rss (memory use) in the
|
Thanks for kicking off the performance measurement, @kennytm!
@julian-seward1 and I ran into this function as on of the hotter ones while profiling a incremental building of the regex crate. The compiler spends roughly 0.9% and 1.6% of cycles in the The main intended optimization here is to get rid of the heap allocation for small hash sets. It's a bit surprising that the effect on the overall instruction count is so small. I would have hoped for -0.5% instead of -0.1%.
This must be some other effect. I don't see how this could introduce any noticeable increase in memory consumption. Thanks for your comments, everyone! Maybe I'll tinker some more with this in my spare time. Closing for now. |
@michaelwoerister Probably it's because jemalloc is very efficient? 😄 |
Many kinds of
DepNodes
will only ever have between zero and three edges originating from them (see e.g. #45063 (comment)) so let's try to avoid allocating aHashSet
in those cases.r? @nikomatsakis