-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[mono] Optimize startup vtable setup #101312
Conversation
…undant calls to it during application startup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll come back and do a more thorough review. At first glance this looks good.
I'd prefer that if the release build is doing a fast path and returning some answer, the debug build should do the slow path and compare that the answer it gets matches the release build's fast answer.
I usually try to repro customer issues on a local debug build and it would be quite annoying if it was giving a different result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
Verify cache in checked builds
`ENABLE_CHECKED_BUILD` is defined to mean "Enable additional checks" and is enabled in checked and debug builds. Therefore this performance optimization be enabled when `ENABLE_CHECKED_BUILD` is *not* defined. Ref: dotnet#101312
* Add new [ptr, ptr] -> ptr simdhash variant for caching * Cache mono_class_implement_interface_slow because we perform many redundant calls to it during application startup * Verify cache in checked builds
…ides (dotnet#101445) `ENABLE_CHECKED_BUILD` is defined to mean "Enable additional checks" and is enabled in checked and debug builds. Therefore this performance optimization should be enabled when `ENABLE_CHECKED_BUILD` is *not* defined. Ref: dotnet#101312
* Add new [ptr, ptr] -> ptr simdhash variant for caching * Cache mono_class_implement_interface_slow because we perform many redundant calls to it during application startup * Verify cache in checked builds
…ides (dotnet#101445) `ENABLE_CHECKED_BUILD` is defined to mean "Enable additional checks" and is enabled in checked and debug builds. Therefore this performance optimization should be enabled when `ENABLE_CHECKED_BUILD` is *not* defined. Ref: dotnet#101312
During startup (mostly in interpreted builds, but also a little bit in AOT) we spend a good chunk of time setting up vtables, and a lot of that time is spent in
mono_class_implement_interface_slow
. Once a check enters that slow path, all checks underneath it also stay on the slow path, which can result in a (small) exponential explosion of recursive checks that scan moderately large arrays, comparing A against B. The interface inheritance chains on BCL types are quite deep now in some cases thanks to things like generic arithmetic.This PR adds a simple simdhash-based cache for mono_class_implement_interface_slow. In my testing it has a cache hit rate of ~60% during runs of System.Runtime.Tests and System.Text.Json.Tests, along with a cache hit rate of 40-50% on simpler applications. The number of expensive checks optimized out this way is fairly significant - tens of thousands on those test suites. Improvements from this should be more dramatic for more complex codebases.
The cache implementation is somewhat suboptimal - it will involve temporary allocations if multiple threads are racing to initialize vtables, and when the cache gets too big we have to clear it instead of pruning the oldest entries, which reduces the effective hit rate - but the memory usage is deterministic and based on my profiles the performance characteristics are good.
This PR also disables
verify_class_overrides
for types inside corlib unless you're building for debug - @lambdageek pointed out that we don't really need to verify corlib types since csc should never generate invalid types for code under our control. This verification is a source of some of these redundant checks, though there are still plenty even with it disabled.