-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inliner: Extend IL limit for profiled call-sites, allow inlining for switches. #55478
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we see any switches getting inlined in the P0/P1 tests? If not, consider adding some test cases for them.
Also we might want to enhance jit stress to explore some of these same new inlining capabilities.
Otherwise LGTM.
src/coreclr/jit/fgprofile.cpp
Outdated
|
||
if ((fgFirstBB != nullptr) && (fgPgoSource == ICorJitInfo::PgoSource::Static)) | ||
{ | ||
const BasicBlock::weight_t sufficientSamples = 5000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This used to be 1000 -- I assume you're increasing this to keep prejit image size small?
If so, you should add a comment describing how this value influences prejit size.
If not, you might comment on what the impact of changing this would be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1000 samples feels like a plenty of evidence that something is hot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted to 1000. Yes, this changes was supposed to decrease the prejitted size, I was using this histogram:
(weights in SPC, the right column starts at 50000).
However, I don't need to save some space with it anymore as I've found an unrelated issue that bloated size for no reason (binary expressions like "arg op cns" used to leave "cns" on top of the pushed stack so there were lots of false-positive foldable-branches/switches).
Not sure about the runtime tests yet (will check), but jit-diff (
Diff is negative! for a change that supposed to inline more methods than in the baseline. |
…it produced many false-positive foldable switches/branches.
…nto jit-inliner-3
One more note regarding switches - runtime tests have several hundreds of them including foldable ones. But even non-foldable can be inlined in the stress mode, because we apply additional +10 multiplier in that mode. |
…debugger_custom_views * 'main' of github.com:thaystg/runtime: (125 commits) [wasm] [debugger] Support method calls (dotnet#55458) [debugger] Fix debugging after hot reloading (dotnet#55599) Inliner: Extend IL limit for profiled call-sites, allow inlining for switches. (dotnet#55478) DiagnosticSourceEventSource supports base class properties (dotnet#55613) [mono] Fix race during mono_image_storage_open (dotnet#55201) [mono] Add wrapper info for native func wrappers. (dotnet#55602) H/3 and Quic AppContext switch (dotnet#55332) Compression.ZipFile support for Unix Permissions (dotnet#55531) [mono] Fix skipping of static methods during IMT table construction. (dotnet#55610) Combine System.Private.Xml TrimmingTests projects (dotnet#55606) fix name conflict with Configuration class (dotnet#55597) Finish migrating RSAOpenSsl from RSA* to EVP_PKEY* Disable generic math (dotnet#55540) Obsolete CryptoConfig.EncodeOID (dotnet#55592) Address System.Net.Http.WinHttpHandler's nullable warnings targeting .NETCoreApp (dotnet#54995) Enable Http2_MultipleConnectionsEnabled_ConnectionLimitNotReached_ConcurrentRequestsSuccessfullyHandled (dotnet#55572) Fix Task.WhenAny failure mode when passed ICollection of zero tasks (dotnet#55580) Consume DistributedContextPropagator in DiagnosticsHandler (dotnet#55392) Add property ordering feature (dotnet#55586) Reduce subtest count in Reflection (dotnet#55537) ...
This PR:
switch
(closes JIT: Methods with switches aren't inlineable #55336) - these used to be non-inlineable always.As the result - it significantly improves several TE benchmarks in FullPGO mode (will post the results for all of the TE benchmarks later):
SPC's R2R size with
--Ot
: 10.35Mb, +0.9% size increase (it's 10.42Mb in Main).SPC's R2R size with
-O
(default mode): 9.93Mb