-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: MethodImplOptions.Cold
for marking cold methods for the JIT
#84333
Comments
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak Issue DetailsBackground and motivationHaving methods that always perform expensive operations like for example growing methods in growable collections is common in high performance code like the BCL. It'd be useful to be able to mark those with a flag being the reverse of I've thought about this while working on #82146 and noticing the JIT placing the call to Grow which is expected to be rare and expensive above a singular mov which is expected to happen in 90% of cases. I'm not really sure what the exact name of the flag should be, my ideas include: Cold, Unlikely, Rare, Expensive. cc @tannergooding @EgorBo @AndyAyersMS API Proposalnamespace System.Runtime.CompilerServices;
public enum MethodImplOptions
{
Cold = 1024
} API UsageThere are quite a few places in the BCL that have "always cold" methods that should be marked with it, for example List.Grow that always ends up allocating a new array and copying data to it: runtime/src/libraries/System.Private.CoreLib/src/System/Collections/Generic/List.cs Lines 447 to 455 in e13f0dc
Alternative DesignsAlternative designs include relying on PGO to detect such methods, which is less reliable, slows down startup and isn't as AOT friendly, or using more flexible method intrinsics from #4966 that are however harder to implement and require the library user, not author to mark the code with those. RisksPartial overlap with Assume intrinsics and PGO, more work for the JIT to do.
|
Tagging subscribers to this area: @dotnet/area-system-runtime-compilerservices Issue DetailsBackground and motivationHaving methods that always perform expensive operations like for example growing methods in growable collections is common in high performance code like the BCL. It'd be useful to be able to mark those with a flag being the reverse of I've thought about this while working on #82146 and noticing the JIT placing the call to Grow which is expected to be rare and expensive above a singular mov which is expected to happen in 90% of cases. I'm not really sure what the exact name of the flag should be, my ideas include: Cold, Unlikely, Rare, Expensive. cc @tannergooding @EgorBo @AndyAyersMS API Proposalnamespace System.Runtime.CompilerServices;
public enum MethodImplOptions
{
Cold = 1024
} API UsageThere are quite a few places in the BCL that have "always cold" methods that should be marked with it, for example List.Grow that always ends up allocating a new array and copying data to it: runtime/src/libraries/System.Private.CoreLib/src/System/Collections/Generic/List.cs Lines 447 to 455 in e13f0dc
Alternative DesignsAlternative designs include relying on PGO to detect such methods, which is less reliable, slows down startup and isn't as AOT friendly, or using more flexible method intrinsics from #4966 that are however harder to implement and require the library user, not author to mark the code with those. RisksPartial overlap with Assume intrinsics and PGO, more work for the JIT to do.
|
At scale, PGO is going to be more reliable and cheaper than annotating methods manually. Also, PGO is AOT friendly and does not need to slow down the startup when the data is collected statically ahead of time. |
PGO can also say that a method is profitable to be inlined, yet Even if static PGO can be collected for AOT apps, I doubt most apps will do so, afaik most native projects do not rely on PGO. |
From my understanding your attribute should just sligtly improve code layout by moving |
As I mentioned offline, if you merge something like this into the BCL, wait a few days for our profiling process to update the data that feeds into See eg #49520 (comment) where we had similar discussions. This process should work equally well for the library code shared by Native AOT, though I am not sure if the profile data gets fed into that toolchain currently. To see similar effects directly during BCL development you can run a suitable benchmark with TieredPGO enabled and inspect the Tier1 codegen. |
You could say the same about throw helpers, reordering them manually is always an option, yet the JIT handles them as cold separately. |
While I agree PGO should be the main choice for most scenarios, there are always going to be scenarios where it is not applicable or where ensuring it gets the "right" data is significant effort. Static PGO is entirely dependent on the workloads you process as part of collection. Additionally, while we may someday achieve better parity with the extended optimizations native compilers may provide, I find it extremely unlikely that we will surpass them such that developer provided hints are truly unnecessary. .NET has multiple scenarios to consider, including both JIT and AOT, as well as a range of platforms (Windows, MacOS, Linux, Android, iOS, etc). Not all of our features (such as Dynamic PGO) are available everywhere, nor do we ourselves use some of our functionality everywhere (e.g. Static PGO is still off for MacOS by default, as far as I recall). Native compilers, despite having significantly more time to spend doing intraprocedural analysis and optimizations in general still provide multiple types of these "guided optimizations". That is, they provide both a type of static PGO and a type of code hints provided via attributes, custom pragma/keywords, or more recently official language features covering features that have long existed across multiple supporting compilers implementations. While most native code bases do not use these features, perf sensitive code bases and hot paths still do. Some much more extensively than others, particularly where the perf is extremely sensitive. With the introduction of official language features around these attributes, code bases are more likely to adopt them in the future as well. This has been seen repeatedly over the years with other features that have moved from being compiler specific to official features. Allowing developers to provide basic hints around "hot" (likely) vs "cold" (unlikely) is a natural next step for .NET and meshes nicely with the overall directions we've been making. The two biggest issues would be:
For 1, I think the solution is relatively simple as we have existing prior art. We should simply use one of the remaining free bits to indicate that an "extended attribute" exists and should be resolved. This works much like how For 2, I think we just need to make a decision. In general, I would presume that the intent is "developer hints" take precedence over "static PGO" but that they are less preferred to "dynamic PGO". That is |
The manual likely/unlikely annotations via method calls are more powerful. If we were to do something here, should we start with those? We do not need multiple mechanisms to do the same thing. |
I think that would be reasonable and cover the same general need. |
The thing is that a flag like this is slightly different:
|
The annotation via a method call can be the very first thing in a method to achieve the same effect.
Yes, there are multiple different variants of cold as you have pointed out. Almost never executed (e.g. throw helper), executed typically once (e.g. static constructor or one time initialization), executed relatively less often (your We do not want to have flags like this with overloaded meaning. We have overloaded flag like that today (
It is not that simple once you address the problem with shortage of MethodImplOptions bits that @tannergooding pointed out. |
Yes, this is potentially a useful distinction. However, it can be problematic particularly for public API surface in that often whether something is "likely" or "unlikely" is per caller. It's often only truly declarable by the callee in a few cases such as
Saying something is cold is saying it is unlikely to be repeatedly executed and therefore typically won't show up in the flamegraph of a profile. There are indeed different kinds of cold here from "never expected" to "expected once" to "expected rarely" and that may be worth considering.
They aren't that uncertain. They come with whatever semantics/heuristics we want to give them, much as they do in C/C++. Marking a block as There are other complications with the method attr as well, such as determining what happens if the block contains more than just one cold call or if we wanted to expand to also allow annotating hot calls, etc. |
Background and motivation
Having methods that always perform expensive operations like for example growing methods in growable collections is common in high performance code like the BCL. It'd be useful to be able to mark those with a flag being the reverse of
AggressiveInlining
which would tell the JIT to always treat blocks that end up with a call to it as cold, heavily pessimize inlining it unless a method always ends up calling it and prefer hoisting branches with it to the end of the generated codegen, similar to what Throw Helpers currently do.I've thought about this while working on #82146 and noticing the JIT placing the call to Grow which is expected to be rare and expensive above a singular mov which is expected to happen in 90% of cases.
I'm not really sure what the exact name of the flag should be, my ideas include: Cold, Unlikely, Rare, Expensive.
cc @tannergooding @EgorBo @AndyAyersMS
API Proposal
API Usage
There are quite a few places in the BCL that have "always cold" methods that should be marked with it, for example List.Grow that always ends up allocating a new array and copying data to it:
runtime/src/libraries/System.Private.CoreLib/src/System/Collections/Generic/List.cs
Lines 447 to 455 in e13f0dc
Alternative Designs
Alternative designs include relying on PGO to detect such methods, which is less reliable, slows down startup and isn't as AOT friendly, or using more flexible method intrinsics from #4966 that are however harder to implement and require the library user, not author to mark the code with those.
Risks
Partial overlap with Assume intrinsics and PGO, more work for the JIT to do.
The text was updated successfully, but these errors were encountered: