-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use a separate thread for tiered compilation background work #45901
Conversation
MusicStore Windows affinity: 1 procBefore
After
MusicStore Linux --cpuset-cpus=1Before
After
MusicStore Linux --cpus=1Before
After
Other testing
|
The diff of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There might be a little fishiness in the cross-thread signaling but other than that all looked good to me.
- Makes it easier to manage how much time is spend for performing background work like rejitting and allows yielding more frequently with just Sleep without incurring thread pool overhead, which is useful in CPU-limited cases - A min/max range is determined for how long background work will be done before yielding the thread. The max is the same as before, 50 ms. For now the min is `processor count` ms (capped to the max), such that in CPU-limited cases the thread would yield more frequently in order to not monopolize too much of the limited CPU resources for background work, and in cases with a larger number of processors where the background work is typically less intrusive to foreground work it would yield less frequently. - At the same time, progress should be made on background work such that steady-state perf would be reached in reasonable time. Yielding too frequently can slow down the background work too much. The sleep duration is measured to identify oversubscribed situations to yield less frequently and make faster progress on the background work. - Due to less time spent rejitting in some CPU-limited cases, steady-state performance may be reached a bit later in favor of fewer spikes along the way - When the portable thread pool is enabled, a side effect of using a managed worker thread for tiering background work was that several GC-heavy microbenchmarks regressed. Tiering was the only thing using the thread pool in those tests and stack-walking the managed thread was slower due to the presence of GC refs. It's not too concerning, the benchmarks are just measuring something different from before, but in any case this change also resolves that issue. Fixes dotnet#44211.
Rebased to fix conflicts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks Kount!
processor count
ms (capped to the max), such that in CPU-limited cases the thread would yield more frequently in order to not monopolize too much of the limited CPU resources for background work, and in cases with a larger number of processors where the background work is typically less intrusive to foreground work it would yield less frequently.