Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Fix hill climbing float overflow #14505

Merged
merged 2 commits into from
Oct 19, 2017
Merged

Fix hill climbing float overflow #14505

merged 2 commits into from
Oct 19, 2017

Conversation

kouvel
Copy link
Member

@kouvel kouvel commented Oct 14, 2017

  • When hill climbing finds that it wants to decrease the thread count but can't because the thread count is already the minimum, it instead tries to increase the sampling interval by a factor of up to 10 depending on how much it wanted to decrease the thread count
  • The ratio was being used incorrectly (used max instead of min), and sometimes the ratio can be so large that the conversion to int after the float math overflows
  • If something in the process enabled floating point exceptions, it may also crash
    • There doesn't appear to be a clean way to disable hill climbing, added a config variable that disables it in case a workaround is necessary for some other reason in the future
  • Fixed to avoid overflow in the math to what was probably intended
  • There may be another bug in GetWaveComponent() that causes values of such high magnitude to be generated, I'll leave that investigation for when that in particular becomes a real issue

- When hill climbing finds that it wants to decrease the thread count but can't because the thread count is already the minimum, it instead tries to increase the sampling interval by a factor of up to 10 depending on how much it wanted to decrease the thread count
- The ratio was being used incorrectly (used max instead of min), and sometimes the ratio can be so large that the conversion to int after the float math overflows
- If something in the process enabled floating point exceptions, it may also crash
  - There doesn't appear to be a clean way to disable hill climbing, added a config variable that disables it in case a workaround is necessary for some other reason in the future
- Fixed to avoid overflow in the math to what was probably intended
- There may be another bug in GetWaveComponent() that causes values of such high magnitude to be generated, I'll leave that investigation for when that in particular becomes a real issue
@kouvel kouvel added this to the 2.1.0 milestone Oct 14, 2017
@kouvel kouvel self-assigned this Oct 14, 2017
@kouvel kouvel requested a review from vancem October 14, 2017 05:29
@kouvel kouvel requested a review from stephentoub October 18, 2017 02:52
@kouvel
Copy link
Member Author

kouvel commented Oct 18, 2017

Ping for review please

if (MinLimitTotalWorkerThreads == MaxLimitTotalWorkerThreads)
{
IsHillClimbingDisabled = true;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the min and max are changed by the dev with calls to ThreadPool.SetMin/MaxThreads. Could we go from a situation where hill climbing is disabled to one where it shouldn't be?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, this is unnecessary as well, I'll remove it

@@ -335,7 +335,7 @@ int HillClimbing::Update(int currentThreadCount, double sampleDuration, int numC
// we'll simply stay at minThreads much longer, and only occasionally try a higher value.
//
if (ratio.r < 0.0 && newThreadCount == ThreadpoolMgr::MinLimitTotalWorkerThreads)
*pNewSampleInterval = (int)(0.5 + m_currentSampleInterval * (10.0 * max(-ratio.r, 1.0)));
*pNewSampleInterval = (int)(0.5 + m_currentSampleInterval * (10.0 * min(-ratio.r, 1.0)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to the overflow concerns, what kind of impact was the max vs min having on actual thread growth? I'm wondering if this might actually have substantial implications and need some broader testing on a variety of scenarios.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thread growth calculation was doing the proper min/max above this code to limit ratio.r to {-1..1}. An effect of this issue is that the new sample interval becomes int32_min after overflow (exception on overflow is disabled by default) and it's added to a GetTickCount() value to determine the next threshold for sampling (stored as unsigned), so the interval becomes very large and hill climbing gets disabled.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thread growth calculation was doing the proper min/max above this code to limit ratio.r to {-1..1}.

Ah. Ok.

@kouvel
Copy link
Member Author

kouvel commented Oct 19, 2017

@stephentoub, can I consider this reviewed?

@kouvel
Copy link
Member Author

kouvel commented Oct 19, 2017

@dotnet-bot test OSX10.12 x64 Checked Build and Test

@kouvel
Copy link
Member Author

kouvel commented Oct 19, 2017

Thanks!

@kouvel kouvel merged commit a589e39 into dotnet:master Oct 19, 2017
@kouvel kouvel deleted the HillClimbingFix branch October 19, 2017 03:51
kouvel added a commit to kouvel/runtime that referenced this pull request Oct 20, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants