Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle early wakeup of counter loop #57170

Merged

Conversation

josalem
Copy link
Contributor

@josalem josalem commented Aug 10, 2021

Fix #56695

  • prevent a negative delta value

Copy of comment from #56695:

I think I've worked out what happened.

The new logic:

TimeSpan delta = now - _nextPollingTimeStamp;
if (delta > TimeSpan.Zero && _pollingIntervalInMilliseconds > 0)
	_nextPollingTimeStamp += TimeSpan.FromMilliseconds(_pollingIntervalInMilliseconds * Math.Ceiling(delta.TotalMilliseconds / _pollingIntervalInMilliseconds));

can have wonky behavior if the sleep in CounterGroup.PollForValues wakes up early.

delta can be negative, which will result in _nextPollingTimeStamp not being updated. This means that sleepDurationInMilliseconds will be set to 1 and the counter will fire again on the next loop because _nextPollingTimeStamp < now + new TimeSpan.FromMilliseconds(1) will be true.

private static void PollForValues()
{
AutoResetEvent? sleepEvent = null;
// Cache of onTimer callbacks for each CounterGroup.
// We cache these outside of the scope of s_counterGroupLock because
// calling into the callbacks can cause a re-entrancy into CounterGroup.Enable()
// and result in a deadlock. (See https://github.com/dotnet/runtime/issues/40190 for details)
var onTimers = new List<CounterGroup>();
while (true)
{
int sleepDurationInMilliseconds = int.MaxValue;
lock (s_counterGroupLock)
{
sleepEvent = s_pollingThreadSleepEvent;
foreach (CounterGroup counterGroup in s_counterGroupEnabledList!)
{
DateTime now = DateTime.UtcNow;
if (counterGroup._nextPollingTimeStamp < now + new TimeSpan(0, 0, 0, 0, 1))
{
onTimers.Add(counterGroup);
}
int millisecondsTillNextPoll = (int)((counterGroup._nextPollingTimeStamp - now).TotalMilliseconds);
millisecondsTillNextPoll = Math.Max(1, millisecondsTillNextPoll);
sleepDurationInMilliseconds = Math.Min(sleepDurationInMilliseconds, millisecondsTillNextPoll);
}
}
foreach (CounterGroup onTimer in onTimers)
{
onTimer.OnTimer();
}
onTimers.Clear();
if (sleepDurationInMilliseconds == int.MaxValue)
{
sleepDurationInMilliseconds = -1; // WaitOne uses -1 to mean infinite
}
sleepEvent?.WaitOne(sleepDurationInMilliseconds);
}
}

A fix would be to switch to the following:

private const TimeSpan MinimumPollingInterval = TimeSpan.FromSeconds(1);

// ...

TimeSpan delta = now - _nextPollingTimeStamp;
delta = MinimumPollingInterval > delta ? MinimumPollingInterval : delta;
if (_pollingIntervalInMilliseconds > 0)
	_nextPollingTimeStamp += TimeSpan.FromMilliseconds(_pollingIntervalInMilliseconds * Math.Ceiling(delta.TotalMilliseconds / _pollingIntervalInMilliseconds));

The now - _nextPollingTimeStamp is calculating the difference between when the counter woke up (now) and when it was supposed to wake up (_nextPollingTimeStamp). This is supposed to then increment _nextPollingTimeStamp to the next discrete increment of _pollingIntervalInMillseconds from the origin time. This prevents the interval from sliding if counters take sufficiently long amounts of time to calculate. This updated logic without the loop, however, only increments if now > _nextPollingTimeStamp (if the counter woke up after the intended timestamp).

We'll need to backport these changes to 3.1 and 5.0 if they are approved.

* prevent a negative delta value
@josalem josalem added this to the 6.0.0 milestone Aug 10, 2021
@josalem josalem requested a review from a team August 10, 2021 20:50
@josalem josalem self-assigned this Aug 10, 2021
@ghost
Copy link

ghost commented Aug 10, 2021

Tagging subscribers to this area: @tarekgh, @tommcdon, @pjanotti
See info in area-owners.md if you want to be subscribed.

Issue Details

Fix #56695

  • prevent a negative delta value

Copy of comment from #56695:

I think I've worked out what happened.

The new logic:

TimeSpan delta = now - _nextPollingTimeStamp;
if (delta > TimeSpan.Zero && _pollingIntervalInMilliseconds > 0)
	_nextPollingTimeStamp += TimeSpan.FromMilliseconds(_pollingIntervalInMilliseconds * Math.Ceiling(delta.TotalMilliseconds / _pollingIntervalInMilliseconds));

can have wonky behavior if the sleep in CounterGroup.PollForValues wakes up early.

delta can be negative, which will result in _nextPollingTimeStamp not being updated. This means that sleepDurationInMilliseconds will be set to 1 and the counter will fire again on the next loop because _nextPollingTimeStamp < now + new TimeSpan.FromMilliseconds(1) will be true.

private static void PollForValues()
{
AutoResetEvent? sleepEvent = null;
// Cache of onTimer callbacks for each CounterGroup.
// We cache these outside of the scope of s_counterGroupLock because
// calling into the callbacks can cause a re-entrancy into CounterGroup.Enable()
// and result in a deadlock. (See https://github.com/dotnet/runtime/issues/40190 for details)
var onTimers = new List<CounterGroup>();
while (true)
{
int sleepDurationInMilliseconds = int.MaxValue;
lock (s_counterGroupLock)
{
sleepEvent = s_pollingThreadSleepEvent;
foreach (CounterGroup counterGroup in s_counterGroupEnabledList!)
{
DateTime now = DateTime.UtcNow;
if (counterGroup._nextPollingTimeStamp < now + new TimeSpan(0, 0, 0, 0, 1))
{
onTimers.Add(counterGroup);
}
int millisecondsTillNextPoll = (int)((counterGroup._nextPollingTimeStamp - now).TotalMilliseconds);
millisecondsTillNextPoll = Math.Max(1, millisecondsTillNextPoll);
sleepDurationInMilliseconds = Math.Min(sleepDurationInMilliseconds, millisecondsTillNextPoll);
}
}
foreach (CounterGroup onTimer in onTimers)
{
onTimer.OnTimer();
}
onTimers.Clear();
if (sleepDurationInMilliseconds == int.MaxValue)
{
sleepDurationInMilliseconds = -1; // WaitOne uses -1 to mean infinite
}
sleepEvent?.WaitOne(sleepDurationInMilliseconds);
}
}

A fix would be to switch to the following:

private const TimeSpan MinimumPollingInterval = TimeSpan.FromSeconds(1);

// ...

TimeSpan delta = now - _nextPollingTimeStamp;
delta = MinimumPollingInterval > delta ? MinimumPollingInterval : delta;
if (_pollingIntervalInMilliseconds > 0)
	_nextPollingTimeStamp += TimeSpan.FromMilliseconds(_pollingIntervalInMilliseconds * Math.Ceiling(delta.TotalMilliseconds / _pollingIntervalInMilliseconds));

The now - _nextPollingTimeStamp is calculating the difference between when the counter woke up (now) and when it was supposed to wake up (_nextPollingTimeStamp). This is supposed to then increment _nextPollingTimeStamp to the next discrete increment of _pollingIntervalInMillseconds from the origin time. This prevents the interval from sliding if counters take sufficiently long amounts of time to calculate. This updated logic without the loop, however, only increments if now > _nextPollingTimeStamp (if the counter woke up after the intended timestamp).

We'll need to backport these changes to 3.1 and 5.0 if they are approved.

Author: josalem
Assignees: josalem
Labels:

area-System.Diagnostics.Tracing

Milestone: 6.0.0

* bug fix is preexisting: if a user other than the one who is listening sets the interval to 0 while the counter is enabled, it makes the counter continuously submit values. Changed this so that an interval value of <= 0 results in the counter being disabled
@josalem
Copy link
Contributor Author

josalem commented Aug 13, 2021

CI doesn't seem to be starting for some reason...

@josalem
Copy link
Contributor Author

josalem commented Aug 13, 2021

/azp run runtime

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@ghost
Copy link

ghost commented Aug 13, 2021

Hello @hoyosjs!

Because this pull request has the auto-merge label, I will be glad to assist with helping to merge this pull request once all check-in policies pass.

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (@msftbot) and give me an instruction to get started! Learn more here.

@hoyosjs
Copy link
Member

hoyosjs commented Aug 13, 2021

Failuer is not unique to this PR (see #57174)

@hoyosjs
Copy link
Member

hoyosjs commented Aug 13, 2021

Created
#57375

@hoyosjs hoyosjs merged commit 3042bd7 into dotnet:main Aug 13, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Sep 12, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Strange events from System.Runtime EventCounters under Linux
4 participants