Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4.x: Fix PeriodicTimerSystemClockMonitor concurrency & failure behavior #528

Merged

Conversation

akarnokd
Copy link
Collaborator

This PR fixes the PeriodicTimerSystemClockMonitor to:

  • avoid restarting the timer when a new listener joins,
  • avoid stopping the timer when one listener leaves (and there could be others still listening),
  • avoid the data races between the NewTimer and the TimerChanged which involves using unix time milliseconds for the absolute time that can be atomically manipulated via Volatile/Interlocked,
  • do not fail if the SYNC_MAXRETRIES has been reached but start backing off and sleeping a bit before retrying.

Fixes: #270

@danielcweber
Copy link
Collaborator

avoid restarting the timer when a new listener joins

This probably is not really what happens since there is AddRef and Release in SystemClock, although it should definitely be handled by the PeriodicTimerSystemClockMonitor.


if (n >= SYNC_MAXRETRIES)
throw new InvalidOperationException(Strings_Core.FAILED_CLOCK_MONITORING);
if (Math.Abs(SystemClock.UtcNow.ToUnixTimeMilliseconds() - now) <= SYNC_MAXDELTA)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does a new timer have to be created, just because the previous one took to long to create? Can't we deal with that otherwise?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my understanding of the original logic:

The logic saves the Now in the _lastTime field and when the scheduled task runs, it reads it out to determine how much time has passed since the start of the timer and the first tick of the timer. If, for some reason the start itself gets delayed and runs eventually, you may end up with a clock-drift notification right away. Instead, the previous timer is cancelled and a new round is attempted. I'd think when the drift does actually happen this very first time, the timer action will restart the timer anyway so my guess is that this just makes the syncing happen earlier.

@akarnokd
Copy link
Collaborator Author

This probably is not really what happens since there is AddRef and Release in SystemClock

Indeed, interesting. Kind of hard to determine the intention here since the class itself is public and the notification is designed with an event: both implying the feature could be used from the outside and perhaps on multiple occasions. I'd say the safety should be added because of it.

@danielcweber
Copy link
Collaborator

I'd say the safety should be added because of it.

Definitely, but can we get rid of AddRef and Release then?

@akarnokd
Copy link
Collaborator Author

I don't think so. If removed, it would do this all the time:

s_serviceSystemClockChanged.Value.SystemClockChanged += OnSystemClockChanged;

potentially registering that method over and over resulting in duplicate notifications.

if (Interlocked.Increment(ref _timerActive) == 1)
{
NewTimer();
}

_systemClockChanged += value;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't thread safe, if we deal with Interlocked above, this all should go into a lock then.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I though += on a delegate is generates as an atomic operation by the compiler.

Copy link
Collaborator

@danielcweber danielcweber Jun 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, this is just non atomic read, change and write operation on a field.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't changed that line so if that's true, you've been running with an unsafe add for a long time now. Or, the compiler is smart enough to produce += as an atomic exchange:

https://stackoverflow.com/a/3522556/61158

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compiler would do magic if the event add/remove code wasn't explicitly implemented. I think the unsafe add is harmless since that class only has only one consumer anyway. That being said, I agree that all of that looks horribly broken.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's really odd, what this code does (creating a resource on first subscribe etc) is solvable easily by Rx itself! It should be redone with Rx.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would create a cyclic dependency with the standard Rx operators. What if you'd want to subscribe to this after some time delay, which would require this to be setup already...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you revert the atomic counting in the event, the class and events are not thread safe even without, and I guess they don't have to be. I'll merge this PR then.

@danielcweber
Copy link
Collaborator

I think an oddity here is that the code could perfectly well use Rx itself...FromEvent, RefCount, all that stuff.

@akarnokd
Copy link
Collaborator Author

I think there would be a chance of circular dependency when schedulers come into play.

@danielcweber danielcweber merged commit bd7178f into dotnet:master Jun 5, 2018
@akarnokd akarnokd deleted the PeriodicTimerSystemClockMonitorFix branch June 26, 2018 21:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failed to start monitoring system clock changes
2 participants