Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subscriber: lighter-weight timestamp recording #254

Open
hawkw opened this issue Jan 7, 2022 · 3 comments
Open

subscriber: lighter-weight timestamp recording #254

hawkw opened this issue Jan 7, 2022 · 3 comments
Assignees
Labels
C-subscriber Crate: console-subscriber. S-performance Severity: performance. S-refactor Severity: refactor.

Comments

@hawkw
Copy link
Member

hawkw commented Jan 7, 2022

Currently, console-subscriber records every timestamp by calling SystemTime::now(). This is not necessarily the most efficient way to record a timestamp; every SystemTime::now() call requires a system call.

Also, perhaps more importantly, I don't think using system timestamps is really the most correct approach. The main reason the console records timestamps at all is to calculate durations. However, the system clock can go backwards; it's not guaranteed to be monotonic. This means that durations calculated using SystemTime might be incorrect. We should really be using monotonic time, such as Instants.

I believe the main reason console-subscriber doesn't currently use monotonic timestamps is that it sends timestamps on the wire using prost_types::Timestamp, which has a From<SystemTime> impl. Since Instants are opaque, there's no From<Instant> implementation for protobuf timestamps.

One solution might be to switch to recording timestamps using Instants, but anchor them against a fixed SystemTime, so that they can be converted into SystemTimes. The way we would do this is by recording a fixed SystemTime and Instant pair when the ConsoleLayer is created. Then, when we need to serialize an Instant timestamp, we would take the duration elapsed between that Instant and the "anchor" instant. Since the anchor Instant and anchor SystemTime represent (nearly) the same timestamp, we could then convert the Instant timestamp into a SystemTime for serialization, by adding the elapsed duration from the anchor Instant to the anchor SystemTime.

Another option we might want to consider is using the quanta crate by @tobz. This would offer a few advantages. In particular, quanta::Instant does provide an Into<prost_types::Timestamp> implementation. Additionally, quanta::Instants can be converted into u64s (unlike std's instants). This would allow them to be stored in atomics, which could be beneficial for #238. Finally, quanta may introduce less overhead compared std::time::Instant, at least on x86 platforms, where it uses rdtsc

@hawkw hawkw added C-subscriber Crate: console-subscriber. S-performance Severity: performance. S-refactor Severity: refactor. labels Jan 7, 2022
@hawkw hawkw self-assigned this Jan 7, 2022
@tobz
Copy link
Member

tobz commented Jan 8, 2022

For what it's worth: Instant::as_u64 has been removed in a yet-to-be-released change as part of slimming down the public API prior to a 1.0 release. Users have used crossbeam's AtomicCell to atomically store an Instant, though, as it can safely be transmuted to an AtomicU64.

As far as the conversion to prost_types::Timestamp, it does exist, but related to your comment about system time not being monotonic, I haven't thought very deeply about the nature of what it would mean to always convert a quanta::Instant to what is effectively std::time::SystemTime, in terms of accurately representing time that is anchored to UTC. While our calibration source is always the equivalent of CLOCK_MONOTONIC, which is adjusted to ultimately converge with an NTP-corrected version of UTC, we don't re-calibrate ourselves over time, so there is the possibility for drift.

There's likely a way, or ways, to compensate for this if conversion to an UTC-anchored time is required, and I'm happy to try and provide an honest description of what quanta can and cannot provide here if you do want to consider seriously switch to it.

@hawkw
Copy link
Member Author

hawkw commented Jan 8, 2022

For what it's worth: Instant::as_u64 has been removed in a yet-to-be-released change as part of slimming down the public API prior to a 1.0 release. Users have used crossbeam's AtomicCell to atomically store an Instant, though, as it can safely be transmuted to an AtomicU64.

Hmm, I'm a bit bummed about this, as I would have kind of preferred to be able to do it without requiring an additional dependency on crossbeam just to get the AtomicCell type. But, that's fine, I guess.

but related to your comment about system time not being monotonic, I haven't thought very deeply about the nature of what it would mean to always convert a quanta::Instant to what is effectively std::time::SystemTime, in terms of accurately representing time that is anchored to UTC.

In this case, we don't actually need a UTC-anchored timestamp at all; all these timestamps are only used to calculate durations. The only reason we send timestamps on the wire rather than durations is so that the console UI can continue to calculate durations across updates.

It's possible what we really should be doing, rather than using protobuf's timestamp type in the wire format, is sending our own monotonic, non-UTC instant type instead...but I was lazy and didn't want to write that.

@hawkw
Copy link
Member Author

hawkw commented Jan 8, 2022

I guess crossbeam_util is probably in enough people's dependency trees already that it's maybe okay to add the dependency...

hawkw added a commit that referenced this issue Feb 17, 2022
## Motivation

Currently, if compiled in debug mode, the `console-subscriber` crate's
`PollStats::end_poll` function will panic if a poll duration is negative
(i.e. if the poll's end
timestamp occurred before its start timestamp). Because the console
currently uses non-monotonic `SystemTime`s rather than monotonic
`Instant`s (due to serialization requirements), this means we are
potentially succeptible to clock skew adjustments. If one occurs during
a poll, and the system clock goes backwards, we may panic here. This
isn't great!

## Solution

This branch fixes the panic by removing the assertions, and changing
`end_poll` to simply bail and print a warning rather than panicking when
encountering a negative poll duration. This is a short-term solution; a
better long-term solution would be to change `console-subscriber` to use
monotonic `Instant`s rather than non-monotonic `SystemTime`s (as
discussed [here][1] and [here][2]).

Fixes #286

[1]:#254
[2]: #286 (comment)
hawkw added a commit that referenced this issue Feb 17, 2022
…287)

## Motivation

Currently, if compiled in debug mode, the `console-subscriber` crate's
`PollStats::end_poll` function will panic if a poll duration is negative
(i.e. if the poll's end
timestamp occurred before its start timestamp). Because the console
currently uses non-monotonic `SystemTime`s rather than monotonic
`Instant`s (due to serialization requirements), this means we are
potentially succeptible to clock skew adjustments. If one occurs during
a poll, and the system clock goes backwards, we may panic here. This
isn't great!

## Solution

This branch fixes the panic by removing the assertions, and changing
`end_poll` to simply bail and print a warning rather than panicking when
encountering a negative poll duration. This is a short-term solution; a
better long-term solution would be to change `console-subscriber` to use
monotonic `Instant`s rather than non-monotonic `SystemTime`s (as
discussed [here][1] and [here][2]).

Fixes #286

[1]:#254
[2]: #286 (comment)
hawkw added a commit that referenced this issue Feb 17, 2022
## Motivation

Currently, the `console-subscriber` crate records all timestamps as
`SystemTime`s. This is because they are eventually sent over the wire as
protobuf `Timestamp`s, which can be constructed from a `SystemTime`.
They cannot be constructed from `Instant`s, because `Instant` is opaque,
and does not expose access to the underlying OS time.

However, using `SystemTime` is not really correct for our use case. We
only use timestamps for calculating durations; we only have to serialize
them because some durations are calculated in the console UI rather than
in-process. We *don't* need timestamps that are globally consistent with
a shared timebase, but we *do* need monotonicity --- using `SystemTime`
leaves us vulnerable to clock skew, if (for example), an NTP clock skew
adjustment causes the system clock to run backwards far enough that a
poll appears to end "before" it started (as in issue #286). If we were
using monotonic `Instant`s, all polls should always have positive
durations, but with `SystemTime`s, this isn't necessarily the case.

Furthermore, `Instant::now()` may have less performance overhead than
`SystemTime::now()`, at least on some platforms.

## Solution

This branch changes `console-subscriber` to always take timestamps using
`Instant::now()` rather than using `SystemTime::now()`, and store all
timestamps as `Instant`s. In order to convert these `Instant`s into
`SystemTime`s that can be sent over the wire, we construct a reference
`TimeAnchor`, consisting of a paired `Instant` and `SystemTime` recorded
at the same time when the `ConsoleLayer` is constructed. We can then
construct "system times" that are monotonic, by calculating the duration
between a given `Instant` and the anchor `Instant`, and adding that
duration to the anchor `SystemTime`. These are not *real* system
timestamps, as they will never run backwards if the system clock is
adjusted; they are relative only to the base process start time as
recorded by the anchor. However, they *are* monotonic, and all durations
calculated from them will be reasonable.

This is part of the change I proposed in #254. I'm not going to close
that issue yet, though, as it also described potentially switching to
use the `quanta` crate rather than `std::time::Instant` to reduce the
overhead of recording monotonic timestamps.

Fixes #286
hawkw added a commit that referenced this issue Feb 17, 2022
## Motivation

Currently, the `console-subscriber` crate records all timestamps as
`SystemTime`s. This is because they are eventually sent over the wire as
protobuf `Timestamp`s, which can be constructed from a `SystemTime`.
They cannot be constructed from `Instant`s, because `Instant` is opaque,
and does not expose access to the underlying OS time.

However, using `SystemTime` is not really correct for our use case. We
only use timestamps for calculating durations; we only have to serialize
them because some durations are calculated in the console UI rather than
in-process. We *don't* need timestamps that are globally consistent with
a shared timebase, but we *do* need monotonicity --- using `SystemTime`
leaves us vulnerable to clock skew, if (for example), an NTP clock skew
adjustment causes the system clock to run backwards far enough that a
poll appears to end "before" it started (as in issue #286). If we were
using monotonic `Instant`s, all polls should always have positive
durations, but with `SystemTime`s, this isn't necessarily the case.

Furthermore, `Instant::now()` may have less performance overhead than
`SystemTime::now()`, at least on some platforms.

## Solution

This branch changes `console-subscriber` to always take timestamps using
`Instant::now()` rather than using `SystemTime::now()`, and store all
timestamps as `Instant`s. In order to convert these `Instant`s into
`SystemTime`s that can be sent over the wire, we construct a reference
`TimeAnchor`, consisting of a paired `Instant` and `SystemTime` recorded
at the same time when the `ConsoleLayer` is constructed. We can then
construct "system times" that are monotonic, by calculating the duration
between a given `Instant` and the anchor `Instant`, and adding that
duration to the anchor `SystemTime`. These are not *real* system
timestamps, as they will never run backwards if the system clock is
adjusted; they are relative only to the base process start time as
recorded by the anchor. However, they *are* monotonic, and all durations
calculated from them will be reasonable.

This is part of the change I proposed in #254. I'm not going to close
that issue yet, though, as it also described potentially switching to
use the `quanta` crate rather than `std::time::Instant` to reduce the
overhead of recording monotonic timestamps.

Fixes #286
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-subscriber Crate: console-subscriber. S-performance Severity: performance. S-refactor Severity: refactor.
Projects
None yet
Development

No branches or pull requests

2 participants