server: configurable clock offset verification threshold #94999
Labels
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-kv
KV Team
We use RPC pings to estimate the remote clock offset, and terminate the node if its offset is greater than
MaxOffset
away from the median clock reading.Some clusters may run with fairly precise clocks, and want to reduce the
MaxOffset
to e.g. 10ms. At these scales, uncertainty in the clock offset estimate can lead to false positives, prematurely terminating the node. Furthermore, the latency measurements from the RPC heartbeats can be wildly inaccurate because it's vulnerable to RPC head-of-line blocking (#93397), which also affects the clock uncertainty measurements.We should provide an option to set an explicit threshold for these uncertainty checks, rather that using
MaxOffset
, such that the nodes can run with e.g. a 10 ms max offset but only self-terminate when they see obviously-incorrect clock offsets by e.g. 1 seconds.Jira issue: CRDB-23265
The text was updated successfully, but these errors were encountered: