CockroachDB requires moderate levels of clock synchronization to preserve data consistency. For this reason, when a node detects that its clock is out of sync with at least half of the other nodes in the cluster by 80% of the maximum offset allowed (500ms by default), it spontaneously shuts down. This avoids the risk of violating serializable consistency and causing stale reads and write skews, but it's important to prevent clocks from drifting too far in the first place by running NTP or other clock synchronization software on each node.
The one rare case to note is when a node's clock suddenly jumps beyond the maximum offset before the node detects it. Although extremely unlikely, this could occur, for example, when running CockroachDB inside a VM and the VM hypervisor decides to migrate the VM to different hardware with a different time. In this case, there can be a small window of time between when the node's clock becomes unsynchronized and when the node spontaneously shuts down. During this window, it would be possible for a client to read stale data and write data derived from stale reads. To protect against this, we recommend using the server.clock.forward_jump_check_enabled
and server.clock.persist_upper_bound_interval
cluster settings.
When setting up clock synchronization:
-
We recommend using Google Public NTP or Amazon Time Sync Service with the clock sync service you are already using (e.g.,
ntpd
,chrony
). For example, if you are already usingntpd
, configurentpd
to point to the Google or Amazon time server.{{site.data.alerts.callout_info}} Amazon Time Sync Service is only available within Amazon EC2, so hybrid environments should use Google Public NTP. {{site.data.alerts.end}}
-
In a hybrid cluster, GCE machines should use Google's internal NTP service and AWS machines should use Amazon Time Sync Service. The Google and Amazon services handle "smearing" the leap second in compatible ways.
-
If you do not want to use the Google or Amazon time sources, you can use
chrony
and enable client-side leap smearing, unless the time source you're using already does server-side smearing. In most cases, we recommend the Google Public NTP time source because it handles smearing the leap second. If you use a different NTP time source that doesn't smear the leap second, you must configure client-side smearing manually and do so in the same way on each machine. -
Do not run more than one clock sync service on VMs where
cockroach
is running.
For guidance on synchronizing clocks, see the tutorial for your deployment environment:
Environment | Featured Approach |
---|---|
On-Premises | Use NTP with Google's external NTP service. |
AWS | Use the Amazon Time Sync Service. |
Azure | Disable Hyper-V time synchronization and use NTP with Google's external NTP service. |
Digital Ocean | Use NTP with Google's external NTP service. |
GCE | Use NTP with Google's internal NTP service. |