CockroachDB requires moderate levels of clock synchronization to preserve data consistency. For this reason, when a node detects that its clock is out of sync with at least half of the other nodes in the cluster by 80% of the maximum offset allowed (500ms by default), it spontaneously shuts down. This avoids the risk of consistency anomalies, but it's best to prevent clocks from drifting too far in the first place by running clock synchronization software on each node.
{% if page.title contains "Digital Ocean" or page.title contains "On-Premises" %}
ntpd
should keep offsets in the single-digit milliseconds, so that software is featured here, but other methods of clock synchronization are suitable as well.
-
SSH to the first machine.
-
Disable
timesyncd
, which tends to be active by default on some Linux distributions:{% include copy-clipboard.html %}
$ sudo timedatectl set-ntp no
Verify that
timesyncd
is off:{% include copy-clipboard.html %}
$ timedatectl
Look for
Network time on: no
orNTP enabled: no
in the output. -
Install the
ntp
package:{% include copy-clipboard.html %}
$ sudo apt-get install ntp
-
Stop the NTP daemon:
{% include copy-clipboard.html %}
$ sudo service ntp stop
-
Sync the machine's clock with Google's NTP service:
{% include copy-clipboard.html %}
$ sudo ntpd -b time.google.com
To make this change permanent, in the
/etc/ntp.conf
file, remove or comment out any lines starting withserver
orpool
and add the following lines:{% include copy-clipboard.html %}
server time1.google.com iburst server time2.google.com iburst server time3.google.com iburst server time4.google.com iburst
Restart the NTP daemon:
{% include copy-clipboard.html %}
$ sudo service ntp start
{{site.data.alerts.callout_info}}We recommend Google's NTP service because it handles "smearing" the leap second. If you use a different NTP service that doesn't smear the leap second, be sure to configure client-side smearing in the same way on each machine. See the Production Checklist for details.{{site.data.alerts.end}}
-
Verify that the machine is using a Google NTP server:
{% include copy-clipboard.html %}
$ sudo ntpq -p
The active NTP server will be marked with an asterisk.
-
Repeat these steps for each machine where a CockroachDB node will run.
{% elsif page.title contains "Google" %}
Compute Engine instances are preconfigured to use NTP, which should keep offsets in the single-digit milliseconds. However, Google can’t predict how external NTP services, such as pool.ntp.org
, will handle the leap second. Therefore, you should:
- Configure each GCE instance to use Google's internal NTP service.
- All nodes in the cluster must be synced to the same time source, or to different sources that implement leap second smearing in the same way. See the Production Checklist for details.
{% elsif page.title contains "AWS" %}
Amazon provides the Amazon Time Sync Service, which uses a fleet of satellite-connected and atomic reference clocks in each AWS Region to deliver accurate current time readings. The service also smears the leap second.
- Configure each AWS instance to use the internal Amazon Time Sync Service.
- Per the above instructions, ensure that
etc/chrony.conf
on the instance contains the lineserver 169.254.169.123 prefer iburst minpoll 4 maxpoll 4
and that otherserver
orpool
lines are commented out. - To verify that Amazon Time Sync Service is being used, run
chronyc sources -v
and check for a line containing* 169.254.169.123
. The*
denotes the preferred time server.
- Per the above instructions, ensure that
- All nodes in the cluster must be synced to the same time source, or to different sources that implement leap second smearing in the same way. See the Production Checklist for details.
{% elsif page.title contains "Azure" %}
ntpd
should keep offsets in the single-digit milliseconds, so that software is featured here. However, to run ntpd
properly on Azure VMs, it's necessary to first unbind the Time Synchronization device used by the Hyper-V technology running Azure VMs; this device aims to synchronize time between the VM and its host operating system but has been known to cause problems.
-
SSH to the first machine.
-
Find the ID of the Hyper-V Time Synchronization device:
{% include copy-clipboard.html %}
$ curl -O https://raw.githubusercontent.com/torvalds/linux/master/tools/hv/lsvmbus
{% include copy-clipboard.html %}
$ python lsvmbus -vv | grep -w "Time Synchronization" -A 3
VMBUS ID 12: Class_ID = {9527e630-d0ae-497b-adce-e80ab0175caf} - [Time Synchronization] Device_ID = {2dd1ce17-079e-403c-b352-a1921ee207ee} Sysfs path: /sys/bus/vmbus/devices/2dd1ce17-079e-403c-b352-a1921ee207ee Rel_ID=12, target_cpu=0
-
Unbind the device, using the
Device_ID
from the previous command's output:{% include copy-clipboard.html %}
$ echo <DEVICE_ID> | sudo tee /sys/bus/vmbus/drivers/hv_util/unbind
-
Install the
ntp
package:{% include copy-clipboard.html %}
$ sudo apt-get install ntp
-
Stop the NTP daemon:
{% include copy-clipboard.html %}
$ sudo service ntp stop
-
Sync the machine's clock with Google's NTP service:
{% include copy-clipboard.html %}
$ sudo ntpd -b time.google.com
To make this change permanent, in the
/etc/ntp.conf
file, remove or comment out any lines starting withserver
orpool
and add the following lines:{% include copy-clipboard.html %}
server time1.google.com iburst server time2.google.com iburst server time3.google.com iburst server time4.google.com iburst
Restart the NTP daemon:
{% include copy-clipboard.html %}
$ sudo service ntp start
{{site.data.alerts.callout_info}}We recommend Google's NTP service because it handles "smearing" the leap second. If you use a different NTP service that doesn't smear the leap second, be sure to configure client-side smearing in the same way on each machine. See the Production Checklist for details.{{site.data.alerts.end}}
-
Verify that the machine is using a Google NTP server:
{% include copy-clipboard.html %}
$ sudo ntpq -p
The active NTP server will be marked with an asterisk.
-
Repeat these steps for each machine where a CockroachDB node will run.
{% endif %}