CockroachDB requires moderate levels of clock synchronization to preserve data consistency. For this reason, when a node detects that its clock is out of sync with at least half of the other nodes in the cluster by 80% of the maximum offset allowed (500ms by default), it spontaneously shuts down. This avoids the risk of consistency anomalies, but it's best to prevent clocks from drifting too far in the first place by running clock synchronization software on each node.
{% if page.title contains "Digital Ocean" or page.title contains "On-Premises" %}
ntpd
should keep offsets in the single-digit milliseconds, so that software is featured here, but other methods of clock synchronization are suitable as well.
-
SSH to the first machine.
-
Disable
timesyncd
, which tends to be active by default on some Linux distributions:{% include copy-clipboard.html %}
$ sudo timedatectl set-ntp no
Verify that
timesyncd
is off:{% include copy-clipboard.html %}
$ timedatectl
Look for
Network time on: no
orNTP enabled: no
in the output. -
Install the
ntp
package:{% include copy-clipboard.html %}
$ sudo apt-get install ntp
-
Stop the NTP daemon:
{% include copy-clipboard.html %}
$ sudo service ntp stop
-
Sync the machine's clock with Google's NTP service:
{% include copy-clipboard.html %}
$ sudo ntpd -b time.google.com
To make this change permanent, in the
/etc/ntp.conf
file, remove or comment out any lines starting withserver
orpool
and add the following lines:{% include copy-clipboard.html %}
server time1.google.com iburst server time2.google.com iburst server time3.google.com iburst server time4.google.com iburst
Restart the NTP daemon:
{% include copy-clipboard.html %}
$ sudo service ntp start
{{site.data.alerts.callout_info}}We recommend Google's external NTP service because they handle "smearing" the leap second. If you use a different NTP service that doesn't smear the leap second, you must configure client-side smearing manually and do so in the same way on each machine.{{site.data.alerts.end}}
-
Verify that the machine is using a Google NTP server:
{% include copy-clipboard.html %}
$ sudo ntpq -p
The active NTP server will be marked with an asterisk.
-
Repeat these steps for each machine where a CockroachDB node will run.
{% elsif page.title contains "Google" %}
Compute Engine instances are preconfigured to use NTP, which should keep offsets in the single-digit milliseconds. However, Google can’t predict how external NTP services, such as pool.ntp.org
, will handle the leap second. Therefore, you should:
- Configure each GCE instances to use Google's internal NTP service.
- If you plan to run a hybrid cluster across GCE and other cloud providers or environments, configure the non-GCE machines to use Google's external NTP service.
{% elsif page.title contains "AWS" %}
Amazon provides the Amazon Time Sync Service, which uses a fleet of satellite-connected and atomic reference clocks in each AWS Region to deliver accurate current time readings. The service also smears the leap second.
- If you plan to run your entire cluster on AWS, configure each AWS instance to use the internal Amazon Time Sync Service.
- However, if you plan to run a hybrid cluster across AWS and other cloud providers or environments, configure all machines to use Google's external NTP service, which is comparably accurate and also handles "smearing" the leap second.
{% elsif page.title contains "Azure" %}
ntpd
should keep offsets in the single-digit milliseconds, so that software is featured here. However, to run ntpd
properly on Azure VMs, it's necessary to first unbind the Time Synchronization device used by the Hyper-V technology running Azure VMs; this device aims to synchronize time between the VM and its host operating system but has been known to cause problems.
-
SSH to the first machine.
-
Find the ID of the Hyper-V Time Synchronization device:
{% include copy-clipboard.html %}
$ curl -O https://raw.githubusercontent.com/torvalds/linux/master/tools/hv/lsvmbus
{% include copy-clipboard.html %}
$ python lsvmbus -vv | grep -w "Time Synchronization" -A 3
VMBUS ID 12: Class_ID = {9527e630-d0ae-497b-adce-e80ab0175caf} - [Time Synchronization] Device_ID = {2dd1ce17-079e-403c-b352-a1921ee207ee} Sysfs path: /sys/bus/vmbus/devices/2dd1ce17-079e-403c-b352-a1921ee207ee Rel_ID=12, target_cpu=0
-
Unbind the device, using the
Device_ID
from the previous command's output:{% include copy-clipboard.html %}
$ echo <DEVICE_ID> | sudo tee /sys/bus/vmbus/drivers/hv_util/unbind
-
Install the
ntp
package:{% include copy-clipboard.html %}
$ sudo apt-get install ntp
-
Stop the NTP daemon:
{% include copy-clipboard.html %}
$ sudo service ntp stop
-
Sync the machine's clock with Google's NTP service:
{% include copy-clipboard.html %}
$ sudo ntpd -b time.google.com
To make this change permanent, in the
/etc/ntp.conf
file, remove or comment out any lines starting withserver
orpool
and add the following lines:{% include copy-clipboard.html %}
server time1.google.com iburst server time2.google.com iburst server time3.google.com iburst server time4.google.com iburst
Restart the NTP daemon:
{% include copy-clipboard.html %}
$ sudo service ntp start
{{site.data.alerts.callout_info}}We recommend Google's NTP service because they handle "smearing" the leap second. If you use a different NTP service that doesn't smear the leap second, be sure to configure client-side smearing in the same way on each machine.{{site.data.alerts.end}}
-
Verify that the machine is using a Google NTP server:
{% include copy-clipboard.html %}
$ sudo ntpq -p
The active NTP server will be marked with an asterisk.
-
Repeat these steps for each machine where a CockroachDB node will run.
{% endif %}