Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add chrony for time sync for Azure image #240

Merged
merged 1 commit into from
Jun 15, 2020

Conversation

nader-ziada
Copy link
Contributor

Chrony is the only way to sync time from host using PTP source according to the azure team

Related to issue 539 in cluster-api-provider-azure

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels May 21, 2020
@nader-ziada
Copy link
Contributor Author

/cc @CecileRobertMichon

{
"type": "shell",
"inline": [
"sudo apt-get -qq update && sudo apt-get -qqy install chrony && sudo systemctl restart chrony.service"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be moved to the Ansible code instead? I suspect there will also need to be some additional modifications to the chrony config to configure the ptp time source, as described in https://docs.microsoft.com/en-us/azure/virtual-machines/linux/time-sync#chrony.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, will look into how to do that using ansible instead

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, if there was a daemon already present, need to make sure that's disabled and removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ubuntu bionic docs say to install it https://ubuntu.com/blog/ubuntu-bionic-using-chrony-to-configure-ntp but I will double check

@CecileRobertMichon
Copy link
Contributor

CecileRobertMichon commented May 21, 2020

@nader-ziada how do you plan on testing this change? We should at least make sure the result is what we want before changing image builder (ie. building an image with these changes and using that to build a CAPZ cluster).

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels May 21, 2020
Comment on lines 28 to 31
- name: fix chrony drift
shell: chronyc makestep 1.0 -1
args:
executable: /bin/bash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This step isn't relevant at the point of creating an image, as new machines are going to be created anyway with a completely different time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does chronyc actually change the config, or just the current process config?

Copy link
Member

@randomvariable randomvariable May 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checked locally on Fedora 32:

cluster-api on  master [⇡$] at ☸️  kind-kind took 5s
➜ sudo chronyc makestep 1.0 -1
200 OK

cluster-api on  master [⇡$] at ☸️  kind-kind
➜ cat /etc/chrony.conf
# These servers were defined in the installation:
pool time.google.com iburst
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).

# Record the rate at which the system clock gains/losses time.
driftfile /var/lib/chrony/drift

# Allow the system clock to be stepped in the first three updates
# if its offset is larger than 1 second.
makestep 1.0 3

# Enable kernel synchronization of the real-time clock (RTC).
rtcsync

# Enable hardware timestamping on all interfaces that support it.
#hwtimestamp *

# Increase the minimum number of selectable sources required to adjust
# the system clock.
#minsources 2

# Allow NTP client access from local network.
#allow 192.168.0.0/16

# Serve time even if not synchronized to a time source.
#local stratum 10

# Specify file containing keys for NTP authentication.
keyfile /etc/chrony.keys

# Get TAI-UTC offset and leap seconds from the system tz database.
leapsectz right/UTC

# Specify directory for log files.
logdir /var/log/chrony

# Select which information is logged.
#log measurements statistics tracking

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be actually changing the current process, i need to double check

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably want to add something like

- name: Configure PTP
  lineinfile:
    path: /etc/chrony/chrony.conf
    create: yes
    line: refclock PHC /dev/ptp0 poll 3 dpoll -2 offset 0

- name: Ensure makestep parameter set as per Azure recommendation
  lineinfile:
    path: /etc/chrony/chrony.conf
    regexp: '^makestep'
    line: makestep 1.0 -1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I was just trying to to figure out how to do this in ansible!


- name: restart chrony.service to pick up config changes
systemd:
state: restarted
Copy link
Member

@randomvariable randomvariable May 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
state: restarted
enabled: yes
state: restarted

Better to be explicit that we want it enabled on boot. The restart isn't technically necessary because of image creation.

@nader-ziada
Copy link
Contributor Author

@CecileRobertMichon yes, I build the image and created a vm with it manually, next will try it with a cluster

@CecileRobertMichon
Copy link
Contributor

/cc @jackfrancis

(related to Azure/aks-engine#2552)

@k8s-ci-robot
Copy link
Contributor

@CecileRobertMichon: GitHub didn't allow me to request PR reviews from the following users: jackfrancis.

Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @jackfrancis

(related to Azure/aks-engine#2552)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@nader-ziada
Copy link
Contributor Author

created the image and VM from it and this is what I see:

$ chronyc tracking
Reference ID    : 50484330 (PHC0)
Stratum         : 1
Ref time (UTC)  : Thu May 21 23:35:23 2020
System time     : 0.000013028 seconds slow of NTP time
Last offset     : -0.000015507 seconds
RMS offset      : 0.000255382 seconds
Frequency       : 72.325 ppm slow
Residual freq   : -0.054 ppm
Skew            : 1.491 ppm
Root delay      : 0.000000001 seconds
Root dispersion : 0.000033918 seconds
Update interval : 8.0 seconds
Leap status     : Normal
$ chronyc activity
200 OK
8 sources online
0 sources offline
0 sources doing burst (return to online)
0 sources doing burst (return to offline)
0 sources with unknown address
$ chronyc sources
210 Number of sources = 9
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0                          0   3   377    12    +85us[  +98us] +/-   17us
^- alphyn.canonical.com          2   6   357    22   +532us[ +548us] +/-   68ms
^- golem.canonical.com           2   6   377    16   +816us[ +834us] +/-   69ms
^- chilipepper.canonical.com     2   6   377    18   +851us[ +868us] +/-   73ms
^- pugot.canonical.com           2   6   377    19   +685us[ +702us] +/-   82ms
^- atl0.fairy.mattnordhoff.>     2   6   377    21   +542us[ +559us] +/-   29ms
^- europa.netbunker.org          2   6   377    18  +1980us[+1997us] +/-   29ms
^- t1.time.bf1.yahoo.com         2   6   377    19  +1042us[+1059us] +/-   46ms
^- 198.255.68.106                2   6   377    19  +1253us[+1270us] +/-   80ms
~$ cat /etc/chrony/chrony.conf
# Welcome to the chrony configuration file. See chrony.conf(5) for more
# information about usuable directives.

# This will use (up to):
# - 4 sources from ntp.ubuntu.com which some are ipv6 enabled
# - 2 sources from 2.ubuntu.pool.ntp.org which is ipv6 enabled as well
# - 1 source from [01].ubuntu.pool.ntp.org each (ipv4 only atm)
# This means by default, up to 6 dual-stack and up to 2 additional IPv4-only
# sources will be used.
# At the same time it retains some protection against one of the entries being
# down (compare to just using one of the lines). See (LP: #1754358) for the
# discussion.
#
# About using servers from the NTP Pool Project in general see (LP: #104525).
# Approved by Ubuntu Technical Board on 2011-02-08.
# See http://www.pool.ntp.org/join.html for more information.
pool ntp.ubuntu.com        iburst maxsources 4
pool 0.ubuntu.pool.ntp.org iburst maxsources 1
pool 1.ubuntu.pool.ntp.org iburst maxsources 1
pool 2.ubuntu.pool.ntp.org iburst maxsources 2

# This directive specify the location of the file containing ID/key pairs for
# NTP authentication.
keyfile /etc/chrony/chrony.keys

# This directive specify the file into which chronyd will store the rate
# information.
driftfile /var/lib/chrony/chrony.drift

# Uncomment the following line to turn logging on.
#log tracking measurements statistics

# Log files location.
logdir /var/log/chrony

# Stop bad estimates upsetting machine clock.
maxupdateskew 100.0

# This directive enables kernel synchronisation (every 11 minutes) of the
# real-time clock. Note that it can’t be used along with the 'rtcfile' directive.
rtcsync

# Step the system clock instead of slewing it if the adjustment is larger than
# one second, but only in the first three clock updates.
makestep 1.0 -1

refclock PHC /dev/ptp0 poll 3 dpoll -2 offset 0

@nader-ziada
Copy link
Contributor Author

I created a few Azure clusters using the image I generated from here and everything is working fine, not sure if there is a special test we need to check on the cluster, but at least this doesn't break anything and chrony is running as in the above comment.

@nader-ziada
Copy link
Contributor Author

@CecileRobertMichon is it possible for you to try this out?

@CecileRobertMichon
Copy link
Contributor

@detiber @randomvariable do we have plans to include E2E testing as a presubmit for image-buidler PRs? Seems like it would be useful in cases like this

@CecileRobertMichon
Copy link
Contributor

@nader-ziada this looks good to me, but we really need a way to be able to test changes to the image in a non-manual way in the future :)

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 29, 2020
@nader-ziada
Copy link
Contributor Author

@CecileRobertMichon i can work on tests but better if we can discuss what scope to cover

@nader-ziada
Copy link
Contributor Author

@detiber ready for another look, Thanks

@detiber
Copy link
Member

detiber commented Jun 1, 2020

/cc @codenrhoden to speak to testing plans for image-builder

@k8s-ci-robot k8s-ci-robot requested a review from codenrhoden June 1, 2020 17:51
@k8s-ci-robot
Copy link
Contributor

@detiber: GitHub didn't allow me to request PR reviews from the following users: image-builder, to, speak, testing, for.

Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @codenrhoden to speak to testing plans for image-builder

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Comment on lines +23 to +26
- name: Ansible apt install chrony
apt:
name: chrony
state: present
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be wrapped in a conditional to only run when the OS is Ubuntu similar to:

- name: install aws agents Ubuntu
shell: snap install amazon-ssm-agent --classic
when: ansible_distribution == "Ubuntu"

Relatively minor nit, but would have an impact if/when other OS images are supported for Azure images.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to add check for Ubuntu

Chrony is the only way to sync time from host using PTP source according to the azure team
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 1, 2020
@detiber
Copy link
Member

detiber commented Jun 1, 2020

/lgtm
/assign @codenrhoden

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 1, 2020
@nader-ziada
Copy link
Contributor Author

Hi @codenrhoden , do you have any concerns with this change?

@codenrhoden
Copy link
Contributor

/approve

Thanks for this @nader-ziada! And for trying out clusters with CAPZ first since there are no blocking PR tests still...

@CecileRobertMichon:

do we have plans to include E2E testing as a presubmit for image-buidler PRs? Seems like it would be useful in cases like this

Short answer is "yes". 😆 But definitely lacking on details so far. But I am in 100% agreement that this is necessary. It's in my near-term plans to start to focus on this.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: codenrhoden, nader-ziada

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 15, 2020
@k8s-ci-robot k8s-ci-robot merged commit bfcc794 into kubernetes-sigs:master Jun 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants