Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove ntp and ntpdate dependencies #5806

Merged
merged 3 commits into from
Feb 24, 2021
Merged

Remove ntp and ntpdate dependencies #5806

merged 3 commits into from
Feb 24, 2021

Conversation

rmol
Copy link
Contributor

@rmol rmol commented Feb 19, 2021

Status

Ready for review

Description of Changes

On Focal, stop installing the ntp and ntpdate packages, and update the system time with systemd-timesyncd.

Fixes #5730.
Fixes #5795.

I also updated molecule/testinfra/conftest to make the class that provides attribute access to test variables aware of distribution-specific variables, such that you can ask for apparmor_enforce, and if that's a dictionary containing your target distribution, e.g. focal, it will return that item from the dictionary.

Finally, there were some complete iptables rulesets in the testinfra vars that didn't seem to be used anywhere, so I removed those.

Testing

  • make build-debs-focal && make staging-focal
  • confirm that the ntp and ntpdate packages are not installed on either server
  • confirm that time has been synchronized to NTP servers on both machines:
    • timedatectl show should contain NTPSynchronized=yes
    • timedatectl show-timesync should contain ServerName=ntp.ubuntu.com, with an NTPMessage indicating that the server has been reached

Deployment

For Focal systems, this replaces the universe packages ntp and ntpdate with the admin package systemd-timesyncd.

The iptables rules for NTP required changes:

  • systemd-timesyncd does not control its source port
  • it also runs as the systemd-timesync user

Checklist

If you made changes to the server application code:

  • Linting (make lint) and tests (make test) pass in the development container

If you made changes to the system configuration:

If you made non-trivial code changes:

  • I have written a test plan and validated it for this PR

Choose one of the following:

  • I have opened a PR in the docs repo for these changes, or will do so later
  • I would appreciate help with the documentation
  • These changes do not require documentation

Copy link
Contributor

@kushaldas kushaldas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR is missing the part to enable the systemd-timesyncd service. This is what causing the failure in CI.

root@app-staging:/home/vagrant# sudo systemctl start systemd-timesyncd
root@app-staging:/home/vagrant# sudo systemctl status systemd-timesyncd
● systemd-timesyncd.service - Network Time Synchronization
     Loaded: loaded (/lib/systemd/system/systemd-timesyncd.service; enabled; vendor preset: enabled)                                                   
     Active: active (running) since Mon 2021-02-22 10:17:38 UTC; 1s ago
       Docs: man:systemd-timesyncd.service(8)
   Main PID: 2563 (systemd-timesyn)
     Status: "Initial synchronization to time server 91.189.89.199:123 (ntp.ubuntu.com)."                                                              
      Tasks: 2 (limit: 1080)
     Memory: 1.6M
     CGroup: /system.slice/systemd-timesyncd.service
             └─2563 /lib/systemd/systemd-timesyncd

Feb 22 10:17:38 app-staging systemd[1]: Starting Network Time Synchronization...                                                                       
Feb 22 10:17:38 app-staging systemd[1]: Started Network Time Synchronization.                                                                          
Feb 22 10:17:38 app-staging systemd-timesyncd[2563]: Initial synchronization to time server 91.189.89.199:123 (ntp.ubuntu.com).                        
root@app-staging:/home/vagrant# timedatectl show
Timezone=Etc/UTC
LocalRTC=no
CanNTP=yes
NTP=yes
NTPSynchronized=yes
TimeUSec=Mon 2021-02-22 10:17:46 UTC
RTCTimeUSec=Mon 2021-02-22 10:17:46 UTC

@@ -60,8 +60,8 @@
{% endfor -%}

# NTP rules
-A OUTPUT -p udp --sport 123 --dport 123 -m owner --uid-owner root -m state --state NEW,ESTABLISHED,RELATED -j ACCEPT -m comment --comment "ntp"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we won't support new installs on Xenial starting on 1.8.0, it is possible orgs will want to run playbooks against Xenial instances in the time period between 1.8.0 and the time where we eliminate Xenial support (e.g. Onion Service v2 -> v3 migration). The firewall rules are applied on playbook run, but ntp/ntpdate are not explicitly removed as part of the playbook run. Do we understand the implications of having both systemd-timesync and ntp/ntpdate installed at the same time (with presumably ntpdate being blocked by iptables) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. I've missed several places where both need to be supported. The CI failure @kushaldas noted is because the timedatectl command differs on Xenial, and as you noted I need to split the NTP iptables rules.

The systemd-timesyncd package conflicts with ntp, though, so won't be present on Xenial.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, systemd-timesyncd won't be present on Xenial because that package doesn't exist for Xenial....

@rmol rmol force-pushed the remove-ntp branch 2 times, most recently from 12a37c7 to 28a2b32 Compare February 22, 2021 23:58
@conorsch
Copy link
Contributor

Seeing a rather perplexing CI error in https://app.circleci.com/pipelines/github/freedomofpress/securedrop/1966/workflows/1ed06d35-a3ed-423c-86be-5fd17a41212e/jobs/51116, where the time is reported as unsynchronized, but only in CI. We've kicked CI yet again to make sure it's consistent. @kushaldas, would appreciate another set of eyes here, see if you can get this one over the finish line.

@kushaldas
Copy link
Contributor

kushaldas commented Feb 23, 2021

I could not get the service running after reboot automatically. A few notes:

  • timedatectl command output will change systemd/systemd@3ec530a
  • if I start the service manually, it starts
  • ConditionVirtualization=!container means it should work on vms in our CI/staging
  • timedatectl set-ntp on starts the service on command line

Wondering the following chain is not showing it:

# systemd-analyze critical-chain sysinit.target
sysinit.target @1.959s
└─haveged.service @1.959s
  └─systemd-random-seed.service @397ms +1.558s
    └─systemd-remount-fs.service @321ms +54ms
      └─systemd-journald.socket @230ms
        └─-.mount @179ms
          └─system.slice @179ms
            └─-.slice @179ms

@rmol
Copy link
Contributor Author

rmol commented Feb 23, 2021

In which environment did systemd-timesyncd not start at boot for you?

  • ConditionVirtualization=!container means it should work on vms in our CI/staging

That's in /usr/lib/systemd/system/systemd-timesyncd.service on Focal. Did you mean that you think we need that, or that it looks to you like it should be working right now?

  • timedatectl set-ntp on starts the service on command line

This is being done in the latest playbook on this branch. Well, it has timedatectl set-ntp true; true and false worked in my testing.

@conorsch
Copy link
Contributor

Working on reproducing the behavior described above. @rmol, do you mind rebasing on latest develop? I see a conflict with a common vars file.

@kushaldas
Copy link
Contributor

kushaldas commented Feb 23, 2021

That's in /usr/lib/systemd/system/systemd-timesyncd.service on Focal. Did you mean that you think we need that, or that it looks to you like it should be working right now?

it should be working right now, all your work looks solid. Works nicely if I start manually, but not on automatic reboots.

@rmol rmol force-pushed the remove-ntp branch 3 times, most recently from f6dd0a8 to ff98c5a Compare February 23, 2021 15:55
Copy link
Contributor

@kushaldas kushaldas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a massive 7 hours of debug failure from me, @poettering provided help and also found the issue.

The vboxadd-service.service is causing the trouble. We should remove it via our playbook.

root@app-staging:/home/vagrant# systemctl cat vboxadd-service.service
# /lib/systemd/system/vboxadd-service.service
[Unit]
SourcePath=/opt/VBoxGuestAdditions-6.1.12/init/vboxadd-service
Description=
Before=runlevel2.target runlevel3.target runlevel4.target runlevel5.target shutdown.target 
After=vboxadd.service 
Conflicts=shutdown.target systemd-timesyncd.service

[Service]
Type=forking
Restart=no
TimeoutSec=5min
IgnoreSIGPIPE=no
KillMode=process
GuessMainPID=no
RemainAfterExit=yes
ExecStart=/opt/VBoxGuestAdditions-6.1.12/init/vboxadd-service start
ExecStop=/opt/VBoxGuestAdditions-6.1.12/init/vboxadd-service stop

[Install]
WantedBy=multi-user.target

rmol and others added 2 commits February 23, 2021 12:59
Expanding the tests already there to ensure we've got a predictable end
state.
@rmol rmol force-pushed the remove-ntp branch 4 times, most recently from c97e417 to 4df8c36 Compare February 23, 2021 22:33
@conorsch
Copy link
Contributor

Noticed a lint failure: https://app.circleci.com/pipelines/github/freedomofpress/securedrop/2001/workflows/5d5dd039-91b4-446a-9f52-803f6b04ea1f/jobs/51439 so I amended the latest commit. We should get a full CI run out of it now.

Copy link
Contributor

@kushaldas kushaldas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All is good, with the systemd-timesyncd service running properly.

@kushaldas kushaldas merged commit d45857d into develop Feb 24, 2021
@emkll emkll deleted the remove-ntp branch February 24, 2021 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Apparmor denial for ntpd on Focal system clock error expected to always happen?
4 participants