Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update.qubes-vm (and updater) salt not updating nightlies in dev env #485

Closed
emkll opened this issue Mar 4, 2020 · 18 comments
Closed

update.qubes-vm (and updater) salt not updating nightlies in dev env #485

emkll opened this issue Mar 4, 2020 · 18 comments

Comments

@emkll
Copy link
Contributor

emkll commented Mar 4, 2020

After running the update, upgrades were not completely applied to templates, in the development environment.

After running the updater, following command, or using the Qubes Updater GUI, observed not all packages were updated:

sudo qubesctl --skip-dom0 --targets sd-viewer-buster-template state.sls update.qubes-vm

workstation packages are not updated

user@sd-viewer-buster-template:~$ sudo apt list --upgradable 
Listing... Done
securedrop-log/unknown 0.1.0-dev-20200304-060751+buster all [upgradable from: 0.1.0-dev-20200303-060618+buster]
securedrop-workstation-svs-disp/unknown 0.2.1-dev-20200304-061326+buster all [upgradable from: 0.2.1-dev-20200303-061304+buster]
user@sd-viewer-buster-template:~$ 

Upgrading the VMs manually (sudo apt upgrade) correctly updates the VMs

@emkll emkll changed the title qubes.updatevm salt not updating nightlies in dev env update.qubes-vm (and updater) salt not updating nightlies in dev env Mar 4, 2020
@zenmonkeykstop
Copy link
Contributor

Also confirmed in staging environment:

  • ran SD updater, updates applied and system rebooted
  • updates still available for sd-{app,log,proxy}-buster-template:
    • securedrop-client and securedrop-log for app
    • securedrop-log for log
    • whonix packages and securedrop-log, securedrop-proxy for proxy
  • sudo apt upgrade applies updates in each case, tho on proxy manual input is required because of a formatting change in /etc/apt/sources.list.d/debian.list

@eloquence
Copy link
Member

If you see this happening, @conorsch recommends cloning the impacted template(s) for investigation purposes.

@emkll
Copy link
Contributor Author

emkll commented Mar 4, 2020

apt logs don't seem to provide any clues, no logs from today are present in the logs

user@sd-viewer-buster-template:~$ cat /var/log/apt/history.log

Start-Date: 2020-03-02  09:12:13
Commandline: apt-get -q -y -o DPkg::Options::=--force-confold -o DPkg::Options::=--force-confdef dist-upgrade
Upgrade: securedrop-log:amd64 (0.1.0-dev-20200226-060606+buster, 0.1.0-dev-20200302-060514+buster), securedrop-workstation-svs-disp:amd64 (0.2.0+buster, 0.2.0-dev-20200302-061051+buster)
End-Date: 2020-03-02  09:12:21

Start-Date: 2020-03-03  08:14:17
Commandline: apt-get -q -y -o DPkg::Options::=--force-confold -o DPkg::Options::=--force-confdef dist-upgrade
Upgrade: securedrop-log:amd64 (0.1.0-dev-20200302-060514+buster, 0.1.0-dev-20200303-060618+buster), securedrop-workstation-svs-disp:amd64 (0.2.0-dev-20200302-061051+buster, 0.2.1-dev-20200303-061304+buster)
End-Date: 2020-03-03  08:14:27

@eloquence
Copy link
Member

sudo apt upgrade applies updates in each case, tho on proxy manual input is required because of a formatting change in /etc/apt/sources.list.d/debian.list

Under normal operation, is the updater expected to automatically handle those cases (e.g., by using the default option) where manual input is required when sudo apt upgrade is run from the command line?

@emkll
Copy link
Contributor Author

emkll commented Mar 4, 2020

On VMs that fail to update, I observe a salt error in /var/log/syslog of the template:

Mar  4 13:34:20 localhost qrexec-agent[446]: executed root:QUBESRPC qubes.VMRootShell disp-mgmt-sd-viewer-buster-temp pid 903
Mar  4 13:34:20 localhost systemd[1]: Started Session c8 of user root.
Mar  4 13:34:21 localhost qubes.VMRootShell-disp-mgmt-sd-viewer-buster-temp: SALT_ARGV: ['/usr/bin/python3', '/var/tmp/.root_62a99a_salt/salt-call', '--retcode-passthrough', '--local', '--metadata', '--out', 'json', '-l', 'quiet', '-c', '/var/tmp/.root_62a99a_salt', '--', 'state.pkg', '/var/tmp/.root_62a99a_salt/salt_state.tgz', 'test=None', 'pkg_sum=49c691c134ffcbd38761065dc7be2c975fac7b2f523ad788698de2dcbc40338b', 'hash_type=sha256'] 
Mar  4 13:34:21 localhost qubes.VMRootShell-disp-mgmt-sd-viewer-buster-temp: _edbc7885e4f9aac9b83b35999b68d015148caf467b78fa39c05f669c0ff89878 
Mar  4 13:34:27 localhost qubes.VMRootShell-disp-mgmt-sd-viewer-buster-temp: Traceback (most recent call last): 
Mar  4 13:34:27 localhost qubes.VMRootShell-disp-mgmt-sd-viewer-buster-temp:   File "/var/tmp/.root_62a99a_salt/salt-call", line 27, in <module> 
Mar  4 13:34:27 localhost qubes.VMRootShell-disp-mgmt-sd-viewer-buster-temp:     salt_call() 
Mar  4 13:34:27 localhost qubes.VMRootShell-disp-mgmt-sd-viewer-buster-temp:   File "/var/tmp/.root_62a99a_salt/pyall/salt/scripts.py", line 445, in salt_call 
Mar  4 13:34:27 localhost qubes.VMRootShell-disp-mgmt-sd-viewer-buster-temp:     client.run()
Mar  4 13:34:27 localhost qubes.VMRootShell-disp-mgmt-sd-viewer-buster-temp:   File "/var/tmp/.root_62a99a_salt/pyall/salt/cli/call.py", line 57, in run
Mar  4 13:34:27 localhost qubes.VMRootShell-disp-mgmt-sd-viewer-buster-temp:     caller.run()
Mar  4 13:34:27 localhost qubes.VMRootShell-disp-mgmt-sd-viewer-buster-temp:   File "/var/tmp/.root_62a99a_salt/pyall/salt/cli/caller.py", line 119, in run
Mar  4 13:34:27 localhost qubes.VMRootShell-disp-mgmt-sd-viewer-buster-temp:     ret = self.call()
Mar  4 13:34:27 localhost qubes.VMRootShell-disp-mgmt-sd-viewer-buster-temp:   File "/var/tmp/.root_62a99a_salt/pyall/salt/cli/caller.py", line 232, in call
Mar  4 13:34:27 localhost qubes.VMRootShell-disp-mgmt-sd-viewer-buster-temp:     func.__module__].__context__.get('retcode', 0)
Mar  4 13:34:27 localhost qubes.VMRootShell-disp-mgmt-sd-viewer-buster-temp: KeyError: 'salt.loaded.int.module.state'
Mar  4 13:34:27 localhost qrexec-agent[446]: send exit code 1
Mar  4 13:34:27 localhost qrexec-agent[446]: pid 903 exited with 1
Mar  4 13:34:27 localhost systemd[1]: session-c8.scope: Succeeded.
Mar  4 13:34:27 localhost qrexec-agent[446]: eintr
Mar  4 13:34:28 localhost qubes-gui[452]: XIO:  fatal IO error 11 (Resource temporarily unavailable) on X server ":0"
Mar  4 13:34:28 localhost qubes-gui[452]:       after 134 requests (134 known processed) with 0 events remaining.

@conorsch
Copy link
Contributor

conorsch commented Mar 4, 2020

Confirming that the above stack trace appears in TemplateVM syslogs if VMs failed to update. The external output of the salt command in dom0 is deceptively benign:

[user@dom0 ~]$ sudo qubesctl --show-output --skip-dom0 --target debian-10-1 state.sls update.qubes-vm
debian-10-1:

[user@dom0 ~]$ echo $?
0

In the above test, debian-10-1 is a direct clone of debian-10. Other TemplateVMs, however, update just fine, so we still need to identify the root cause of the breakage in any given VM.

@emkll
Copy link
Contributor Author

emkll commented Mar 4, 2020

Opened ticket upstream QubesOS/qubes-issues#5705

@emkll
Copy link
Contributor Author

emkll commented Mar 4, 2020

The problem does not appear to be isolated to update.qubes-vm, but seems to rather be a generalized salt issue: A clean Workstation Install on Qubes 4.0.3 no longer works: TemplateVMs are not provisioned, and I observe the same error as above in the logs

@emkll
Copy link
Contributor Author

emkll commented Mar 4, 2020

A workaround for this issue, which I've tested locally: dnf downgrade salt in fedora-30 template. Unfortunately somewhat impractical given how we enforce auto-updates, but we now understand the root cause. I have updated the upstream ticket.

@emkll
Copy link
Contributor Author

emkll commented Mar 5, 2020

Temporary workaround (downgrades and freezes upstream salt packages):

Run the following command in Fedora-30 TemplateVM:

sudo dnf downgrade -y salt && echo "exclude=salt salt-ssh" | sudo tee -a /etc/dnf/dnf.conf

To revert the changes, run the following command in the Fedora-30 TemplateVM:

sed -i '/^exclude=/ d' /etc/dnf/dnf.conf

@conorsch
Copy link
Contributor

Performed a prod install on test hardware. In order to pull in the updated upstream package, specifically v4.0.9 of qubes-mgmt-salt-dom0-update, I ran sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing, as documented here. All VMs installed correctly. Additionally, to check the upgrade behavior, I waited until there were updates available for whonix-ws-15, then used the GUI qubes updater to apply them. After comparing versions installed in the TemplateVM before and after the run, confirmed that the packages were properly updated via salt.

Leaving this ticket open since there's still a workaround required for prod installs:

sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing

which should be run as the first step on a new machine. Once the v4.0.9 lands in the stable repo, then we're fine to close out the issue.

@conorsch
Copy link
Contributor

v4.0.9 of qubes-mgmt-salt-dom0-update has been promoted to stable (see upstream issue), however I'm having trouble pulling it in:

qubes-dom0-update-salt-mgmt409-failure

I've not enabled testing repos on the machine where I ran those commands. Might be a transient error during repo updates. Can anyone else reproduce?

@kushaldas
Copy link
Contributor

kushaldas commented Mar 25, 2020

I've not enabled testing repos on the machine where I ran those commands. Might be a transient error during repo updates. Can anyone else reproduce?

@conorsch I managed to install/update it properly. sudo qubes-dom0-update manged to pull in. Must be issue with the rpm repository not synced properly.

@emkll
Copy link
Contributor Author

emkll commented Mar 25, 2020

Reporting the same as @kushaldas , update worked well for me locally, reverted dnf changes in fedora-30 template and both GUI updaters worked as expected.

@redshiftzero
Copy link
Contributor

Confirming that after updating to qubes-mgmt-salt-dom0-update, updating fedora-30 salt-ssh, and running our updater all works as expected now

@kushaldas
Copy link
Contributor

Confirming that after updating to qubes-mgmt-salt-dom0-update, updating fedora-30 salt-ssh, and running our updater all works as expected now

Double confirming the same. Finally all updated this side.

@emkll
Copy link
Contributor Author

emkll commented Mar 25, 2020

Confirming both Qubes updater and workstation updater work well on a clean install (4.0.3)

@conorsch
Copy link
Contributor

Transient error indeed: all updates successfully applied this morning. 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants