Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qvm-backup error: Connection to qubesd terminated, followed by traceback #5004

Closed
andrewdavidwong opened this issue Apr 28, 2019 · 13 comments
Closed
Assignees
Labels
C: core P: major Priority: major. Between "default" and "critical" in severity. r4.0-dom0-stable r4.1-dom0-stable T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists.

Comments

@andrewdavidwong
Copy link
Member

Qubes OS version

R4.0

Affected component(s) or functionality

qvm-backup

Brief summary

While making a backup, qvm-backup fails with the message Connection to qubesd terminated, reconnecting in 1.0 seconds, followed by the usual VM list, followed by a traceback (see below).

To Reproduce

$ qvm-backup [...]
Making a backup... 0% [...] Making a backup... 63.58%app: Connection to qubesd terminated, reconnecting in 1.0 seconds

Backup error: Got empty response from qubesd. See journalctl in dom0 for details.
------------------+--------------+--------------+
               VM |         type |         size |
------------------+--------------+--------------+
[...]
------------------+--------------+--------------+
      Total size: |                    20.0 GiB |
------------------+--------------+--------------+
VMs not selected for backup:
[...]


Traceback (most recent call last):
  File "/usr/lib/python3.5/site-packages/qubesadmin/app.py", line 540, in qubesd_call
    client_socket.connect(qubesadmin.config.QUBESD_SOCKET)
FileNotFoundError: [Errno 2] No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/bin/qvm-run", line 5, in <module>
    sys.exit(main())
  File "/usr/lib/python3.5/site-packages/qubesadmin/tools/qvm_run.py", line 199, in main
    args = parser.parse_args(args, app=app)
  File "/usr/lib/python3.5/site-packages/qubesadmin/tools/__init__.py", line 396, in parse_args
    action.parse_qubes_app(self, namespace)
  File "/usr/lib/python3.5/site-packages/qubesadmin/tools/__init__.py", line 170, in parse_qubes_app
    namespace.domains += [app.domains[vm_name]]
  File "/usr/lib/python3.5/site-packages/qubesadmin/app.py", line 86, in __getitem__
    if not self.app.blind_mode and item not in self:
  File "/usr/lib/python3.5/site-packages/qubesadmin/app.py", line 107, in __contains__
    self.refresh_cache()
  File "/usr/lib/python3.5/site-packages/qubesadmin/app.py", line 60, in refresh_cache
    'admin.vm.List'
  File "/usr/lib/python3.5/site-packages/qubesadmin/app.py", line 543, in qubesd_call
    'Failed to connect to qubesd service: %s', str(e))
qubesadmin.exc.QubesDaemonCommunicationError: Failed to connect to qubesd service: [Errno 2] No such file or directory

Additional context

This qvm-backup is run as part of a nightly script. It has worked fine with no changes almost all other nights for a long time (months to years). I made no configuration changes between this run and the previous successful run.

@andrewdavidwong andrewdavidwong added T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists. C: core P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. labels Apr 28, 2019
@andrewdavidwong andrewdavidwong added this to the Release 4.0 updates milestone Apr 28, 2019
@marmarek
Copy link
Member

Looks like qubesd was restarted during the process. Maybe you've installed updates at the same time? Anyway, see journalctl messages about that time.

@andrewdavidwong
Copy link
Member Author

Maybe you've installed updates at the same time?

Nope.

Anyway, see journalctl messages about that time.

Here it is:

Apr 28 04:20:02 dom0 systemd[1]: Stopping Qubes OS daemon...
Apr 28 04:20:02 dom0 qubesd[3069]: caught SIGTERM, exiting
Apr 28 04:20:02 dom0 widget-wrapper[5179]: Connection to qubesd terminated, reconnecting in 1.0 seconds
Apr 28 04:20:02 dom0 kernel: audit: type=1131 audit(1556443202.175:2015): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=qubesd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Apr 28 04:20:02 dom0 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=qubesd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Apr 28 04:20:02 dom0 libvirtd[3111]: S4 d2019-04-28 09:20:02.174+0000: 3111: error : virNetSocketReadWire:1808 : End of file while reading data: Input/output error
Apr 28 04:20:02 dom0 widget-wrapper[5171]: Connection to qubesd terminated, reconnecting in 1.0 seconds
Apr 28 04:20:02 dom0 systemd[1]: Stopped Qubes OS daemon.
Apr 28 04:20:02 dom0 qmemman.daemon.algo[28033]: balance_when_enough_memory(xen_free_memory=57244247, total_mem_pref=2135820288.0, total_available_memory=18571457111.0)
Apr 28 04:20:02 dom0 qubesd[3069]: Traceback (most recent call last):
Apr 28 04:20:02 dom0 qubesd[3069]:   File "/usr/lib64/python3.5/runpy.py", line 193, in _run_module_as_main
Apr 28 04:20:02 dom0 qubesd[3069]:     "__main__", mod_spec)
Apr 28 04:20:02 dom0 qubesd[3069]:   File "/usr/lib64/python3.5/runpy.py", line 85, in _run_code
Apr 28 04:20:02 dom0 qubesd[3069]:     exec(code, run_globals)
Apr 28 04:20:02 dom0 qubesd[3069]:   File "/usr/lib/python3.5/site-packages/qubes/tarwriter.py", line 179, in <module>
Apr 28 04:20:02 dom0 qubesd[3069]:     main()
Apr 28 04:20:02 dom0 qubesd[3069]:   File "/usr/lib/python3.5/site-packages/qubes/tarwriter.py", line 169, in main
Apr 28 04:20:02 dom0 qubesd[3069]:     copy_sparse_data(input_file, output, sparse_map)
Apr 28 04:20:02 dom0 qubesd[3069]:   File "/usr/lib/python3.5/site-packages/qubes/tarwriter.py", line 126, in copy_sparse_data
Apr 28 04:20:02 dom0 qubesd[3069]:     output_stream.write(buf[:read])
Apr 28 04:20:02 dom0 qubesd[3069]: BrokenPipeError: [Errno 32] Broken pipe
Apr 28 04:20:02 dom0 widget-wrapper[5166]: Connection to qubesd terminated, reconnecting in 1.0 seconds
Apr 28 04:20:02 dom0 widget-wrapper[5166]: Connection to qubesd terminated, reconnecting in 1.0 seconds
Apr 28 04:20:02 dom0 systemd[1]: Starting Qubes OS daemon...
Apr 28 04:20:02 dom0 qmemman.daemon.algo[28033]: left_memory=411059886 acceptors_count=1
Apr 28 04:20:03 dom0 widget-wrapper[5179]: Connection to qubesd terminated, reconnecting in 1.0 seconds
Apr 28 04:20:03 dom0 widget-wrapper[5166]: Connection to qubesd terminated, reconnecting in 1.0 seconds
Apr 28 04:20:03 dom0 widget-wrapper[5166]: Connection to qubesd terminated, reconnecting in 1.0 seconds
Apr 28 04:20:03 dom0 widget-wrapper[5171]: Connection to qubesd terminated, reconnecting in 1.0 seconds
Apr 28 04:20:03 dom0 systemd[1]: Started Qubes OS daemon.
Apr 28 04:20:03 dom0 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=qubesd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Apr 28 04:20:03 dom0 kernel: audit: type=1130 audit(1556443203.759:2016): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=qubesd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'

@marmarek
Copy link
Member

Apr 28 04:20:02 dom0 systemd[1]: Stopping Qubes OS daemon...

Ok, now the question is: why?
Wild guess - logrotate triggered at that time, which do restart qubesd to reopen logs? Any signs of that?

@andrewdavidwong
Copy link
Member Author

Wild guess - logrotate triggered at that time, which do restart qubesd to reopen logs? Any signs of that?

Yes, looks like you're right:

Apr 28 04:19:49 dom0 qmemman.systemstate[3078]: mem-set domain 0 to 16489283638
Apr 28 04:19:49 dom0 qmemman.systemstate[3078]: mem-set domain 218 to 4194304000
Apr 28 04:20:01 dom0 anacron[27544]: Job `cron.daily' started
Apr 28 04:20:01 dom0 run-parts[28013]: (/etc/cron.daily) starting logrotate
Apr 28 04:20:01 dom0 systemd[1]: Stopping Qubes memory management daemon...
Apr 28 04:20:01 dom0 systemd[1]: Stopped Qubes memory management daemon.
Apr 28 04:20:01 dom0 kernel: audit: type=1131 audit(1556443201.727:2013): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=qubes-qmemman comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Apr 28 04:20:01 dom0 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=qubes-qmemman comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Apr 28 04:20:01 dom0 systemd[1]: Starting Qubes memory management daemon...
Apr 28 04:20:02 dom0 qmemman.daemon[28033]: MIN_PREFMEM=419430400 DOM0_MEM_BOOST=367001600 CACHE_FACTOR=1.3
Apr 28 04:20:02 dom0 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=qubes-qmemman comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Apr 28 04:20:02 dom0 systemd[1]: Started Qubes memory management daemon.
Apr 28 04:20:02 dom0 kernel: audit: type=1130 audit(1556443202.045:2014): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=qubes-qmemman comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Apr 28 04:20:02 dom0 qmemman.daemon.algo[28033]: balance_when_enough_memory(xen_free_memory=57244247, total_mem_pref=0, total_available_memory=57244247)
Apr 28 04:20:02 dom0 qmemman.daemon.algo[28033]: balance_when_enough_memory(xen_free_memory=57244247, total_mem_pref=0, total_available_memory=57244247)
Apr 28 04:20:02 dom0 qmemman.daemon.algo[28033]: balance_when_enough_memory(xen_free_memory=57244247, total_mem_pref=1781167308.8, total_available_memory=14748583306.2)
Apr 28 04:20:02 dom0 qmemman.daemon.algo[28033]: balance_when_enough_memory(xen_free_memory=57244247, total_mem_pref=2256655974.4, total_available_memory=18450621424.600002)
Apr 28 04:20:02 dom0 qmemman.daemon.algo[28033]: left_memory=164459468 acceptors_count=1
Apr 28 04:20:02 dom0 systemd[1]: Stopping Qubes OS daemon...
Apr 28 04:20:02 dom0 qubesd[3069]: caught SIGTERM, exiting

@andrewdavidwong
Copy link
Member Author

@andrewdavidwong andrewdavidwong added P: major Priority: major. Between "default" and "critical" in severity. and removed P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. labels Aug 4, 2019
rustybird added a commit to rustybird/qubes-core-admin that referenced this issue Feb 4, 2021
stderr already ends goes to syslog. Stop duplicating the messages to
/var/lib/qubes/qubes.log (or /tmp/qubes.log), so logrotate doesn't have
to restart qubesd - causing problems such as aborted backups.

Fixes QubesOS/qubes-issues#5004
@qubesos-bot
Copy link

Automated announcement from builder-github

The package qubes-core-dom0-4.1.19-1.fc32 has been pushed to the r4.1 testing repository for dom0.
To test this update, please install it with the following command:

sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing

Changes included in this update

@qubesos-bot
Copy link

Automated announcement from builder-github

The package qubes-core-dom0-4.1.20-1.fc32 has been pushed to the r4.1 stable repository for dom0.
To install this update, please use the standard update command:

sudo qubes-dom0-update

Or update dom0 via Qubes Manager.

Changes included in this update

@aesrentai
Copy link

I encountered this same error (R4.0, fully updated dom0). I used the GUI to start a backup, went to bed (only dom0 and system qubes were running), and woke up to the same error "Backup error: Got empty response from qubesd. See journalctl in dom0 for details."

Relevent portion of dom0 log:

Jul 18 03:40:02 dom0 run-parts[19632]: (/etc/cron.daily) starting logrotate
Jul 18 03:40:03 dom0 systemd[1]: Stopping Qubes memory management daemon...
Jul 18 03:40:03 dom0 systemd[1]: Stopped Qubes memory management daemon.
Jul 18 03:40:03 dom0 systemd[1]: Starting Qubes memory management daemon...
Jul 18 03:40:06 dom0 systemd[1]: Started Qubes memory management daemon.
Jul 18 03:40:06 dom0 qubesd[2420]: caught SIGTERM, exiting
Jul 18 03:40:06 dom0 systemd[1]: Stopping Qubes OS daemon...
Jul 18 03:40:06 dom0 widget-wrapper[7140]: Connection to qubesd terminated, reconnecting in 1.
0 seconds
Jul 18 03:40:07 dom0 kernel: audit: type=1131 audit(1626604806.515:266): pid=1 uid=0 auid=4294
967295 ses=4294967295 msg='unit=qubesd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=
? addr=? terminal=? res=success'
Jul 18 03:40:06 dom0 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='un
it=qubesd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=succe
ss'
Jul 18 03:40:06 dom0 systemd[1]: Stopped Qubes OS daemon.
Jul 18 03:40:07 dom0 libvirtd[2460]: 2021-07-18 10:40:06.515+0000: 2460: error : virNetSocketR
eadWire:1808 : End of file while reading data: Input/output error
Jul 18 03:40:07 dom0 widget-wrapper[7131]: Connection to qubesd terminated, reconnecting in 1.
0 seconds
Jul 18 03:40:07 dom0 widget-wrapper[7147]: Connection to qubesd terminated, reconnecting in 1.
0 seconds
Jul 18 03:40:07 dom0 widget-wrapper[7140]: Connection to qubesd terminated, reconnecting in 1.
0 seconds
Jul 18 03:40:06 dom0 systemd[1]: Starting Qubes OS daemon...
Jul 18 03:40:07 dom0 widget-wrapper[7147]: Connection to qubesd terminated, reconnecting in 1.
0 seconds
Jul 18 03:40:07 dom0 widget-wrapper[7131]: Connection to qubesd terminated, reconnecting in 1.
0 seconds
Jul 18 03:40:07 dom0 widget-wrapper[7140]: Connection to qubesd terminated, reconnecting in 1.
0 seconds
Jul 18 03:40:07 dom0 widget-wrapper[7140]: Connection to qubesd terminated, reconnecting in 1.
0 seconds
Jul 18 03:40:08 dom0 widget-wrapper[7147]: Connection to qubesd terminated, reconnecting in 1.
0 seconds
Jul 18 03:40:08 dom0 widget-wrapper[7131]: Connection to qubesd terminated, reconnecting in 1.
0 seconds
Jul 18 03:40:08 dom0 widget-wrapper[7140]: Connection to qubesd terminated, reconnecting in 1.
0 seconds
Jul 18 03:40:08 dom0 widget-wrapper[7140]: Connection to qubesd terminated, reconnecting in 1.
0 seconds
Jul 18 03:40:09 dom0 widget-wrapper[7147]: Connection to qubesd terminated, reconnecting in 1.
0 seconds
Jul 18 03:40:09 dom0 widget-wrapper[7131]: Connection to qubesd terminated, reconnecting in 1.
0 seconds
Jul 18 03:40:09 dom0 widget-wrapper[7140]: Connection to qubesd terminated, reconnecting in 1.
0 seconds
Jul 18 03:40:09 dom0 widget-wrapper[7140]: Connection to qubesd terminated, reconnecting in 1.
0 seconds
Jul 18 03:40:10 dom0 systemd[1]: Started Qubes OS daemon.

it appears as though the fix was pushed to the R4.1 repos but I don't know if it made it to the R4.0 repos. Considering the backup I'm trying to make is absolutely massive is a workaround just to cancel the logrotate job temporarily?

@DemiMarie
Copy link

Pinging @marmarek

@andrewdavidwong
Copy link
Member Author

I can confirm that this is still a problem on 4.0. Here's a thread that's probably reporting the same thing.

@DemiMarie
Copy link

This needs to be backported. @marmarek ping

@marmarek marmarek added the backport pending On closed issues, indicates fix released for newer version will be backported to older version. label Nov 12, 2021
marmarek pushed a commit to QubesOS/qubes-core-admin that referenced this issue Dec 3, 2021
@qubesos-bot
Copy link

Automated announcement from builder-github

The component core-admin (including package qubes-core-dom0-4.0.59-1.fc25) has been pushed to the r4.0 testing repository for dom0.
To test this update, please install it with the following command:

sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing

Changes included in this update

@qubesos-bot
Copy link

Automated announcement from builder-github

The component core-admin (including package qubes-core-dom0-4.0.59-1.fc25) has been pushed to the r4.0 stable repository for dom0.
To install this update, please use the standard update command:

sudo qubes-dom0-update

Or update dom0 via Qubes Manager.

Changes included in this update

@DemiMarie DemiMarie removed the backport pending On closed issues, indicates fix released for newer version will be backported to older version. label Dec 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: core P: major Priority: major. Between "default" and "critical" in severity. r4.0-dom0-stable r4.1-dom0-stable T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists.
Projects
None yet
Development

No branches or pull requests

5 participants