Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to restore vm with disk size smaller than default #7176

Closed
a-barinov opened this issue Jan 2, 2022 · 23 comments · Fixed by QubesOS/qubes-core-admin-client#238
Closed
Labels
affects-4.1 This issue affects Qubes OS 4.1. C: core diagnosed Technical diagnosis has been performed (see issue comments). P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. pr submitted A pull request has been submitted for this issue. r4.1-bookworm-cur-test r4.1-bullseye-cur-test r4.1-buster-cur-test r4.1-centos-stream8-cur-test r4.1-dom0-stable r4.1-fc36-stable r4.1-fc37-stable r4.1-fc38-stable r4.2-host-stable r4.2-vm-bookworm-stable r4.2-vm-bullseye-stable r4.2-vm-centos-stream8-stable r4.2-vm-fc36-stable r4.2-vm-fc37-stable r4.2-vm-fc38-cur-test T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists.

Comments

@a-barinov
Copy link

Qubes OS release

4.1.0-rc3

Brief summary

When trying to restore a TemplateVM with root or private volume smaller than default (10G and 2G respectively), restore fails with error "not enough data".
I have not tested other vm types or backups created in Qubes 4.0.

Steps to reproduce

  1. Create a new TemplateVM.
  2. Shrink root volume to 2.5Gb and private volume to 0.5Gb, this requires a bit of manual fs, lvm and partition (in case of root volume) resizing.
  3. Backup the VM.
  4. Try restoring from the backup.

Expected behavior

VM getting restored from backup and can be started without errors.

Actual behavior

Trying to restore, you get the following error:

qubesadmin.backup: -> Restoring debian-full...
qubesadmin.backup: Extracting data: 1.8 GiB to restore
qubesadmin.backup: Failed to restore volume root of VM debian-full1: Data import failed: not enough data (copied 2684354560 bytes, expected 10737418240 bytes)
qubesadmin.backup: Failed to restore volume private of VM debian-full1: Data import failed: not enough data (copied 536870912 bytes, expected 2147483648 bytes)
qubesadmin.backup: -> Done.

Restored VM then cannot start.

@a-barinov a-barinov added P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists. labels Jan 2, 2022
@andrewdavidwong andrewdavidwong added C: core needs diagnosis Requires technical diagnosis from developer. Replace with "diagnosed" or remove if otherwise closed. labels Jan 2, 2022
@andrewdavidwong andrewdavidwong added this to the Release 4.1 milestone Jan 2, 2022
@rustybird
Copy link

  1. This line of the restore code should check for inequality

  2. Storage driver support:

    1. file-reflink: works
    2. lvm_thin: see this comment
    3. legacy file: looks similar to lvm_thin

@marmarek
Copy link
Member

marmarek commented Jan 2, 2022

qubesadmin.backup: Failed to restore volume root of VM debian-full1: Data import failed: not enough data (copied 2684354560 bytes, expected 10737418240 bytes)

This is 2.5G, while the root volume is 10G normally, so it isn't just about missing resize, it really got not enough data.
EDIT: not enough coffee...

@andrewdavidwong andrewdavidwong added diagnosed Technical diagnosis has been performed (see issue comments). and removed needs diagnosis Requires technical diagnosis from developer. Replace with "diagnosed" or remove if otherwise closed. labels Jan 3, 2022
@palainp
Copy link

palainp commented Apr 5, 2022

I ran into the same type of issue with a template build for a mirageOS unikernel (qubes-mirage-firewall). My builder.conf have TEMPLATE_ROOT_SIZE ?= 50MiB which should be enough for the unikernel and I can see in the build-logs/template-mirage.log:

...
--> Creating template config file...
--> Linking /home to /rw/home...
--> Linking /usr/local to /rw/usrlocal...
Reducing image size (calling cleanup_image)...
9268 -rw-r--r-- 1 user user 52428800 Apr  5 09:53 qubeized_images/mirage-firewall/root.img
--> Cleaning up image file...
--> Compacting image file...
mnt: 40 MiB (41944064 bytes) trimmed
9204 -rw-r--r-- 1 user user 52428800 Apr  5 09:53 qubeized_images/mirage-firewall/root.img
--> Unmounting qubeized_images/mirage-firewall/root.img
Qubeized image stored at: qubeized_images/mirage-firewall/root.img
...

and when I try to install it into dom0:

...
mirage-firewall: Importing data
Traceback (most recent call last):
  File "/bin/qvm-template-postprocess", line 5, in <module>
    sys.exit(main())
  File "/usr/lib/python3.8/site-packages/qubesadmin/tools/qvm_template_postprocess.py", line 431, in main
    loop.run_until_complete(post_install(args))
  File "/usr/lib64/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "/usr/lib/python3.8/site-packages/qubesadmin/tools/qvm_template_postprocess.py", line 297, in post_install
    import_root_img(vm, args.dir)
  File "/usr/lib/python3.8/site-packages/qubesadmin/tools/qvm_template_postprocess.py", line 107, in import_root_img
    vm.volumes['root'].import_data(stream=tar.stdout)
  File "/usr/lib/python3.8/site-packages/qubesadmin/storage.py", line 270, in import_data
    self._qubesd_call('Import', payload_stream=stream)
  File "/usr/lib/python3.8/site-packages/qubesadmin/storage.py", line 76, in _qubesd_call
    return self.app.qubesd_call(
  File "/usr/lib/python3.8/site-packages/qubesadmin/app.py", line 730, in qubesd_call
    return self._parse_qubesd_response(stdout)
  File "/usr/lib/python3.8/site-packages/qubesadmin/base.py", line 109, in _parse_qubesd_response
    raise exc_class(format_string, *args)
qubesadmin.exc.QubesException: Data import failed: not enough data (copied 52428800 bytes, expected 10737418240 bytes)
warning: %post(qubes-template-mirage-firewall-4.0.6-202204051348.noarch) scriptlet failed, exit status 1
...

The root.img size seems to be the same from the logfile and in dom0, but the "< default=10G" size trigger the issue.

EDIT: add the full backtrace

@palainp
Copy link

palainp commented Apr 5, 2022

Not sure if that can help, and my knowledge does not permit me to try that patch, but as we seem to have the root.img size in qvm_template_postprocess.py:def import_root_img(vm, source_dir), can it be possible to change (https://github.com/QubesOS/qubes-core-admin-client/blob/400ea49e84bfba30f50b658ca0b23e4bb18c52f2/qubesadmin/tools/qvm_template_postprocess.py#L119)?

        with subprocess.Popen(['tar', 'xSOf', root_path + '.tar'],
                stdout=subprocess.PIPE) as tar:
-            vm.volumes['root'].import_data(stream=tar.stdout)
+            vm.volumes['root'].import_data_with_size(stream=tar.stdout, root_size) 
        if tar.returncode != 0:
            raise qubesadmin.exc.QubesException('root.img extraction failed')

And also at https://github.com/QubesOS/qubes-core-admin-client/blob/400ea49e84bfba30f50b658ca0b23e4bb18c52f2/qubesadmin/tools/qvm_template_postprocess.py#L134

@rustybird
Copy link

@palainp Neat! import_data_with_size() instead of the current resize() + import_data() would work even for storage drivers that don't support shrinking.

qubesadmin.backup.restore has the relevant code, but there's lots of indirection so it might have to be refactored a bit.

@a-barinov
Copy link
Author

OK, I was kind of able to fix this, but due to too many API layers this is not the cleanest fix.

Contrary to what @rustybird said previously, the error message is actually triggered in admin.vm.volume.Import script, section starting from line 69. Commenting it out will get rid of the error, but your restored volumes will not be resized. I therefore decided to look for a cleaner way which sent me down a rabbit hole...

First, inspired by @palainp I made a similar change to restore code in restore.py

+++ restore.py  2022-04-30 20:04:24.962001232 +0200
@@ -1866,7 +1866,7 @@
     def _handle_volume_data(self, vm, volume, stream):
         '''Wrap volume data import with logging'''
         try:
-            volume.import_data(stream)
+            volume.import_data_with_size(stream=stream, size=volume.size)
         except Exception as err:  # pylint: disable=broad-except
             self.log.error('Failed to restore volume %s of VM %s: %s',
                 volume.name, vm.name, err)

This was not enough though, as for whatever reason a function just below patched prohibits volume downsizing during restore, which I consider to be a bug. So one additional fix is needed:

+++ restore.py  2022-04-30 20:04:24.962001232 +0200
@@ -1874,8 +1874,7 @@
     def _handle_volume_size(self, vm, volume, size):
         '''Wrap volume resize with logging'''
         try:
-            if volume.size < size:
-                volume.resize(size)
+            volume.resize(size)
         except Exception as err:  # pylint: disable=broad-except
             self.log.error('Failed to resize volume %s of VM %s to %d: %s',
                 volume.name, vm.name, size, err)

This fixes restore code to properly take care of volume downsizing. I was hoping that this will do the job, but turned out the underlying storage driver cannot downsize volumes! I think this was done to prevent user from reducing volume size from qube settings or commend line (which is a right thing to do), but baking this into storage backend does sounds way too extreme to me. Anyway, I'm not keen to edit underlying Qubes API so I decided to fully restore the ability to downsize volumes.

As I use LVM backend, all the changes I needed are limited to lvm.py.

First, I had to remove backend restriction to downsize volumes:

+++ lvm.py  2022-04-30 21:00:51.311000681 +0200
@@ -607,12 +607,12 @@
             msg = 'Can not resize reađonly volume {!s}'.format(self)
             raise qubes.storage.StoragePoolException(msg)
 
-        if size < self.size:
-            raise qubes.storage.StoragePoolException(
-                'For your own safety, shrinking of %s is'
-                ' disabled (%d < %d). If you really know what you'
-                ' are doing, use `lvresize` on %s manually.' %
-                (self.name, size, self.size, self.vid))
 
         if size == self.size:
             return

Second, we need to allow extend operation to downsize volumes:

+++ lvm.py  2022-04-30 21:00:51.311000681 +0200
@@ -766,7 +766,7 @@
                    '--virtualsize=' + cmd[3] + 'B', '--', cmd[1]]
     elif action == 'extend':
         assert len(cmd) == 3, 'wrong number of arguments for extend'
-        lvm_cmd = ["lvextend", "--size=" + cmd[2] + 'B', '--', cmd[1]]
+        lvm_cmd = ["lvresize", "--force", "--size=" + cmd[2] + 'B', '--', cmd[1]]
     elif action == 'activate':
         assert len(cmd) == 2, 'wrong number of arguments for activate'
         lvm_cmd = ['lvchange', '--activate=y', '--', cmd[1]]

Keep in mind this will likely enable system-wide LVM volumes downsize. A cleaner approach would be to modify Qubes internal API so that restore operation could indicate that volume downsize shall be allowed. Alternatively, all the downsize checks can be move to appropriate frontends, leaving my added backend downsize capability intact. I definitely prefer second solution as more architecturally correct ne.

@DemiMarie
Copy link

@a-barinov You, I, and @marmarek all agree that the downsize checks do not belong in qubesd. Care to send a PR with your changes?

@a-barinov
Copy link
Author

You'll have it in a few days. Let me verify that other frontends have appropriate downsize checks, and also check other backends.

rustybird added a commit to rustybird/qubes-core-admin-client that referenced this issue Apr 10, 2023
Migrate from resize() followed by unspecified-size import_data() to the
newer method import_data_with_size(), which fixes two bugs.

1. When restoring a volume with a size that's smaller than the default,
   the resize step didn't attempt to shrink (and not all storage drivers
   would have supported it anyway), which caused the import step to fail
   with the message "not enough data":

Fixes QubesOS/qubes-issues#7176

2. When restoring a volume with a size that's not divisible by 4 MiB
   (e.g. backed up from file-reflink), LVM Thin implicitly rounds up the
   resize to next multiple, which again caused the import step to fail
   in the same way:

https://forum.qubes-os.org/t/backup-didnt-restore-private-volume/17874
marmarek added a commit to marmarek/qubes-core-admin that referenced this issue Apr 13, 2023
Allow admin.vm.volume.ImportWithSize too, not only admin.vm.volume.Import.

QubesOS/qubes-issues#7176
@andrewdavidwong andrewdavidwong added the pr submitted A pull request has been submitted for this issue. label Apr 17, 2023
@qubesos-bot
Copy link

Automated announcement from builder-github

The package core-admin-client has been pushed to the r4.2 testing repository for the Debian template.
To test this update, first enable the testing repository in /etc/apt/sources.list.d/qubes-*.list by uncommenting the line containing bullseye-testing (or appropriate equivalent for your template version), then use the standard update command:

sudo apt-get update && sudo apt-get dist-upgrade

Changes included in this update

@qubesos-bot
Copy link

Automated announcement from builder-github

The package core-admin-client has been pushed to the r4.2 testing repository for the CentOS centos-stream8 template.
To test this update, please install it with the following command:

sudo yum update --enablerepo=qubes-vm-r4.2-current-testing

Changes included in this update

@qubesos-bot
Copy link

Automated announcement from builder-github

The component core-admin-client (including package core-admin-client) has been pushed to the r4.2 stable repository for the Fedora template.
To install this update, please use the standard update command:

sudo dnf update

Changes included in this update

@qubesos-bot
Copy link

Automated announcement from builder-github

The component core-admin-client (including package core-admin-client) has been pushed to the r4.2 stable repository for the Fedora template.
To install this update, please use the standard update command:

sudo dnf update

Changes included in this update

marmarek pushed a commit to QubesOS/qubes-core-admin-client that referenced this issue Jun 22, 2023
Migrate from resize() followed by unspecified-size import_data() to the
newer method import_data_with_size(), fixing two bugs.

1. When restoring a volume with a size that's smaller than the default,
   the resize step didn't attempt to shrink (and not all storage drivers
   would have supported it anyway), which caused the import step to fail
   with the message "not enough data":

Fixes QubesOS/qubes-issues#7176

2. When restoring a volume with a size that's not divisible by 4 MiB
   (e.g. backed up from file-reflink), LVM Thin implicitly rounds up the
   resize to next multiple, which again caused the import step to fail
   in the same way:

https://forum.qubes-os.org/t/backup-didnt-restore-private-volume/17874
(cherry picked from commit ba9b24d)
@qubesos-bot
Copy link

Automated announcement from builder-github

The component core-admin-client (including package python3-qubesadmin-4.1.31-1.fc32) has been pushed to the r4.1 testing repository for dom0.
To test this update, please install it with the following command:

sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing

Changes included in this update

@qubesos-bot
Copy link

Automated announcement from builder-github

The component core-admin-client (including package python3-qubesadmin-4.1.31-1.fc32) has been pushed to the r4.1 stable repository for dom0.
To install this update, please use the standard update command:

sudo qubes-dom0-update

Or update dom0 via Qubes Manager.

Changes included in this update

@qubesos-bot
Copy link

Automated announcement from builder-github

The package qubes-core-admin-client_4.1.31-1 has been pushed to the r4.1 testing repository for the Debian template.
To test this update, first enable the testing repository in /etc/apt/sources.list.d/qubes-*.list by uncommenting the line containing buster-testing (or appropriate equivalent for your template version), then use the standard update command:

sudo apt-get update && sudo apt-get dist-upgrade

Changes included in this update

@andrewdavidwong andrewdavidwong added the affects-4.1 This issue affects Qubes OS 4.1. label Aug 8, 2023
@qubesos-bot
Copy link

Automated announcement from builder-github

The package core-admin-client has been pushed to the r4.1 testing repository for the CentOS centos-stream8 template.
To test this update, please install it with the following command:

sudo yum update --enablerepo=qubes-vm-r4.1-current-testing

Changes included in this update

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-4.1 This issue affects Qubes OS 4.1. C: core diagnosed Technical diagnosis has been performed (see issue comments). P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. pr submitted A pull request has been submitted for this issue. r4.1-bookworm-cur-test r4.1-bullseye-cur-test r4.1-buster-cur-test r4.1-centos-stream8-cur-test r4.1-dom0-stable r4.1-fc36-stable r4.1-fc37-stable r4.1-fc38-stable r4.2-host-stable r4.2-vm-bookworm-stable r4.2-vm-bullseye-stable r4.2-vm-centos-stream8-stable r4.2-vm-fc36-stable r4.2-vm-fc37-stable r4.2-vm-fc38-cur-test T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants