-
-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HVM is significantly slower than PVH (Xen 4.14) #6174
HVM is significantly slower than PVH (Xen 4.14) #6174
Comments
I cannot reproduce it locally, but it seems HVM is significantly slower than PVH (at least its boot time). This may be slow enough on openQA (thanks to nested virt) to hit the timeout. |
Attempted to test this, just providing observations: The very first boot, the VM did not start, but the errors observed on the console involved systemd-udevd invoking oom-killer 7 seconds after boot. Unsure if related as the logs do not line up with what is originally posted. Attached: fed32-hvm-oom.txt Could not reproduce after the very first boot, even with changing memory values, keeping in-vm kernel:
|
Ok, so this is OOM in initramfs, where the swap is not enabled yet. And apparently now Fedora is running fsck there... That said, I now did once got oom-killer killing systemd-udevd in fedora-32 based VM (after initramfs phase, but still before enabling swap), with dom0-provided kernel. Where are the times when Linux was happy with 128MB RAM in total... |
I suspect it largely depends on config options. There are Linux devices (Azure Sphere) running happily with 4MiB of RAM. And fsck is known to be somewhat RAM intensive, especially on large partitions. That’s why OpenBSD turns on swap first, and I think we should do the same. |
Ideally, we should enable swap before doing just about anything else. |
Grub scripts are very persistent in trying to use what is currently mounted as /. Even if currently (TemplateVM) /dev/xvda3 is mounted directly, all the configuration should use /dev/mapper/dmroot, to work also in AppVM. GRUB_DEVICE is used in various places as root device (including constructing root= parameter in some versions). Force it to /dev/mapper/dmroot QubesOS/qubes-issues#6174
fsck may require significant amount of RAM, enable swap earlier to avoid out of memory condition QubesOS/qubes-issues#6174
fsck may require significant amount of RAM, enable swap earlier to avoid out of memory condition. Implement this as a separate service unit, not a swap unit, because the latter requires udev running (implicit dependency on dev-xvdc1.device) which is not the case before remounting root filesystem read-write. QubesOS/qubes-issues#6174
Swap may be some factor here, but definitely not the only one. HVM seems to boot significantly slower even with the same kernel from dom0.
Part of it is because of stubdomain startup (10s between |
Automated announcement from builder-github The package
|
Automated announcement from builder-github The package
|
One more thing worth testing: |
From quick testing What does appear to be constant is a ~4 second pause during |
Automated announcement from builder-github The package
|
Fixes QubesOS/qubes-issues#6174 Conflicts: qemu/patches/series Fixup since other branch has more patches.
Automated announcement from builder-github The package
Or update dom0 via Qubes Manager. |
Strongly discouraged to rely on RDRAND for security / entropy quality anyhow as per: |
In context of this issue, it is not a problem, because stubdomain does not use RNG for any security critical task. There is not crypto involved etc. One could argue it may make ASLR for qemu less effective, but we don't consider qemu trusted, so it is not a huge deal (and remember the RDRAND issues are still very hypothetical - see below). In a broader context of RDRAND, I don't think we should worry about backdoors there. Or rather: if you consider intentional backdoors in your CPU a valid threat, throw away that CPU. There is no really a difference how such hypothetical backdoor could work - whether that would be predictable RDRAND, reacting to some magic values to any other instruction, or anything else. We could worry about its effectiveness - not intentional bugs, which indeed is hard to reason about, since its being opaque. |
Linux inside HVM will allocate 64MB for bouncing DMA (SWIOTLB) by default. If no real PCI device is assigned, that's way too much, and wastes over 15% of VM's initial memory. With real PCI devices, it's usually too much too, but it's very device specific, so don't risk breaking it. In other cases, reduce default to 4MB. Note PVH domain will not allocate SWIOTLB anyway, as no PCI devices are there at all. This difference contributes to the VM start time, so reducing SWIOTLB should also improve that part. QubesOS/qubes-issues#6174
Linux inside HVM will allocate 64MB for bouncing DMA (SWIOTLB) by default. If no real PCI device is assigned, that's way too much, and wastes over 15% of VM's initial memory. With real PCI devices, it's usually too much too, but it's very device specific, so don't risk breaking it. In other cases, reduce default to 4MB. Note PVH domain will not allocate SWIOTLB anyway, as no PCI devices are there at all. This difference contributes to the VM start time, so reducing SWIOTLB should also improve that part. QubesOS/qubes-issues#6174
Linux inside HVM will allocate 64MB for bouncing DMA (SWIOTLB) by default. If no real PCI device is assigned, that's way too much, and wastes over 15% of VM's initial memory. With real PCI devices, it's usually too much too, but it's very device specific, so don't risk breaking it. In other cases, reduce default to 4MB. Note PVH domain will not allocate SWIOTLB anyway, as no PCI devices are there at all. This difference contributes to the VM start time, so reducing SWIOTLB should also improve that part. QubesOS/qubes-issues#6174 (cherry picked from commit c774fd4)
This issue is being closed because:
If anyone believes that this issue should be reopened, please leave a comment saying so. |
Observation
openQA test in scenario qubesos-4.1-pull-requests-x86_64-system_tests_pvgrub_salt_storage@64bit fails in
TC_41_HVMGrub_fedora-32
Possibly related VM console entries:
Note double
root=
. That isn't necessary the root cause.The same test works on debian-10.
Test suite description
Setup fedora-32 StandaloneVM HVM with 'kernel' set to none.
Reproducible
Fails since (at least) Build 2020103122-4.1 (current job)
Expected result
Last good: 2020103116-4.1 (or more recent)
Further details
Always latest result in this scenario: latest
The text was updated successfully, but these errors were encountered: