Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--vm-test fails with ZFS on root after updating lock file #350

Open
Sirius902 opened this issue Jul 17, 2024 · 1 comment
Open

--vm-test fails with ZFS on root after updating lock file #350

Sirius902 opened this issue Jul 17, 2024 · 1 comment

Comments

@Sirius902
Copy link

I have a configuration based on nixos-anywhere-examples here with the only things I've changed being:

  • Changed root filesystem to ZFS.
  • Switched to systemd-boot.
  • Added nixosConfigurations.vm (the same as hetzner-cloud but with disko.devices.disk.disk1.device = "/dev/vda"; to use in QEMU).

With the configuration as-is, nix run github:nix-community/nixos-anywhere -- --flake "path:.#vm" --vm-test succeeds and installing it in a QEMU guest with nix run github:nix-community/nixos-anywhere -- --flake "path:.#vm" root@<ip> also works.

However, after updating the lock file with nix flake update this is not the case. Running the update command will update flake.lock like so.

• Updated input 'disko':
    'github:nix-community/disko/0b178c0554421a6171fc8afb3fb1675511f31377' (2023-09-26)
  → 'github:nix-community/disko/bad376945de7033c7adc424c02054ea3736cf7c4' (2024-07-15)
• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/e12483116b3b51a185a33a272bf351e357ba9a99' (2023-09-21)
  → 'github:NixOS/nixpkgs/9355fa86e6f27422963132c2c9aeedb0fb963d93' (2024-07-16)

Running the command with --vm-test now will result in a failure. The error output is the following.

error: builder for '/nix/store/qqg5a968s0xabipxnz3s01km3g77i321-vm-test-run-disko-nixos-disko.drv' failed with exit code 1;
       last 10 log lines:
       >     driver.run_tests()
       >   File "/nix/store/1jxaawzgwla9qf3ksnzd4a8h0b1ija5n-nixos-test-driver-1.1/lib/python3.12/site-packages/test_driver/driver.py", line 166, in run_tests
       >     self.test_script()
       >   File "/nix/store/1jxaawzgwla9qf3ksnzd4a8h0b1ija5n-nixos-test-driver-1.1/lib/python3.12/site-packages/test_driver/driver.py", line 158, in test_script
       >     exec(self.tests, symbols, None)
       >   File "<string>", line 46, in <module>
       >   File "/nix/store/1jxaawzgwla9qf3ksnzd4a8h0b1ija5n-nixos-test-driver-1.1/lib/python3.12/site-packages/test_driver/machine.py", line 611, in succeed
       >     raise Exception(f"command `{command}` failed (exit code {status})")
       > Exception: command `test -e /mnt/home/testfile` failed (exit code 1)
       > kill vlan (pid 7)
       For full logs, run 'nix log /nix/store/qqg5a968s0xabipxnz3s01km3g77i321-vm-test-run-disko-nixos-disko.drv'.

It is worth noting that installing on the same QEMU guest with the root@<ip> command still works with no errors and the guest successfully boots after the installation. Only running with --vm-test has this issue.

@daroot
Copy link

daroot commented Jul 22, 2024

I've run into the same problem. Using a disko config I know worked on both the vm-test and actual hardware as of 2023-12-20 I'm now seeing the same error as shown above, with the critical error in logs being:

vm-test-run-disko-skeleton-disko> machine # + rm -rf /tmp/tmp.Muxv79PDAU
vm-test-run-disko-skeleton-disko> (finished: must succeed: /nix/store/z8838sjswryrjg6rwy1ck2drf184s7qn-disko, in 29.34 seconds)
vm-test-run-disko-skeleton-disko> machine: must succeed: mkdir -p /mnt/home
vm-test-run-disko-skeleton-disko> (finished: must succeed: mkdir -p /mnt/home, in 0.32 seconds)
vm-test-run-disko-skeleton-disko> machine: must succeed: touch /mnt/home/testfile
vm-test-run-disko-skeleton-disko> (finished: must succeed: touch /mnt/home/testfile, in 0.30 seconds)
vm-test-run-disko-skeleton-disko> machine: must succeed: /nix/store/fv7mfyig4sv48lfggx2m6kvy6893dpd0-disko-format
vm-test-run-disko-skeleton-disko> machine # ++ mktemp -d
vm-test-run-disko-skeleton-disko> machine # + disko_devices_dir=/tmp/tmp.9HItfC04BG
vm-test-run-disko-skeleton-disko> machine # + trap 'rm -rf "$disko_devices_dir"' EXIT
vm-test-run-disko-skeleton-disko> machine # + mkdir -p /tmp/tmp.9HItfC04BG
vm-test-run-disko-skeleton-disko> machine # + device=/dev/vdb
vm-test-run-disko-skeleton-disko> machine # + imageSize=2G
vm-test-run-disko-skeleton-disko> machine # + name=main
vm-test-run-disko-skeleton-disko> machine # + type=disk
vm-test-run-disko-skeleton-disko> machine # + device=/dev/vdb
vm-test-run-disko-skeleton-disko> machine # + efiGptPartitionFirst=1
vm-test-run-disko-skeleton-disko> machine # + type=gpt
vm-test-run-disko-skeleton-disko> machine # + blkid /dev/vdb
vm-test-run-disko-skeleton-disko> machine # /dev/vdb: PTUUID="eae70d3c-a8ea-460b-b02e-14b3dfdb8ee5" PTTYPE="gpt"
vm-test-run-disko-skeleton-disko> machine # + sgdisk --align-end --new=1:0:+512M --change-name=1:disk-main-ESP --typecode=1:EF00 /dev/vdb
vm-test-run-disko-skeleton-disko> machine # Could not create partition 1 from 8386560 to 9435135
vm-test-run-disko-skeleton-disko> machine # Error encountered; not saving changes.
vm-test-run-disko-skeleton-disko> machine # + sgdisk --change-name=1:disk-main-ESP --typecode=1:EF00 /dev/vdb
vm-test-run-disko-skeleton-disko> machine # + partprobe /dev/vdb
vm-test-run-disko-skeleton-disko> machine # + udevadm trigger --subsystem-match=block
vm-test-run-disko-skeleton-disko> machine # + udevadm settle
vm-test-run-disko-skeleton-disko> machine # + sgdisk --align-end --new=2:0:-0 --change-name=2:disk-main-rootfs --typecode=2:8300 /dev/vdb
vm-test-run-disko-skeleton-disko> machine # Could not create partition 2 from 8386560 to 8388574
vm-test-run-disko-skeleton-disko> machine # Error encountered; not saving changes.
vm-test-run-disko-skeleton-disko> machine # + sgdisk --change-name=2:disk-main-rootfs --typecode=2:8300 /dev/vdb

In particular, sgdisk does not seem to be doing the right thing, creating a partition at an unusual offset, which makes me think the test is improperly clearing the block device either in nixos-anywhere or disko itself, or something has changed in the behavior of sgdisk (which updated from 1.0.9 to 1.0.10 in this commit NixOS/nixpkgs#297099 )

I'm working on trying to isolate the chain of nixos-anywhere, disko, and gptfdisk to figure out what actually broke.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants