Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

overlay.d: add 30lvmdevices overlay #2566

Merged
merged 2 commits into from
Aug 30, 2023

Conversation

dustymabe
Copy link
Member

Populate an lvmdevices(8) file to limit LVM from autoactivating all
devices it sees in a system. By default systems will get a "blank"
configuration file with a comment in it explaining what it is used
for. There is also a one-time "populate" service that will run and
add any devices it sees into the devices file. This will serve to
import existing devices on upgrading systems or new systems with
pre-existing LVM devices attached. See the tracker issue [1] for more
information.

[1] coreos/fedora-coreos-tracker#1517

Copy link
Member

@travier travier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments but looks great otherwise

tests/kola/disks/lvmdevices Show resolved Hide resolved
tests/kola/disks/lvmdevices Show resolved Hide resolved
tests/kola/disks/lvmdevices Show resolved Hide resolved
tests/kola/disks/lvmdevices Outdated Show resolved Hide resolved
DefaultDependencies=false
# Since our ConditionPathExists file lives under /var/lib/ let's make
# sure they are set up if those filesystems/mountpoints exist separately.
After=var.mount var-lib.mount
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is just RequiresMountsFor=/var/lib

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a lot of systems there may be no extra filesystem for /var or /var/lib. RequiresMountsFor= will add a Requires= too, which we don't want IIUC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This operates correctly if there isn't a distinct filesystem, see some of the examples in

$ grep -r RequiresMounts /usr/lib/systemd/system
/usr/lib/systemd/system/rpmdb-migrate.service:RequiresMountsFor=/var
/usr/lib/systemd/system/rpmdb-rebuild.service:RequiresMountsFor=/usr
/usr/lib/systemd/system/rpm-ostreed.service:RequiresMountsFor=/boot
/usr/lib/systemd/system/ostree-boot-complete.service:RequiresMountsFor=/boot
/usr/lib/systemd/system/ostree-finalize-staged-hold.service:RequiresMountsFor=/sysroot /boot
/usr/lib/systemd/system/ostree-finalize-staged.service:RequiresMountsFor=/sysroot /boot
/usr/lib/systemd/system/basic.target:RequiresMountsFor=/var /var/tmp
/usr/lib/systemd/system/podman-clean-transient.service:RequiresMountsFor=%t/containers
/usr/lib/systemd/system/[email protected]:RequiresMountsFor=%t/containers
/usr/lib/systemd/system/rpcbind.service:RequiresMountsFor=/run/rpcbind
/usr/lib/systemd/system/systemd-journal-flush.service:RequiresMountsFor=/var/log/journal
/usr/lib/systemd/system/systemd-machined.service:RequiresMountsFor=/var/lib/machines
/usr/lib/systemd/system/[email protected]:RequiresMountsFor=/var/lib/machines/%i
/usr/lib/systemd/system/systemd-portabled.service:RequiresMountsFor=/var/lib/portables
/usr/lib/systemd/system/systemd-random-seed.service:RequiresMountsFor=/var/lib/systemd/random-seed
/usr/lib/systemd/system/systemd-rfkill.socket:RequiresMountsFor=/var/lib/systemd/rfkill
/usr/lib/systemd/system/systemd-update-utmp-runlevel.service:RequiresMountsFor=/var/log/wtmp
/usr/lib/systemd/system/systemd-update-utmp.service:RequiresMountsFor=/var/log/wtmp

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going mostly off of the description in the man page which says:

RequiresMountsFor=
    Takes a space-separated list of absolute paths. Automatically adds dependencies
    of type Requires= and After= for all mount units required to access the
    specified path.

From that description RequiresMountsFor=/var/lib would translate directly into Requires=var-lib.mount and After=var-lib.mount which would force coreos-populate-lvmdevices.service to get dropped from the transaction in systemd 254+.

And indeed Requires=var-lib.mount and After=var-lib.mount yields me with:

[core@qemu0 ~]$ systemctl status coreos-populate-lvmdevices.service 
○ coreos-populate-lvmdevices.service - CoreOS Populate LVM Devices File
     Loaded: loaded (/usr/lib/systemd/system/coreos-populate-lvmdevices.service; enabled; preset: enabled)
    Drop-In: /usr/lib/systemd/system/service.d
             └─10-timeout-abort.conf
     Active: inactive (dead)
       Docs: https://github.com/coreos/fedora-coreos-tracker/issues/1517

Aug 28 20:18:29 localhost systemd[1]: coreos-populate-lvmdevices.service: Cannot add dependency job to transaction, deleting job coreos-populate-lvmdevices.service/start again: Unit var-lib.mount not found.
Aug 28 20:18:29 localhost systemd[1]: coreos-populate-lvmdevices.service: Cannot add dependency job, ignoring: Unit var-lib.mount not found.
Aug 28 20:19:18 qemu0 systemd[1]: coreos-populate-lvmdevices.service: Next restart interval calculated as: 100ms

but RequiresMountsFor=/var/lib apparently has different behavior because it works fine (passes the test).

I guess maybe the difference is RequiresMountsFor= will just look at what mounts exist and only mark them required if they exist and are required to access that path?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another question is will RequiresMountsFor=/var/lib also add a requirement for var.mount? I would assume it would have done both var.mount and var-lib.mount but then why does systemd have RequiresMountsFor=/var /var/tmp separate in https://github.com/systemd/systemd/blob/337e8504f71074686a00dbec4ed9542a0a0c2224/units/basic.target#L21 ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess maybe the difference is RequiresMountsFor= will just look at what mounts exist and only mark them required if they exist and are required to access that path?

From a quick glance at the source code (see unit_add_mount_dependencies) it looks like that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok added RequiresMountsFor=/var /var/lib in the latest upload.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess maybe the difference is RequiresMountsFor= will just look at what mounts exist and only mark them required if they exist and are required to access that path?

From a quick glance at the source code (see unit_add_mount_dependencies) it looks like that.

Yeah. I was looking at that function but it wasn't 100% clear to me. I'm guessing manager_load_unit_prepare will load a unit for var-lib.mount even if no configuration exists for one and it'll get deleted from the transaction later. At least that's what the debug level systemd logging looked like was happening to me when I tested it.

# On OpenShift/Kubernetes we want to ensure we run before kubernetes
# comes up where storage drivers may be initiated and present more LVM
# devices to the system that we don't want to be considered.
Before=kubelet.service
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit of a layering violation here, which ultimately at least for OCP we'll fix with something in the shape of openshift/machine-config-operator#3865

But obviously a good idea to have here now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. It's probably not needed since we have DefaultDependencies=false, but I was trying to make sure we had the case covered. I'm happy to remove it if you think it's safe to do so.

@dustymabe
Copy link
Member Author

Once we get the code review close I'll spin another RHCOS/OpenShift test for this. Let me know.

@teigland
Copy link

Hi, I happened to pull up this issue again today while researching the design of the lvm feature (bug 2224641) that I hope would handle this in the future (for this and another image-based installations).

The basic idea is that on first boot, lvm would create a new system.devices file for the root VG if no system.devices exists. The method I see above is including all visible PVs. Is it important for you to include all PVs vs just devices for the root VG?

Also, I'm thinking about whether the automatic creation of system.devices can happen by default, or whether it needs to be enabled explicitly. If it needs to be enabled, then you'd need to do something like edit a file in the image to turn it on before booting the image for the first time. How difficult would that editing be?

Looking at the script for populating system.devices, it looks like a fine approach, except it's unfortunate that it has to do activation, which is a fairly significant divergence from normal system behavior (although it makes sense that first boot has some latitude for doing things differently.) I can see why an activation command would be needed in the script, and don't have any good suggestion. I'd note that 'vgchange -aay' is the standard/preferred command for autoactivation, as opposed to 'pvscan --cache -aay' which is no longer used in RHEL9 (although it should still work.)

@dustymabe
Copy link
Member Author

Hi, I happened to pull up this issue again today while researching the design of the lvm feature (bug 2224641) that I hope would handle this in the future (for this and another image-based installations).

The basic idea is that on first boot, lvm would create a new system.devices file for the root VG if no system.devices exists. The method I see above is including all visible PVs. Is it important for you to include all PVs vs just devices for the root VG?

With CoreOS systems it's almost impossible for the root filesystem to be on a VG. Ignition doesn't support LVM and we don't offer any way to provision a system to use LVM for the root filesystem so that's not a use case I'm considering here.

The goal of this work is to identify and add all existing LVM devices on the system to a devices file. Think if someone had added a device to a VG called vgdata and then an LV on top and ran mkfs and mounted that under /var/data/. That's the use case I'm trying to capture here. i.e. on upgrade their LV continues to work even though we are now delivering a devices file by default.

The service runs before kubelet is started so hopefully any Block storage delivered by storage drivers that may have LVM on subordinate devices for guests wouldn't be up yet.

Also, I'm thinking about whether the automatic creation of system.devices can happen by default, or whether it needs to be enabled explicitly. If it needs to be enabled, then you'd need to do something like edit a file in the image to turn it on before booting the image for the first time. How difficult would that editing be?

Right now the way this PR works is that there is an empty LVM devices file delivered with the system by default.

Looking at the script for populating system.devices, it looks like a fine approach, except it's unfortunate that it has to do activation, which is a fairly significant divergence from normal system behavior (although it makes sense that first boot has some latitude for doing things differently.) I can see why an activation command would be needed in the script, and don't have any good suggestion. I'd note that 'vgchange -aay' is the standard/preferred command for autoactivation, as opposed to 'pvscan --cache -aay' which is no longer used in RHEL9 (although it should still work.)

For vgchange versus pvscan, don't I need some sort of "scan" to happen to search for devices that had previously been ignored? Would vgchange do that?

I wish I didn't need any scan at all. When I ran the test (see the test added in this PR) on Fedora I didn't need the pvcscan, the sytem just picked up the change to the devices file and it worked without the scan. On RHEL I needed the scan. Maybe it's a race condition of sorts. Either way the pvscan was the big hammer approach.

@dustymabe
Copy link
Member Author

@teigland - I'd be happy to get together to discuss this further! There's a lot of nuance here and boucing ideas off each other would be quite useful.

@teigland
Copy link

With CoreOS systems it's almost impossible for the root filesystem to be on a VG. Ignition doesn't support LVM and we don't offer any way to provision a system to use LVM for the root filesystem so that's not a use case I'm considering here.

The goal of this work is to identify and add all existing LVM devices on the system to a devices file. Think if someone had added a device to a VG called vgdata and then an LV on top and ran mkfs and mounted that under /var/data/. That's the use case I'm trying to capture here. i.e. on upgrade their LV continues to work even though we are now delivering a devices file by default.

Interesting, it's quite a different problem than I've been working on solving. The problem here sounds like an OS upgrade, and I'd been focusing on fresh image-based installations that use lvm for the system root. Upgrades are obviously complicated by existing system customizations -- I've never seriously considered trying to enable the devices file in a RHEL8 to 9 upgrade. 8 to 9 upgrades is the main reason that the lvmdevices feature is disabled when no system.devices exists.

@cgwalters
Copy link
Member

With CoreOS systems it's almost impossible for the root filesystem to be on a VG. Ignition doesn't support LVM and we don't offer any way to provision a system to use LVM for the root filesystem so that's not a use case I'm considering here.

(Note that bootc install-to-filesystem though is explicitly intended to be used for things like installing on top of LVM, and I definitely plan to continue to "productize" bootc in the CoreOS context, so Ignition-LVM support is not necessary to use CoreOS with LVM in the future)

Populate an lvmdevices(8) file to limit LVM from autoactivating all
devices it sees in a system. By default systems will get a "blank"
configuration file with a comment in it explaining what it is used
for. There is also a one-time "populate" service that will run and
add any devices it sees into the devices file. This will serve to
import existing devices on upgrading systems or new systems with
pre-existing LVM devices attached. See the tracker issue [1] for more
information.

[1] coreos/fedora-coreos-tracker#1517
It doesn't have a - between 25 and azure.
@dustymabe
Copy link
Member Author

Once we get the code review close I'll spin another RHCOS/OpenShift test for this. Let me know.

OK I'm going to give this another round of testing on RHCOS/OpenShift now and then merge if everything looks good.

@dustymabe dustymabe merged commit ae7b792 into coreos:testing-devel Aug 30, 2023
2 checks passed
@dustymabe dustymabe deleted the dusty-lvmdevices branch August 30, 2023 02:31
dustymabe added a commit to coreosbot-releng/os that referenced this pull request Aug 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants