-
Notifications
You must be signed in to change notification settings - Fork 27
ignition-udev-service: add support for root on multipath #183
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No objections to merging mostly as is but some comments:
After=ignition-setup-base.service | ||
|
||
# Wait for multipathd | ||
Wants=multipathd.service |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing related to this...it'd be nice if we only started multipathd.service
if it was actually in use; today we start it unconditionally in the initramfs and it just spews an error and then nothing happens. (And the same with iscsi)
IOW I think this should only be started if there's a dm.multipath=1
karg or something like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah but I guess this whole service is only started if something like multipath is in use?
If that's the case, bikeshed naming: ignition-diskful-complex-blockdev.service
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the latest push is a bit better in this regard.
Type=oneshot | ||
RemainAfterExit=yes | ||
# Have to shell out since we need shell expansion of the * | ||
ExecStart=-/bin/sh -c '/usr/bin/udevadm trigger -w -v /dev/dm*' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what I find confusing here is I'd really expect there to be an explicit command for this somewhere else. Is there not similar code in dracut or multipathd?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's my confusion as well. I had hoped that we'd get this for free and all we really had to do was just make sure that the units we have which need ro access to /boot
run early enough that they can't interfere with multipath rewiring the symlinks. Poking around on this to see how it happens on traditional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The multipath systemd unit triggers systemd-udev-trigger
. The problem we have is we need to wait until the symlinks are updated (which won't happen if the the disk is mounted) and during testing we had a racing condition. That's why I have this ugly workaround.
My understanding is that on most traditional systems that run |
How are you testing this BTW? Are you just using e.g. |
Hmm, so I added |
This provides support for multipath without having to re-order everything or provide a new target. We need to trigger udev to update the device symlinks in `/dev/disk/by-{id,label,uuid}` for multipath devices after multipathd has run. This reorders some services to run in "RO" before the multipathd target is setup and then moves the "RW" after device-mapper targets have been found. Signed-off-by: Ben Howard <[email protected]>
Rename coreos-teardown-initramfs-network.* to coreos-teardown-initramfs.* and add function to propigating automatic configuration through to boot.
You'll need to test with upsteam dracut. I'm waiting on cranking the upstream dracut in FCOS for this to land first. |
The latest push works but fails on a few things ONLY when Multipath is enabled; it does not affect single pathed systems.
(which will be addressed in another PR) |
OK, been playing with the multipath patches a lot now. So to recap and reword, there are two issues:
Fixing 1 is pretty simple. This works: diff --git a/dracut/30ignition/ignition-setup-user.service b/dracut/30ignition/ignition-setup-user.service
index 17ec3c4..85fb519 100644
--- a/dracut/30ignition/ignition-setup-user.service
+++ b/dracut/30ignition/ignition-setup-user.service
@@ -9,6 +9,10 @@ Before=local-fs-pre.target
Before=ignition-disks.service
Before=ignition-files.service
+# We want to make sure we're not racing with multipath taking ownership of the
+# boot device.
+Before=multipathd.service
+
# On diskful boots, ignition-generator adds Requires/After on
# dev-disk-by\x2dlabel-boot.device.
diff --git a/overlay.d/05core/usr/lib/dracut/modules.d/15coreos-firstboot-network/coreos-copy-firstboot-network.service b/overlay.d/05cor
e/usr/lib/dracut/modules.d/15coreos-firstboot-network/coreos-copy-firstboot-network.service
index 6e2bf37..4f05ca3 100644
--- a/overlay.d/05core/usr/lib/dracut/modules.d/15coreos-firstboot-network/coreos-copy-firstboot-network.service
+++ b/overlay.d/05core/usr/lib/dracut/modules.d/15coreos-firstboot-network/coreos-copy-firstboot-network.service
@@ -36,6 +36,10 @@ After=dracut-cmdline.service
Requires=dev-disk-by\x2dlabel-boot.device
After=dev-disk-by\x2dlabel-boot.device
+# We want to make sure we're not racing with multipath taking ownership of the
+# boot device.
+Before=multipathd.service
+
[Service]
Type=oneshot
RemainAfterExit=yes Fixing 2 is more tricky. When diff --git a/overlay.d/05core/usr/lib/dracut/modules.d/40ignition-ostree/ignition-ostree-mount-firstboot-sysroot.service b/overlay.d/05core/usr/lib/dracut/modules.d/40ignition-ostree/ignition-ostree-mount-firstboot-sysroot.service
index 3ba677d..e7c05ae 100644
--- a/overlay.d/05core/usr/lib/dracut/modules.d/40ignition-ostree/ignition-ostree-mount-firstboot-sysroot.service
+++ b/overlay.d/05core/usr/lib/dracut/modules.d/40ignition-ostree/ignition-ostree-mount-firstboot-sysroot.service
@@ -22,4 +22,5 @@ Before=ostree-prepare-root.service ignition-remount-sysroot.service
[Service]
Type=oneshot
RemainAfterExit=yes
+ExecStart=/usr/bin/sh -xc "for i in {0..10}; do readlink -v /dev/disk/by-label/root; sleep 0.5; done"
ExecStart=/usr/sbin/ignition-ostree-mount-sysroot Then, looking at the output:
The surface issue is that The suggested service here ( There's basically no easy race-free way to determine if I also played with this: diff --git a/overlay.d/05core/usr/lib/dracut/modules.d/40ignition-ostree/ignition-ostree-mount-sysroot.sh b/overlay.d/05core/usr/lib/dracut/modules.d/40ignition-ostree/ignition-ostree-mount-sysroot.sh
index 047ba2d..7ddce38 100755
--- a/overlay.d/05core/usr/lib/dracut/modules.d/40ignition-ostree/ignition-ostree-mount-sysroot.sh
+++ b/overlay.d/05core/usr/lib/dracut/modules.d/40ignition-ostree/ignition-ostree-mount-sysroot.sh
@@ -12,6 +12,16 @@ if ! [ -b "${rootpath}" ]; then
echo "ignition-ostree-mount-sysroot: Failed to find ${rootpath}" 1>&2
exit 1
fi
+
+if [ -f /etc/multipath.conf ]; then
+ if ! multipathd show status | grep -qE '^paths: 0$'; then
+ while ! ls /dev/mapper/mpath? &>/dev/null; do
+ echo "waiting for multipathed devices to finish set up"
+ sleep 1
+ done
+ fi
+fi
+
eval $(blkid -o export ${rootpath})
mountflags=
if [ "${TYPE}" == "xfs" ]; then It works, but it's not great. Notably, it means we wait for multipathd even if it's to set up a mount point like My vote now is to simply use an explicit Using kargs in general of course is tricky because we don't support firstboot kargs yet in the cloud case. I think RAID is more relevant to the cloud than multipath, and we'll hit up against the same issues there, so we'd definitely have to address it eventually if we go the kargs path for that too (and LUKS too once we move that to Ignition). Thoughts? |
To clarify, all these together give me working multipath (modulo the growpart service + those in #183 (comment) which will need to be adapted):
|
@jlebon thanks you for playing with this -- and I'll dig in on your comments tomorrow. coreos/fedora-coreos-config#392 is the second-half of the this which will be needed to complete the boot |
Had a chat with @darkmuggle about this. While we're agreed the root argument is nicer from an implementation point of view (because it's explicit and doesn't require any guessing), there were some concerns about the UX. Ben brought up that the Ben suggested adding a udev rule which creates a nicer symlink for devicemapper devices if the underlying filesystem label is |
I guess the way I would frame this is that we expand our API from "rootfs is |
Yep exactly. This is also what coreos/fedora-coreos-tracker#465 was about. |
Right, yeah. Though I think I'm proposing a middle ground, where we only require a root arg if needed. |
I think a guiding principle here for us is "keep the cloud and metal cases as symmetric as possible", while avoiding paying too much cost in cloud for metal things - something like that? So we ship a single initramfs which has multipath, but it doesn't make sense to start it by default - enable via karg in But IMO that leaves us neutral on "root karg by default"; it's a completely free thing to do in cloud, and if we end up with it on a nontrivial percentage of metal installs, then we're making things more symmetric. Although you had a comment here coreos/fedora-coreos-tracker#465 (comment) around integrating with the bootloader that I want to chase down a bit more. |
Right now we allocate a random UUID on each build, but everyone starting from a particular build will have the same one...so they're not unique. This will pair with a new version of coreos/ignition-dracut#183 so that we can regenerate them on firstboot and ensure that they are actually unique (as much as possible) across running systems.
Right now we allocate a random UUID on each build, but everyone starting from a particular build will have the same one...so they're not unique. This will pair with a new version of coreos/ignition-dracut#183 so that we can regenerate them on firstboot and ensure that they are actually unique (as much as possible) across running systems.
This is required because the unit needs read-only access to `/boot` in order to extract any baked in user-provided config and multipathd might claim ownership. See: coreos#183
I split out the patch needed for |
@cgwalters something to note: |
@cgwalters @jlebon thank you most kindly for the good discussion and direction I concur that its time to close this since the race condition still exists. |
This is required because the unit needs read-only access to `/boot` in order to extract any baked in network config and multipathd might claim ownership. See similar patch for `ignition-setup-user.service`: coreos/ignition-dracut#185 For more information, see: coreos/ignition-dracut#183 (comment)
And I've split out from #183 (comment) the bit for |
This is required because the unit needs read-only access to `/boot` in order to extract any baked in user-provided config and multipathd might claim ownership. See: #183
This provides support for multipath without having to re-order
everything or provide a new target. We need to trigger udev to update
the device symlinks in
/dev/disk/by-{id,label,uuid}
for multipathdevices after multipathd has run.
This also reorders some services to run in "RO" before the multipathd target
is setup and then moves the "RW" after device-mapper targets have been
found.
Signed-off-by: Ben Howard [email protected]