Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for --replace-mode=alongside for ostree target #137

Merged
merged 1 commit into from
Nov 18, 2024

Conversation

cgwalters
Copy link
Collaborator

Ironically our support for --replace-mode=alongside breaks when we're targeting an already extant ostree host, because when we first blow away the /boot directory, this means the ostree stack loses its knowledge that we're in a booted deployment, and will attempt to GC it...

ostreedev/ostree-rs-ext@8fa019b is a key part of the fix for that.

However, a notable improvement we can do here is to grow this whole thing into a real "factory reset" mode, and this will be a compelling answer to
coreos/fedora-coreos-tracker#399

To implement this though we need to support configuring the stateroot and not just hardcode default.

@openshift-ci
Copy link

openshift-ci bot commented Oct 2, 2023

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@omertuc
Copy link
Contributor

omertuc commented Sep 18, 2024

Sorry @cgwalters , accidentally pushed the rebase to your fork instead of mine

EDIT: undid it, continuing my rebase efforts on https://github.com/omertuc/bootc/tree/137clone

EDIT2: Continuing here

@cgwalters
Copy link
Collaborator Author

I think you can just take over this PR too if you want, or open a new PR from your fork - either way.

@omertuc
Copy link
Contributor

omertuc commented Sep 19, 2024

Rebased. Without any changes, I'm facing an issue where in an ostree system, the mounted / on the host system is an overlay (-v mounted into /:/target) and so the findmnt source for it is overlay rather than a /dev/... and so it trips up lsblk later on

I'll see how I can tweak it so that it finds the right device

@omertuc
Copy link
Contributor

omertuc commented Oct 2, 2024

With additional -v /sysroot:/target -v /sysroot:/target/sysroot mounts instead of -v /:/target and --stateroot foo, this seems to work

@omertuc
Copy link
Contributor

omertuc commented Oct 2, 2024

@cgwalters thoughts on the above mounts? Do we want to require them for install on ostree targets, or should I figure out a way to make this work without them, using just the already-documented install mounts (i.e. /:/target)?

@cgwalters
Copy link
Collaborator Author

and so the findmnt source for it is overlay rather than a /dev/... and so it trips up lsblk later on

We should learn how to peel that. This is really the same thing as https://bugzilla.redhat.com/show_bug.cgi?id=2308594 and ostreedev/ostree#3198 and containers/composefs#280

Short term the simplest is the same logic as the grub patch - detect overlayfs for / and check if /sysroot exists and is mounted, if so use that.

@omertuc
Copy link
Contributor

omertuc commented Oct 8, 2024

and so the findmnt source for it is overlay rather than a /dev/... and so it trips up lsblk later on

We should learn how to peel that. This is really the same thing as bugzilla.redhat.com/show_bug.cgi?id=2308594 and ostreedev/ostree#3198 and containers/composefs#280

Short term the simplest is the same logic as the grub patch - detect overlayfs for / and check if /sysroot exists and is mounted, if so use that.

OK. Changed it so that when the target rootfs is an overlay, we'll implicitly try targetting <original_target>/sysroot instead.

It wasn't working at first and was a bit of a headache for me to debug because apparently if you mount /:/target then inside the container /target/sysroot is read-only by default, and so ensure_dir_labeled was failing, as opposed to when you mount /sysroot:/target directly, in which case it's not read-only. Took me a while to track that down chasing red herrings, and I'm still not sure who's responsible for this behavior (kernel? podman?), but after I realized it I simply moved ensure_dir_label to run only after your added let _ = crate::utils::open_dir_remount_rw... and then the rest just worked.

Current code might need a bit of touch-ups, but do you think the direction of the code in its current state is good? Should I clean it up and undraft?

@cgwalters
Copy link
Collaborator Author

It wasn't working at first and was a bit of a headache for me to debug because apparently if you mount /:/target then inside the container /target/sysroot is read-only by default, and so ensure_dir_labeled was failing, as opposed to when you mount /sysroot:/target directly, in which case it's not read-only.

I think that's possibly because it's bootc that's special casing mounting /sysroot read-write - that's how we do it outside of a container at least.

Copy link
Collaborator Author

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking this up!!

lib/src/install.rs Show resolved Hide resolved
lib/src/install.rs Outdated Show resolved Hide resolved
lib/src/lsm.rs Outdated Show resolved Hide resolved
lib/src/utils.rs Outdated Show resolved Hide resolved
@omertuc omertuc force-pushed the install-existing-ostree branch 4 times, most recently from 3561a64 to ed94f1e Compare October 14, 2024 17:35
@omertuc omertuc force-pushed the install-existing-ostree branch 3 times, most recently from d392280 to eed91ff Compare October 21, 2024 15:05
@omertuc
Copy link
Contributor

omertuc commented Nov 6, 2024

Forced pushed now to solve some new tiny conflict on import lines, other than that there's no conflicts vs main

@omertuc
Copy link
Contributor

omertuc commented Nov 6, 2024

Forced push again because of duplicate import

@omertuc
Copy link
Contributor

omertuc commented Nov 6, 2024

I suspect it's btrfs specific.

Yeah I was downloading Fedora CoreOS (as opposed to Silverblue) today to see if we hit similar issues, I'll report with what I find

I just tried Fedora CoreOS and I can confirm the same issue happens even with xfs, so it's not a btrfs issue

@cgwalters
Copy link
Collaborator Author

stop stop stop 😆 you're working on your old stale branch, I've already solved all these conflicts a while back

Oops! Sorry

@omertuc
Copy link
Contributor

omertuc commented Nov 12, 2024

Got around to taking a serious look at this read-only behavior on FCOS/Silverblue, it seems to simply be due to an immutable attribute on the ostree deployment, which gets preserved when the ostree deployment directory is simply mounted directly only /.

On bootc systems I assume this attribute gets lost due to usage of composefs/overlay instead of a direct mount

@cgwalters
Copy link
Collaborator Author

Oh duh of course.

On bootc systems I assume this attribute gets lost due to usage of composefs/overlay instead of a direct mount

Kind of, it's not lost so much as shadowed. But basically with composefs we don't need the immutable bit anymore, it was always a hack.

Although wait, there's two immutable bits potentially - one on the deployment root, and one on the physical /.

Ahh yes, see https://github.com/coreos/coreos-assembler/blob/58df48e0c237a638df5b57a475d3b80b1029baa5/src/osbuild-manifests/coreos.osbuild.x86_64.mpp.yaml#L560

Basically let's change our code to run chattr -i before the chmod. We could try to preserve it but I don't think it's important.

@omertuc
Copy link
Contributor

omertuc commented Nov 12, 2024

Basically let's change our code to run chattr -i before the chmod. We could try to preserve it but I don't think it's important.

Yep, force pushed with rebase and this change. However we're now hitting another problem:

TRACE exec: "ostree" "config" "--repo" "ostree/repo" "set" "sysroot.bootloader" "none"
ERROR Installing to filesystem: Creating ostree deployment: Subprocess failed: ExitStatus(unix_wait_status(256))

Looking into what's behind this one

Although wait, there's two immutable bits potentially - one on the deployment root, and one on the physical /.

Is that right? They seem to be tied from what I can see:

$ findmnt 
/                                            /dev/vda4[/ostree/deploy/fedora-coreos/deploy/6df70065620571076f242857b9080d747891e2279dff3ed1756270f6889731ce.0]     xfs        rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,prjquota

Removing the bit from one removes it from the other

@cgwalters
Copy link
Collaborator Author

They seem to be tied from what I can see:

This is confusing for sure! In ostree there's the term "physical root" vs "booted root". When we're in the booted system, then / is the deployment/booted root, and /sysroot is the physical root.

But mounting the filesystem from outside, / is the physical root, and /ostree/deploy/<stateroot>/<checksum>.<id> is the deployment root.

I am pretty sure what we're hitting here at first is the physical root, not the deployment root.

ERROR Installing to filesystem: Creating ostree deployment: Subprocess failed: ExitStatus(unix_wait_status(256))

Well that's busted, we should be getting stderr from that...is there nothing there?

@omertuc
Copy link
Contributor

omertuc commented Nov 12, 2024

strace -y

[pid  4103] openat(3</target/sysroot/ostree/repo>, ".", O_WRONLY|O_CLOEXEC|O_TMPFILE, 0600) = -1 EROFS (Read-only file system)

Looks like inside the container, /target and /target/sysroot are separate mounts that have to be rw remounted separately... Another thing that didn't happen in existing bootc systems

Looking into this

@omertuc
Copy link
Contributor

omertuc commented Nov 12, 2024

When remounting both, installation seems to proceed smoothly, so this seems to be the final hurdle

@omertuc omertuc force-pushed the install-existing-ostree branch 2 times, most recently from 7386012 to e75b8b1 Compare November 12, 2024 17:12
@cgwalters
Copy link
Collaborator Author

Looks like inside the container, /target and /target/sysroot are separate mounts that have to be rw remounted separately...

Ahh I think I understand the root of confusion here. I think early on we need to detect "is this an ostree system" and if so, then we need to actually look at /target/sysroot and basically do most of our operations in terms of that.

But for non-ostree systems then / is already the physical root.

The install code may need some cleanups around this. In the non-ostree case we can assume that the physical root / is writable (otherwise nothing could happen), and in the ostree case we shouldn't need to remount / writable (it doesn't make sense to do so with composefs); we need to peel it and access the physical root at /sysroot.

@omertuc
Copy link
Contributor

omertuc commented Nov 12, 2024

Looks like inside the container, /target and /target/sysroot are separate mounts that have to be rw remounted separately...

Ahh I think I understand the root of confusion here. I think early on we need to detect "is this an ostree system" and if so, then we need to actually look at /target/sysroot and basically do most of our operations in terms of that.

But for non-ostree systems then / is already the physical root.

The install code may need some cleanups around this. In the non-ostree case we can assume that the physical root / is writable (otherwise nothing could happen), and in the ostree case we shouldn't need to remount / writable (it doesn't make sense to do so with composefs); we need to peel it and access the physical root at /sysroot.

So basically what you're suggesting is to change your original suggestion here from "detect overlayfs" to "detect ostree system" ?

@cgwalters
Copy link
Collaborator Author

So basically what you're suggesting is to change your original suggestion #137 (comment) from "detect overlayfs" to "detect ostree system" ?

Either way is OK by me honestly. For grub, it does feel a bit cleaner not to do anything specific to ostree but to just check for /sysroot. In an ideal world perhaps we even "standardize" this /sysroot path for other theoretical non-ostree-but-filesystem/composefs based systems...

But for bootc we have other code specific to ostree anyways that aren't going to go away anytime soon so "detect ostree" is something we'll need to do regardless, whether it's early on or a bit later in the flow.

Comment on lines 604 to 606
let path = "sysroot".into();
let _ = crate::utils::open_dir_remount_rw(rootfs_dir, path)
.context("remounting target sysroot as read-write")?;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one won't exist in the non-ostree case, that's why the install test GHA is failing (those tests install overtop the GHA runner).

This gets to what I was saying in that I think we'll need to detect this early on dispatch on it.

I think concretely I'd say we check for ostree/repo and there should be a bool existing_ostree in the State structure or so?

Copy link
Contributor

@omertuc omertuc Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I see I'll rework that tomorrow

Copy link
Contributor

@omertuc omertuc Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one won't exist in the non-ostree case

This code is already inside a code branch that's ostree only. I'd argue the test failures are due to a lacking cleanup operation in fn reset_root - it only deletes the ostree deployments directory, not the ostree repo which is what we used to detect whether there's ostree on the system

@omertuc omertuc force-pushed the install-existing-ostree branch 3 times, most recently from 72159e6 to cb97d18 Compare November 18, 2024 02:36
Ironically our support for `--replace-mode=alongside` breaks
when we're targeting an already extant ostree host, because when
we first blow away the `/boot` directory, this means the ostree
stack loses its knowledge that we're in a booted deployment,
and will attempt to GC it...

ostreedev/ostree-rs-ext@8fa019b
is a key part of the fix for that.

However, a notable improvement we can do here is to grow this
whole thing into a real "factory reset" mode, and this will
be a compelling answer to
coreos/fedora-coreos-tracker#399

To implement this though we need to support configuring the
stateroot and not just hardcode `default`.

Signed-off-by: Omer Tuchfeld <[email protected]>
@cgwalters
Copy link
Collaborator Author

This is my PR originally so I can't approve it, but the code LGTM so please do so!

@cgwalters cgwalters merged commit b882540 into containers:main Nov 18, 2024
30 of 32 checks passed
@omertuc omertuc mentioned this pull request Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/install Issues related to `bootc install`
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants