-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating bootimages #381
Comments
For future reference, partially since it isn't easy to get the RHCOS version from the installer, here is the version information for 4.1.0: payload-rhcos-4.1.0
Installer 4.1.0 commit c6517384e71e5f09931c4da5e772fdec225d02ec Inlined metadata JSON for full posterity
|
One thing I wanted to write about here was that I would swear this bug was caused by UPI installs having a different bootimage than IPI, but I just used the |
Oh wait, I see...actually, there are indeed different checksums for the oscontainer for IPI:
Yet...they have the same version number 😢 What I suspect happened here is that the RHCOS build failed at some point after pushing the oscontainer, leading to version number reuse, which is a known issue with the pipeline. |
Currently, the MCO only supports rebooting for config changes; the MCC sets the `desiredConfig` annotation, and the MCD implicitly takes `currentConfig != desiredConfig` as permission to reboot. However, we have cases where we want to support drain+reboot for non-config changes. The first major motivating reason for this is to handle kernel arguments injected via initial MachineConfig objects. Until we do more work, because `pivot` as shipped with OpenShift 4.1.0 doesn't know about `MachineConfig` (and we have no mechanism right now to update the bootimages: openshift/os#381 ) what will happen is the MCD will land on the nodes and then go degraded, because the kernel arguments don't match what is expected. Now, we really want to handle kernel arguments at the first boot, and that will be done eventually. But this gives us a mechanism to reconcile rather than go degraded. Previously if we'd had the MCD start a reboot on its own, we could exceed the "maxUnavailable" as defined by the pool, break master etcd consistency etc. In other words, reboots need to be managed via the MCC too. For the MCD, the `reboot-requested` source of truth is the systemd journal, as it gives us an easy way to "scope" a reboot request to a given boot. The flow here is: - MCC sets desiredConfig - MCD notices desiredConfig, logs request in journal - MCD gets its own journal message, adds `reboot-requested` annotation - MCC approves reboot via `reboot-approved` annotation - MCD reboots - MCD notices it has reboot annotation, checks to see if it's in the journal; it's not, so `reboot-requested` flag is removed - MCC notices node has `reboot-approved` without `reboot-requested`, removes `reboot-approved` This ensures that config changes are always testing the "reboot request" API. For now, we expose that API as `/usr/libexec/machine-config-daemon request-reboot <reason>`. For example, one could use `oc debug node` in a loop and run that on each node to perform a rolling reboot of the hosts.
Because we have an unfortunate common use of private Google Docs for technical discussion, here's a link to an important relevant one "Lifecycle of RHCOS image source": https://url.corp.redhat.com/7aa24cd |
i suspect this doc can't be made public ? would love to read about it .. if possible |
@cgwalters I'd propose we support "machine.Spec.ProviderSpec.ami=managed"
The patter extrapolates for all providers. |
This is noted in the issue description above:
|
Some terms:
The linked issue there talks about a model where the A tricky question we need to answer here is - what do we do about the case of "oscontainer only" builds, where we didn't produce updated bootimages? Is the "release update bootimage" unset, or does it carry the same thing as the previous bootimage information? Now...I don't think we can inject this data into
This also gets into the tricky issue of The complexity here spikes fast because we have to think about all three of the installer pinned bootimage, the machine-os-content, and the release update bootimages. I guess what'd ideally end up happening if we make it all work out is that the installer pinned bootimage is now only used for the bootstrap node...except because masters aren't provisioned by machine API today, we'd likely end up only using the release update bootimages for workers, unless the installer learns how to parse the release image. |
Do you expect we'll need to address bootimage updates before we get machine-API-provisioned control planes working? I'm not clear on the relative timing, but that would help reduce some of the complexity. |
Offhand...they seem parallelizable? |
If you get machine-API-created control planes, then you don't have to involve the installer in the selection of cluster bootimages. The installer would just pick whatever bootimage it needed for bootstrapping, and the machine-API could manage bootimage selection and upgrades for the cluster without needing to involve anyone else in that behavior. |
I wrote some of this originally here but implementation proposals are probably better here. First, let's just say we will create How about we add this image to the MCO, and have the (ART) RHCOS pipeline update it via automatically-submitted PRs? (Implementation point; PR submission would be async of build completion) |
Today the RHCOS bootimages come "prepivoted"; they have a custom origin set to the `machine-os-content` that matches. The idea here is to avoid an extra reboot. In practice we're always going to be updating and rebooting today, because we don't have a mechanism to update bootimages: openshift/os#381 I plan to change the RHCOS build system to stop "prepivoting"; the oscontainer will be pushed *after* the bootimages. This breaks a complicated dependency and ensures that the RHCOS "source of truth" is the cosa storage. Also, this allows one to use the MCD when e.g. booting Fedora CoreOS. Closes: openshift#981
@abhinavdahiya and I discussed trying to get A later (but parallelizable) task is the idea to ship tooling to turn machine-os-content into a bootimage (subset of cosa) as a derived image from machine-os-content. This is a subset of what https://github.com/coreos/coreos-assembler does - we'd basically take the ostree repo from that and run create_disk.sh. For this ideally we support a choice of:
With a fallback of:
|
Going to move this issue over there. |
THIS ISSUE IS MOVED TO openshift/enhancements#201
First, be sure you're familiar with os updates.
Currently, we have no default mechanism to update the bootimages once a cluster is installed. For example, as of 4.1.4 today, the way things work is that the installer pins RHCOS, and when a cluster is installed in e.g. AWS, the installer injects the AMI into the machinesets in
openshift-machine-api
. Nothing thereafter updates it. And because we haven't updated the pinned RHCOS for 4.1.X today every OpenShift install uses the4.1.0
RHCOS bootimages. (More on that below)Now, see:
openshift/origin#21998
for a PR which tries to embed this metadata into the release image. From there, we could imagine having e.g. the cluster API use it to update the machinesets.
However, the "fully managed AWS" environment is the easy one. We have to consider manual bare metal PXE setups as well as private OpenStack clouds. For private clouds like OpenStack (as well as AWS Outpost for that matter) one could imagine that we ship all of the images in a container, or perhaps a "pristine" disk image in a container, and also the tooling (
cosa gf-oemid
) to stamp it, as well as an operator to pull those bootimages and upload them to glance or private AWS.The bare metal PXE case is unmanaged; we are probably just going to have to suffer and try to point people doing this towards managed metal.
As of today though, we will have to support upgrading OpenShift from the
4.1.0
bootimages for...quite a while.The text was updated successfully, but these errors were encountered: