bootimages: Downloading and updating bootimages via release image #201

cgwalters · 2020-02-04T23:06:16Z

This proposes a path towards integrating the bootimage into the
release image and become managed by the cluster. This aids worker
scaleup speed, helps in mirroring OpenShift into disconnected
environments, and allows the CoreOS team to avoid supporting
in-place updates from OpenShift 4.1 into the forseeable future.

michaelgugino · 2020-02-05T03:09:46Z

enhancements/bootimages.md

+
+## Proposal phase 3: machine-api-bootimage-updater
+
+Add a new component to https://github.com/openshift/machine-api-operator which runs `machine-os-bootimage-generator`, uploads the resulting images to the IaaS layer (AMI, OpenStack Glance, etc.) and updates the machinesets for the cluster.


Are we moving away from public AMIs?

There was some talk about possibly changing machine-controller to an image reference rather than updating each machineset. Either of these cases has the question of mixed image support. Do we plan on doing that in the future, or do we do it now?

openshift/installer#2906 suggests moving away from AMIs, yep.

michaelgugino · 2020-02-05T03:14:09Z

enhancements/bootimages.md

+
+The implementation of this is basically shipping a subset of [coreos-assembler](https://github.com/coreos/coreos-assembler) as part of the OpenShift release payload, and teaching `oc` how to invoke `podman` to run it.
+
+The `generate-bootimage` implementation would download the `machine-os-bootimage-generator` container image along with the existing `machine-os-content` container image (OSTree repository), and effectively run the [create_disk.sh](https://github.com/coreos/coreos-assembler/blob/30fbac4e176c7936362efbd647c8199d927e593c/src/create_disk.sh) process or [buildextend-installer](https://github.com/coreos/coreos-assembler/blob/30fbac4e176c7936362efbd647c8199d927e593c/src/cmd-buildextend-installer) for live media, etc.


Why do this at all? If we're already pulling all the binaries to build the image, why not just publish an image that contains a ISO or whatever format?

Good question. I think the basic answer is that if we e.g. ship a container image wrapping each pre-built bootimage, that's at least 700MB per image and there are like 7-8 or more images. That's...a lot. So things like oc adm release mirror would have to learn how to filter them. It'd make the OpenShift release process to push them much slower.

It just seems more tenable to me to make this dynamic.

So things like oc adm release mirror would have to learn how to filter them

That seems like a better approach.

Can you elaborate a bit more on why? Just in terms of avoiding needing a new nontrivial container image? Needing to ship code that was formerly "behind the firewall" to run on premise? Something else?

On the bit level, we'll know the image they are booting is the one we built, we don't have to worry about a bug in the builder producing something incorrect in their environment. It's also one less moving piece we need to worry about in-cluster. My preference is to add nothing to the cluster that doesn't need to 100% need to be there.

enxebre · 2020-02-05T09:01:02Z

enhancements/bootimages.md

+
+## Proposal phase 3: machine-api-bootimage-updater
+
+Add a new component to https://github.com/openshift/machine-api-operator which runs `machine-os-bootimage-generator`, uploads the resulting images to the IaaS layer (AMI, OpenStack Glance, etc.) and updates the machinesets for the cluster.


This proposal seems to me an extension of the current upgrade workflow driven by the Machine Config Operator.

If I understand correctly this is introducing a new upgrading step to run the machine-os-bootimage-generator to use the just brought machine-os-content to build an upload an artifact during an upgrade. So this belong together with all the upgrading business logic to the Machine Config Operator. And only when this last step is complete the MCO should signal a cluster upgrade as completed. So then any component in any environment e.g UPI have an available artifact to consume.

I can't see why we'd want to couple this with the machine API or the machine API operator.
I'd find it more reasonable for the MCO to make the new artifact ID available via e.g configMap as part of this new upgrading step and let any consumer e.g machineSets to reference it as they see fit e.g spec.Ami: "ocp-cluster-latest".

Watching and updating all the machineSets massively would be again effectively an additional step for an upgrade workflow to be considered complete. And I don't think we should split or couple our upgrading process to something orthogonal like the machine API, this belong to the Machine Config Operator to me.

So this belong together with all the upgrading business logic to the Machine Config Operator. And only when this last step is complete the MCO should signal a cluster upgrade as completed.

I don't think it's that clear cut. Changing the bootimage has no effect on any running system; it's not clearly coupled with the main CVO upgrade logic.

I can't see why we'd want to couple this with the machine API or the machine API operator.

Because there's nothing to do if the machineAPI isn't in use.

I'd find it more reasonable for the MCO to make the new artifact ID available via e.g configMap as part of this new upgrading step and let any consumer e.g machineSets to reference it as they see fit e.g spec.Ami: "ocp-cluster-latest".

Again none of this applies unless machineAPI is in use. Sure, the code could technically live in the MCO but...

Another argument for having it live in machineAPI is that it requires IaaS credentials, usage of those APIs - the MCO isn't set up with that today, but machineAPI is.

Or to expand on this - in a non-machineAPI (i.e. UPI) scenario, it'd be up to the customer to integrate this tooling into whatever they use, such as updating their AWS CloudFormation template, etc. Similar for the "manual bare metal PXE" scenario.

Another thing to note, which is called out in this proposal - this is not replacing the default in-place updates. So even in a machineAPI/IPI scenario, if we edit the machinesets as part of the default CVO flow (whether the MCO edits or machineAPI edits), we would rely on the default machineAPI semantic of not reprovisoning machines when a machineset is changed.

Again none of this applies unless machineAPI is in use. Sure, the code could technically live in the MCO but...

I see only a tangential point here to the machine API -> new machines belonging to a machineSet should come with the latest OS dictated by the MCO. The lifecycle of that OS artifact is transparent for the machine API. The machine API should be a consumer of this, just like any other component/tooling creating a new instance (e.g cloud formation) would be. They'd only need to know about where this is available.

Another argument for having it live in machineAPI is that it requires IaaS credentials, usage of those APIs - the MCO isn't set up with that today, but machineAPI is.

The machine API has actually no particular knowledge about IaaS credentials. It just happen to consume what ever the cloud credentials operator makes available for any component in the cluster.

Or to expand on this - in a non-machineAPI (i.e. UPI) scenario, it'd be up to the customer to integrate this tooling into whatever they use, such as updating their AWS CloudFormation template, etc. Similar for the "manual bare metal PXE" scenario.

I think It's up to the consumer to integrate as they see fit but to consume a consistently formatted "artifact address" given by the component which is the authoritative source of truth for cluster upgrades i.e the MCO. This consumer could be anything: machine API, cloud formation, etc. They are orthogonal building blocks.

Another thing to note, which is called out in this proposal - this is not replacing the default in-place updates. So even in a machineAPI/IPI scenario, if we edit the machinesets as part of the default CVO flow (whether the MCO edits or machineAPI edits), we would rely on the default machineAPI semantic of not reprovisoning machines when a machineset is changed.

I'm not in favour of the MCO editing machineSets to complement a cluster upgrade. That would be effectively scattering our upgrade process into different components and coupling our upgrade process to the machine API.

I think this proposal describes a missing step in our current upgrade workflow driven by the authoritative source of truth for cluster upgrades i.e MCO. Once this would be supported then any component/tooling (e.g machine API, anything) could transparently react as they see fit to consume a consistently formatted source of truth.

I think the proposal is to

let machine config operator ecosystem generate the artifact that's suitable for upload
===

machine-api-operator ecosystem consuming that artifact and creating the cloud specific Image

machine-api-operator ecosystem then ensuring all required machinesets have the latest Image.

I think the separation make sense because machine config + rhcos knows best how to create bootimages.. and machine-api ecosystem knows best how to talk to cloud to create a cloud specific images and interact with Machinesets.

michaelgugino · 2020-02-05T17:16:46Z

I don't understand why we're not publishing public images in all the clouds. Seems like we're trying to do a ton of stuff to avoid doing that. I don't really want to support an image pipeline that lives on each cluster. The whole point of RHELCoS was to avoid doing that, I thought.

cgwalters · 2020-02-05T17:23:03Z

I don't understand why we're not publishing public images in all the clouds.

That doesn't cover disconnected environments (on-premise OpenStack) nor does it help bare metal users mirror everything with just the release image, instead of the release image + rhcos downloads externally.

michaelgugino · 2020-02-05T17:33:29Z

I don't understand why we're not publishing public images in all the clouds.

That doesn't cover disconnected environments (on-premise OpenStack) nor does it help bare metal users mirror everything with just the release image, instead of the release image + rhcos downloads externally.

It's trivial to download an image on put it on openstack. If you want to run disconnected, that's the price you pay.

nor does it help bare metal users mirror everything with just the release image

I'm not familiar with what a ground-up bare metal installation process looks like. How much effort does downloading a bootable image add to that process today? It also doesn't seem to solve the workflow of:

Get image from somewhere
Put it somewhere on your infrastructure
Configure a machine to boot that image

In any case, for UPI installs, I don't really have a preference. For IPI installs, we should be using public images IMO.

cgwalters · 2020-02-05T17:43:33Z

It's trivial to download an image on put it on openstack.

Not as easy as you'd think - people don't understand that e.g. we require a separate RHCOS per OpenShift minor version.

And further, it doesn't address the on-going issue of keeping the images up to date.

If you want to run disconnected, that's the price you pay.

Sorry, I disagree - at Red Hat we support a hybrid cloud.

How much effort does downloading a bootable image add to that process today?

It's not at all the biggest stumbling block, bare metal admins tend to understand this stuff. Ignition is much more new, learning how to do static IP addressing etc. are bigger issues.

But another way to look a this is: with OpenShift 4 (vs 3.x), you no longer need to know how to manage/mirror/update RHEL install media (or AMIs), RPMs and containers by default. You just need to know how to mirror RHCOS media and containers. With this proposal, you just need to know how to mirror containers!

michaelgugino · 2020-02-05T19:14:21Z

With this proposal, you just need to know how to mirror containers!

The boot images need to be available somewhere to actually boot the machines. So you still need to have a way to mirror the images, whether you build them yourself or you download them. Once the cluster is up and running, you could automatically build new images, but there's no way to automate uploading the boot images to the provisioning network/storage because we don't own it (in the UPI case). In some cases (many cases?) the provisioning infrastructure is not going to be network accessible from your application network, so an admin will need to download the artifact from the cluster (but, they still need to know which artifact) to their workstation, then upload it to the provisioning storage.

Not as easy as you'd think - people don't understand that e.g. we require a separate RHCOS per OpenShift minor version.

This is explained in a single sentence. There should be a single page quick start guide for every release that tells you exactly what you need to mirror.

Ignition is much more new, learning how to do static IP addressing etc. are bigger issues.

Seems like ignition is the wrong technology for this kind of thing. OSTree works well, but the ignition model seems to be a large hurdle, IMO, that's the elephant in the room, we made more work for ourselves because we have a very narrowly supported installation model that moved the problem from hosts we control to infrastructure we don't.

abhinavdahiya · 2020-02-05T22:34:15Z

enhancements/bootimages.md

+
+In fact, we could aim to switch to having workers use the `bootimage.json` from the release payload immediately after it lands.  A downside is this would open up potential for drift between the bootstrap+controlplane and workers.
+
+## Proposal phase 2: oc adm release generate-bootimage


Can we include details on a option that prints the contents of bootimage.json like mentioned here https://github.com/openshift/enhancements/pull/201/files#diff-9b570624a774881c701bbfbc18b60109R64

a way to get the prebuilt bootimages if the user sees fit as an alternative to generating the whole thing...?

This gets into a fundamental question of whether we do ship pre-built bootimages going forward in general. Clearly in "phase 1" we continue to do so as we can't break the way openshift-install works today.

The thing is though in any disconnected environment it'd require another step for admins to mirror on an ongoing basis.

On one hand though, we absolutely are going to need to make it ergonomic at some point to "boot a single RHCOS node" for testing. There are many valid reasons for this, among them hardware validation before trying a cluster install, experimenting with configuration "live" that one then compiled into Ignition/MachineConfig, etc. And until such time as we have openshift-install run-one-coreos-instance or something...people are going to want to download the bootimages directly and boot them too.

So...I'm conflicted.

Maybe what happens in the end is we do do both - but the cadence of "golden bootimages" would still be once per release (for the bootstrap). There's no point to us uploading e.g. new AMIs if in practice clusters are going to use in-place update mechanisms anyways.

IOW...I think people doing UPI installs would turn to automating oc adm release generate-bootimage.

For phase 1, i think it's necessary for migration.

we could add a --info flag with it defaulting to true to begin with. and when we have actual generation capability we move to --info false so users are generating the images.

i think --info will continue to provide value in future, because it will allow the customers to get information like

what's the rhcos version, maybe what's included, if there are any publicly available images they are try let's use them. and as long as the output is versioned users have a way to differentiate and automate.

OK I'm thinking about this again. I guess one thing here is what you most strongly objected to was having openshift-install be the frontend for this information (clearly we want to do all the other stuff here though too).

But clearly a "MVP" for this is basically what you're saying with --info. The question is how to get there from the today where RHCOS is pinned in the installer. I really don't want two sources of this information (it's already bad enough docs are updated manually). So...what if we still added a hidden/secret mechanism for oc adm release new to download openshift-install and extract this data from it somehow?

The simplest would perhaps be a hidden version of openshift-install hidden-only-for-oc-output-coreos-build or so.

Or, we could stick it in a special ELF section (would require oc adm to vendor some ELF tools).

Or, we could do a variant of openshift/installer#1422 and basically scrape the binary; the lowest tech approach...

I think we should include an object into the release image like we include the machine-os-content.

This object can then be used to provide the information as part of oc adm release info --boot-images

trying to get this data from installer is not correct imo. the installer's embedded data is mostly for bootstrap-host and it currently being used for cluster hosts.

Yes, we are all agreed on the long term goal of getting the data out of the installer. The problem is in the short term, having two sources of truth for this data is going to be very problematic.

Now...hmm, we could get the data into a container in the release payload and change oc adm release new to patch the openshift-install binary, the same way it does for the release image itself?

The downside of doing this is that anyone building openshift-install from git...well, it just won't work. I guess we could change the installer to fetch "the latest" from a URL if built from git, basically the same way we do with the release image.

For metal IPI CI, we stuff the data into the baremetal-installer container because we have to have it. See: openshift/installer#3330

The way users get this data to use metal IPI is really painful (openshift-install version, and fetching the rhcos.json from GitHub for that sha!). Basically every on-premise platform needs access to the image locations...

I dislike client-side patching of the installer binary. Storing it somehow next to machine-os-content makes sense to me.

I dislike client-side patching of the installer binary.

I'm not talking about any client side patching. This is something that would happen in the OpenShift build system (release controller).

Storing it somehow next to machine-os-content makes sense to me.

Yes but the problem with that is that it's the bootstrap node that pulls the release image, so that leaves open the problem of which image to use for the bootstrap.

Now I guess a first transitional step could be changing the installer to use the metadata from the release payload for the cluster, and leave the pinned version only for the bootstrap node.

I dislike client-side patching of the installer binary.
I'm not talking about any client side patching. This is something that would happen in the OpenShift build system (release controller).

oc adm release extract patches the binary. Sure, the binaries published on mirror.openshift.com already had that run to get them, but users can do that themselves too client side.

It's the default for baremetal since we don't publish ours.

Storing it somehow next to machine-os-content makes sense to me.
Yes but the problem with that is that it's the bootstrap node that pulls the release image, so that leaves open the problem of which image to use for the bootstrap.

A user can just use oc adm release info? That's at least a consistent place for any release-related information.

jamescassell · 2020-02-05T22:54:02Z

I'm in favor of the change. It'll make a single process for mirroring for an offline install. Plus this goes nicely with the RHEL 8.2 support for installing a fully up-to-date machine by initially installing any updated packages, beyond just those in the GA compose.

cgwalters · 2020-03-09T14:24:08Z

One subtle case we will need to handle eventually is that in some clouds there are "properties" of an image (beyond just the raw content) that we need to preserve. For example:

AWS enhanced networking
Lots of OpenStack properties like virtio is available, etc.
VMWare properties

And one that came up recently is around GCP - I want to flag {F,RH}CoreOS as supporting UEFI and secure boot which will enable TPMs.

But an even trickier one in GCP is the nested virt license - we don't want to enable this by default but we do want to enable users to turn it on. We're talking about patching the installer to do this, but if we do that eventually it'd need to write an in-cluster object too so we know to do it for any images the cluster uploads.

For most of the first chunk though I think we're going to need to do something like:

Take the current "baseline" set of properties and hardcode them?
Any new properties that only apply to newer CoreOS versions end up in meta.json or so, and the in-cluster uploader knows how to adjust for them.

For example, say that we bump the minimum required VMware version in 4.X - that would show up as vmware: version: X.Y.Z in meta.json and the uploader component would read that.

cgwalters · 2020-03-18T14:26:53Z

I wanted to write something up here about manually updating bootimages. It's not easy unfortunately.

In the case of AWS for example, openshift/installer copies the AMIs w/encryption. So if for example one just grabs the AMIs from the RHCOS metadata in the installer and edits the machinesets to use the, you'd end up with unencrypted disks.

More recently this installer PR switches things to use AWS KMS directly (should appear in 4.5 and above). So, someone could also set that up manually.

In the case of e.g. GCP and other IaaS clouds, we don't upload bootable images directly, the installer creates them from a seralized form. So anyone who wants to update things would have to replicate that bit too (not particularly difficult, but definitely cloud specific). See also openshift/installer#2906

For bare metal at least, updating bootimages manually should just be a matter of e.g. changing your PXE server or whatever to reference the new images, e.g. 4.3.

stbenjam · 2020-03-18T19:54:40Z

enhancements/bootimages.md

+
+## Proposal phase 1: bootimage.json in release image
+
+First, the RHCOS build process is changed to inject the current coreos-assembler `meta.json` output for the build into `machine-os-content`.  This aims to move the "source of truth" for cluster bootimages into the release image.  Nothing would use this to start - however, as soon as we switch to a machine API provisioned control plane, having that process consume this data would be a natural next step.


Could you have some details about what this means?

I've read through what I can find and I don't understand how machine-os-content is produced, there's no source location in the release image.

As a user, how would I consume this? How would I find out which is the image URL for say the OpenStack image? Does the meta.json become an annotation on the image or something?

{ "name": "machine-os-content", "annotations": { "io.openshift.build.commit.id": "", "io.openshift.build.commit.ref": "", "io.openshift.build.source-location": "", "io.openshift.build.version-display-names": "machine-os=Red Hat Enterprise Linux CoreOS", "io.openshift.build.versions": "machine-os=45.81.202003131108-0" }

None of this exists yet - it's a proposal.

But this is the same question as #201 (comment) right?

I don't understand how machine-os-content is produced

Via https://github.com/coreos/coreos-assembler and https://gitlab.cee.redhat.com/coreos/redhat-coreos/

Got it thank you - I was looking for details about how this should be implemented to see if we could already do part of it.

The problem for us is that baremetal (and openstack) IPI both need to know where the RHCOS images live. Today we do awful hacks with openstack-install version and GitHub to fetch the data that don't work in CI. There, openshift-install version reports a sha that doesn't exist due to the rebasing strategy CI uses. We ended up with a temporary workaround and just copy rhcos.json into the installer container.

I'd be interested to see if we could get a better approach to make the data accessible to end users (this --info flag @abhinavdahiya mentioned)

enhancements/bootimages.md

cgwalters · 2020-06-01T21:34:36Z

Implementation thought; we shipped osmet and are now making use of it in FCOS (and will for RHCOS 4.6). It'd be useful to try generalizing it to be able to work across other images.

Chat summary:

Wait I just realized, we can probably use osmet for #201 right? Basically we have a container that's FROM oscontainer and includes coreos-installer + .osmet files for each image like rhcos-qemu-qcow2.osmet, rhcos-aws.vmdk.osmet etc. or so? (edited)

@jlebon said

right yeah. and quickly scanning the other artifacts i don't think any of them use compression offhand. so it's possible in practice, but i think it requires entirely separate code from coreos-installer osmet pack
i guess if we assume a format is literally just (header, block device data, trailer), then we could do some fiddling to save the header and trailer and find the block device data offset

That said, it needs to be weighed against other implementation approaches like having the Ignition platform ID be at an easily detectable offset and so we can have tooling that "punches" it in without having to do anything like actually mount the disk (i.e. run qemu). That works as long as the only difference between the disk images is the platform ID (and not possibly other things like the bootloader timeout).

This proposes a path towards integrating the bootimage into the release image and become managed by the cluster. This aids worker scaleup speed, helps in mirroring OpenShift into disconnected environments, and allows the CoreOS team to avoid supporting in-place updates from OpenShift 4.1 into the forseeable future.

openshift-bot · 2021-11-18T10:15:27Z

Inactive enhancement proposals go stale after 28d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle stale.
Stale proposals rot after an additional 7d of inactivity and eventually close.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2021-11-25T10:21:45Z

Stale enhancement proposals rot after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Rotten proposals close after an additional 7d of inactivity.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

openshift-bot · 2021-12-02T10:27:19Z

Rotten enhancement proposals close after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Reopen the proposal by commenting /reopen.
Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Exclude this proposal from closing again by commenting /lifecycle frozen.

/close

openshift-ci · 2021-12-02T10:27:48Z

@openshift-bot: Closed this PR.

In response to this:

Rotten enhancement proposals close after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Reopen the proposal by commenting /reopen.
Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Exclude this proposal from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

cgwalters · 2022-01-20T22:50:21Z

/reopen
/lifecycle frozen

openshift-ci · 2022-01-20T22:50:51Z

@cgwalters: Reopened this PR.

In response to this:

/reopen
/lifecycle frozen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci · 2022-01-20T22:50:52Z

@cgwalters: The lifecycle/frozen label cannot be applied to Pull Requests.

In response to this:

/reopen
/lifecycle frozen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci · 2022-01-20T22:51:23Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: crawford
To complete the pull request process, please assign bparees after the PR has been reviewed.
You can assign the PR to them by writing /assign @bparees in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-bot · 2022-01-27T23:11:56Z

Rotten enhancement proposals close after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Reopen the proposal by commenting /reopen.
Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Exclude this proposal from closing again by commenting /lifecycle frozen.

/close

openshift-ci · 2022-01-27T23:12:26Z

@openshift-bot: Closed this PR.

In response to this:

Rotten enhancement proposals close after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Reopen the proposal by commenting /reopen.
Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Exclude this proposal from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

cgwalters · 2022-01-27T23:23:16Z

/reopen
/lifecycle frozen

openshift-ci · 2022-01-27T23:23:45Z

@cgwalters: Reopened this PR.

In response to this:

/reopen
/lifecycle frozen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci · 2022-01-27T23:23:45Z

@cgwalters: The lifecycle/frozen label cannot be applied to Pull Requests.

In response to this:

/reopen
/lifecycle frozen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci · 2022-01-27T23:28:38Z

@cgwalters: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/markdownlint	`5505d7d`	link	true	`/test markdownlint`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-bot · 2022-02-04T05:17:26Z

Rotten enhancement proposals close after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Reopen the proposal by commenting /reopen.
Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Exclude this proposal from closing again by commenting /lifecycle frozen.

/close

openshift-ci · 2022-02-04T05:17:53Z

@openshift-bot: Closed this PR.

In response to this:

Rotten enhancement proposals close after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Reopen the proposal by commenting /reopen.
Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Exclude this proposal from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

cgwalters · 2022-07-01T12:18:55Z

I want to highlight since important bits got lost in the github comment hiding:

#201 (comment)

cgwalters mentioned this pull request Feb 4, 2020

Updating bootimages openshift/os#381

Closed

openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Feb 4, 2020

openshift-ci-robot requested review from ericavonb and tbielawa February 4, 2020 23:08

michaelgugino reviewed Feb 5, 2020

View reviewed changes

enxebre reviewed Feb 5, 2020

View reviewed changes

abhinavdahiya reviewed Feb 5, 2020

View reviewed changes

This was referenced Feb 10, 2020

Add new rhcos-metadata command openshift/installer#2092

Closed

RFE: openshift-install download-bootimage openshift/installer#1399

Closed

cgwalters mentioned this pull request Feb 13, 2020

Support AWS China install #209

Closed

cgwalters mentioned this pull request Mar 6, 2020

proposal for rebasing on Ignition master openshift/os#402

Closed

cgwalters mentioned this pull request Mar 18, 2020

overlay/ignition-ostree-growfs: Don't run if root= karg coreos/fedora-coreos-config#307

Merged

stbenjam reviewed Mar 18, 2020

View reviewed changes

This was referenced Apr 8, 2020

Add nested support for GCP openshift/installer#3430

Merged

Bug 1823852: pkg/server: disable weak TLS versions openshift/machine-config-operator#1649

Merged

cgwalters mentioned this pull request Apr 30, 2020

Bug 1829642: templates: Add a special machine-config-daemon-firstboot-v42.service openshift/machine-config-operator#1706

Merged

ericavonb reviewed May 15, 2020

View reviewed changes

enhancements/bootimages.md Outdated Show resolved Hide resolved

ericavonb reviewed May 15, 2020

View reviewed changes

enhancements/bootimages.md Show resolved Hide resolved

cgwalters mentioned this pull request Jun 1, 2020

Design tooling to ship bootloader updates coreos/fedora-coreos-tracker#510

Closed

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 18, 2021

openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 25, 2021

openshift-ci bot closed this Dec 2, 2021

openshift-ci bot reopened this Jan 20, 2022

openshift-ci bot closed this Jan 27, 2022

openshift-ci bot reopened this Jan 27, 2022

cgwalters mentioned this pull request Feb 2, 2022

RHCOS 4.10 build failure: Aliyun quota issues + pruning? openshift/os#701

Closed

openshift-ci bot closed this Feb 4, 2022

enxebre mentioned this pull request Mar 8, 2022

Fix create nodepool azure command openshift/hypershift#1118

Merged

4 tasks

elluvium mentioned this pull request Sep 6, 2023

[Update 4.11 -> 4.12] Master node becomes unreachable and loses network connectivity okd-project/okd#1657

Closed

djoshy mentioned this pull request Oct 16, 2023

Managing boot images via the MCO #1496

Merged

jlebon mentioned this pull request Jul 18, 2024

WIP: overlay node image before bootstrapping if necessary openshift/installer#8742

Draft


		## Proposal phase 3: machine-api-bootimage-updater

		Add a new component to https://github.com/openshift/machine-api-operator which runs `machine-os-bootimage-generator`, uploads the resulting images to the IaaS layer (AMI, OpenStack Glance, etc.) and updates the machinesets for the cluster.


		The implementation of this is basically shipping a subset of [coreos-assembler](https://github.com/coreos/coreos-assembler) as part of the OpenShift release payload, and teaching `oc` how to invoke `podman` to run it.

		The `generate-bootimage` implementation would download the `machine-os-bootimage-generator` container image along with the existing `machine-os-content` container image (OSTree repository), and effectively run the [create_disk.sh](https://github.com/coreos/coreos-assembler/blob/30fbac4e176c7936362efbd647c8199d927e593c/src/create_disk.sh) process or [buildextend-installer](https://github.com/coreos/coreos-assembler/blob/30fbac4e176c7936362efbd647c8199d927e593c/src/cmd-buildextend-installer) for live media, etc.


		In fact, we could aim to switch to having workers use the `bootimage.json` from the release payload immediately after it lands. A downside is this would open up potential for drift between the bootstrap+controlplane and workers.

		## Proposal phase 2: oc adm release generate-bootimage


		## Proposal phase 1: bootimage.json in release image

		First, the RHCOS build process is changed to inject the current coreos-assembler `meta.json` output for the build into `machine-os-content`. This aims to move the "source of truth" for cluster bootimages into the release image. Nothing would use this to start - however, as soon as we switch to a machine API provisioned control plane, having that process consume this data would be a natural next step.

bootimages: Downloading and updating bootimages via release image #201

bootimages: Downloading and updating bootimages via release image #201

Conversation

cgwalters commented Feb 4, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cgwalters Feb 5, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

enxebre Feb 5, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cgwalters Feb 5, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

enxebre Feb 5, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michaelgugino commented Feb 5, 2020

cgwalters commented Feb 5, 2020

michaelgugino commented Feb 5, 2020

cgwalters commented Feb 5, 2020

michaelgugino commented Feb 5, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stbenjam May 29, 2020 • edited Loading

Choose a reason for hiding this comment

jamescassell commented Feb 5, 2020

cgwalters commented Mar 9, 2020 • edited Loading

cgwalters commented Mar 18, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cgwalters commented Jun 1, 2020

openshift-bot commented Nov 18, 2021

openshift-bot commented Nov 25, 2021

openshift-bot commented Dec 2, 2021

openshift-ci bot commented Dec 2, 2021

cgwalters commented Jan 20, 2022

openshift-ci bot commented Jan 20, 2022

openshift-ci bot commented Jan 20, 2022

openshift-ci bot commented Jan 20, 2022

openshift-bot commented Jan 27, 2022

openshift-ci bot commented Jan 27, 2022

cgwalters commented Jan 27, 2022

openshift-ci bot commented Jan 27, 2022

openshift-ci bot commented Jan 27, 2022

openshift-ci bot commented Jan 27, 2022

openshift-bot commented Feb 4, 2022

openshift-ci bot commented Feb 4, 2022

cgwalters commented Jul 1, 2022 • edited Loading

cgwalters Feb 5, 2020 •

edited

Loading

enxebre Feb 5, 2020 •

edited

Loading

cgwalters Feb 5, 2020 •

edited

Loading

enxebre Feb 5, 2020 •

edited

Loading

stbenjam May 29, 2020 •

edited

Loading

cgwalters commented Mar 9, 2020 •

edited

Loading

cgwalters commented Jul 1, 2022 •

edited

Loading