Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No RHCOS image in baremetal Machines and MachineSets #2037

Closed
markmc opened this issue Jul 18, 2019 · 9 comments
Closed

No RHCOS image in baremetal Machines and MachineSets #2037

markmc opened this issue Jul 18, 2019 · 9 comments
Assignees
Labels
platform/baremetal IPI bare metal hosts platform

Comments

@markmc
Copy link
Contributor

markmc commented Jul 18, 2019

From openshift-metal3/kni-installer#160

In the Generate() function for masters and workers under pkgs/assets/machines every other platform except libvirt generates an RHCOS image asset and includes it in the Machines or MachineSets it generates

Why isn't the baremetal platform doing this?

(NB - currently on baremetal the generated RHCOS image is the qcow image we use with libvirt for the bootstrap VM. If we were to include bootimage info in the generated machine manifests, we would need to make it have the proper URL instead)

Some of the underlying assumptions of the rhcos.Image abstraction:

The same image is used for bootstrap, masters, and workers
That image information is included in the generated machines manifests and also to terraform
I think we should follow that basic idea and just have an "oh hey, our bootstrap node boot image is different!" special case

@markmc
Copy link
Contributor Author

markmc commented Jul 18, 2019

/label platform/baremetal

@hardys
Copy link
Contributor

hardys commented Jul 18, 2019

This is also related to openshift-metal3/kni-installer#58 - I think ideally we want to remove the image parameter completely from the install-config, since when the Ironic provisioning of masters and workers happen, the URL needed is different in each case.

That's really an internal implementation detail, and when we've got Ironic hosted on the bootstrap VM via openshift-metal3/kni-installer#100 we'll know where the images will be as an internal implementation detail, e.g we can set the terraform variables for instance/driverInfo based on knowledge of the provisioning IP of the bootstrap VM.

So the remaining missing piece is to set the providerSpec for the machineset, which currently gets set to the same install-config image data, which is wrong in the case where it points to the bootstrap VM, and there's no way to differentiate between master/worker images at that level.

@hardys
Copy link
Contributor

hardys commented Jul 19, 2019

Looking at this rhcos.Image is currently a string, and it's consumed like:

    machines, err = aws.Machines(clusterID.InfraID, ic, pool, string(*rhcosImage), "master", "master-user-data")

So I wonder if we can insteadm make Image a struct (renamed to Images?) then have two keys, bootstrapImage and osImage?

Then for all platforms where these are the same, they'd return the same value.

@hardys
Copy link
Contributor

hardys commented Jul 19, 2019

Another possbly less invasive approach would be to add a new abstraction e.g rhcos.OSImage which returns rhcos.Image for all platforms except those where the bootstrap and OS images don't match?

I need to resolve this for bootstrap hosted Ironic ref #2035 so perhaps @abhinavdahiya may be able to provide some feedback on the preferred approach and I'll work on a PR, then rebase #2035 to align with the new interface and enable templating of the OSImage?

Then we can decouple the bootstrap download of that image from the providerSpec for the workers (which will use a cached copy of the same file).

One question is how do we handle updating of this image on upgrade, e.g when the cluster upgrades to a new version, will the machineset providerSpec get updated, and if so how?

@hardys
Copy link
Contributor

hardys commented Jul 19, 2019

One question is how do we handle updating of this image on upgrade, e.g when the cluster upgrades to a new version, will the machineset providerSpec get updated, and if so how?

Discussion indicates that currently no platforms support changing the boot image post deploy, and rely on the upgrade of the bootimage after provisioning. So that simplifies things a bit.

@cgwalters
Copy link
Member

Discussion indicates that currently no platforms support changing the boot image post deploy,

openshift/os#381

@hardys
Copy link
Contributor

hardys commented Jul 19, 2019

Discussion indicates that currently no platforms support changing the boot image post deploy,

openshift/os#381

Ack thanks - the issue for Baremetal IPI right now is that we download/cache the bootimage at the time of initial deployment. As support for updating the bootimage evolves we can certainly figure out how to refresh that, but it seems that we can defer that work until the initial integration is completed.

@hardys
Copy link
Contributor

hardys commented Jul 22, 2019

/assign

hardys pushed a commit to hardys/rhcos-downloader that referenced this issue Jul 23, 2019
The URL provided internally by the installer is the full path
to the openstack qcow file, and the baseURI is supposed to be an
internal detail, so we need to handle this case for the planned
switch to use installer generated image references:

openshift/installer#2037
openshift/installer#2061

The previous baseURI method is maintained for backwards compatibilty
and can be removed later when the installer changes are complete,
and at that point we probably want to remove the latest symlinks
as I think we need to deal with explicit image references in the
providerSpec, then have the BMO (and terraform) look at the mirror
location for the locally downloaded/compressed version.
stbenjam pushed a commit to hardys/installer that referenced this issue Jul 29, 2019
For the baremetal platform we require two different bootimages,
the QEMU one for the libvirt based bootstrap VM, and the OpenStack
one that contains the necessary Ironic config drive support to pass
data to ignition.

So we rework the Image abstraction by adding a new BootstrapImage
type/asset, which returns the same as Image in all cases except for
the baremetal platform.

This also aligns with the tfvars renamed in openshift#2044 and allows us to
pass the rhcos.QEMU image via BootstrapImage to terraform but leaves
the OpenStack image URL available for future use to deploy masters
via follow-up PRs that implement issue openshift#2060 and also correctly set
the worker machineset providerSpec for the baremetal-operator.

Related: openshift#2037
@juliakreger
Copy link

@dhellmann FYI

jhixson74 pushed a commit to jhixson74/installer that referenced this issue Dec 6, 2019
This removes the image install-config input, instead we calculate the
correct cache URL based on knowledge of how the rhcos-downloader container
works on the boostrap VM and cluster.

Closes: openshift#2064
Closes: openshift#2037
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform/baremetal IPI bare metal hosts platform
Projects
None yet
Development

No branches or pull requests

5 participants