[cabpk] Support for machines images with pre-pulled container images from non-standard repos #1767

akutz · 2019-08-15T15:12:52Z

/kind feature

Note: This issue could also have been filed to the the kubernetes-sigs/image-builder repository.

Describe the solution you'd like
Machine images built from the CAPI image builder contain pre-pulled container images for the following components:

coredns
etcd
kube-proxy
kubernetes-apiserver
kubernetes-controller-manager
kubernetes-scheduler

Thanks to kubernetes-sigs/image-builder#16 and kubernetes-sigs/image-builder#19 by @figo, machine images may be built using custom package/container repositories as the source for the above container images.

However, this presents a problem as initially reported by @imikushin -- kubeadm's ClusterConfiguration must be used to specify a custom image repository:

// ImageRepository sets the container registry to pull images from.
// If empty, `k8s.gcr.io` will be used by default; in case of kubernetes version is a CI build (kubernetes version starts with `ci/` or `ci-cross/`)
// `gcr.io/kubernetes-ci-images` will be used as a default for control plane components and for kube-proxy, while `k8s.gcr.io`
// will be used for all the other images.
ImageRepository string `json:"imageRepository,omitempty"`

There is a way around this -- to manually load custom images into the local image repository in the namespace k8s.gcr.io, however this has longer-term implications as it hides the true origin of the custom images.

As @imikushin pointed out, his team uses a custom machine image that merges an incoming kubeadm configuration with a local version that overrides the imageRepository setting to ensure the correct, custom images are used. However, custom machine images are not a desirable solution.

On a call with Ivan and @detiber, Jason suggested the possibility of creating a ConfigMap on the management cluster that could contain overrides for the kubeadm bootstrap provider.

What we all agreed is this -- it should not be the responsibility of the end-user who is deploying a cluster to know that the machine images in use may be pre-loaded with custom container images for the Kubernetes components. Not should it be the end-user's job to create the kubeadm configuration that specifies the use of those container images.

This issue tracks the discussion around how to support deploying machine images with pre-pulled container images from non-standard image repositories without forcing an end-user to specify the non-standard repository as part of the cluster/machine configuration.

cc @imikushin @detiber @shep @zjs @chuckha @figo

The text was updated successfully, but these errors were encountered:

akutz · 2019-08-19T16:22:50Z

/assign @fabriziopandini

Hi Fabrizio,

I'm assigning this to you to work with @figo on this.

detiber · 2019-08-28T19:55:40Z

@fabriziopandini @figo any updates on this area?

akutz · 2019-08-28T20:02:33Z

I think this was handled by kubernetes-sigs/image-builder#26 by @figo

detiber · 2019-08-29T00:36:03Z

@akutz that was the image-builder portion, but we had also discussed having a way for users creating a cluster to not necessarily need to know what all kubeadm parameters needed to be overridden in the KubeadmConfig bootstrapping configuration.

fabriziopandini · 2019-08-29T07:06:20Z

some initial work.

we want to preserve the true origin of the custom images, not to mask by renaming with k8s.gcr.io.
similarly, we want to preserve the tag of the custom images, not to mask by renaming with an upstream release tag.
kubeadm can handle both if image repository and Kubernetes version are set
preloading images in the CRI should be executed before running kubeadm init/join, as part of the (machine) image build process (we are considering two options for this)
how to make image repository and Kubernetes version settings transparent to the user is still TBD

detiber · 2019-08-29T14:22:24Z

@chuckha @akutz @fabriziopandini @figo @zjs Since we already have the image building foundation in place, I think we can isolate any additional changes needed to support streamlining UX for setting/defaulting the image configuration in the KubeadmConfig to just CABPK, and we still have a non-trivial problem to solve, I suggest that we do not block the initial v1alpha2 release on this issue and address it as a followup to the initial v1alpha2 release (but prior to v1alpha3), thoughts?

chuckha · 2019-08-29T18:48:02Z

Agreed with @detiber. In theory all the plumbing exists to allow custom image repositories but the experience for the user is quite bad. They have to know magic characters to type in in order for their cluster to work.

While this is entirely impractical I don't see it as blocking the initial release and this can be addressed in this kubeadm provider after the initial release. Worst case we have to add a field that only this provider uses/knows about which is still backwards compatible and doesn't even warrant a minor version change. However, I think there are many things the image builder could do to help this along, such as providing some kind of script or metadata to use before kubeadm init is executed.

figo · 2019-08-29T19:00:03Z

how to make image repository and Kubernetes version settings transparent to the user is still TBD

this is where bootstrap-provider-kubeadm plumbing is needed.

Agreed with @detiber. In theory all the plumbing exists to allow custom image repositories but the > experience for the user is quite bad. They have to know magic characters to type in in order for
their cluster to work.

@chuckha first of all, the default case does not change,
the customization case UX should not be too bad if it is one additional field to configure, document from vendor should help to alleviate the problem (which they already doing anyway).

detiber · 2019-08-29T19:37:25Z

Based on the feedback received, I'm going to bump this to v0.1.x and we can work out the rest of the logistics needed later.

chuckha · 2019-08-29T19:49:57Z

Had a chat with @figo. Looks like the one of the problems is that kubeadm uses 3.1 as a hardcoded constant for the pause container. This is likely able to be addressed at the image build layer and potentially in Kubeadm itself.

angarg · 2019-09-17T17:36:09Z

I wonder if building on top of kubernetes/enhancements#1159 is a way to get this behavior? In that request, a user of kubeadm would supply a kustomize directory structure for customizations to be applied to the static manifests that kubeadm generates. If our provider image contained a location of our kustomization overlays, and if the kubeadm call used it, then could that provide us with the mechanism we need to modify the bootstrap process?

detiber · 2019-09-23T16:16:38Z

@angarg longer term, yes. However, we would need to be able to support Kubernetes v1.(n-2).x where n is the latest supported Kubernetes release. Once that minimum supported version of kubeadm has support for kustomize patches, then we could start to rely on them.

akutz · 2019-10-09T17:46:01Z

ping @ncdc @vincepri @juan-lee

ncdc · 2019-10-09T17:53:01Z

Had a chat with @figo. Looks like the one of the problems is that kubeadm uses 3.1 as a hardcoded constant for the pause container. This is likely able to be addressed at the image build layer and potentially in Kubeadm itself.

You can override this, but it's not pretty. You have to specify the args to the kubelet to set the --pod-infra-container-image flag.

ncdc · 2019-10-09T18:08:55Z

I have a couple of thoughts around simplifying the UX. Some of them are covered in #1227. At this point I think I'm interested in researching two complementary approaches:

In the absence of a KubeadmConfig and/or an an infra machine (e.g. AWSMachine), a component could create these for you and wire them up to your Machine (and do the same thing for Cluster + infrastructure ref). We could have a way to define defaults (e.g. "I want to use CAPZ for all my clusters").
Either separately or in combination with the previous item, have a way to merge in user-specified defaults to user-created KubeadmConfigs/AWSMachines/etc before proceeding to generate bootstrap data, provision machines, etc. This is somewhat like using kustomize to merge in changes.

ncdc · 2019-11-13T21:52:25Z

@akutz I moved this into CAPI

chuckha · 2019-11-15T15:51:26Z

It's been a while so I want to recap what I think I'm reading:

The desired outcome is to have CABPK work with non-default images without the user specifying the location of the images AND without automatically masking the images by renaming them to match the expected images.

This to me sounds provider specific and it will be a matter of the provider coordinating with the bootstrap provider to set the appropriate flags.

We should also outline what flags/options need to be set:

The image repository
We need to enable the kubelet to use a different pause container, however --pod-infra-container-image is a docker only flag according to the flag help:

root@ip-172-31-42-251:~# ./kubelet --help | grep pause
--pod-infra-container-image string The image whose network/ipc namespaces containers in each pod will use. This docker-specific flag only works when container-runtime is set to docker. (default "k8s.gcr.io/pause:3.1")

Is there a way to get the kubelet to use a different image with a different CRI?

ncdc · 2019-12-20T18:50:14Z

The desired outcome is to have CABPK work with non-default images without the user specifying the location of the images

Is this accurate? If so, I'm not sure this is possible?

Is setting the image repository and version sufficient?

detiber · 2019-12-20T20:47:24Z

The desired outcome is to have CABPK work with non-default images without the user specifying the location of the images

Is this accurate? If so, I'm not sure this is possible?

I think the idea is that a management cluster administrator could provide some values somehow that could be used as additional default values for clusters created by workload cluster administrators.

ncdc · 2019-12-20T20:56:56Z

Ah, that makes sense. We could prototype a standalone POC that has a defaulting webhook for KubeadmConfigs that is configurable, maybe from a ConfigMap.

vincepri · 2020-02-12T20:20:51Z

/milestone Next

vincepri · 2020-04-20T19:11:38Z

/area bootstrap
/priority backlog

fejta-bot · 2020-07-19T19:37:38Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

vincepri · 2020-07-20T14:02:41Z

/lifecycle frozen

vincepri · 2021-10-22T17:57:05Z

/close

On October 22nd we discussed this issue and came to the conclusion that this issue shouldn't probably be solved within Cluster API, but at a layer above. This is already possible today by using KCP's ImageRepository fields and pre-baking the container images in a VM image.

k8s-ci-robot · 2021-10-22T17:57:15Z

@vincepri: Closing this issue.

In response to this:

/close

On October 22nd we discussed this issue and came to the conclusion that this issue shouldn't probably be solved within Cluster API, but at a layer above. This is already possible today by using KCP's ImageRepository fields and pre-baking the container images in a VM image.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

timothysc assigned akutz, detiber, figo and chuckha and unassigned akutz Aug 15, 2019

k8s-ci-robot assigned fabriziopandini Aug 19, 2019

ncdc changed the title ~~Support for machines images with pre-pulled container images from non-standard repos~~ [cabpk] Support for machines images with pre-pulled container images from non-standard repos Nov 13, 2019

ncdc transferred this issue from kubernetes-retired/cluster-api-bootstrap-provider-kubeadm Nov 13, 2019

k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Nov 13, 2019

ncdc added this to the v0.3.0 milestone Nov 13, 2019

ncdc added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Nov 13, 2019

This was referenced Jan 7, 2020

Configure Cluster API manager/controllers with a ConfigMap #1749

Closed

Incorporate kube-rbac-proxy image registry location into manifest generation kubernetes-sigs/cluster-api-provider-vsphere#675

Closed

ncdc added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Jan 22, 2020

k8s-ci-robot modified the milestones: v0.3.0, Next Feb 12, 2020

k8s-ci-robot added area/bootstrap Issues or PRs related to bootstrap providers priority/backlog Higher priority than priority/awaiting-more-evidence. labels Apr 20, 2020

vincepri removed the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Apr 20, 2020

vincepri unassigned chuckha, akutz, figo, detiber and fabriziopandini Apr 20, 2020

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 19, 2020

k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 20, 2020

k8s-ci-robot closed this as completed Oct 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[cabpk] Support for machines images with pre-pulled container images from non-standard repos #1767

[cabpk] Support for machines images with pre-pulled container images from non-standard repos #1767

akutz commented Aug 15, 2019

akutz commented Aug 19, 2019

detiber commented Aug 28, 2019

akutz commented Aug 28, 2019

detiber commented Aug 29, 2019

fabriziopandini commented Aug 29, 2019 •

edited

Loading

detiber commented Aug 29, 2019

chuckha commented Aug 29, 2019

figo commented Aug 29, 2019

detiber commented Aug 29, 2019

chuckha commented Aug 29, 2019

angarg commented Sep 17, 2019

detiber commented Sep 23, 2019

akutz commented Oct 9, 2019

ncdc commented Oct 9, 2019

ncdc commented Oct 9, 2019

ncdc commented Nov 13, 2019

chuckha commented Nov 15, 2019 •

edited

Loading

ncdc commented Dec 20, 2019

detiber commented Dec 20, 2019

ncdc commented Dec 20, 2019

vincepri commented Feb 12, 2020

vincepri commented Apr 20, 2020

fejta-bot commented Jul 19, 2020

vincepri commented Jul 20, 2020

vincepri commented Oct 22, 2021

k8s-ci-robot commented Oct 22, 2021

[cabpk] Support for machines images with pre-pulled container images from non-standard repos #1767

[cabpk] Support for machines images with pre-pulled container images from non-standard repos #1767

Comments

akutz commented Aug 15, 2019

akutz commented Aug 19, 2019

detiber commented Aug 28, 2019

akutz commented Aug 28, 2019

detiber commented Aug 29, 2019

fabriziopandini commented Aug 29, 2019 • edited Loading

detiber commented Aug 29, 2019

chuckha commented Aug 29, 2019

figo commented Aug 29, 2019

detiber commented Aug 29, 2019

chuckha commented Aug 29, 2019

angarg commented Sep 17, 2019

detiber commented Sep 23, 2019

akutz commented Oct 9, 2019

ncdc commented Oct 9, 2019

ncdc commented Oct 9, 2019

ncdc commented Nov 13, 2019

chuckha commented Nov 15, 2019 • edited Loading

ncdc commented Dec 20, 2019

detiber commented Dec 20, 2019

ncdc commented Dec 20, 2019

vincepri commented Feb 12, 2020

vincepri commented Apr 20, 2020

fejta-bot commented Jul 19, 2020

vincepri commented Jul 20, 2020

vincepri commented Oct 22, 2021

k8s-ci-robot commented Oct 22, 2021

fabriziopandini commented Aug 29, 2019 •

edited

Loading

chuckha commented Nov 15, 2019 •

edited

Loading