Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MCO-679: MCO-830: Managed boot images MVP #4083

Merged
merged 5 commits into from
Jan 8, 2024

Conversation

djoshy
Copy link
Contributor

@djoshy djoshy commented Dec 18, 2023

This implements openshift/enhancements#1496, for the GCP platform.

What does it do?

Adds a new subcontroller machine_set_boot_image_controller to the MCC, that

  • Listens on machineset changes or coreos-bootimages configmap changes
  • Determines architecture and infra of cluster to index the correct boot image
  • Updates machineset(s) with newer bootimages if there is a delta after 1 master node has completed the upgrade
  • Patches machine set via machineclient

How do I test this?

  • Launch a cluster on 4.15. Check the first few lines of the machine config daemon logs, and make a note of the node's aleph version. This is the original boot image.
I1218 20:35:06.196301    2320 rpm-ostree.go:308] Running captured: rpm-ostree status
I1218 20:35:06.240687    2320 daemon.go:1598] State: idle
Deployments:
* ostree-unverified-registry:registry.ci.openshift.org/ocp/4.16-2023-12-18-045249@sha256:a2900a99595cbf0cf3862d0066d37b2e95a95e032659469d59f616eb9752d3c0
                   Digest: sha256:a2900a99595cbf0cf3862d0066d37b2e95a95e032659469d59f616eb9752d3c0
                  Version: 416.92.202312150201-0 (2023-12-18T16:15:22Z)
I1218 20:35:06.241321    2320 coreos.go:53] CoreOS aleph version: mtime=2023-11-24 16:50:34.214 +0000 UTC
{
   "build": "415.92.202311241643-0",
   "imgid": "rhcos-415.92.202311241643-0-qemu.x86_64.qcow2",
   "ostree-commit": "3aff20eacec06af854303111319e74d9dc84c241af5c57dc8ae3330a8ae5b086",
   "ref": ""
}


  • Upgrade to this PR. This should create a version skew between boot images and current node image. During this upgrade, coreos-bootimages configmap will get updated by a CVO manifest to the newest boot images.
  • Turn on the tech preview feature gate by applying this YAML:
apiVersion: config.openshift.io/v1
kind: FeatureGate
metadata:
  name: cluster
spec:
  featureSet: TechPreviewNoUpgrade
  • This should trigger a machineset reconciliation loop. Observe logs on the MCC with this command:
oc logs -n openshift-machine-config-operator -f "$(oc get pod -o name -l='k8s-app=machine-config-controller' -n openshift-machine-config-operator)" | grep "machine_set"

You should see all the machinesets being patched:

I1218 20:31:52.657798       1 machine_set_boot_image_controller.go:143] "FeatureGates changed" enabled=["AdminNetworkPolicy","AlibabaPlatform","AutomatedEtcdBackup","AzureWorkloadIdentity","BuildCSIVolumes","CSIDriverSharedResource","CloudDualStackNodeIPs","DNSNameResolver","DynamicResourceAllocation","ExternalCloudProvider","ExternalCloudProviderAzure","ExternalCloudProviderExternal","ExternalCloudProviderGCP","GCPClusterHostedDNS","GCPLabelsTags","GatewayAPI","InsightsConfigAPI","InstallAlternateInfrastructureAWS","KMSv1","MachineAPIProviderOpenStack","MachineConfigNodes","ManagedBootImages","MaxUnavailableStatefulSet","MetricsServer","MixedCPUsAllocation","NetworkLiveMigration","NodeSwap","OnClusterBuild","OpenShiftPodSecurityAdmission","PrivateHostedZoneAWS","RouteExternalCertificate","SignatureStores","SigstoreImageVerification","VSphereControlPlaneMachineSet","VSphereStaticIPs","ValidatingAdmissionPolicy"] disabled=["ClusterAPIInstall","DisableKubeletCloudCredentialProviders","EventedPLEG","MachineAPIOperatorDisableMachineHealthCheckController"]
I1218 20:31:52.657849       1 machine_set_boot_image_controller.go:156] Trigger a sync as this feature was turned on
I1218 20:31:52.657964       1 machine_set_boot_image_controller.go:560] Reconciling machineset djoshy10-rvsmr-worker-a on GCP, with arch x86_64
I1218 20:31:52.658487       1 machine_set_boot_image_controller.go:560] Reconciling machineset djoshy10-rvsmr-worker-b on GCP, with arch x86_64
I1218 20:31:52.709865       1 machine_set_boot_image_controller.go:585] New target boot image: projects/rhcos-cloud/global/images/rhcos-415-92-202311241643-0-gcp-x86-64
I1218 20:31:52.709888       1 machine_set_boot_image_controller.go:586] Current image: projects/rhcos-cloud/global/images/rhcos-415-92-202311241643-gcp-x86-64
I1218 20:31:52.709961       1 machine_set_boot_image_controller.go:402] Patching machineset djoshy10-rvsmr-worker-a
I1218 20:31:52.711798       1 machine_set_boot_image_controller.go:405] No patching required for machineset djoshy10-rvsmr-worker-b

If this doesn't happen, or if logs say that no patching was required for the machinesets, then no skew was found between the boot images and the last upgrade.

  • Scale up a new node on any of the machinesets
oc scale --replicas=2 machineset <machineset> -n openshift-machine-api
  • Once the node is up and in the worker pool, observe the MCD logs of the newly scaled up node. At the daemon startup, the aleph version should now indicate the new boot image.

Other notes:
Once the feature gate is removed, this reconciliation would take place when at least 1 master node has completed an upgrade(and no other upgrades are pending). This is by design, as one would expect boot image updates to take place during an upgrade.

This vendors in:
- informers and listers for the machine api objects
- stream-metadata-go for coreos stream objects
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 18, 2023
Copy link
Contributor

openshift-ci bot commented Dec 18, 2023

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 18, 2023
@djoshy djoshy force-pushed the manage-boot-images-gcp branch from ae4735d to e6696b6 Compare December 19, 2023 15:07
@djoshy djoshy changed the title Managed boot images MVP MCO-679: MCO-830: Managed boot images MVP Dec 19, 2023
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Dec 19, 2023
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Dec 19, 2023

@djoshy: This pull request references MCO-679 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.

In response to this:

This implements openshift/enhancements#1496, for the GCP platform.

What does it do?

Adds a new subcontroller machine_set_boot_image_controller to the MCC, that

  • Listens on machineset changes or coreos-bootimages configmap changes
  • Determines architecture and infra of cluster to index the correct boot image
  • Updates machineset(s) with newer bootimages if there is a delta after 1 master node has completed the upgrade
  • Patches machine set via machineclient

How do I test this?

  • Launch a cluster on 4.15. Check the first few lines of the machine config daemon logs, and make a note of the node's aleph version. This is the original boot image.
I1218 20:35:06.196301    2320 rpm-ostree.go:308] Running captured: rpm-ostree status
I1218 20:35:06.240687    2320 daemon.go:1598] State: idle
Deployments:
* ostree-unverified-registry:registry.ci.openshift.org/ocp/4.16-2023-12-18-045249@sha256:a2900a99595cbf0cf3862d0066d37b2e95a95e032659469d59f616eb9752d3c0
                  Digest: sha256:a2900a99595cbf0cf3862d0066d37b2e95a95e032659469d59f616eb9752d3c0
                 Version: 416.92.202312150201-0 (2023-12-18T16:15:22Z)
I1218 20:35:06.241321    2320 coreos.go:53] CoreOS aleph version: mtime=2023-11-24 16:50:34.214 +0000 UTC
{
  "build": "415.92.202311241643-0",
  "imgid": "rhcos-415.92.202311241643-0-qemu.x86_64.qcow2",
  "ostree-commit": "3aff20eacec06af854303111319e74d9dc84c241af5c57dc8ae3330a8ae5b086",
  "ref": ""
}


  • Upgrade to this PR. This should create a version skew between boot images and current node image. During this upgrade, coreos-bootimages configmap will get updated by a CVO manifest to the newest boot images.
  • Turn on the tech preview feature gate by applying this YAML:
apiVersion: config.openshift.io/v1
kind: FeatureGate
metadata:
 name: cluster
spec:
 featureSet: TechPreviewNoUpgrade
  • This should trigger a machineset reconciliation loop. Observe logs on the MCC with this command:
oc logs -n openshift-machine-config-operator -f "$(oc get pod -o name -l='k8s-app=machine-config-controller' -n openshift-machine-config-operator)" | grep "machine_set"

You should see all the machinesets being patched:

I1218 20:31:52.657798       1 machine_set_boot_image_controller.go:143] "FeatureGates changed" enabled=["AdminNetworkPolicy","AlibabaPlatform","AutomatedEtcdBackup","AzureWorkloadIdentity","BuildCSIVolumes","CSIDriverSharedResource","CloudDualStackNodeIPs","DNSNameResolver","DynamicResourceAllocation","ExternalCloudProvider","ExternalCloudProviderAzure","ExternalCloudProviderExternal","ExternalCloudProviderGCP","GCPClusterHostedDNS","GCPLabelsTags","GatewayAPI","InsightsConfigAPI","InstallAlternateInfrastructureAWS","KMSv1","MachineAPIProviderOpenStack","MachineConfigNodes","ManagedBootImages","MaxUnavailableStatefulSet","MetricsServer","MixedCPUsAllocation","NetworkLiveMigration","NodeSwap","OnClusterBuild","OpenShiftPodSecurityAdmission","PrivateHostedZoneAWS","RouteExternalCertificate","SignatureStores","SigstoreImageVerification","VSphereControlPlaneMachineSet","VSphereStaticIPs","ValidatingAdmissionPolicy"] disabled=["ClusterAPIInstall","DisableKubeletCloudCredentialProviders","EventedPLEG","MachineAPIOperatorDisableMachineHealthCheckController"]
I1218 20:31:52.657849       1 machine_set_boot_image_controller.go:156] Trigger a sync as this feature was turned on
I1218 20:31:52.657964       1 machine_set_boot_image_controller.go:560] Reconciling machineset djoshy10-rvsmr-worker-a on GCP, with arch x86_64
I1218 20:31:52.658487       1 machine_set_boot_image_controller.go:560] Reconciling machineset djoshy10-rvsmr-worker-b on GCP, with arch x86_64
I1218 20:31:52.709865       1 machine_set_boot_image_controller.go:585] New target boot image: projects/rhcos-cloud/global/images/rhcos-415-92-202311241643-0-gcp-x86-64
I1218 20:31:52.709888       1 machine_set_boot_image_controller.go:586] Current image: projects/rhcos-cloud/global/images/rhcos-415-92-202311241643-gcp-x86-64
I1218 20:31:52.709961       1 machine_set_boot_image_controller.go:402] Patching machineset djoshy10-rvsmr-worker-a
I1218 20:31:52.711798       1 machine_set_boot_image_controller.go:405] No patching required for machineset djoshy10-rvsmr-worker-b

If this doesn't happen, or if logs say that no patching was required for the machinesets, then no skew was found between the boot images and the last upgrade.

  • Scale up a new node on any of the machinesets
oc scale --replicas=2 machineset <machineset> -n openshift-machine-api
  • Once the node is up and in the worker pool, observe the MCD logs of the newly scaled up node. At the daemon startup, the aleph version should now indicate the new boot image.

Other notes:
Once the feature gate is removed, this reconciliation would take place when at least 1 master node has completed an upgrade(and no other upgrades are pending). This is by design, as one would expect boot image updates to take place during an upgrade.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@djoshy djoshy marked this pull request as ready for review December 19, 2023 15:20
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 19, 2023
@openshift-ci openshift-ci bot requested review from jkyros and sinnykumari December 19, 2023 15:25
@djoshy djoshy force-pushed the manage-boot-images-gcp branch 4 times, most recently from 7f9e952 to 3408c6c Compare December 19, 2023 19:55
@djoshy djoshy force-pushed the manage-boot-images-gcp branch from 3408c6c to bf9193a Compare December 19, 2023 21:38
@rioliu-rh
Copy link

/hold for QE verification

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 19, 2023
@rioliu-rh
Copy link

rioliu-rh commented Dec 20, 2023

verification
install cluster with build 4.15.0-0.nightly-2023-12-19-033450 on GCP

$ oc get infrastructures/cluster -o jsonpath='{.status.platform}'
GCP

check boot image info in MCD log

$ oc logs -n openshift-machine-config-operator -c machine-config-daemon machine-config-daemon-wsqv5 | grep -A6 'CoreOS aleph version'
I1220 07:26:08.690917    2258 coreos.go:53] CoreOS aleph version: mtime=2023-11-24 16:50:34.214 +0000 UTC
{
   "build": "415.92.202311241643-0",
   "imgid": "rhcos-415.92.202311241643-0-qemu.x86_64.qcow2",
   "ostree-commit": "3aff20eacec06af854303111319e74d9dc84c241af5c57dc8ae3330a8ae5b086",
   "ref": ""
}

do pre-upgrade snapshot for configmap/coreos-bootimages

oc get cm coreos-bootimages -n openshift-machine-config-operator -o yaml > /tmp/coreos-bootimages_pre_upgrade.yaml

upgrade cluster to the image built based on this PR registry.build05.ci.openshift.org/ci-ln-1byts4t/release:latest

$ oc adm upgrade --to-image registry.build05.ci.openshift.org/ci-ln-1byts4t/release:latest --force --allow-explicit-upgrade
warning: Using by-tag pull specs is dangerous, and while we still allow it in combination with --force for backward compatibility, it would be much safer to pass a by-digest pull spec instead
warning: The requested upgrade image is not one of the available updates.You have used --allow-explicit-upgrade for the update to proceed anyway
warning: --force overrides cluster verification of your supplied release image and waives any update precondition failures.
Requested update to release image registry.build05.ci.openshift.org/ci-ln-1byts4t/release:latest


$ oc get clusterversion version -o yaml | yq '.status.history[]|[.version, .state]'
[
  "4.15.0-0.ci.test-2023-12-20-055728-ci-ln-1byts4t-latest",
  "Completed"
]
[
  "4.15.0-0.nightly-2023-12-19-033450",
  "Completed"
]

do post-upgrade snapshot for configmap/coreos-bootimages and do diff b/w snapshots

$ oc get cm coreos-bootimages -n openshift-machine-config-operator -o yaml > /tmp/coreos-bootimages_post_upgrade.yaml

$ diff /tmp/coreos-bootimages_pre_upgrade.yaml /tmp/coreos-bootimages_post_upgrade.yaml
2a3,4
>   MCOReleaseImageVersion: 4.15.0-0.ci.test-2023-12-20-055728-ci-ln-1byts4t-latest
>   MCOVersionHash: 239869f9c3f7befd579a23f7b453e523a7084dd2
944c946
<   resourceVersion: "845"
---
>   resourceVersion: "64197"

make sure featureGate ManagedBootImages is disabled

$ oc get featuregate/cluster -o jsonpath='{.status.featureGates[?(@.version=="4.15.0-0.ci.test-2023-12-20-055728-ci-ln-1byts4t-latest")].disabled[?(@.name=="ManagedBootImages")]}'
{"name":"ManagedBootImages"}

enable featureGate ManagedBootImages with featureSet TechPreviewNoUpgrade

$ cat ~/mco_test/mc/featuregate_techpreview.yaml
apiVersion: config.openshift.io/v1
kind: FeatureGate
metadata:
  name: cluster
spec:
  featureSet: TechPreviewNoUpgrade

$ oc apply -f ~/mco_test/mc/featuregate_techpreview.yaml
Warning: resource featuregates/cluster is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by oc apply. oc apply should only be used on resources created declaratively by either oc create --save-config or oc apply. The missing annotation will be patched automatically.
featuregate.config.openshift.io/cluster configured

monitor MCC pod log to grep logs from machine_set_boot_image_controller

$ oc logs -f -n openshift-machine-config-operator -c machine-config-controller machine-config-controller-85584647dc-dx5hv | grep machine_set_boot_image_controller.go
I1220 09:15:59.497802       1 machine_set_boot_image_controller.go:143] "FeatureGates changed" enabled=["AdminNetworkPolicy","AlibabaPlatform","AutomatedEtcdBackup","AzureWorkloadIdentity","BuildCSIVolumes","CSIDriverSharedResource","CloudDualStackNodeIPs","DNSNameResolver","DynamicResourceAllocation","ExternalCloudProvider","ExternalCloudProviderAzure","ExternalCloudProviderExternal","ExternalCloudProviderGCP","GCPClusterHostedDNS","GCPLabelsTags","GatewayAPI","InsightsConfigAPI","InstallAlternateInfrastructureAWS","MachineAPIProviderOpenStack","MachineConfigNodes","ManagedBootImages","MaxUnavailableStatefulSet","MetricsServer","MixedCPUsAllocation","NetworkLiveMigration","NodeSwap","OnClusterBuild","OpenShiftPodSecurityAdmission","PrivateHostedZoneAWS","RouteExternalCertificate","SignatureStores","SigstoreImageVerification","VSphereControlPlaneMachineSet","VSphereStaticIPs","ValidatingAdmissionPolicy"] disabled=["ClusterAPIInstall","DisableKubeletCloudCredentialProviders","EventedPLEG","MachineAPIOperatorDisableMachineHealthCheckController"]
I1220 09:15:59.497939       1 machine_set_boot_image_controller.go:156] Trigger a sync as this feature was turned on
I1220 09:15:59.719712       1 machine_set_boot_image_controller.go:272] configMap coreos-bootimages added, reconciling all machine sets
I1220 09:15:59.732187       1 machine_set_boot_image_controller.go:228] MachineSet rioliu-1220c-bz2gp-worker-a added
I1220 09:15:59.732206       1 machine_set_boot_image_controller.go:228] MachineSet rioliu-1220c-bz2gp-worker-b added
I1220 09:15:59.733350       1 machine_set_boot_image_controller.go:228] MachineSet rioliu-1220c-bz2gp-worker-c added
I1220 09:15:59.733379       1 machine_set_boot_image_controller.go:228] MachineSet rioliu-1220c-bz2gp-worker-f added
I1220 09:15:59.739239       1 machine_set_boot_image_controller.go:174] Starting MachineConfigController-MachineSetBootImageController
I1220 09:15:59.739355       1 machine_set_boot_image_controller.go:560] Reconciling machineset rioliu-1220c-bz2gp-worker-a on GCP, with arch x86_64
I1220 09:15:59.741197       1 machine_set_boot_image_controller.go:405] No patching required for machineset rioliu-1220c-bz2gp-worker-a
I1220 09:15:59.741249       1 machine_set_boot_image_controller.go:560] Reconciling machineset rioliu-1220c-bz2gp-worker-b on GCP, with arch x86_64
I1220 09:15:59.742440       1 machine_set_boot_image_controller.go:405] No patching required for machineset rioliu-1220c-bz2gp-worker-b
I1220 09:15:59.742482       1 machine_set_boot_image_controller.go:560] Reconciling machineset rioliu-1220c-bz2gp-worker-c on GCP, with arch x86_64
I1220 09:15:59.743649       1 machine_set_boot_image_controller.go:405] No patching required for machineset rioliu-1220c-bz2gp-worker-c
I1220 09:15:59.743704       1 machine_set_boot_image_controller.go:211] Error syncing machineset openshift-machine-api/rioliu-1220c-bz2gp-worker-f: failed to fetch architecture type of machineset rioliu-1220c-bz2gp-worker-f, err: could not find any machines linked to machineset, error: %!w(<nil>)
I1220 09:15:59.796857       1 machine_set_boot_image_controller.go:211] Error syncing machineset openshift-machine-api/rioliu-1220c-bz2gp-worker-f: failed to fetch architecture type of machineset rioliu-1220c-bz2gp-worker-f, err: could not find any machines linked to machineset, error: %!w(<nil>)
I1220 09:15:59.807413       1 machine_set_boot_image_controller.go:211] Error syncing machineset openshift-machine-api/rioliu-1220c-bz2gp-worker-f: failed to fetch architecture type of machineset rioliu-1220c-bz2gp-worker-f, err: could not find any machines linked to machineset, error: %!w(<nil>)
I1220 09:15:59.827610       1 machine_set_boot_image_controller.go:211] Error syncing machineset openshift-machine-api/rioliu-1220c-bz2gp-worker-f: failed to fetch architecture type of machineset rioliu-1220c-bz2gp-worker-f, err: could not find any machines linked to machineset, error: %!w(<nil>)
I1220 09:15:59.882642       1 machine_set_boot_image_controller.go:211] Error syncing machineset openshift-machine-api/rioliu-1220c-bz2gp-worker-f: failed to fetch architecture type of machineset rioliu-1220c-bz2gp-worker-f, err: could not find any machines linked to machineset, error: %!w(<nil>)
E1220 09:16:00.002887       1 machine_set_boot_image_controller.go:216] failed to fetch architecture type of machineset rioliu-1220c-bz2gp-worker-f, err: could not find any machines linked to machineset, error: %!w(<nil>)
I1220 09:16:00.003003       1 machine_set_boot_image_controller.go:217] Dropping machineset "openshift-machine-api/rioliu-1220c-bz2gp-worker-f" out of the work queue: failed to fetch architecture type of machineset rioliu-1220c-bz2gp-worker-f, err: could not find any machines linked to machineset, error: %!w(<nil>)

checkout the updated boot image info from MCD log

$ oc logs -n openshift-machine-config-operator -c machine-config-daemon machine-config-daemon-t6dnd | grep -A6 'CoreOS aleph version'
I1220 09:09:52.338206    1510 coreos.go:53] CoreOS aleph version: mtime=2023-11-24 16:50:34.214 +0000 UTC
{
   "build": "415.92.202311241643-0",
   "imgid": "rhcos-415.92.202311241643-0-qemu.x86_64.qcow2",
   "ostree-commit": "3aff20eacec06af854303111319e74d9dc84c241af5c57dc8ae3330a8ae5b086",
   "ref": ""
}

No changes found. what's wrong

/cc @sergiordlr @djoshy

maybe I need to find a older nightly to test

@djoshy
Copy link
Contributor Author

djoshy commented Dec 20, 2023

Thanks for the testing, Rio!

Yes, you are correct, you will need to use an older nightly. There are no confimap changes in this case. The diff on the confimap right now only shows some keys that the MCO set up to avoid race conditions. Essentially, you need a change in this file in the installer repo to be different in the original nightly and new upgrade target:
https://github.com/openshift/installer/blob/master/data%2Fdata%2Fcoreos%2Frhcos.json

Another interesting thing I noticed is that a machine couldn't be found for the Machineset rioliu-1220c-bz2gp-worker-f. I use this to determine the architecture of the node. Did it not have any replicas? That is something I should check against if that is the case, I forgot to do that.

@rioliu-rh
Copy link

rioliu-rh commented Dec 20, 2023

Another interesting thing I noticed is that a machine couldn't be found for the Machineset rioliu-1220c-bz2gp-worker-f. I use this to determine the architecture of the node. Did it not have any replicas? That is something I should check against if that is the case, I forgot to do that.

@djoshy I think this error is expected, because no machine provisioned by this machineset

$ oc get machineset/rioliu-1220c-bz2gp-worker-f -n openshift-machine-api
NAME                          DESIRED   CURRENT   READY   AVAILABLE   AGE
rioliu-1220c-bz2gp-worker-f   0         0                             3h47m

Question: when the machineset is modified and scale up works, can we sync the boot image changes at that time.

Open a bug to track OCPBUGS-25725

@djoshy
Copy link
Contributor Author

djoshy commented Dec 20, 2023

It looks like this PR last updated the boot images:
openshift/installer#7770 (comment)
So if you start from a nightly before 4.15.0-0.nightly-2023-12-07-041003, we should have a skew in boot images.

Question: when the machineset is modified and scale up works, can we sync the boot image changes at that time.

Yes, any time the machineset is modified, the boot images sync will happen again. I do think I can clean up the error to be less annoying though(it is not anything fatal right now)

@djoshy
Copy link
Contributor Author

djoshy commented Dec 20, 2023

Per latest comment from https://issues.redhat.com/browse/OCPBUGS-25725, we have run into the issue of a machineset getting the older boot image when scaled up from 0 replicas for the very first time. Me and Jerry discussed this, and as this is under feature gate, we thought it best to fix this in separate PR against that bug after this merges. Is that ok with you, @sergiordlr @rioliu-rh ?

Copy link
Contributor

@yuqi-zhang yuqi-zhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a first-pass and the general architecture makes sense, thanks for doing this David!

// Any machine from this slice will be of the same architecture, so grab the first one
// Cycle through nodes, compare to annotations to find node with matching machine

nodes, err := ctrl.nodeLister.List(labels.Everything())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already chatted about this, and I think our plan is to read it off the label first, then either the nodes here or from the control plane to match autoscaler, but I wonder if we can skip the node check altogether and just follow the same steps outlined there?

Copy link
Contributor Author

@djoshy djoshy Dec 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We totally could, and I was debating that as I was reworking this. The labels were present in my regular 3+3 cluster today, so I think it may have been a newer change that I hadn't caught on to yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened a PR which resolves this: #4088
This should also fix https://issues.redhat.com/browse/OCPBUGS-25725

// has completed an upgrade). This "stamp" is used by the machine set controller as a safety before
// it updates boot images.

func (optr *Operator) stampBootImagesCM(pool *mcfgv1.MachineConfigPool) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is basically the same logic as the MCS served config, right?

Copy link
Contributor Author

@djoshy djoshy Dec 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, the only difference being, here we gate it on 1 master node completing the update, while the MCS wait for at least 1 node in the pool to complete the update before serving it to new nodes.

Here's the actual commit for reference:

4bd204d

@rioliu-rh
Copy link

Per latest comment from issues.redhat.com/browse/OCPBUGS-25725, we have run into the issue of a machineset getting the older boot image when scaled up from 0 replicas for the very first time. Me and Jerry discussed this, and as this is under feature gate, we thought it best to fix this in separate PR against that bug after this merges. Is that ok with you, @sergiordlr @rioliu-rh ?

i'm ok, we only verify machineset pathing function in this PR

@rioliu-rh
Copy link

rioliu-rh commented Dec 21, 2023

2nd round testing
setup cluster with build 4.15.0-ec.2 on GCP
pre-upgrade check
boot image

$ oc logs -n openshift-machine-config-operator -c machine-config-daemon $(oc get pod -l k8s-app=machine-config-daemon -n openshift-machine-config-operator -ojsonpath='{.items[0].metadata.name}') | grep 'CoreOS aleph version'
I1221 04:58:14.425691    1476 coreos.go:54] CoreOS aleph version: mtime=2023-10-31 00:44:27.92 +0000 UTC build=415.92.202310310037-0 imgid=rhcos-415.92.202310310037-0-qemu.x86_64.qcow2

do pre-upgrade snapshot for cm/coreos-bootimages

$ oc get cm/coreos-bootimages -n openshift-machine-config-operator -o yaml > /tmp/coreos-bootimages_pre_upgrade.yaml

upgrade to CI image

$ oc adm upgrade --to-image registry.build05.ci.openshift.org/ci-ln-4xnxd1b/release:latest --force --allow-explicit-upgrade
warning: Using by-tag pull specs is dangerous, and while we still allow it in combination with --force for backward compatibility, it would be much safer to pass a by-digest pull spec instead
warning: The requested upgrade image is not one of the available updates.You have used --allow-explicit-upgrade for the update to proceed anyway
warning: --force overrides cluster verification of your supplied release image and waives any update precondition failures.
Requested update to release image registry.build05.ci.openshift.org/ci-ln-4xnxd1b/release:latest

$ oc get clusterversion version -o yaml | yq '.status.history[]|[.version, .state]'
[
  "4.15.0-0.ci.test-2023-12-21-012105-ci-ln-4xnxd1b-latest",
  "Completed"
]
[
  "4.15.0-ec.2",
  "Completed"
]

do post-upgrade snapshot for cm/coreos-bootimages, and do diff with pre-upgrade snapshot, boot image is updated

$ oc get cm/coreos-bootimages -n openshift-machine-config-operator -o yaml > /tmp/coreos-bootimages_post_upgrade.yaml

$ cat /tmp/coreos-bootimages_pre_upgrade.yaml | yq -r '.data.stream' | jq '.architectures.x86_64.artifacts.gcp'
{
  "release": "415.92.202310310037-0",
  "formats": {
    "tar.gz": {
      "disk": {
        "location": "https://rhcos.mirror.openshift.com/art/storage/prod/streams/4.15-9.2/builds/415.92.202310310037-0/x86_64/rhcos-415.92.202310310037-0-gcp.x86_64.tar.gz",
        "sha256": "bb79cc15d8404e464610edd05219edbfcb8ec578f24cbf2a7a62f172203e86ff",
        "uncompressed-sha256": "4f73296738173a253821696259ff388e8b16e2947cdce7db2e3b23e97ff14a4a"
      }
    }
  }
}

$ cat /tmp/coreos-bootimages_post_upgrade.yaml | yq -r '.data.stream' | jq '.architectures.x86_64.artifacts.gcp'
{
  "release": "415.92.202311241643-0",
  "formats": {
    "tar.gz": {
      "disk": {
        "location": "https://rhcos.mirror.openshift.com/art/storage/prod/streams/4.15-9.2/builds/415.92.202311241643-0/x86_64/rhcos-415.92.202311241643-0-gcp.x86_64.tar.gz",
        "sha256": "c539cbb353ec8ff1ea7def11f20f5a38f3b1333c0173a810dee80c8ede703463",
        "uncompressed-sha256": "4f58bbc44ea99af7047c77cfdd72b1d132f652aac0249b02aafcaeb5460dd27f"
      }
    }
  }
}

enable featureSet: TechPreviewNoUpgrade

$ oc get featuregate/cluster -o jsonpath='{.status.featureGates[?(@.version=="4.15.0-0.ci.test-2023-12-21-012105-ci-ln-4xnxd1b-latest")].enabled[?(@.name=="ManagedBootImages")]}'
{"name":"ManagedBootImages"}

monitor MCC pod log

I1221 06:41:15.781240       1 machine_set_boot_image_controller.go:156] Trigger a sync as this feature was turned on
I1221 06:41:15.781351       1 machine_set_boot_image_controller.go:560] Reconciling machineset rioliu-1221c-h72r5-worker-a on GCP, with arch x86_64
I1221 06:41:15.783093       1 event.go:298] Event(v1.ObjectReference{Kind:"Namespace", Namespace:"openshift-machine-config-operator", Name:"openshift-machine-config-operator", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'FeatureGatesModified' FeatureGates updated to featuregates.Features{Enabled:[]v1.FeatureGateName{"AdminNetworkPolicy", "AlibabaPlatform", "AutomatedEtcdBackup", "AzureWorkloadIdentity", "BuildCSIVolumes", "CSIDriverSharedResource", "CloudDualStackNodeIPs", "DNSNameResolver", "DynamicResourceAllocation", "ExternalCloudProvider", "ExternalCloudProviderAzure", "ExternalCloudProviderExternal", "ExternalCloudProviderGCP", "GCPClusterHostedDNS", "GCPLabelsTags", "GatewayAPI", "InsightsConfigAPI", "InstallAlternateInfrastructureAWS", "MachineAPIProviderOpenStack", "MachineConfigNodes", "ManagedBootImages", "MaxUnavailableStatefulSet", "MetricsServer", "MixedCPUsAllocation", "NetworkLiveMigration", "NodeSwap", "OnClusterBuild", "OpenShiftPodSecurityAdmission", "PrivateHostedZoneAWS", "RouteExternalCertificate", "SignatureStores", "SigstoreImageVerification", "VSphereControlPlaneMachineSet", "VSphereStaticIPs", "ValidatingAdmissionPolicy"}, Disabled:[]v1.FeatureGateName{"ClusterAPIInstall", "DisableKubeletCloudCredentialProviders", "EventedPLEG", "MachineAPIOperatorDisableMachineHealthCheckController"}}
I1221 06:41:15.783234       1 machine_set_boot_image_controller.go:560] Reconciling machineset rioliu-1221c-h72r5-worker-b on GCP, with arch x86_64
I1221 06:41:15.787995       1 machine_set_boot_image_controller.go:585] New target boot image: projects/rhcos-cloud/global/images/rhcos-415-92-202311241643-0-gcp-x86-64
I1221 06:41:15.788017       1 machine_set_boot_image_controller.go:586] Current image: projects/rhcos-cloud/global/images/rhcos-415-92-202310310037-0-gcp-x86-64
I1221 06:41:15.788226       1 machine_set_boot_image_controller.go:402] Patching machineset rioliu-1221c-h72r5-worker-b
I1221 06:41:15.805047       1 machine_set_boot_image_controller.go:585] New target boot image: projects/rhcos-cloud/global/images/rhcos-415-92-202311241643-0-gcp-x86-64
I1221 06:41:15.805072       1 machine_set_boot_image_controller.go:586] Current image: projects/rhcos-cloud/global/images/rhcos-415-92-202310310037-0-gcp-x86-64
I1221 06:41:15.820665       1 machine_set_boot_image_controller.go:402] Patching machineset rioliu-1221c-h72r5-worker-a
I1221 06:41:15.958282       1 machine_set_boot_image_controller.go:560] Reconciling machineset rioliu-1221c-h72r5-worker-c on GCP, with arch x86_64
I1221 06:41:15.967055       1 machine_set_boot_image_controller.go:211] Error syncing machineset openshift-machine-api/rioliu-1221c-h72r5-worker-f: failed to fetch architecture type of machineset rioliu-1221c-h72r5-worker-f, err: could not find any machines linked to machineset, error: %!w(<nil>)
I1221 06:41:15.967131       1 machine_set_boot_image_controller.go:248] MachineSet rioliu-1221c-h72r5-worker-b updated, reconciling all machinesets
I1221 06:41:15.967146       1 machine_set_boot_image_controller.go:248] MachineSet rioliu-1221c-h72r5-worker-a updated, reconciling all machinesets
I1221 06:41:15.967197       1 machine_set_boot_image_controller.go:560] Reconciling machineset rioliu-1221c-h72r5-worker-a on GCP, with arch x86_64
I1221 06:41:15.968450       1 machine_set_boot_image_controller.go:405] No patching required for machineset rioliu-1221c-h72r5-worker-a
I1221 06:41:15.968510       1 machine_set_boot_image_controller.go:560] Reconciling machineset rioliu-1221c-h72r5-worker-b on GCP, with arch x86_64
I1221 06:41:15.968863       1 machine_set_boot_image_controller.go:585] New target boot image: projects/rhcos-cloud/global/images/rhcos-415-92-202311241643-0-gcp-x86-64
I1221 06:41:15.968875       1 machine_set_boot_image_controller.go:586] Current image: projects/rhcos-cloud/global/images/rhcos-415-92-202310310037-0-gcp-x86-64
I1221 06:41:15.968914       1 machine_set_boot_image_controller.go:402] Patching machineset rioliu-1221c-h72r5-worker-c

double check machinesets are patched with new boot image projects/rhcos-cloud/global/images/rhcos-415-92-202311241643-0-gcp-x86-64

machineset.machine.openshift.io/rioliu-1221c-h72r5-worker-a
projects/rhcos-cloud/global/images/rhcos-415-92-202311241643-0-gcp-x86-64
machineset.machine.openshift.io/rioliu-1221c-h72r5-worker-b
projects/rhcos-cloud/global/images/rhcos-415-92-202311241643-0-gcp-x86-64
machineset.machine.openshift.io/rioliu-1221c-h72r5-worker-c
projects/rhcos-cloud/global/images/rhcos-415-92-202311241643-0-gcp-x86-64
machineset.machine.openshift.io/rioliu-1221c-h72r5-worker-f -- no ready replica
projects/rhcos-cloud/global/images/rhcos-415-92-202310310037-0-gcp-x86-64

EDIT: use new cluster to do following steps
scale up machineset

$ oc scale --replicas=2 machineset/rioliu-1221f-zs9zf-worker-a -n openshift-machine-api
machineset.machine.openshift.io/rioliu-1221f-zs9zf-worker-a scaled

check boot image info on new provisioned node

$ oc logs -n openshift-machine-config-operator -c machine-config-daemon machine-config-daemon-lg8cm | grep -A6 'CoreOS aleph version'
I1221 12:08:08.708830    1504 coreos.go:53] CoreOS aleph version: mtime=2023-11-24 16:50:34.214 +0000 UTC
{
   "build": "415.92.202311241643-0",
   "imgid": "rhcos-415.92.202311241643-0-qemu.x86_64.qcow2",
   "ostree-commit": "3aff20eacec06af854303111319e74d9dc84c241af5c57dc8ae3330a8ae5b086",
   "ref": ""
}

@djoshy
Copy link
Contributor Author

djoshy commented Dec 21, 2023

/test unit

@rioliu-rh
Copy link

/hold cancel
/label qe-approved

@openshift-ci openshift-ci bot added qe-approved Signifies that QE has signed off on this PR and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Dec 21, 2023
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Dec 21, 2023

@djoshy: This pull request references MCO-679 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.

In response to this:

This implements openshift/enhancements#1496, for the GCP platform.

What does it do?

Adds a new subcontroller machine_set_boot_image_controller to the MCC, that

  • Listens on machineset changes or coreos-bootimages configmap changes
  • Determines architecture and infra of cluster to index the correct boot image
  • Updates machineset(s) with newer bootimages if there is a delta after 1 master node has completed the upgrade
  • Patches machine set via machineclient

How do I test this?

  • Launch a cluster on 4.15. Check the first few lines of the machine config daemon logs, and make a note of the node's aleph version. This is the original boot image.
I1218 20:35:06.196301    2320 rpm-ostree.go:308] Running captured: rpm-ostree status
I1218 20:35:06.240687    2320 daemon.go:1598] State: idle
Deployments:
* ostree-unverified-registry:registry.ci.openshift.org/ocp/4.16-2023-12-18-045249@sha256:a2900a99595cbf0cf3862d0066d37b2e95a95e032659469d59f616eb9752d3c0
                  Digest: sha256:a2900a99595cbf0cf3862d0066d37b2e95a95e032659469d59f616eb9752d3c0
                 Version: 416.92.202312150201-0 (2023-12-18T16:15:22Z)
I1218 20:35:06.241321    2320 coreos.go:53] CoreOS aleph version: mtime=2023-11-24 16:50:34.214 +0000 UTC
{
  "build": "415.92.202311241643-0",
  "imgid": "rhcos-415.92.202311241643-0-qemu.x86_64.qcow2",
  "ostree-commit": "3aff20eacec06af854303111319e74d9dc84c241af5c57dc8ae3330a8ae5b086",
  "ref": ""
}


  • Upgrade to this PR. This should create a version skew between boot images and current node image. During this upgrade, coreos-bootimages configmap will get updated by a CVO manifest to the newest boot images.
  • Turn on the tech preview feature gate by applying this YAML:
apiVersion: config.openshift.io/v1
kind: FeatureGate
metadata:
 name: cluster
spec:
 featureSet: TechPreviewNoUpgrade
  • This should trigger a machineset reconciliation loop. Observe logs on the MCC with this command:
oc logs -n openshift-machine-config-operator -f "$(oc get pod -o name -l='k8s-app=machine-config-controller' -n openshift-machine-config-operator)" | grep "machine_set"

You should see all the machinesets being patched:

I1218 20:31:52.657798       1 machine_set_boot_image_controller.go:143] "FeatureGates changed" enabled=["AdminNetworkPolicy","AlibabaPlatform","AutomatedEtcdBackup","AzureWorkloadIdentity","BuildCSIVolumes","CSIDriverSharedResource","CloudDualStackNodeIPs","DNSNameResolver","DynamicResourceAllocation","ExternalCloudProvider","ExternalCloudProviderAzure","ExternalCloudProviderExternal","ExternalCloudProviderGCP","GCPClusterHostedDNS","GCPLabelsTags","GatewayAPI","InsightsConfigAPI","InstallAlternateInfrastructureAWS","KMSv1","MachineAPIProviderOpenStack","MachineConfigNodes","ManagedBootImages","MaxUnavailableStatefulSet","MetricsServer","MixedCPUsAllocation","NetworkLiveMigration","NodeSwap","OnClusterBuild","OpenShiftPodSecurityAdmission","PrivateHostedZoneAWS","RouteExternalCertificate","SignatureStores","SigstoreImageVerification","VSphereControlPlaneMachineSet","VSphereStaticIPs","ValidatingAdmissionPolicy"] disabled=["ClusterAPIInstall","DisableKubeletCloudCredentialProviders","EventedPLEG","MachineAPIOperatorDisableMachineHealthCheckController"]
I1218 20:31:52.657849       1 machine_set_boot_image_controller.go:156] Trigger a sync as this feature was turned on
I1218 20:31:52.657964       1 machine_set_boot_image_controller.go:560] Reconciling machineset djoshy10-rvsmr-worker-a on GCP, with arch x86_64
I1218 20:31:52.658487       1 machine_set_boot_image_controller.go:560] Reconciling machineset djoshy10-rvsmr-worker-b on GCP, with arch x86_64
I1218 20:31:52.709865       1 machine_set_boot_image_controller.go:585] New target boot image: projects/rhcos-cloud/global/images/rhcos-415-92-202311241643-0-gcp-x86-64
I1218 20:31:52.709888       1 machine_set_boot_image_controller.go:586] Current image: projects/rhcos-cloud/global/images/rhcos-415-92-202311241643-gcp-x86-64
I1218 20:31:52.709961       1 machine_set_boot_image_controller.go:402] Patching machineset djoshy10-rvsmr-worker-a
I1218 20:31:52.711798       1 machine_set_boot_image_controller.go:405] No patching required for machineset djoshy10-rvsmr-worker-b

If this doesn't happen, or if logs say that no patching was required for the machinesets, then no skew was found between the boot images and the last upgrade.

  • Scale up a new node on any of the machinesets
oc scale --replicas=2 machineset <machineset> -n openshift-machine-api
  • Once the node is up and in the worker pool, observe the MCD logs of the newly scaled up node. At the daemon startup, the aleph version should now indicate the new boot image.

Other notes:
Once the feature gate is removed, this reconciliation would take place when at least 1 master node has completed an upgrade(and no other upgrades are pending). This is by design, as one would expect boot image updates to take place during an upgrade.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@rioliu-rh
Copy link

rioliu-rh commented Dec 22, 2023

seems like Cluster-Bot does not support to build arm64 image. maybe we only can test this feature with arm64 nightly in post-merge

Copy link
Contributor

@yuqi-zhang yuqi-zhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm for an MVP. Should be safe behind the featuregate and the remaining concerns are covered by the TODOs. Let's iterate on this

Thanks for working on this David!

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jan 6, 2024
Copy link
Contributor

openshift-ci bot commented Jan 6, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: djoshy, yuqi-zhang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 854a866 and 2 for PR HEAD bf9193a in total

@djoshy
Copy link
Contributor Author

djoshy commented Jan 8, 2024

/retest-required

Copy link
Contributor

openshift-ci bot commented Jan 8, 2024

@djoshy: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp-op-layering bf9193a link false /test e2e-gcp-op-layering
ci/prow/e2e-openstack bf9193a link false /test e2e-openstack

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@djoshy
Copy link
Contributor Author

djoshy commented Jan 8, 2024

/override ci/prow/e2e-gcp-op-single-node

Overriding as failure is unrelated and this PR is under feature gate. The test is currently being investigated as it has been broken for some time

Copy link
Contributor

openshift-ci bot commented Jan 8, 2024

@djoshy: Overrode contexts on behalf of djoshy: ci/prow/e2e-gcp-op-single-node

In response to this:

/override ci/prow/e2e-gcp-op-single-node

Overriding as failure is unrelated and this PR is under feature gate. The test is currently being investigated as it has been broken for some time

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-merge-bot openshift-merge-bot bot merged commit 7dae64a into openshift:master Jan 8, 2024
13 of 15 checks passed
@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

This PR has been included in build openshift-proxy-pull-test-container-v4.16.0-202401082351.p0.g7dae64a.assembly.stream for distgit openshift-proxy-pull-test.
All builds following this will include this PR.

@rioliu-rh
Copy link

rioliu-rh commented Jan 9, 2024

@djoshy shall we backport this PR to 4.15 branch? we already verified the case for arm64 with 4.16 build. thanks @sergiordlr

@djoshy
Copy link
Contributor Author

djoshy commented Jan 9, 2024

@djoshy shall we backport this PR to 4.15 branch? we already verified the case for arm64 with 4.16 build. thanks @sergiordlr

We don't typically backport features and this feature is only targeted for 4.16 as of now, so I don't think we need to (:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. qe-approved Signifies that QE has signed off on this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants