Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix image registry reconcile loop #1794

Merged
merged 1 commit into from
Oct 13, 2022

Conversation

davidvossel
Copy link
Contributor

@davidvossel davidvossel commented Oct 7, 2022

When setting the image registry storage to EmptyDir{}, a reconcile loop is occurring for the KubeVirt and None platforms.

The issue is that the the imageregistry.spec.storage struct has a managementState field that now receives an default value, but our HCCO reconcile loop clobbers that default value over and over again.

Example. HCCO sets

spec:                   
  httpSecret: f840c08af1fe8d5a9e18467f6f304afd39a83b282efa201a6a76c7b836c0938cbfdcb25f039606b7170bb7ae42a793d03170f1ada8740a823732c45f2d366fdb
  logLevel: Normal
  managementState: Managed
  observedConfig: null
  operatorLogLevel: Normal  
  proxy: {}
  replicas: 1
  requests: 
    read:
      maxWaitInQueue: 0s
    write:
      maxWaitInQueue: 0s                   
  storage:
    emptyDir: {}      

Then the image registry gets the spec.storage.managementState: Managed field defaulted.

spec:                   
  httpSecret: f840c08af1fe8d5a9e18467f6f304afd39a83b282efa201a6a76c7b836c0938cbfdcb25f039606b7170bb7ae42a793d03170f1ada8740a823732c45f2d366fdb
  logLevel: Normal
  managementState: Managed
  observedConfig: null
  operatorLogLevel: Normal  
  proxy: {}
  replicas: 1
  requests: 
    read:
      maxWaitInQueue: 0s
    write:
      maxWaitInQueue: 0s                   
  storage:
    emptyDir: {}      
    managementState: Managed      

After that, the HCCO then clobbers that storage field again and removes the managementState default, then the default gets added back, then HCCO clobbers it, etc... reconcile loop.

To fix this, we should only cobber the storage field if the EmptyDir{} storage type isn't already in use.

@enxebre
Copy link
Member

enxebre commented Oct 10, 2022

Thanks! @davidvossel the specs above in the PR desc seems identical, am I missing anything?

@sjenning shouldn't loopDetector have captured this?

@@ -25,7 +25,7 @@ func ReconcileRegistryConfig(cfg *imageregistryv1.Config, platform hyperv1.Platf
if cfg.Spec.HTTPSecret == "" {
cfg.Spec.HTTPSecret = generateImageRegistrySecret()
}
if platform == hyperv1.KubevirtPlatform || platform == hyperv1.NonePlatform {
if platform == hyperv1.KubevirtPlatform || platform == hyperv1.NonePlatform && cfg.Spec.Storage.EmptyDir == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you mean

(platform == hyperv1.KubevirtPlatform || platform == hyperv1.NonePlatform) && cfg.Spec.Storage.EmptyDir == nil

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ouch. yes i did

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, fixed!

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 10, 2022
@sjenning
Copy link
Contributor

@enxebre we only have loop detector on clients to the management KAS, not the guest KAS

@sjenning
Copy link
Contributor

actually I'm wrong, we do have loop detector on the HCCO guest cluster connection too

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Oct 13, 2022
@davidvossel
Copy link
Contributor Author

Thanks! @davidvossel the specs above in the PR desc seems identical, am I missing anything?

nope, that was a copy paste error. I was trying to show that the managementState field was defaulted in the storage in the second one. The description has been updated now

@enxebre
Copy link
Member

enxebre commented Oct 13, 2022

Thanks David!
/approve
/lgtm
/hold

actually I'm wrong, we do have loop detector on the HCCO guest cluster connection too

For @sjenning to have the chance to have another a look

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 13, 2022
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 13, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 13, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: davidvossel, enxebre

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 13, 2022
@sjenning
Copy link
Contributor

lgtm
/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 13, 2022
@davidvossel
Copy link
Contributor Author

/retest-required

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 13, 2022

@davidvossel: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/capi-provider-agent-sanity 8b42849 link false /test capi-provider-agent-sanity
ci/prow/e2e-ibmcloud-roks 8b42849 link false /test e2e-ibmcloud-roks
ci/prow/e2e-ibmcloud-iks 8b42849 link false /test e2e-ibmcloud-iks

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit 32eb1e2 into openshift:main Oct 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants