Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 [0.2] Exclude conversion-data annotation when MD reconciles #3010

Merged

Conversation

vincepri
Copy link
Member

@vincepri vincepri commented May 5, 2020

Signed-off-by: Vince Prignano [email protected]

What this PR does / why we need it:
This PR fixes an issue that was causing controller to fight with each other when a management cluster had both v1alpha2 and v1alpha3 running.

The MachineDeployment controller has has logic to sync annotations from a MachineDeployment to its linked MachineSets as seen in:

annotationsUpdated := mdutil.SetNewMachineSetAnnotations(d, msCopy, newRevision, true, logger)

The SetNewMachineSetAnnotations (regardless of what the names says) actually syncs up annotations on a new MachineSet or an existing one, in particular when this function is called

annotationChanged := copyDeploymentAnnotationsToMachineSet(deployment, newMS)
annotation are copied, and if there has been changes it returns true and the MachineSet is ultimately patched.

All annotations are usually copied, with the exception for some listed in

var annotationsToSkip = map[string]bool{
v1.LastAppliedConfigAnnotation: true,
clusterv1.RevisionAnnotation: true,
clusterv1.RevisionHistoryAnnotation: true,
clusterv1.DesiredReplicasAnnotation: true,
clusterv1.MaxReplicasAnnotation: true,
}
.

With the release of v1alpha3, we added the conversion-data annotation that is used internally to restore fields that would otherwise be lost during the conversion to an older version. This annotation caused the MD to always think that a change needs to be made:

  1. MachineDeployment sees that the MachineSet has the wrong conversion-data value (because it's set from the conversion webhook)
  2. MachineDeployment copies its own conversion-data (as shown above), increases the revision history
    newMS.Annotations[clusterv1.RevisionHistoryAnnotation] = strings.Join(oldRevisions, ",")
    , and issues a patch.
  3. Webhook conversion kicks in, but removes the conversion-data field
    delete(from.GetAnnotations(), DataAnnotation)
  4. Webhook saves the updated object in etcd.
  5. Webhook conversion kicks in and converts the object from v1alpha3 to v1alpha2, overwriting the conversion-data field.
  6. Go to step 1

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 5, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vincepri

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested review from chuckha and ncdc May 5, 2020 21:42
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels May 5, 2020
@vincepri
Copy link
Member Author

vincepri commented May 5, 2020

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 5, 2020
@vincepri
Copy link
Member Author

vincepri commented May 5, 2020

/milestone v0.2.x

@k8s-ci-robot k8s-ci-robot added this to the v0.2.x milestone May 5, 2020
@vincepri
Copy link
Member Author

vincepri commented May 5, 2020

/assign @ncdc

@vincepri
Copy link
Member Author

vincepri commented May 5, 2020

We'll need to forward-port this fix as well

@DheerajSShetty
Copy link

Without this change v1alpha2 MachineDeployments in a management cluster running capi-webhooks would fail to resize, update.
Used the fix here to check Creation and upgrade of a MachineDeployment and it worked fine

@ncdc
Copy link
Contributor

ncdc commented May 6, 2020

@vincepri can you explain the details about the fighting? I also think a comment in the code about why we're including this annotation in the skip list would be helpful for future-us 😄

@vincepri
Copy link
Member Author

vincepri commented May 6, 2020

@ncdc updated the first post explaining my findings, I'll update and add a comment to the code as well

@vincepri vincepri force-pushed the bug-alpha2-annotatione branch from 1990ab5 to 688528e Compare May 6, 2020 15:31
@vincepri
Copy link
Member Author

vincepri commented May 6, 2020

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 6, 2020
@ncdc
Copy link
Contributor

ncdc commented May 6, 2020

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 6, 2020
@k8s-ci-robot k8s-ci-robot merged commit 2263f41 into kubernetes-sigs:release-0.2 May 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants