Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🏃Remove annotations on upgradeControlPlane #2887

Merged
merged 1 commit into from
Apr 16, 2020

Conversation

benmoss
Copy link

@benmoss benmoss commented Apr 9, 2020

What this PR does / why we need it:
Removes use of annotations in upgrade logic. We were duplicating the logic of selecting the machine in the scale down code, and it also causes bugs like #2430 where the cached state stored in the annotation gets out of sync with the world.

This logic working is dependent on the fact that scaleUp / TargetClusterControlPlaneIsHealthy ensures there's a 1:1 correspondence of nodes and machines so we don't continue scaling up while new machines are still joining the cluster.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #2702

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 9, 2020
@benmoss benmoss changed the title 🏃Remove annotations 🏃Remove annotations on upgradeControlPlane Apr 9, 2020
Copy link
Contributor

@michaelgugino michaelgugino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is linked to my PR rather than an issue on the last line, and it going to close my PR. Please update the first comment to point to the correct issue.

@detiber
Copy link
Member

detiber commented Apr 9, 2020

This PR is linked to my PR rather than an issue on the last line, and it going to close my PR. Please update the first comment to point to the correct issue.

I updated the reference

@vincepri
Copy link
Member

vincepri commented Apr 9, 2020

/assign @fabriziopandini @sedefsavas @yastij

for extra review 👀

@vincepri
Copy link
Member

vincepri commented Apr 9, 2020

/milestone v0.3.4

@k8s-ci-robot k8s-ci-robot added this to the v0.3.4 milestone Apr 9, 2020
@benmoss
Copy link
Author

benmoss commented Apr 9, 2020

Sorry about that, not sure how I messed that up

@benmoss benmoss force-pushed the remove-annotations branch from 7e9daef to 8b550ef Compare April 9, 2020 17:51
@benmoss
Copy link
Author

benmoss commented Apr 9, 2020

/hold
I realize I am not including all the logic that was in selectAndMarkMachine

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 9, 2020
@benmoss
Copy link
Author

benmoss commented Apr 9, 2020

/hold cancel
actually nevermind, I just can't remove that from scaleDownControlPlane 😌

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 9, 2020
Copy link
Member

@vincepri vincepri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chatted yesterday with @benmoss, this PR is only a first step in removing the annotations. We'll follow-up with more changes, more specifically

  • Add the concept of outdated machines in the filters.
  • Make scaleDownControlPlane to take only the ownedMachines as input, and decide on its own which Machine should be scaled down
  • Picking a Machine for scale down will happen in selectAndMarkMachine (which should probably be renamed)
    • Modify the logic to first look at the outdated Machines. If there is any outdated Machines, use this list as collection.
    • If there is no outdated Machines, pick the oldest one in the FailureDomain with most machines.

@benmoss
Copy link
Author

benmoss commented Apr 10, 2020

Going to update this to remove all annotations, there's also a bug right now with selectedMachines / ownedMachines in scale down
/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 10, 2020
@benmoss benmoss force-pushed the remove-annotations branch from c5d2e8e to 8ff8401 Compare April 15, 2020 13:27
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 15, 2020
@benmoss benmoss force-pushed the remove-annotations branch from 8ff8401 to a045cbc Compare April 15, 2020 13:29
Upgrade logic no longer uses machine annotations

Logic is now:
- Check number of nodes in workload cluster
- If node count <= replicas, create new upgrade machine
- If node count > replicas, scale down

Scale up logic ensures that we don't create additional machines if we
reconcile while waiting for an upgrade machine to appear in the node
list

Scale up should only consider machines needing upgrade

We never support upgrading a subset of the cluster, but this will ensure
that we pick the FD that has the most machines needing upgrade, rather
than just the FD with the most machines.

Also add a comment to explain why scale up will not cause more than 1
machine to be created

Scale down always scales down outdated machines first

This removes the need to pass through outdated machines
@benmoss benmoss force-pushed the remove-annotations branch from a045cbc to dd02966 Compare April 15, 2020 13:33
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 15, 2020
@benmoss
Copy link
Author

benmoss commented Apr 15, 2020

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 15, 2020
Copy link
Member

@vincepri vincepri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: benmoss, vincepri

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 16, 2020
Copy link
Member

@fabriziopandini fabriziopandini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benmoss looks great!
only few questions from my side to double-check some code steps

return ctrl.Result{}, &capierrors.RequeueAfterError{RequeueAfter: healthCheckFailedRequeueAfter}

}
if err := workloadCluster.RemoveMachineFromKubeadmConfigMap(ctx, machineToDelete); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double-check this.
if there is an error after removing the machine from the config map but before deleting, when re-entering this code machineToDelete will be the same, right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that can be guaranteed, since a user could encounter an error and change their cluster such that the next reconciliation doesn't retry scale down. At that point any number of changes can happen before the next scale down (like more scale up) such that the partially removed machine isn't the same one removed.


func (r *KubeadmControlPlaneReconciler) selectAndMarkMachine(ctx context.Context, machines internal.FilterableMachineCollection, annotation string, controlPlane *internal.ControlPlane) (*clusterv1.Machine, error) {
selected, err := controlPlane.MachineInFailureDomainWithMostMachines(machines)
status, err := workloadCluster.ClusterStatus(ctx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is not part of this PR, but ClusterStatus confused me.
It is ControlPlane status (workers are not considered)...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I think this is absolutely true, I can't believe this didn't occur to me earlier

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, nevermind! I thought your point was that I am looking at all nodes, and not just control plane nodes, and this was a huge bug 🤦 . But in fact it's that this method is only for control plane nodes, even though it seems more like it should be for all nodes.

if needingUpgrade := controlPlane.MachinesNeedingUpgrade(); needingUpgrade.Len() > 0 {
machines = needingUpgrade
}
return controlPlane.MachineInFailureDomainWithMostMachines(machines)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm not wrong, in case of upgrades, this is considering only the machines to upgrades, not all the control-plane machines.
Is it possible this leads to an un-unexpected machine distribution? probably this is compensated by scale-up always looking at all the machines for placement...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it will change the machine distribution. We scale down the FD with the most un-upgraded machines, but we scale back up by smallest FD overall

Here's a run-through of what I understand the algorithm to be: https://gist.github.com/benmoss/9914fb4e09e1e4fed4a651119e983298

}

return selected, nil
if status.Nodes <= *kcp.Spec.Replicas {
// scaleUp ensures that we don't continue scaling up while waiting for Machines to have NodeRefs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just for my knowledge, where is this logic? I cant' find it...

Copy link
Author

@benmoss benmoss Apr 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benmoss
Copy link
Author

benmoss commented Apr 16, 2020

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 16, 2020
@benmoss
Copy link
Author

benmoss commented Apr 16, 2020

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 16, 2020
@vincepri
Copy link
Member

/test pull-cluster-api-apidiff

@k8s-ci-robot
Copy link
Contributor

@benmoss: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
pull-cluster-api-apidiff dd02966 link /test pull-cluster-api-apidiff

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@vincepri
Copy link
Member

API changes looks good, it's all internal

@fabriziopandini
Copy link
Member

@benmoss thanks for your answers!
for #2887 (comment), we should consider having a separate reconciliation loop to ensure the list of endpoints gets back in sync, but this is out of scope of this PR
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 16, 2020
@k8s-ci-robot k8s-ci-robot merged commit 04d9db7 into kubernetes-sigs:master Apr 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

KubeadmControlPlane controller shouldn't rely on annotations to operate
8 participants