-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ Compute ControlPlane version for a managed topology #5059
✨ Compute ControlPlane version for a managed topology #5059
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
5a8000d
to
b19e4c5
Compare
/retest |
1 similar comment
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good - just a couple of questions
b19e4c5
to
c8caf3d
Compare
Signed-off-by: Stefan Büringer [email protected]
c8caf3d
to
81e57c6
Compare
Rebased on top of current main. |
/retest |
/area topology |
return currentTopologyVersion, nil | ||
} | ||
|
||
currentControlPlaneVersion, err := getControlPlaneVersion(current.controlPlane.object) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currentControlPlaneVersion, err := getControlPlaneVersion(current.controlPlane.object) | |
version, err := getControlPlaneVersion(current.controlPlane.object) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dropped the current prefix everywhere. Now I have:
- topologyVersion
- controlPlaneVersion
- controlPlaneStatusVersion
I think in this func the prefixes are useful to keep the logic as easily readable as possible.
If you prefer the following seems like a good alternative to me:
- topologyVersion
- version
- statusVersion
I guess the missing prefix in combination with the func name should provide enough context to infer version is the controlPlaneVersion.
WDYT?
// ControlPlane already has the currentTopologyVersion. | ||
if currentControlPlaneVersion == currentTopologyVersion { | ||
return currentTopologyVersion, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This return is the exact same as the last one, can we simplify and merge them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can do that. I thought it might be easier to read and understand if I spell out the different cases explicitly (especially as there will be a few more).
I combined those two and added a comment which contains both cases.
return nil, errors.Wrap(err, "failed to set spec.version in the ControlPlane object") | ||
} | ||
|
||
return controlPlane, nil | ||
} | ||
|
||
// computeControlPlaneVersion computes the ControlPlane version based on the current clusterTopologyState. | ||
// TODO: we also have to handle the following cases: | ||
// * ControlPlane.spec.version != ControlPlane.status.version, i.e. ControlPLane rollout is already in progress. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably handle this now within this function, is there any reason we're holding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talked to @fabriziopandini and we thought it might be good idea to keep it easy for now as there are all kinds of things that can go wrong when the ClusterClass version is changed again during a rollout (either when ControlPlane is not yet rolled out completely and also if at least a subset of MachineDeployments are not rolled out yet).
Only adding the case in l.156 should be relatively simple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the case from l.156.
If we also want to add l.157 I would prefer waiting for the MachineDeployment compute and reconcile PRs to be merged first. Then I would also implement computeMachineDeploymentVersions in this PR. This should make it easier to reason about the overall cluster state during upgrades and all edge cases.
// Do not trigger another rollout, while a rollout is already in progress. A rollout is still in progress if: | ||
// * .status.version is not set | ||
// * .status.version is set but not equal to .spec.version | ||
if !statusVersionSet || controlPlaneVersion != controlPlaneStatusVersion { | ||
return controlPlaneVersion, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is obviously based on the assumption that we don't want to trigger another rollout while one is in progress.
I'm not sure if that's what we want.
/hold for now. No further review required, will be reopened with a slightly different scope. |
@sbueringer: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Superseded by #5178 |
Signed-off-by: Stefan Büringer [email protected]
What this PR does / why we need it:
I intentionally did not implement the more complicated cases when a previous rollout is not completely done and the next upgrade is triggered. The idea behind this is to get the base case stable and tested (including e2e) before we expand the implementation to more advanced use cases.
Notes:
computeMachineDeployments
code has been implemented in Implement generate MachineDeployments and their referenced objects for a managed topology #4999.Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Partially implements #5016