Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ClusterClass rebase should check variable compatibility #8154

Open
killianmuldoon opened this issue Feb 22, 2023 · 5 comments
Open

ClusterClass rebase should check variable compatibility #8154

killianmuldoon opened this issue Feb 22, 2023 · 5 comments
Labels
area/clusterclass Issues or PRs related to clusterclass help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@killianmuldoon
Copy link
Contributor

PR #8153 removes the variable compatibility checks for Clusters using a ClusterClass when that class is being updated in place. This change is a consequence of external variables.

During ClusterClass rebase - moving a Cluster from one ClusterClass to another - similar variable checks could be done allowing users to safely update from one set of variables to another while changing ClusterClass. Currently validation of those variables is not done during ClusterClass rebase.

This validation should be added to give rebase webhook based validation of variable changes.

/area topology
/kind feature

@k8s-ci-robot k8s-ci-robot added area/topology kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Feb 22, 2023
@sbueringer
Copy link
Member

sbueringer commented Feb 23, 2023

I think we're very close to having that already

Let's assume we have a Cluster update call which changes the ClusterClass.

Looking at this code:

clusterClass, clusterClassPollErr := webhook.pollClusterClassForCluster(ctx, newCluster)
if clusterClassPollErr != nil &&
// If the error is anything other than "NotFound" or "NotReconciled" return all errors at this point.
!(apierrors.IsNotFound(clusterClassPollErr) || errors.Is(clusterClassPollErr, errClusterClassNotReconciled)) {
allErrs = append(
allErrs, field.InternalError(
fldPath.Child("class"),
clusterClassPollErr))
return allErrs
}
if clusterClassPollErr == nil {
// If there's no error validate the Cluster based on the ClusterClass.
allErrs = append(allErrs, ValidateClusterForClusterClass(newCluster, clusterClass, fldPath)...)
}
if oldCluster != nil { // On update
// The ClusterClass must exist to proceed with update validation. Return an error if the ClusterClass was
// not found.
if apierrors.IsNotFound(clusterClassPollErr) {
allErrs = append(
allErrs, field.InternalError(
fldPath.Child("class"),
clusterClassPollErr))
return allErrs
}

As soon as oldCluster is != nil we don't accept not found errors right now. If we now change the code to

	if oldCluster != nil { // On update
		if clusterClassPollErr != nil {
			allErrs = append(
				allErrs, field.InternalError(
					fldPath.Child("class"),
					clusterClassPollErr))
			return allErrs
		}

We can ensure that either ValidateClusterForClusterClass is run in the update case or we return with an error because the new ClusterClass is either not found or not reconciled.

I think ValidateClusterForClusterClass already does all that we need, it ensures that the variables in new Cluster are valid according to the new ClusterClass. I don't think we have to do any additional validation with the old ClusterClass.

I think this is also in line with our idea that for a rebase old and new ClusterClass have to exist and be successfully reconciled.

@fabriziopandini
Copy link
Member

/triage accepted

I think we should have strict validation for the rebase operation and block if CC is not ready or other issues; we should also check if we are doing default, so new variables with defaults are added (this can be also a separate PR).

The only issue that I see is when ClusterClass introduces some breaking changes on variables because in this case rebase won't be possible. But I think that this is a bigger discussion and soon or later we have to start thinking about some sort of support for variable conversion during rebase.

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Feb 23, 2023
@killianmuldoon killianmuldoon added the area/clusterclass Issues or PRs related to clusterclass label May 4, 2023
@killianmuldoon
Copy link
Contributor Author

/help

@k8s-ci-robot
Copy link
Contributor

@killianmuldoon:
This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

  • Why are we solving this issue?
  • To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
  • Does this issue have zero to low barrier of entry?
  • How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Nov 7, 2023
@fabriziopandini
Copy link
Member

/priority important-longterm

@k8s-ci-robot k8s-ci-robot added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Apr 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/clusterclass Issues or PRs related to clusterclass help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

4 participants