Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release-1.4] ✨ add support or concurrent MD upgrades in classy clusters #8528

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions api/v1beta1/common_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,10 @@ const (
// will not be completed until the annotation is removed and all MachineDeployments are upgraded.
ClusterTopologyDeferUpgradeAnnotation = "topology.cluster.x-k8s.io/defer-upgrade"

// ClusterTopologyUpgradeConcurrencyAnnotation can be set as top-level annotation on the Cluster object of
// a classy Cluster to define the maximum concurrency while upgrading MachineDeployments.
ClusterTopologyUpgradeConcurrencyAnnotation = "topology.cluster.x-k8s.io/upgrade-concurrency"

// ClusterTopologyUnsafeUpdateClassNameAnnotation can be used to disable the webhook check on
// update that disallows a pre-existing Cluster to be populated with Topology information and Class.
ClusterTopologyUnsafeUpdateClassNameAnnotation = "unsafe.topology.cluster.x-k8s.io/disable-update-class-name-check"
Expand Down
1 change: 1 addition & 0 deletions docs/book/src/reference/labels_and_annotations.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@
| topology.cluster.x-k8s.io/defer-upgrade | It can be used to defer the Kubernetes upgrade of a single MachineDeployment topology. If the annotation is set on a MachineDeployment topology in Cluster.spec.topology.workers, the Kubernetes upgrade for this MachineDeployment topology is deferred. It doesn't affect other MachineDeployment topologies. |
| topology.cluster.x-k8s.io/dry-run | It is an annotation that gets set on objects by the topology controller only during a server side dry run apply operation. It is used for validating update webhooks for objects which get updated by template rotation (e.g. InfrastructureMachineTemplate). When the annotation is set and the admission request is a dry run, the webhook should deny validation due to immutability. By that the request will succeed (without any changes to the actual object because it is a dry run) and the topology controller will receive the resulting object. |
| topology.cluster.x-k8s.io/hold-upgrade-sequence | It can be used to hold the entire MachineDeployment upgrade sequence. If the annotation is set on a MachineDeployment topology in Cluster.spec.topology.workers, the Kubernetes upgrade for this MachineDeployment topology and all subsequent ones is deferred. |
| topology.cluster.x-k8s.io/upgrade-concurrency | It can be used to configure the maximum concurrency while upgrading MachineDeployments of a classy Cluster. It is set as a top level annotation on the Cluster object. The value should be >= 1. If unspecified the upgrade concurrency will default to 1. |
| machine.cluster.x-k8s.io/certificates-expiry | It captures the expiry date of the machine certificates in RFC3339 format. It is used to trigger rollout of control plane machines before certificates expire. It can be set on BootstrapConfig and Machine objects. The value set on Machine object takes precedence. The annotation is only used by control plane machines. |
| machine.cluster.x-k8s.io/exclude-node-draining | It explicitly skips node draining if set. |
| machine.cluster.x-k8s.io/exclude-wait-for-node-volume-detach | It explicitly skips the waiting for node volume detaching if set. |
Expand Down
8 changes: 0 additions & 8 deletions internal/controllers/topology/cluster/desired_state.go
Original file line number Diff line number Diff line change
Expand Up @@ -841,14 +841,6 @@ func computeMachineDeploymentVersion(s *scope.Scope, machineDeploymentTopology c
return currentVersion, nil
}

// At this point the control plane is stable (not scaling, not upgrading, not being upgraded).
// Checking to see if the machine deployments are also stable.
// If any of the MachineDeployments is rolling out, do not upgrade the machine deployment yet.
if s.Current.MachineDeployments.IsAnyRollingOut() {
s.UpgradeTracker.MachineDeployments.MarkPendingUpgrade(currentMDState.Object.Name)
return currentVersion, nil
}

// Control plane and machine deployments are stable.
// Ready to pick up the topology version.
s.UpgradeTracker.MachineDeployments.MarkRollingOut(currentMDState.Object.Name)
Expand Down
142 changes: 93 additions & 49 deletions internal/controllers/topology/cluster/desired_state_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1635,23 +1635,25 @@ func TestComputeMachineDeployment(t *testing.T) {
}).
Build()

machineDeploymentStable := builder.MachineDeployment("test-namespace", "md-1").
machineDeploymentStable := builder.MachineDeployment("test-namespace", "md-stable").
WithGeneration(1).
WithReplicas(2).
WithStatus(clusterv1.MachineDeploymentStatus{
ObservedGeneration: 2,
Replicas: 2,
ReadyReplicas: 2,
UpdatedReplicas: 2,
AvailableReplicas: 2,
}).
Build()

machineDeploymentRollingOut := builder.MachineDeployment("test-namespace", "md-1").
machineDeploymentRollingOut := builder.MachineDeployment("test-namespace", "md-rolling").
WithGeneration(1).
WithReplicas(2).
WithStatus(clusterv1.MachineDeploymentStatus{
ObservedGeneration: 2,
Replicas: 1,
ReadyReplicas: 1,
UpdatedReplicas: 1,
AvailableReplicas: 1,
}).
Expand All @@ -1669,28 +1671,46 @@ func TestComputeMachineDeployment(t *testing.T) {
name string
machineDeploymentsState scope.MachineDeploymentsStateMap
currentMDVersion *string
upgradeConcurrency string
topologyVersion string
expectedVersion string
}{
{
name: "use cluster.spec.topology.version if creating a new machine deployment",
machineDeploymentsState: nil,
upgradeConcurrency: "1",
currentMDVersion: nil,
topologyVersion: "v1.2.3",
expectedVersion: "v1.2.3",
},
{
name: "use machine deployment's spec.template.spec.version if one of the machine deployments is rolling out",
name: "use machine deployment's spec.template.spec.version if one of the machine deployments is rolling out, concurrency limit reached",
machineDeploymentsState: machineDeploymentsStateRollingOut,
upgradeConcurrency: "1",
currentMDVersion: pointer.String("v1.2.2"),
topologyVersion: "v1.2.3",
expectedVersion: "v1.2.2",
},
{
name: "use cluster.spec.topology.version if one of the machine deployments is rolling out, concurrency limit not reached",
machineDeploymentsState: machineDeploymentsStateRollingOut,
upgradeConcurrency: "2",
currentMDVersion: pointer.String("v1.2.2"),
topologyVersion: "v1.2.3",
expectedVersion: "v1.2.3",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
g := NewWithT(t)
s := scope.New(cluster)

testCluster := cluster.DeepCopy()
if testCluster.Annotations == nil {
testCluster.Annotations = map[string]string{}
}
testCluster.Annotations[clusterv1.ClusterTopologyUpgradeConcurrencyAnnotation] = tt.upgradeConcurrency

s := scope.New(testCluster)
s.Blueprint = blueprint
s.Blueprint.Topology.Version = tt.topologyVersion
s.Blueprint.Topology.ControlPlane = clusterv1.ControlPlaneTopology{
Expand All @@ -1709,6 +1729,7 @@ func TestComputeMachineDeployment(t *testing.T) {
WithStatus(clusterv1.MachineDeploymentStatus{
ObservedGeneration: 2,
Replicas: 2,
ReadyReplicas: 2,
UpdatedReplicas: 2,
AvailableReplicas: 2,
}).
Expand All @@ -1722,6 +1743,7 @@ func TestComputeMachineDeployment(t *testing.T) {
s.Current.ControlPlane = &scope.ControlPlaneState{
Object: controlPlaneStable123,
}
s.UpgradeTracker.MachineDeployments.MarkRollingOut(s.Current.MachineDeployments.RollingOut()...)
desiredControlPlaneState := &scope.ControlPlaneState{
Object: controlPlaneStable123,
}
Expand Down Expand Up @@ -1828,45 +1850,55 @@ func TestComputeMachineDeploymentVersion(t *testing.T) {
//
// A machine deployment is considered upgrading if any of the above conditions
// is false.
machineDeploymentStable := builder.MachineDeployment("test-namespace", "md-1").
WithGeneration(1).
WithReplicas(2).
WithStatus(clusterv1.MachineDeploymentStatus{
ObservedGeneration: 2,
Replicas: 2,
UpdatedReplicas: 2,
AvailableReplicas: 2,
ReadyReplicas: 2,
UnavailableReplicas: 0,
}).
Build()
machineDeploymentRollingOut := builder.MachineDeployment("test-namespace", "md-2").
WithGeneration(1).
WithReplicas(2).
WithStatus(clusterv1.MachineDeploymentStatus{
ObservedGeneration: 2,
Replicas: 1,
UpdatedReplicas: 1,
AvailableReplicas: 1,
ReadyReplicas: 1,
UnavailableReplicas: 1,
}).
Build()
stableMachineDeployment := func(ns, name string) *clusterv1.MachineDeployment {
return builder.MachineDeployment(ns, name).
WithGeneration(1).
WithReplicas(2).
WithStatus(clusterv1.MachineDeploymentStatus{
ObservedGeneration: 2,
Replicas: 2,
UpdatedReplicas: 2,
AvailableReplicas: 2,
ReadyReplicas: 2,
UnavailableReplicas: 0,
}).
Build()
}

machineDeploymentsStateStable := scope.MachineDeploymentsStateMap{
"md1": &scope.MachineDeploymentState{Object: machineDeploymentStable},
"md2": &scope.MachineDeploymentState{Object: machineDeploymentStable},
rollingMachineDeployment := func(ns, name string) *clusterv1.MachineDeployment {
return builder.MachineDeployment(ns, name).
WithGeneration(1).
WithReplicas(2).
WithStatus(clusterv1.MachineDeploymentStatus{
ObservedGeneration: 2,
Replicas: 1,
UpdatedReplicas: 1,
AvailableReplicas: 1,
ReadyReplicas: 1,
UnavailableReplicas: 1,
}).
Build()
}

twoMachineDeploymentsStateStable := scope.MachineDeploymentsStateMap{
"md1": &scope.MachineDeploymentState{Object: stableMachineDeployment("test1", "md1")},
"md2": &scope.MachineDeploymentState{Object: stableMachineDeployment("test1", "md2")},
}
oneStableOneRollingMachineDeploymentState := scope.MachineDeploymentsStateMap{
"md1": &scope.MachineDeploymentState{Object: stableMachineDeployment("test1", "md1")},
"md2": &scope.MachineDeploymentState{Object: rollingMachineDeployment("test1", "md2")},
}
machineDeploymentsStateRollingOut := scope.MachineDeploymentsStateMap{
"md1": &scope.MachineDeploymentState{Object: machineDeploymentStable},
"md2": &scope.MachineDeploymentState{Object: machineDeploymentRollingOut},
twoRollingMachineDeploymentState := scope.MachineDeploymentsStateMap{
"md1": &scope.MachineDeploymentState{Object: rollingMachineDeployment("test1", "md1")},
"md2": &scope.MachineDeploymentState{Object: rollingMachineDeployment("test1", "md2")},
}

tests := []struct {
name string
machineDeploymentTopology clusterv1.MachineDeploymentTopology
currentMachineDeploymentState *scope.MachineDeploymentState
machineDeploymentsStateMap scope.MachineDeploymentsStateMap
upgradeConcurrency int
currentControlPlane *unstructured.Unstructured
desiredControlPlane *unstructured.Unstructured
topologyVersion string
Expand All @@ -1889,16 +1921,7 @@ func TestComputeMachineDeploymentVersion(t *testing.T) {
},
},
currentMachineDeploymentState: &scope.MachineDeploymentState{Object: builder.MachineDeployment("test1", "md-current").WithVersion("v1.2.2").Build()},
machineDeploymentsStateMap: machineDeploymentsStateStable,
currentControlPlane: controlPlaneStable123,
desiredControlPlane: controlPlaneDesired,
topologyVersion: "v1.2.3",
expectedVersion: "v1.2.2",
},
{
name: "should return machine deployment's spec.template.spec.version if any one of the machine deployments is rolling out",
currentMachineDeploymentState: &scope.MachineDeploymentState{Object: builder.MachineDeployment("test1", "md-current").WithVersion("v1.2.2").Build()},
machineDeploymentsStateMap: machineDeploymentsStateRollingOut,
machineDeploymentsStateMap: twoMachineDeploymentsStateStable,
currentControlPlane: controlPlaneStable123,
desiredControlPlane: controlPlaneDesired,
topologyVersion: "v1.2.3",
Expand All @@ -1908,7 +1931,7 @@ func TestComputeMachineDeploymentVersion(t *testing.T) {
// Control plane is considered upgrading if the control plane's spec.version and status.version is not equal.
name: "should return machine deployment's spec.template.spec.version if control plane is upgrading",
currentMachineDeploymentState: &scope.MachineDeploymentState{Object: builder.MachineDeployment("test1", "md-current").WithVersion("v1.2.2").Build()},
machineDeploymentsStateMap: machineDeploymentsStateStable,
machineDeploymentsStateMap: twoMachineDeploymentsStateStable,
currentControlPlane: controlPlaneUpgrading,
topologyVersion: "v1.2.3",
expectedVersion: "v1.2.2",
Expand All @@ -1917,7 +1940,7 @@ func TestComputeMachineDeploymentVersion(t *testing.T) {
// Control plane is considered ready to upgrade if spec.version of current and desired control planes are not equal.
name: "should return machine deployment's spec.template.spec.version if control plane is ready to upgrade",
currentMachineDeploymentState: &scope.MachineDeploymentState{Object: builder.MachineDeployment("test1", "md-current").WithVersion("v1.2.2").Build()},
machineDeploymentsStateMap: machineDeploymentsStateStable,
machineDeploymentsStateMap: twoMachineDeploymentsStateStable,
currentControlPlane: controlPlaneStable122,
desiredControlPlane: controlPlaneDesired,
topologyVersion: "v1.2.3",
Expand All @@ -1927,20 +1950,40 @@ func TestComputeMachineDeploymentVersion(t *testing.T) {
// Control plane is considered scaling if its spec.replicas is not equal to any of status.replicas, status.readyReplicas or status.updatedReplicas.
name: "should return machine deployment's spec.template.spec.version if control plane is scaling",
currentMachineDeploymentState: &scope.MachineDeploymentState{Object: builder.MachineDeployment("test1", "md-current").WithVersion("v1.2.2").Build()},
machineDeploymentsStateMap: machineDeploymentsStateStable,
machineDeploymentsStateMap: twoMachineDeploymentsStateStable,
currentControlPlane: controlPlaneScaling,
topologyVersion: "v1.2.3",
expectedVersion: "v1.2.2",
},
{
name: "should return cluster.spec.topology.version if the control plane is not upgrading, not scaling, not ready to upgrade and none of the machine deployments are rolling out",
currentMachineDeploymentState: &scope.MachineDeploymentState{Object: builder.MachineDeployment("test1", "md-current").WithVersion("v1.2.2").Build()},
machineDeploymentsStateMap: machineDeploymentsStateStable,
machineDeploymentsStateMap: twoMachineDeploymentsStateStable,
currentControlPlane: controlPlaneStable123,
desiredControlPlane: controlPlaneDesired,
topologyVersion: "v1.2.3",
expectedVersion: "v1.2.3",
},
{
name: "should return cluster.spec.topology.version if control plane is stable, other machine deployments are rolling out, concurrency limit not reached",
currentMachineDeploymentState: &scope.MachineDeploymentState{Object: builder.MachineDeployment("test1", "md-current").WithVersion("v1.2.2").Build()},
machineDeploymentsStateMap: oneStableOneRollingMachineDeploymentState,
upgradeConcurrency: 2,
currentControlPlane: controlPlaneStable123,
desiredControlPlane: controlPlaneDesired,
topologyVersion: "v1.2.3",
expectedVersion: "v1.2.3",
},
{
name: "should return machine deployment's spec.template.spec.version if control plane is stable, other machine deployments are rolling out, concurrency limit reached",
currentMachineDeploymentState: &scope.MachineDeploymentState{Object: builder.MachineDeployment("test1", "md-current").WithVersion("v1.2.2").Build()},
machineDeploymentsStateMap: twoRollingMachineDeploymentState,
upgradeConcurrency: 2,
currentControlPlane: controlPlaneStable123,
desiredControlPlane: controlPlaneDesired,
topologyVersion: "v1.2.3",
expectedVersion: "v1.2.2",
},
}

for _, tt := range tests {
Expand All @@ -1959,9 +2002,10 @@ func TestComputeMachineDeploymentVersion(t *testing.T) {
ControlPlane: &scope.ControlPlaneState{Object: tt.currentControlPlane},
MachineDeployments: tt.machineDeploymentsStateMap,
},
UpgradeTracker: scope.NewUpgradeTracker(),
UpgradeTracker: scope.NewUpgradeTracker(scope.MaxMDUpgradeConcurrency(tt.upgradeConcurrency)),
}
desiredControlPlaneState := &scope.ControlPlaneState{Object: tt.desiredControlPlane}
s.UpgradeTracker.MachineDeployments.MarkRollingOut(s.Current.MachineDeployments.RollingOut()...)
version, err := computeMachineDeploymentVersion(s, tt.machineDeploymentTopology, desiredControlPlaneState, tt.currentMachineDeploymentState)
g.Expect(err).NotTo(HaveOccurred())
g.Expect(version).To(Equal(tt.expectedVersion))
Expand Down
Loading