Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Adds AvailableCondition and ReadyCondition Conditions to MachineDeployment #4625

Merged
merged 1 commit into from
Jul 15, 2021

Conversation

Arvinderpal
Copy link
Contributor

What this PR does / why we need it:
Following the model of KCP and Machine Conditions, this PR adds an AvailableCondition to MachineDeployment. The condition is true when Nodes of the underlying MachineSet(s) are available to take on workloads. It also adds the summary ReadyCondition.

Tracking issue: #3486

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 15, 2021
@k8s-ci-robot
Copy link
Contributor

Hi @Arvinderpal. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 15, 2021
return nil
}

// FIXME(awander): This func is removed after running `make generate`. I assume b/c there is no Conditions fields in v1alpha3 for MachineDeploymentStatus. What is the correct way to handle this?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can someone help me here. What is the correct way to handle this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I got this right you should make sure to get roundtrip conversion by using marshal and unamarshal to in an annotation like we already have for spec.Strategy in

func (src *MachineDeployment) ConvertTo(dstRaw conversion.Hub) error {
dst := dstRaw.(*v1alpha4.MachineDeployment)
if err := Convert_v1alpha3_MachineDeployment_To_v1alpha4_MachineDeployment(src, dst, nil); err != nil {
return err
}
// Manually restore data.
restored := &v1alpha4.MachineDeployment{}
if ok, err := utilconversion.UnmarshalData(src, restored); err != nil || !ok {
return err
}
if restored.Spec.Strategy != nil && restored.Spec.Strategy.RollingUpdate != nil {
if dst.Spec.Strategy == nil {
dst.Spec.Strategy = &v1alpha4.MachineDeploymentStrategy{}
}
if dst.Spec.Strategy.RollingUpdate == nil {
dst.Spec.Strategy.RollingUpdate = &v1alpha4.MachineRollingUpdateDeployment{}
}
dst.Spec.Strategy.RollingUpdate.DeletePolicy = restored.Spec.Strategy.RollingUpdate.DeletePolicy
}
return nil
}
func (dst *MachineDeployment) ConvertFrom(srcRaw conversion.Hub) error {
src := srcRaw.(*v1alpha4.MachineDeployment)
if err := Convert_v1alpha4_MachineDeployment_To_v1alpha3_MachineDeployment(src, dst, nil); err != nil {
return err
}
// Preserve Hub data on down-conversion except for metadata
if err := utilconversion.MarshalData(src, dst); err != nil {
return err
}
return nil
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay. Made the change. PTAL.

@@ -349,6 +350,14 @@ func (r *MachineDeploymentReconciler) scale(ctx context.Context, deployment *clu
// syncDeploymentStatus checks if the status is up-to-date and sync it if necessary.
func (r *MachineDeploymentReconciler) syncDeploymentStatus(allMSs []*clusterv1.MachineSet, newMS *clusterv1.MachineSet, d *clusterv1.MachineDeployment) error {
d.Status = calculateStatus(allMSs, newMS, d)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be add a comment here that minReplicasNeeded will be equal to d.Spec.Replicas when the strategy is not RollingUpdateMachineDeploymentStrategyType

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the only other strategy I see implemented today is OnDeleteMachineDeploymentStrategyType. Would the comment become stale if we implement other strategies?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

	// Rolling update config params. Present only if
	// MachineDeploymentStrategyType = RollingUpdate.
	// +optional
	RollingUpdate *MachineRollingUpdateDeployment `json:"rollingUpdate,omitempty"`

As today maxUnavailable and MaxSurge are only applicable when // MachineDeploymentStrategyType = RollingUpdate.

I think reinforcing this would make it easier for new folks to ramp up and avoid confusion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Added the comment.

// machines required are up and running for at least minReadySeconds.
AvailableCondition ConditionType = "Available"

// MinimumMachinesAvailableReason reflects the minimum availability of machines for a machinedeployment.
Copy link
Member

@enxebre enxebre May 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having a hard time to read this, "reflects the minimum availability of machines for a machinedeployment." doesn't seem right to me, since this variable is just meant to be the reason for Available=false. Also shouldn't it be NotMinimumMachinesAvailable as that's the reason for Available=false?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it to NotMinimumMachinesAvailableReason and updated the comment.

@enxebre
Copy link
Member

enxebre commented May 17, 2021

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 17, 2021
Copy link
Contributor Author

@Arvinderpal Arvinderpal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@enxebre Thank you for the quick feedback. I will work on adding a unit test for the negative case.

api/v1alpha4/condition_consts.go Outdated Show resolved Hide resolved
controllers/machinedeployment_controller.go Outdated Show resolved Hide resolved
controllers/machinedeployment_controller_test.go Outdated Show resolved Hide resolved
controllers/machinedeployment_controller_test.go Outdated Show resolved Hide resolved
@@ -349,6 +350,14 @@ func (r *MachineDeploymentReconciler) scale(ctx context.Context, deployment *clu
// syncDeploymentStatus checks if the status is up-to-date and sync it if necessary.
func (r *MachineDeploymentReconciler) syncDeploymentStatus(allMSs []*clusterv1.MachineSet, newMS *clusterv1.MachineSet, d *clusterv1.MachineDeployment) error {
d.Status = calculateStatus(allMSs, newMS, d)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the only other strategy I see implemented today is OnDeleteMachineDeploymentStrategyType. Would the comment become stale if we implement other strategies?

cmd/clusterctl/config/manifest/clusterctl-api.yaml Outdated Show resolved Hide resolved
controllers/machinedeployment_controller.go Outdated Show resolved Hide resolved
controllers/machinedeployment_sync.go Show resolved Hide resolved
controllers/machinedeployment_sync.go Outdated Show resolved Hide resolved
return nil
}

// FIXME(awander): This func is removed after running `make generate`. I assume b/c there is no Conditions fields in v1alpha3 for MachineDeploymentStatus. What is the correct way to handle this?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I got this right you should make sure to get roundtrip conversion by using marshal and unamarshal to in an annotation like we already have for spec.Strategy in

func (src *MachineDeployment) ConvertTo(dstRaw conversion.Hub) error {
dst := dstRaw.(*v1alpha4.MachineDeployment)
if err := Convert_v1alpha3_MachineDeployment_To_v1alpha4_MachineDeployment(src, dst, nil); err != nil {
return err
}
// Manually restore data.
restored := &v1alpha4.MachineDeployment{}
if ok, err := utilconversion.UnmarshalData(src, restored); err != nil || !ok {
return err
}
if restored.Spec.Strategy != nil && restored.Spec.Strategy.RollingUpdate != nil {
if dst.Spec.Strategy == nil {
dst.Spec.Strategy = &v1alpha4.MachineDeploymentStrategy{}
}
if dst.Spec.Strategy.RollingUpdate == nil {
dst.Spec.Strategy.RollingUpdate = &v1alpha4.MachineRollingUpdateDeployment{}
}
dst.Spec.Strategy.RollingUpdate.DeletePolicy = restored.Spec.Strategy.RollingUpdate.DeletePolicy
}
return nil
}
func (dst *MachineDeployment) ConvertFrom(srcRaw conversion.Hub) error {
src := srcRaw.(*v1alpha4.MachineDeployment)
if err := Convert_v1alpha4_MachineDeployment_To_v1alpha3_MachineDeployment(src, dst, nil); err != nil {
return err
}
// Preserve Hub data on down-conversion except for metadata
if err := utilconversion.MarshalData(src, dst); err != nil {
return err
}
return nil
}

api/v1alpha4/condition_consts.go Outdated Show resolved Hide resolved
api/v1alpha4/condition_consts.go Outdated Show resolved Hide resolved
Copy link
Contributor Author

@Arvinderpal Arvinderpal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fabriziopandini I have addressed all the comments. PTAL

return nil
}

// FIXME(awander): This func is removed after running `make generate`. I assume b/c there is no Conditions fields in v1alpha3 for MachineDeploymentStatus. What is the correct way to handle this?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay. Made the change. PTAL.

api/v1alpha4/condition_consts.go Outdated Show resolved Hide resolved
api/v1alpha4/condition_consts.go Outdated Show resolved Hide resolved
controllers/machinedeployment_controller.go Outdated Show resolved Hide resolved
controllers/machinedeployment_sync.go Show resolved Hide resolved
controllers/machinedeployment_sync.go Outdated Show resolved Hide resolved
cmd/clusterctl/config/manifest/clusterctl-api.yaml Outdated Show resolved Hide resolved
controllers/machinedeployment_controller_test.go Outdated Show resolved Hide resolved
@Arvinderpal Arvinderpal force-pushed the md-avail-condition branch from 6d16a47 to bbbdcf4 Compare May 19, 2021 18:06
@Arvinderpal
Copy link
Contributor Author

/retest

@Arvinderpal Arvinderpal force-pushed the md-avail-condition branch from bbbdcf4 to 5019ec0 Compare May 19, 2021 20:29
@Arvinderpal
Copy link
Contributor Author

Can anyone help me with why pull-cluster-api-test-main is failing with:

FAIL	sigs.k8s.io/cluster-api/controllers [build failed]

Is it possibly related to this line in build logs? I did add a new test at line 393, but I don't quite understand why that would be the issue:

controllers/machinedeployment_controller_test.go:393:3: undefined: By

@fabriziopandini @enxebre

@enxebre
Copy link
Member

enxebre commented May 20, 2021

Is it possibly related to this line in build logs? I did add a new test at line 393, but I don't quite understand why that would be the issue:
controllers/machinedeployment_controller_test.go:393:3: undefined: By

@Arvinderpal We are in the process of dropping ginkgo from the unit/integration tests. If you get the latest changes i.e git fetch this repo and git rebase your branch atop you'll see ginko is not imported anymore in that test package thus By is undefined.

@Arvinderpal Arvinderpal force-pushed the md-avail-condition branch from 5019ec0 to b00bf38 Compare May 20, 2021 14:39
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 27, 2021
@Arvinderpal Arvinderpal force-pushed the md-avail-condition branch from 57eefbd to 53edcbc Compare May 27, 2021 15:47
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 27, 2021
@Arvinderpal
Copy link
Contributor Author

@enxebre I have addressed the last comment.
Waiting for v1alpha4 sounds good to me.

@Arvinderpal
Copy link
Contributor Author

Now that v1alpha4 is out, can we merge this?

@enxebre
Copy link
Member

enxebre commented Jun 30, 2021

/hold cancel
/lgtm

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Jun 30, 2021
@Arvinderpal
Copy link
Contributor Author

@fabriziopandini @srm09 This has been sitting for a while. If we can merge this, then I can follow up with additional Conditions. 🙏

@srm09
Copy link
Contributor

srm09 commented Jul 13, 2021

/lgtm

Copy link
Contributor

@CecileRobertMichon CecileRobertMichon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: CecileRobertMichon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 15, 2021
@Arvinderpal
Copy link
Contributor Author

/retest

@CecileRobertMichon
Copy link
Contributor

FAIL sigs.k8s.io/cluster-api/controllers [build failed]

?

@Arvinderpal
Copy link
Contributor Author

FAIL sigs.k8s.io/cluster-api/controllers [build failed]

@CecileRobertMichon Can't seem to make out from the logs why this is happening. It could just be flaky so I'll try retest again but if that fails, I can try to rebase my branch. This will remove all the lgtms and approves...

@Arvinderpal
Copy link
Contributor Author

/retest-required

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 15, 2021
`AvailableCondition` and `ReadyCondition`.
@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Jul 15, 2021

@Arvinderpal: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Rerun command
pull-cluster-api-apidiff-main c0caa60 link /test pull-cluster-api-apidiff-main

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@Arvinderpal
Copy link
Contributor Author

@CecileRobertMichon @enxebre @srm09 Please lgtm and approve again. I had to rebase with master in order for all CI jobs to pass. 🙏

@srm09
Copy link
Contributor

srm09 commented Jul 15, 2021

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 15, 2021
@k8s-ci-robot k8s-ci-robot merged commit 109abc5 into kubernetes-sigs:master Jul 15, 2021
@k8s-ci-robot k8s-ci-robot added this to the v0.4 milestone Jul 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants