Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🏃[e2e] add mhc test #3029

Merged
merged 1 commit into from
May 11, 2020
Merged

Conversation

sedefsavas
Copy link

What this PR does / why we need it:
This PR adds machine health check e2e tests.

/assign @fabriziopandini

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 7, 2020
@k8s-ci-robot k8s-ci-robot requested review from benmoss and JoelSpeed May 7, 2020 20:51
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 7, 2020
@sedefsavas sedefsavas force-pushed the e2e-mhc-test branch 2 times, most recently from e293c6f to 230d1a1 Compare May 8, 2020 06:23
@sedefsavas
Copy link
Author

Following the decision of not supporting empty machine healthcheck labels here (#3006), this test disallows having a mhc without labels.

@vincepri
Copy link
Member

vincepri commented May 8, 2020

/assign @fabriziopandini @benmoss
/milestone v0.3.6

@k8s-ci-robot k8s-ci-robot added this to the v0.3.6 milestone May 8, 2020
test/framework/alltypes_helpers.go Outdated Show resolved Hide resolved
test/framework/alltypes_helpers.go Outdated Show resolved Hide resolved
test/framework/machine_helpers.go Show resolved Hide resolved
test/framework/machinehealthcheck_helpers.go Show resolved Hide resolved
test/framework/machinehealthcheck_helpers.go Outdated Show resolved Hide resolved
test/framework/machinehealthcheck_helpers.go Outdated Show resolved Hide resolved
test/framework/machinehealthcheck_helpers.go Show resolved Hide resolved
test/framework/machinehealthcheck_helpers.go Outdated Show resolved Hide resolved
test/framework/machinehealthcheck_helpers.go Outdated Show resolved Hide resolved
test/e2e/mhc_remediations_test.go Show resolved Hide resolved
Copy link
Member

@fabriziopandini fabriziopandini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sedefsavas
My main concern for this PR is that DiscoverMachineHealthChecksAndWaitForRemediation is doing a complex sequence and I would like to break down this in smaller parts that can be reused for MHC remediation test. what about

  1. Avoiding patching labels by setting labels upfront in the cluster template,
  2. Implementing the following sequence in the spec
    • CreateWorloadCluster (get cluster, mds)
    • DiscoveryMHCs
    • PatchCondition on mds[0], first machine)
    • Wait for remediation to happen

test/e2e/data/infrastructure-docker/cluster-template.yaml Outdated Show resolved Hide resolved
test/e2e/mhc_remediations.go Outdated Show resolved Hide resolved
test/e2e/mhc_remediations.go Outdated Show resolved Hide resolved
test/framework/machine_helpers.go Outdated Show resolved Hide resolved
test/framework/machinehealthcheck_helpers.go Outdated Show resolved Hide resolved
test/framework/machinehealthcheck_helpers.go Outdated Show resolved Hide resolved
test/framework/machinehealthcheck_helpers.go Outdated Show resolved Hide resolved
test/framework/machinehealthcheck_helpers.go Outdated Show resolved Hide resolved
@sedefsavas
Copy link
Author

/hold
MachineHealthCheck controller detects the unhealthy node but then does not remediate. Still investigating.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 8, 2020
@sedefsavas sedefsavas force-pushed the e2e-mhc-test branch 2 times, most recently from 809b231 to 3d9953c Compare May 11, 2020 05:41
@sedefsavas
Copy link
Author

/hold cancel

This PR is ready. cc @fabriziopandini

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 11, 2020
Copy link
Member

@fabriziopandini fabriziopandini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not 100% sold of the approach that goes back from MHC to machines, but I think this can be acceptable from the first iteration so lgtm for me after fixing two small nits

test/framework/machinehealthcheck_helpers.go Outdated Show resolved Hide resolved
test/framework/machinehealthcheck_helpers.go Outdated Show resolved Hide resolved
@sedefsavas sedefsavas force-pushed the e2e-mhc-test branch 2 times, most recently from 4983f13 to 1741057 Compare May 11, 2020 17:54
@sedefsavas
Copy link
Author

I'm still not 100% sold of the approach that goes back from MHC to machines, but I think this can be acceptable from the first iteration so lgtm for me after fixing two small nits

@fabriziopandini what do you have in mind as a better approach to trigger MHC other than patching node condition? Asking for future improvement.

@fabriziopandini
Copy link
Member

/approve
/lgtm

@sedefsavas I'm ok with how we trigger remediation.
I'm less convinced about how we choose the machine to be remediated because now we start from mhc and then we choose 1 machine in scope, but there is no chance for the framework users to specify which machine to remediate. Instead, for sake of re-usability, we should probably let the user specify the list of machines to be remediated

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 11, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: fabriziopandini, sedefsavas

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 11, 2020
@k8s-ci-robot k8s-ci-robot merged commit 6821939 into kubernetes-sigs:master May 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants