Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kinder: add kubelet skew test jobs #2396

Merged

Conversation

neolit123
Copy link
Member

@neolit123 neolit123 commented Feb 24, 2021

on the SIG Node / Arch mailing list there is a discussion around the fact that the kubelet claims support for N-2 skew against the kube-apiserver but currently there are not tests for that.

https://groups.google.com/d/msgid/kubernetes-sig-architecture/CAH1uJ6U4qftRnmWthjYKtBKAzRFFnX7XWLMkSbyVWHUsX8%3DBBg%40mail.gmail.com?utm_medium=email&utm_source=footer

on the side of kubeadm we always recommend that users match their kubelet version with the control plane version and the kubeadm version, but such skew tests can help SIG Node and feature upgrade testing in core.

these commit add tests for version skew testing with kubeadm / kinder:

skew-kubelet-1.17-on-1.18.yaml
skew-kubelet-1.17-on-1.19.yaml
skew-kubelet-1.18-on-1.19.yaml
skew-kubelet-1.18-on-1.20.yaml
skew-kubelet-1.19-on-1.20.yaml
skew-kubelet-1.19-on-latest.yaml
skew-kubelet-1.20-on-latest.yaml

NOTE: x-on-y jobs that y==x are already covered by the regular* jobs.

one caveat (as the note in kubeadm-periodic.tests.md explains) is that kubeadm doesn't really support N-2 kubelet skew (max is N-1), so in the future if changes to the kubelet happen such as these tests are no longer possible with kubeadm directly we need to re-evaluate.

@neolit123 neolit123 added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. area/test kind/feature Categorizes issue or PR as related to a new feature. sig/node Categorizes an issue or PR as relevant to SIG Node. area/kinder Issues to track work in the kinder tool labels Feb 24, 2021
@neolit123 neolit123 added this to the v1.21 milestone Feb 24, 2021
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 24, 2021
@k8s-ci-robot k8s-ci-robot added sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 24, 2021
@neolit123
Copy link
Member Author

/hold

@fabriziopandini leaving this up for discussion for now.
PTAL and LMK what you think.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 24, 2021
@neolit123 neolit123 changed the title 1.21 add kubelet skew jobs kinder: add kubelet skew test jobs Feb 24, 2021
Copy link
Member

@fabriziopandini fabriziopandini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not opposed to having these test in the kubeadm repo, however, there should be some kind of agreement on the responsibility for monitoring these job/triage failures (I guess the first line here is SIG Node, with our team at support)

kinder/ci/workflows/skew-x-on-y-tasks.yaml Outdated Show resolved Hide resolved
kinder/ci/workflows/skew-kubelet-1.17-on-1.18.yaml Outdated Show resolved Hide resolved
kinder/ci/workflows/skew-x-on-y-tasks.yaml Outdated Show resolved Hide resolved
@neolit123
Copy link
Member Author

I'm not opposed to having these test in the kubeadm repo, however, there should be some kind of agreement on the responsibility for monitoring these job/triage failures (I guess the first line here is SIG Node, with our team at support)

i proposed that we duplicate / add the jobs to a SIG Node dashboard.

@neolit123 neolit123 force-pushed the 1.21-add-kubelet-skew-jobs branch from 85b5d88 to b0ac04e Compare February 24, 2021 23:17
@neolit123
Copy link
Member Author

@fabriziopandini updated.

- add kubeletVersion as a variable
- add defaultIgnorePreflightErrors with the default list
of checks to skip
- add ignorePreflightErrors
latest (tip of master) == n:
- n control plane + kubeadm against kubelet n-1 and n-2
- n-1 control plane + kubeadm against kubelet n-2 and n-3
- n-2 control plane + kubeadm against kubelet n-3 and n-4
- n-3 control plane + kubeadm against kubelet n-4

Update kubeadm-periodic.tests.md to include details about the new jobs.
@dims
Copy link
Member

dims commented Mar 3, 2021

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 3, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dims, neolit123

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@neolit123
Copy link
Member Author

let's iterate if we find problems.

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 4, 2021
@neolit123
Copy link
Member Author

/retest

@neolit123
Copy link
Member Author

/retest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kinder Issues to track work in the kinder tool area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants