-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate using kind for e2e tests on Prow #103
Comments
/cc @mattmoor @josephburnett FYI This should help with the question brought today during Joe's presentation about autoscaling |
I'm wondering about the resource requirements we need and it's ability to handle our workloads (e.g. provisioning GCP LoadBalancers). |
Investigated using KIND in our e2e test flows, by trying to make all e2e tests in knative/serving passing in KIND. After a couple of PRs(the ones associated with this issue), KIND works for most of tests as shown below:
To make KIND work, the following steps are needed:
Overall, KIND is not ready yet for running our e2e tests, in considering that the scaling test duration is not comparable to GKE clusters, and the potential benefit it could bring us(shorter clusters creation time, from ~3 minutes to ~10-20 seconds). Also, KIND is not in a stable state yet(master branch is currently broken kubernetes-sigs/kind#509), and there is a breaking change in upcoming 0.3 release branch. |
@adrcunha fyi |
👋 kind dev here, would like to help if there's interest 🙃
Do you always need to run scale tests? These are necessarily going to be bound by the hardware it runs on and might improve if run on bigger CI nodes, I'd guess your GKE clusters have more horsepower. Are you running scale tests in presubmit?
I'd be curious what those failed tests are doing, if you know :-)
What version do you need? kubernetes-sigs/kind#531 Does knative not test with the current Kubernetes release at all? This seems surprising.
Should only be necessary if you need memory limits on your pods, unfortunately there aren't many options there.
To clarify, it is not broken. You must build with go modules or use the makefile. https://github.com/kubernetes-sigs/kind/releases has precompiled binaries to make this easier. 0.3 will require new node images (but we're providing them) and changes some internal details of the node that are never supposed to be guaranteed (like which CRI we use). I wouldn't expect knative testing to depend on these details. |
Hi @BenTheElder , thanks for prompt response, as well as clarification. Yes we do run scale test in presubmit, and the difference is 240-300 seconds on KIND vs 10-20 seconds on GKE for scaling up to 10, and our e2e tests run in n1-standard-4 machines with 4 nodes, each machine has 4 vcpus and 15 GB mem, this isn't much different from my linux box(12 cpus and 64 GB mem). |
At this point we're running them on presubmit, specially because there's active development in the autoscale area.
No, we don't test against the latest k8s version for several reasons, including features and compatibility with other k8s providers (GCP, IBM, Azure, etc). Running against the latest k8s isn't currently a concern. |
Thanks @chaodaiG @adrcunha that makes sense to me.
So pod start time is the issue? That is good to know and worth looking into on my end...
I see, so if I want to run knative locally I need to match the cloud providers?
FWIW It looks like IBM has 1.14.1 with the default at 1.13.6 which is pretty recent (latest from the two most recent Kubernetes branches). AKS has 1.13 GA with 1.14 in preview. GKE supports 1.13.5, and of course customers on all of these clouds use unmanaged clusters to run more recent versions. kind defaults to 1.14.1 currently but we run CI against all supported Kubernetes versions https://testgrid.k8s.io/conformance-kind |
Close this investigation for now, will re-evaluate in the future. fyi @BenTheElder , based on my observation pod start time is the bottleneck. And thanks a lot for the helps and quick turnaround time along my investigation. Also fyi @tcnghia , we decided not to integrate with KIND at this point. Thank you for helping me sweeping out most of the failed tests. /close |
@chaodaiG: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
For stability, yes. knative.dev website has instructions for several cloud providers, and also minikube.
CI and presubmit run against latest GKE.
Good to know, as it might be necessary in the future, thanks. |
/reopen
|
@chizhg: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
You'll also need to consider incorporate it with kntest I believe |
I'd love to see us try this because frankly it's going to be our most reliable way of tracking the set of upstream Kubernetes versions that we want to validate against. Technically we should be testing against 1.16-1.18 right now, but GKE only has 1.15 in the default channel. Getting some base set of tests (e.g. conformance) to run against kind across those versions would be pretty valuable IMO. |
Issues go stale after 90 days of inactivity. Send feedback to Knative Productivity Slack channel or file an issue in knative/test-infra. /lifecycle stale |
/remove-lifecycle stale |
/assign @mattmoor |
I'm not looking at this in Prow |
Ah, glossed over the "on Prow" part, but I figured the "using KIND" was more important :) |
Enabled in #2427 |
@chaodaiG: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
kind (kubernetes in docker) might make e2e tests faster since there's no need to create an external cluster. However, it needs to be taken into consideration that we have to start a k8s cluster with a known, public version (e.g. 1.11.1).
The text was updated successfully, but these errors were encountered: