Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Cluster API tests to work better with constrained resources #3441

Merged
merged 1 commit into from
Aug 19, 2020

Conversation

detiber
Copy link
Member

@detiber detiber commented Aug 19, 2020

This makes Cluster API provider tests for Cluster Autoscaler more resilient when running in resource constrained environments. Previously the tests were replicating operations on the informer cache stores and the fake clients, which led to duplicated resources being present in the informer cache store depending on racy behavior that manifests with increased parallelism or in resource constrained environments. With this change, rather than trying to replicate operations the test helpers poll the informer cache store for the changes to be propagated before continuing test assertions.

/assign @elmiko

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 19, 2020
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Aug 19, 2020
@detiber
Copy link
Member Author

detiber commented Aug 19, 2020

Previously, it was easy to trigger failures by pre-compiling the tests and running them using stress with the default configuration.

With these changes, I was able to increase the parallelism of stress to 1024 without being able to trigger any failures (using a 32 core threadripper):

detiberloggerhead~srckubernetesautoscalercluster-autoscaler master2⬇2✎1+$  stress -p 1024 ./capi.test
0 runs so far, 0 failures
21 runs so far, 0 failures
191 runs so far, 0 failures
340 runs so far, 0 failures
418 runs so far, 0 failures
548 runs so far, 0 failures
637 runs so far, 0 failures
753 runs so far, 0 failures
867 runs so far, 0 failures
968 runs so far, 0 failures
1071 runs so far, 0 failures
1157 runs so far, 0 failures
1254 runs so far, 0 failures
1355 runs so far, 0 failures
1439 runs so far, 0 failures
1571 runs so far, 0 failures
1643 runs so far, 0 failures
1732 runs so far, 0 failures
1835 runs so far, 0 failures
1928 runs so far, 0 failures
2022 runs so far, 0 failures
2109 runs so far, 0 failures
2171 runs so far, 0 failures
2322 runs so far, 0 failures
2388 runs so far, 0 failures
2469 runs so far, 0 failures
2557 runs so far, 0 failures
2647 runs so far, 0 failures
2743 runs so far, 0 failures
2835 runs so far, 0 failures
2927 runs so far, 0 failures
2978 runs so far, 0 failures
3079 runs so far, 0 failures
3126 runs so far, 0 failures
3216 runs so far, 0 failures
3284 runs so far, 0 failures
3383 runs so far, 0 failures
3440 runs so far, 0 failures
3508 runs so far, 0 failures
3610 runs so far, 0 failures
3664 runs so far, 0 failures
3765 runs so far, 0 failures
3830 runs so far, 0 failures
3895 runs so far, 0 failures
3966 runs so far, 0 failures
4020 runs so far, 0 failures
4105 runs so far, 0 failures
4163 runs so far, 0 failures
4233 runs so far, 0 failures
4303 runs so far, 0 failures
4355 runs so far, 0 failures
4448 runs so far, 0 failures
4511 runs so far, 0 failures
4570 runs so far, 0 failures
4631 runs so far, 0 failures
4705 runs so far, 0 failures
4754 runs so far, 0 failures
4831 runs so far, 0 failures

@elmiko
Copy link
Contributor

elmiko commented Aug 19, 2020

thanks for the quick turnaround on this Jason! i tested this out locally and i am seeing similar results to your output.
/approve

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 19, 2020
@detiber
Copy link
Member Author

detiber commented Aug 19, 2020

/assign @benmoss
Ben, ptal, this fixes an issue with the tests (and only the tests) that was uncovered after the unstructured changes merged.

@mwielgus
Copy link
Contributor

Since the issue seems to be more common (another pr affected: https://travis-ci.org/github/kubernetes/autoscaler/builds/719376991) and the PR is only tests I will take the liberty of merging the PR. Feel free to proceed with review and send a follow-up if needed.

Copy link
Contributor

@mwielgus mwielgus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 19, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: elmiko, mwielgus

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 5c07277 into kubernetes:master Aug 19, 2020
ryaneorth pushed a commit to ryaneorth/autoscaler that referenced this pull request Oct 14, 2020
Improve Cluster API tests to work better with constrained resources
ryaneorth pushed a commit to ryaneorth/autoscaler that referenced this pull request Oct 14, 2020
Improve Cluster API tests to work better with constrained resources
k8s-ci-robot added a commit that referenced this pull request Oct 14, 2020
Merge pull request #3441 from detiber/fixCAPITests
k8s-ci-robot added a commit that referenced this pull request Oct 14, 2020
Merge pull request #3441 from detiber/fixCAPITests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants