Improve Cluster API tests to work better with constrained resources #3441

detiber · 2020-08-19T17:22:08Z

This makes Cluster API provider tests for Cluster Autoscaler more resilient when running in resource constrained environments. Previously the tests were replicating operations on the informer cache stores and the fake clients, which led to duplicated resources being present in the informer cache store depending on racy behavior that manifests with increased parallelism or in resource constrained environments. With this change, rather than trying to replicate operations the test helpers poll the informer cache store for the changes to be propagated before continuing test assertions.

/assign @elmiko

detiber · 2020-08-19T17:24:00Z

Previously, it was easy to trigger failures by pre-compiling the tests and running them using stress with the default configuration.

With these changes, I was able to increase the parallelism of stress to 1024 without being able to trigger any failures (using a 32 core threadripper):

detiberloggerhead~srckubernetesautoscalercluster-autoscaler master2⬇2✎1+$  stress -p 1024 ./capi.test
0 runs so far, 0 failures
21 runs so far, 0 failures
191 runs so far, 0 failures
340 runs so far, 0 failures
418 runs so far, 0 failures
548 runs so far, 0 failures
637 runs so far, 0 failures
753 runs so far, 0 failures
867 runs so far, 0 failures
968 runs so far, 0 failures
1071 runs so far, 0 failures
1157 runs so far, 0 failures
1254 runs so far, 0 failures
1355 runs so far, 0 failures
1439 runs so far, 0 failures
1571 runs so far, 0 failures
1643 runs so far, 0 failures
1732 runs so far, 0 failures
1835 runs so far, 0 failures
1928 runs so far, 0 failures
2022 runs so far, 0 failures
2109 runs so far, 0 failures
2171 runs so far, 0 failures
2322 runs so far, 0 failures
2388 runs so far, 0 failures
2469 runs so far, 0 failures
2557 runs so far, 0 failures
2647 runs so far, 0 failures
2743 runs so far, 0 failures
2835 runs so far, 0 failures
2927 runs so far, 0 failures
2978 runs so far, 0 failures
3079 runs so far, 0 failures
3126 runs so far, 0 failures
3216 runs so far, 0 failures
3284 runs so far, 0 failures
3383 runs so far, 0 failures
3440 runs so far, 0 failures
3508 runs so far, 0 failures
3610 runs so far, 0 failures
3664 runs so far, 0 failures
3765 runs so far, 0 failures
3830 runs so far, 0 failures
3895 runs so far, 0 failures
3966 runs so far, 0 failures
4020 runs so far, 0 failures
4105 runs so far, 0 failures
4163 runs so far, 0 failures
4233 runs so far, 0 failures
4303 runs so far, 0 failures
4355 runs so far, 0 failures
4448 runs so far, 0 failures
4511 runs so far, 0 failures
4570 runs so far, 0 failures
4631 runs so far, 0 failures
4705 runs so far, 0 failures
4754 runs so far, 0 failures
4831 runs so far, 0 failures

elmiko · 2020-08-19T17:52:37Z

thanks for the quick turnaround on this Jason! i tested this out locally and i am seeing similar results to your output.
/approve

detiber · 2020-08-19T17:53:30Z

/assign @benmoss
Ben, ptal, this fixes an issue with the tests (and only the tests) that was uncovered after the unstructured changes merged.

mwielgus · 2020-08-19T22:29:04Z

Since the issue seems to be more common (another pr affected: https://travis-ci.org/github/kubernetes/autoscaler/builds/719376991) and the PR is only tests I will take the liberty of merging the PR. Feel free to proceed with review and send a follow-up if needed.

mwielgus

/lgtm
/approve

k8s-ci-robot · 2020-08-19T22:30:19Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: elmiko, mwielgus

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~cluster-autoscaler/cloudprovider/clusterapi/OWNERS~~ [elmiko,mwielgus]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Improve Cluster API tests to work better with constrained resources

Merge pull request #3441 from detiber/fixCAPITests

k8s-ci-robot assigned elmiko Aug 19, 2020

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 19, 2020

k8s-ci-robot requested review from enxebre and hardikdr August 19, 2020 17:22

k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Aug 19, 2020

Improve Cluster API tests to work better with constrained resources

63f9e40

detiber force-pushed the fixCAPITests branch from 0910446 to 63f9e40 Compare August 19, 2020 17:31

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 19, 2020

k8s-ci-robot assigned benmoss Aug 19, 2020

mwielgus approved these changes Aug 19, 2020

View reviewed changes

k8s-ci-robot assigned mwielgus Aug 19, 2020

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 19, 2020

k8s-ci-robot merged commit 5c07277 into kubernetes:master Aug 19, 2020

elmiko mentioned this pull request Oct 14, 2020

Merge pull request #3570 from towca/jtuznik/scale-down-after-delete-fix #3597

Merged

ryaneorth pushed a commit to ryaneorth/autoscaler that referenced this pull request Oct 14, 2020

Merge pull request kubernetes#3441 from detiber/fixCAPITests

48c9d68

Improve Cluster API tests to work better with constrained resources

ryaneorth pushed a commit to ryaneorth/autoscaler that referenced this pull request Oct 14, 2020

Merge pull request kubernetes#3441 from detiber/fixCAPITests

e110878

Improve Cluster API tests to work better with constrained resources

k8s-ci-robot added a commit that referenced this pull request Oct 14, 2020

Merge pull request #3613 from ryaneorth/cherry-pick-3441-1.19

fad6360

Merge pull request #3441 from detiber/fixCAPITests

k8s-ci-robot added a commit that referenced this pull request Oct 14, 2020

Merge pull request #3612 from ryaneorth/cherry-pick-3441-1.18

d282d74

Merge pull request #3441 from detiber/fixCAPITests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Cluster API tests to work better with constrained resources #3441

Improve Cluster API tests to work better with constrained resources #3441

detiber commented Aug 19, 2020

detiber commented Aug 19, 2020

elmiko commented Aug 19, 2020

detiber commented Aug 19, 2020

mwielgus commented Aug 19, 2020

mwielgus left a comment

k8s-ci-robot commented Aug 19, 2020

Improve Cluster API tests to work better with constrained resources #3441

Improve Cluster API tests to work better with constrained resources #3441

Conversation

detiber commented Aug 19, 2020

detiber commented Aug 19, 2020

elmiko commented Aug 19, 2020

detiber commented Aug 19, 2020

mwielgus commented Aug 19, 2020

mwielgus left a comment

Choose a reason for hiding this comment

k8s-ci-robot commented Aug 19, 2020