-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Cluster API tests to work better with constrained resources #3441
Conversation
Previously, it was easy to trigger failures by pre-compiling the tests and running them using With these changes, I was able to increase the parallelism of stress to 1024 without being able to trigger any failures (using a 32 core threadripper): detiberloggerhead~srckubernetesautoscalercluster-autoscaler master2⬇2✎1+$ stress -p 1024 ./capi.test
0 runs so far, 0 failures
21 runs so far, 0 failures
191 runs so far, 0 failures
340 runs so far, 0 failures
418 runs so far, 0 failures
548 runs so far, 0 failures
637 runs so far, 0 failures
753 runs so far, 0 failures
867 runs so far, 0 failures
968 runs so far, 0 failures
1071 runs so far, 0 failures
1157 runs so far, 0 failures
1254 runs so far, 0 failures
1355 runs so far, 0 failures
1439 runs so far, 0 failures
1571 runs so far, 0 failures
1643 runs so far, 0 failures
1732 runs so far, 0 failures
1835 runs so far, 0 failures
1928 runs so far, 0 failures
2022 runs so far, 0 failures
2109 runs so far, 0 failures
2171 runs so far, 0 failures
2322 runs so far, 0 failures
2388 runs so far, 0 failures
2469 runs so far, 0 failures
2557 runs so far, 0 failures
2647 runs so far, 0 failures
2743 runs so far, 0 failures
2835 runs so far, 0 failures
2927 runs so far, 0 failures
2978 runs so far, 0 failures
3079 runs so far, 0 failures
3126 runs so far, 0 failures
3216 runs so far, 0 failures
3284 runs so far, 0 failures
3383 runs so far, 0 failures
3440 runs so far, 0 failures
3508 runs so far, 0 failures
3610 runs so far, 0 failures
3664 runs so far, 0 failures
3765 runs so far, 0 failures
3830 runs so far, 0 failures
3895 runs so far, 0 failures
3966 runs so far, 0 failures
4020 runs so far, 0 failures
4105 runs so far, 0 failures
4163 runs so far, 0 failures
4233 runs so far, 0 failures
4303 runs so far, 0 failures
4355 runs so far, 0 failures
4448 runs so far, 0 failures
4511 runs so far, 0 failures
4570 runs so far, 0 failures
4631 runs so far, 0 failures
4705 runs so far, 0 failures
4754 runs so far, 0 failures
4831 runs so far, 0 failures |
thanks for the quick turnaround on this Jason! i tested this out locally and i am seeing similar results to your output. |
/assign @benmoss |
Since the issue seems to be more common (another pr affected: https://travis-ci.org/github/kubernetes/autoscaler/builds/719376991) and the PR is only tests I will take the liberty of merging the PR. Feel free to proceed with review and send a follow-up if needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: elmiko, mwielgus The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Improve Cluster API tests to work better with constrained resources
Improve Cluster API tests to work better with constrained resources
Merge pull request #3441 from detiber/fixCAPITests
Merge pull request #3441 from detiber/fixCAPITests
This makes Cluster API provider tests for Cluster Autoscaler more resilient when running in resource constrained environments. Previously the tests were replicating operations on the informer cache stores and the fake clients, which led to duplicated resources being present in the informer cache store depending on racy behavior that manifests with increased parallelism or in resource constrained environments. With this change, rather than trying to replicate operations the test helpers poll the informer cache store for the changes to be propagated before continuing test assertions.
/assign @elmiko