Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor performance tests: capture memory usage, add README #2455

Merged
merged 1 commit into from
Jul 7, 2023

Conversation

jdn5126
Copy link
Contributor

@jdn5126 jdn5126 commented Jul 3, 2023

What type of PR is this?
enhancement, testing

Which issue does this PR fix:
#2437

What does this PR do / Why do we need it:
This PR refactors and cleans up scripts/run-integration-tests.sh, primarily for performance testing. It also captures memory usage of the aws-node pod during performance tests and uploads it to an S3 bucket. It also adds a README for running run-integration-tests.sh and cleans up unused variables.

The usage of the memory statistics may be extended in the future, i.e. to compare to previous runs, but for now we only flag when memory usage exceeds our recommended value: 200Mi.

If an issue # is not available please add repro steps and logs from IPAMD/CNI showing the issue:
N/A

Testing done on this change:
Manually running integration tests and evaluating performance statistics.

Automation added to e2e:
Added capturing of memory usage by aws-node pods to performance tests.

Will this PR introduce any new dependencies?:
No

Will this break upgrades or downgrades. Has updating a running cluster been tested?:
No, Yes

Does this change require updates to the CNI daemonset config files to work?:
No

Does this PR introduce any user-facing change?:
No


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@jdn5126 jdn5126 requested a review from a team as a code owner July 3, 2023 18:56
Copy link
Member

@orsenthil orsenthil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Could you please paste an output of the test run after this refactor. It will be good to verify what what to expect, and see a successful run.

Thank you.

run: |
./scripts/run-integration-tests.sh
if: always()
- name: Run calico tests
env:
DISABLE_PROMPT: true
S3_BUCKET_CREATE: false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the motivation of these changes, especially the removal of S3 bucket creates and setting RUN_INTEGRATION_DEFAULT_CNI to false in the weekly cron?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

S3_BUCKET_CREATE and S3_BUCKET_NAME were unused variables that appear to have been missed in a previous cleanup, so I removed them here.

For RUN_INTEGRATION_DEFAULT_CNI, we were running the CNI integration tests with every invocation of run-integration-tests. For the weekly tests, that means we were running them 4 times with no real benefit. Since they already run nightly as part of the nightly-cron-tests, I concluded they should be removed from the weekly tests. This should drop our weekly test runtime by about 1.5 hours, as well.

@jdn5126
Copy link
Contributor Author

jdn5126 commented Jul 7, 2023

For test results, I kicked off https://github.com/aws/amazon-vpc-cni-k8s/actions/runs/5488008758 to show the results from GitHub

@jdn5126
Copy link
Contributor Author

jdn5126 commented Jul 7, 2023

For test results, I kicked off https://github.com/aws/amazon-vpc-cni-k8s/actions/runs/5488008758 to show the results from GitHub

Oh, actually that won't work, as the run cannot happen with the new workflow file until this merges. For example output, I can show the latest S3 bucket results from my manual run:

Iteration, scaleUpTime (s), scaleDownTime (s), scaleUpMem (Mi), scaleDownMem (Mi)
1, 13, 25, 57, 59
2, 13, 26, 59, 59
3, 13, 25, 58, 59

@jdn5126 jdn5126 merged commit 438dcb3 into aws:master Jul 7, 2023
@jdn5126 jdn5126 deleted the perf branch July 7, 2023 15:27
jdn5126 added a commit that referenced this pull request Jul 11, 2023
* refactor canary test to access images from AWS registries (#2398)

* upgrade client-go and controller-runtime modules (#2396)

* updates for v1.13.0 release (#2400)

* chore: Added dependabot (#2403)

* dependency updates (#2412)

* deprecate ENABLE_NFTABLES and set iptables mode using iptables-wrapper script (#2402)

* update networking test agent to go1.20 and latest sys module (#2413)

* skip delete test cluster to debug (#2414)

* Revert "skip delete test cluster to debug (#2414)" (#2415)

This reverts commit 7c30943.

* authenticate to test image registry (#2417)

* update test agent image (#2419)

* update test agent hash in go.mod (#2422)

* fix hard-coded nitro instances (#2428)

* move authentication step from test canary script (#2429)

* node initialization must come after primary ENI's security groups are synced to cache (#2427)

* Add 1.27 to Rec Version Table (#2404)

* revise rec version table

* make DOCKER_ARGS a passable var from CLI builds (#2434)

Signed-off-by: jonahjon <[email protected]>

* Update Kops cluster to latest and add parameter for kops version (#2435)

* Updates instance limits including c7gn (#2438)

* Update Kops cluster to latest and add parameter for kops version (#2440)

* update image tag to v1.13.2 (#2432)

* update docs and CNI logging (#2433)

* remove default canary test run from integration tests (#2443)

* Silences nightly cron jobs for forks (#2444)

* Silences weekly cron jobs for forks (#2459)

* refactor performance tests (#2455)

* add custom-networking test covering ENIConfig objects with no security (#2445)

groups

* k8s clients only need to access corev1; add pod selector (#2463)

---------

Signed-off-by: jonahjon <[email protected]>
Co-authored-by: Olivia Song <[email protected]>
Co-authored-by: Ellis Tarn <[email protected]>
Co-authored-by: Geoffrey Cline <[email protected]>
Co-authored-by: Jonah Jones <[email protected]>
Co-authored-by: Jay Deokar <[email protected]>
Co-authored-by: Matt <[email protected]>
Co-authored-by: Matt <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants