Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add benchmark testing framework #3599

Merged
merged 32 commits into from
Jun 27, 2024
Merged

Add benchmark testing framework #3599

merged 32 commits into from
Jun 27, 2024

Conversation

shawnh2
Copy link
Contributor

@shawnh2 shawnh2 commented Jun 12, 2024

What this PR does / why we need it:

Taking #2578 forward, but using code to implement benchmark testing framework.

Similar to ConformanceSuite and ConformanceTest, I define a BenchmarkSuite and BenchmarkTest, each benchmark test case will be run as a k8s Job, so that:

  • we can easily control our benchmark env like Pod's resources limit and request
  • we can add more test cases in the future without any breaking changes

Which issue(s) this PR fixes:

Fix #1365, Fix #2325, Close #2578

Copy link

codecov bot commented Jun 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 68.33%. Comparing base (0ebfae8) to head (3dc4472).
Report is 7 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3599      +/-   ##
==========================================
+ Coverage   68.25%   68.33%   +0.08%     
==========================================
  Files         170      170              
  Lines       20760    20780      +20     
==========================================
+ Hits        14170    14201      +31     
+ Misses       5568     5562       -6     
+ Partials     1022     1017       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -0,0 +1,40 @@
name: Benchmarking Tests at Scale
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: should we schedule this ci as a cron job or run with every PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or only run this if someone comments /benchmark ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id vote to only make it run on push to main and release/v*

lets raise a follow up issue to support running on PRs automatically (if it doesn't increase CI time) or using /benchmark

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good!

@shawnh2 shawnh2 marked this pull request as ready for review June 16, 2024 09:10
@shawnh2 shawnh2 requested a review from a team as a code owner June 16, 2024 09:10
Signed-off-by: shawnh2 <[email protected]>
@zirain
Copy link
Member

zirain commented Jun 17, 2024

where's the result?

@shawnh2
Copy link
Contributor Author

shawnh2 commented Jun 17, 2024

where's the result?

In the CI stdout, which way do we prefer to see the result ? Comments back in current thread ?

@zirain
Copy link
Member

zirain commented Jun 17, 2024

where's the result?

In the CI stdout, which way do we prefer to see the result ? Comments back in current thread ?

that's an option.

@arkodg
Copy link
Contributor

arkodg commented Jun 17, 2024

thanks for building out this benchmarking suite !

suggest configuring envoy proxy and envoy gateway memory and cpu limits before starting the test, and also make it a top level input arg
I like how this is a suite of individual tests, each test changing some parameter, the first one modifies HTTPRoute.
A suggestion is to graph the values of the below attributes with changes in the input param (e.g. HTTPRoute)

  • Latency (P50, P90 and P99)
  • Error Rate (non 200)
  • Throughput
  • EG Memory usage
  • EG CPU usage
  • EnvoyProxy Memory usage
  • EnvoyProxy CPU usage
    for each test within Benchmark

@arkodg
Copy link
Contributor

arkodg commented Jun 17, 2024

we are already using fortio in our repo today

func runLoadAndWait(t *testing.T, timeoutConfig config.TimeoutConfig, done chan bool, aborter *periodic.Aborter, reqURL string) {

are they any benefits on using nighthawk here ?
cc @guydc

@shawnh2
Copy link
Contributor Author

shawnh2 commented Jun 18, 2024

A suggestion is to graph the values of the below attributes with changes in the input param (e.g. HTTPRoute)

Cool! These values can be easily retrieved from CP & DP metrics.

@@ -0,0 +1,53 @@
name: Benchmarking Tests at Scale
on:
pull_request:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change it to push once this PR is good to go

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we need to run this in every push, or on schedule

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like suggested #3599 (comment), we can run this with /benchmark command.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is good to run it on pull/push. In general, I like all major testing/linting/etc. CI suites to run on every push even if there is no PR. Makes it easy to get your branches in order without having a PR that goes through a bunch of edits to get things working. I dislike the idea of only ever running it when users comment /benchmark. The general idea should be for CI to alert us when incoming changes degrade (or improve) performance.

spec:
serviceAccountName: default
containers:
- name: nighthawk-server
Copy link
Contributor Author

@shawnh2 shawnh2 Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can replace this test server with a much simpler one, like echo server in a follow-up PR.

@shawnh2 shawnh2 added the hold do not merge label Jun 18, 2024
@guydc
Copy link
Contributor

guydc commented Jun 25, 2024

are they any benefits on using nighthawk here ?

Nighthawk:

  • Official benchmarking tool for envoy: makes performance-related discussions with upstream envoy easier, as it help rule-out the loadgen tech as a culprit for issues.
  • Uses the same configuration as envoy proxy (easier for us to define client and server behavior, supports very advanced settings)

Fortio:

  • Easier to interact with in golang (download, compile, call)
  • Very popular and maintained performance tool in the envoy ecosystem

Istio supports execution of their dataplane benchmark tests with either Fortio or Nighthawk, the latter being the most recent addition: istio/istio#21161. Istio load tests are currently executed with fortio.

Overall, I'm +1 for using nighthawk as the data plane benchmark tool. The current use of fortio in our project is mostly for easy execution of load during e2e tests that require it (retries, shutdown, upgrade, ... ). I believe that we can implement our own simple load generator based on net/http and remove the fortio dependency in the future.

@shawnh2
Copy link
Contributor Author

shawnh2 commented Jun 25, 2024

Yes, it does not have privilege to preview the report as a comment, I shall upload a report along with this PR.

The report can be posted to a PR as a comment with pull_request_target event:

Still not working, the error shows Resource not accessible by integration, stick to upload report along with this PR.

@shawnh2 shawnh2 requested review from arkodg and Xunzhuo June 25, 2024 14:59
@shawnh2 shawnh2 removed the hold do not merge label Jun 25, 2024
parentRefs:
- name: "{REF_GATEWAY_NAME}"
hostnames:
- "www.benchmark.com"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we also template out the hostname for this test so each HTTPRoute gets a unique hostname
else these routes wont reach Programmed and will bloat Status

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can make this a bit more realistic and control num-routes-per-host? This way, we don't have one huge route table or many small ones... anyway, not criticial for this time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense, do it as a follow-up

@@ -0,0 +1,925 @@
# Benchmark Report
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ thanks for generating this !

arkodg
arkodg previously approved these changes Jun 25, 2024
Copy link
Contributor

@arkodg arkodg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks for adding this framework and the report !

@arkodg arkodg requested review from a team June 25, 2024 18:41
@arkodg
Copy link
Contributor

arkodg commented Jun 25, 2024

are they any benefits on using nighthawk here ?

Nighthawk:

  • Official benchmarking tool for envoy: makes performance-related discussions with upstream envoy easier, as it help rule-out the loadgen tech as a culprit for issues.
  • Uses the same configuration as envoy proxy (easier for us to define client and server behavior, supports very advanced settings)

Fortio:

  • Easier to interact with in golang (download, compile, call)
  • Very popular and maintained performance tool in the envoy ecosystem

Istio supports execution of their dataplane benchmark tests with either Fortio or Nighthawk, the latter being the most recent addition: istio/istio#21161. Istio load tests are currently executed with fortio.

Overall, I'm +1 for using nighthawk as the data plane benchmark tool. The current use of fortio in our project is mostly for easy execution of load during e2e tests that require it (retries, shutdown, upgrade, ... ). I believe that we can implement our own simple load generator based on net/http and remove the fortio dependency in the future.

non blocking comment, my preference would be to run the client outside the k8s cluster, which is easier with fortio as a golang lib rather than running a containerized nighthawk client

Alice-Lilith
Alice-Lilith previously approved these changes Jun 26, 2024
@@ -0,0 +1,53 @@
name: Benchmarking Tests at Scale
on:
pull_request:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is good to run it on pull/push. In general, I like all major testing/linting/etc. CI suites to run on every push even if there is no PR. Makes it easy to get your branches in order without having a PR that goes through a bunch of edits to get things working. I dislike the idea of only ever running it when users comment /benchmark. The general idea should be for CI to alert us when incoming changes degrade (or improve) performance.

guydc
guydc previously approved these changes Jun 26, 2024
Copy link
Contributor

@guydc guydc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Looking forward to seeing this run in CI!

parentRefs:
- name: "{REF_GATEWAY_NAME}"
hostnames:
- "www.benchmark.com"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can make this a bit more realistic and control num-routes-per-host? This way, we don't have one huge route table or many small ones... anyway, not criticial for this time.

@arkodg
Copy link
Contributor

arkodg commented Jun 26, 2024

there's a lint error

Error: ./test/benchmark/benchmark_report.md:814: ocurred ==> occurred

@shawnh2 shawnh2 dismissed stale reviews from guydc, Alice-Lilith, and arkodg via 3dc4472 June 27, 2024 00:52
@shawnh2
Copy link
Contributor Author

shawnh2 commented Jun 27, 2024

there's a lint error

Error: ./test/benchmark/benchmark_report.md:814: ocurred ==> occurred

Seems like a typo from nighthawk

@arkodg arkodg requested review from guydc and Alice-Lilith June 27, 2024 00:57
@shawnh2
Copy link
Contributor Author

shawnh2 commented Jun 27, 2024

/retest

@arkodg arkodg merged commit 2a86997 into envoyproxy:main Jun 27, 2024
24 checks passed
@shawnh2 shawnh2 deleted the benchmark-ci branch June 28, 2024 01:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Performance test in CI Envoy Gateway performance at scale
7 participants