Add benchmark testing framework #3599

shawnh2 · 2024-06-12T13:44:58Z

What this PR does / why we need it:

Taking #2578 forward, but using code to implement benchmark testing framework.

Similar to ConformanceSuite and ConformanceTest, I define a BenchmarkSuite and BenchmarkTest, each benchmark test case will be run as a k8s Job, so that:

we can easily control our benchmark env like Pod's resources limit and request
we can add more test cases in the future without any breaking changes

Which issue(s) this PR fixes:

Fix #1365, Fix #2325, Close #2578

Signed-off-by: shawnh2 <[email protected]>

codecov · 2024-06-12T13:50:25Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 68.33%. Comparing base (0ebfae8) to head (3dc4472).
Report is 7 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3599      +/-   ##
==========================================
+ Coverage   68.25%   68.33%   +0.08%     
==========================================
  Files         170      170              
  Lines       20760    20780      +20     
==========================================
+ Hits        14170    14201      +31     
+ Misses       5568     5562       -6     
+ Partials     1022     1017       -5

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: shawnh2 <[email protected]>

shawnh2 · 2024-06-14T15:25:19Z

.github/workflows/benchmark.yaml

@@ -0,0 +1,40 @@
+name: Benchmarking Tests at Scale


question: should we schedule this ci as a cron job or run with every PR?

or only run this if someone comments /benchmark ?

id vote to only make it run on push to main and release/v*

lets raise a follow up issue to support running on PRs automatically (if it doesn't increase CI time) or using /benchmark

sounds good!

Signed-off-by: shawnh2 <[email protected]>

zirain · 2024-06-17T09:04:39Z

where's the result?

shawnh2 · 2024-06-17T09:46:31Z

where's the result?

In the CI stdout, which way do we prefer to see the result ? Comments back in current thread ?

zirain · 2024-06-17T09:57:32Z

where's the result?

In the CI stdout, which way do we prefer to see the result ? Comments back in current thread ?

that's an option.

test/benchmark/config/gatewayclass.yaml

test/benchmark/tests/scale_httproutes.go

arkodg · 2024-06-17T22:40:31Z

thanks for building out this benchmarking suite !

suggest configuring envoy proxy and envoy gateway memory and cpu limits before starting the test, and also make it a top level input arg
I like how this is a suite of individual tests, each test changing some parameter, the first one modifies HTTPRoute.
A suggestion is to graph the values of the below attributes with changes in the input param (e.g. HTTPRoute)

Latency (P50, P90 and P99)
Error Rate (non 200)
Throughput
EG Memory usage
EG CPU usage
EnvoyProxy Memory usage
EnvoyProxy CPU usage
for each test within Benchmark

arkodg · 2024-06-17T22:43:19Z

we are already using fortio in our repo today

gateway/test/e2e/tests/utils.go

Line 154 in 2c602d5

    
           func runLoadAndWait(t *testing.T, timeoutConfig config.TimeoutConfig, done chan bool, aborter *periodic.Aborter, reqURL string) {

are they any benefits on using nighthawk here ?
cc @guydc

shawnh2 · 2024-06-18T07:04:59Z

A suggestion is to graph the values of the below attributes with changes in the input param (e.g. HTTPRoute)

Cool! These values can be easily retrieved from CP & DP metrics.

…cale to scale-up Signed-off-by: shawnh2 <[email protected]>

Signed-off-by: shawnh2 <[email protected]>

shawnh2 · 2024-06-18T11:23:20Z

.github/workflows/benchmark.yaml

@@ -0,0 +1,53 @@
+name: Benchmarking Tests at Scale
+on:
+  pull_request:


change it to push once this PR is good to go

Not sure if we need to run this in every push, or on schedule

like suggested #3599 (comment), we can run this with /benchmark command.

I think it is good to run it on pull/push. In general, I like all major testing/linting/etc. CI suites to run on every push even if there is no PR. Makes it easy to get your branches in order without having a PR that goes through a bunch of edits to get things working. I dislike the idea of only ever running it when users comment /benchmark. The general idea should be for CI to alert us when incoming changes degrade (or improve) performance.

shawnh2 · 2024-06-18T11:25:10Z

test/benchmark/config/nighthawk-test-server.yaml

+    spec:
+      serviceAccountName: default
+      containers:
+        - name: nighthawk-server


can replace this test server with a much simpler one, like echo server in a follow-up PR.

Signed-off-by: shawnh2 <[email protected]>

guydc · 2024-06-25T14:23:29Z

are they any benefits on using nighthawk here ?

Nighthawk:

Official benchmarking tool for envoy: makes performance-related discussions with upstream envoy easier, as it help rule-out the loadgen tech as a culprit for issues.
Uses the same configuration as envoy proxy (easier for us to define client and server behavior, supports very advanced settings)

Fortio:

Easier to interact with in golang (download, compile, call)
Very popular and maintained performance tool in the envoy ecosystem

Istio supports execution of their dataplane benchmark tests with either Fortio or Nighthawk, the latter being the most recent addition: istio/istio#21161. Istio load tests are currently executed with fortio.

Overall, I'm +1 for using nighthawk as the data plane benchmark tool. The current use of fortio in our project is mostly for easy execution of load during e2e tests that require it (retries, shutdown, upgrade, ... ). I believe that we can implement our own simple load generator based on net/http and remove the fortio dependency in the future.

shawnh2 · 2024-06-25T14:55:50Z

Yes, it does not have privilege to preview the report as a comment, I shall upload a report along with this PR.

The report can be posted to a PR as a comment with pull_request_target event:

Still not working, the error shows Resource not accessible by integration, stick to upload report along with this PR.

Signed-off-by: shawnh2 <[email protected]>

.github/workflows/benchmark.yaml

arkodg · 2024-06-25T18:35:04Z

test/benchmark/config/httproute.yaml

+  parentRefs:
+    - name: "{REF_GATEWAY_NAME}"
+  hostnames:
+    - "www.benchmark.com"


can we also template out the hostname for this test so each HTTPRoute gets a unique hostname
else these routes wont reach Programmed and will bloat Status

Maybe we can make this a bit more realistic and control num-routes-per-host? This way, we don't have one huge route table or many small ones... anyway, not criticial for this time.

make sense, do it as a follow-up

test/benchmark/benchmark_report.md

arkodg · 2024-06-25T18:38:18Z

test/benchmark/benchmark_report.md

@@ -0,0 +1,925 @@
+# Benchmark Report


❤️ thanks for generating this !

arkodg

LGTM thanks for adding this framework and the report !

arkodg · 2024-06-25T18:42:56Z

are they any benefits on using nighthawk here ?

Nighthawk:

Official benchmarking tool for envoy: makes performance-related discussions with upstream envoy easier, as it help rule-out the loadgen tech as a culprit for issues.

Uses the same configuration as envoy proxy (easier for us to define client and server behavior, supports very advanced settings)

Fortio:

Easier to interact with in golang (download, compile, call)

Very popular and maintained performance tool in the envoy ecosystem

Istio supports execution of their dataplane benchmark tests with either Fortio or Nighthawk, the latter being the most recent addition: istio/istio#21161. Istio load tests are currently executed with fortio.

Overall, I'm +1 for using nighthawk as the data plane benchmark tool. The current use of fortio in our project is mostly for easy execution of load during e2e tests that require it (retries, shutdown, upgrade, ... ). I believe that we can implement our own simple load generator based on net/http and remove the fortio dependency in the future.

non blocking comment, my preference would be to run the client outside the k8s cluster, which is easier with fortio as a golang lib rather than running a containerized nighthawk client

Alice-Lilith · 2024-06-26T00:06:20Z

.github/workflows/benchmark.yaml

@@ -0,0 +1,53 @@
+name: Benchmarking Tests at Scale
+on:
+  pull_request:


I think it is good to run it on pull/push. In general, I like all major testing/linting/etc. CI suites to run on every push even if there is no PR. Makes it easy to get your branches in order without having a PR that goes through a bunch of edits to get things working. I dislike the idea of only ever running it when users comment /benchmark. The general idea should be for CI to alert us when incoming changes degrade (or improve) performance.

guydc

LGTM! Looking forward to seeing this run in CI!

guydc · 2024-06-25T23:05:39Z

test/benchmark/config/httproute.yaml

+  parentRefs:
+    - name: "{REF_GATEWAY_NAME}"
+  hostnames:
+    - "www.benchmark.com"


Maybe we can make this a bit more realistic and control num-routes-per-host? This way, we don't have one huge route table or many small ones... anyway, not criticial for this time.

arkodg · 2024-06-26T19:37:48Z

there's a lint error

Error: ./test/benchmark/benchmark_report.md:814: ocurred ==> occurred

Signed-off-by: shawnh2 <[email protected]>

shawnh2 · 2024-06-27T00:55:53Z

there's a lint error

Error: ./test/benchmark/benchmark_report.md:814: ocurred ==> occurred

Seems like a typo from nighthawk

shawnh2 · 2024-06-27T08:33:35Z

/retest

initial implementation of benchmark test suite

9bc6e99

Signed-off-by: shawnh2 <[email protected]>

shawnh2 added 5 commits June 13, 2024 23:01

implement benchmark test run

32ecca3

Signed-off-by: shawnh2 <[email protected]>

add benchmark in ci and spawn a job for benchmark test run

6ea7c91

Signed-off-by: shawnh2 <[email protected]>

fix lint

c2cfb96

Signed-off-by: shawnh2 <[email protected]>

fix ci config

83abe3f

Signed-off-by: shawnh2 <[email protected]>

save benchmark test result into report

bb8f6ee

Signed-off-by: shawnh2 <[email protected]>

shawnh2 commented Jun 15, 2024

View reviewed changes

shawnh2 added 2 commits June 16, 2024 17:08

add control-plane metrics to benchmark test report

c9c6be7

Signed-off-by: shawnh2 <[email protected]>

change httproutes scale number to perform the benchmark test

d4710a3

Signed-off-by: shawnh2 <[email protected]>

shawnh2 marked this pull request as ready for review June 16, 2024 09:10

shawnh2 requested a review from a team as a code owner June 16, 2024 09:10

increase poll timeout

63b70bd

Signed-off-by: shawnh2 <[email protected]>

arkodg reviewed Jun 17, 2024

View reviewed changes

test/benchmark/config/gatewayclass.yaml Outdated Show resolved Hide resolved

arkodg reviewed Jun 17, 2024

View reviewed changes

test/benchmark/config/gatewayclass.yaml Outdated Show resolved Hide resolved

arkodg reviewed Jun 17, 2024

View reviewed changes

test/benchmark/tests/scale_httproutes.go Outdated Show resolved Hide resolved

shawnh2 added 3 commits June 18, 2024 15:20

Merge branch 'main' of github.com:envoyproxy/gateway into benchmark-ci

2b5b8a1

add longer timeout for go-test, collect reports in suite and change s…

baf9be9

…cale to scale-up Signed-off-by: shawnh2 <[email protected]>

update resource limits for both envoyproxy and envoygateway pod

88d2204

Signed-off-by: shawnh2 <[email protected]>

shawnh2 commented Jun 18, 2024

View reviewed changes

fix github action unit problem

8add6f5

Signed-off-by: shawnh2 <[email protected]>

shawnh2 added the hold do not merge label Jun 18, 2024

add scale-down test case support

a7eb9f4

Signed-off-by: shawnh2 <[email protected]>

grant benchmark-test job with write access

19ae3d8

Signed-off-by: shawnh2 <[email protected]>

upload latest benchmark report and remove commenter ci

970f5f7

Signed-off-by: shawnh2 <[email protected]>

shawnh2 requested review from arkodg and Xunzhuo June 25, 2024 14:59

shawnh2 removed the hold do not merge label Jun 25, 2024

arkodg reviewed Jun 25, 2024

View reviewed changes

.github/workflows/benchmark.yaml Outdated Show resolved Hide resolved

arkodg reviewed Jun 25, 2024

View reviewed changes

test/benchmark/benchmark_report.md Show resolved Hide resolved

arkodg reviewed Jun 25, 2024

View reviewed changes

test/benchmark/benchmark_report.md

@@ -0,0 +1,925 @@

# Benchmark Report

Copy link

Contributor

arkodg Jun 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ thanks for generating this !

arkodg previously approved these changes Jun 25, 2024

View reviewed changes

arkodg requested review from a team June 25, 2024 18:41

Alice-Lilith previously approved these changes Jun 26, 2024

View reviewed changes

guydc previously approved these changes Jun 26, 2024

View reviewed changes

fix lint and address comments

3dc4472

Signed-off-by: shawnh2 <[email protected]>

shawnh2 dismissed stale reviews from guydc, Alice-Lilith, and arkodg via 3dc4472 June 27, 2024 00:52

arkodg approved these changes Jun 27, 2024

View reviewed changes

arkodg requested review from guydc and Alice-Lilith June 27, 2024 00:57

guydc approved these changes Jun 27, 2024

View reviewed changes

arkodg merged commit 2a86997 into envoyproxy:main Jun 27, 2024
24 checks passed

shawnh2 deleted the benchmark-ci branch June 28, 2024 01:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add benchmark testing framework #3599

Add benchmark testing framework #3599

shawnh2 commented Jun 12, 2024 •

edited

Loading

codecov bot commented Jun 12, 2024 •

edited

Loading

shawnh2 Jun 14, 2024

shawnh2 Jun 16, 2024

arkodg Jun 17, 2024

shawnh2 Jun 18, 2024

zirain commented Jun 17, 2024

shawnh2 commented Jun 17, 2024

zirain commented Jun 17, 2024

arkodg commented Jun 17, 2024

arkodg commented Jun 17, 2024

shawnh2 commented Jun 18, 2024 •

edited

Loading

shawnh2 Jun 18, 2024

Xunzhuo Jun 19, 2024

shawnh2 Jun 25, 2024

Alice-Lilith Jun 26, 2024

shawnh2 Jun 18, 2024 •

edited

Loading

guydc commented Jun 25, 2024 •

edited

Loading

shawnh2 commented Jun 25, 2024

arkodg Jun 25, 2024

guydc Jun 25, 2024

shawnh2 Jun 27, 2024

arkodg Jun 25, 2024

arkodg left a comment

arkodg commented Jun 25, 2024

Alice-Lilith Jun 26, 2024

guydc left a comment

guydc Jun 25, 2024

arkodg commented Jun 26, 2024

shawnh2 commented Jun 27, 2024

shawnh2 commented Jun 27, 2024

Add benchmark testing framework #3599

Add benchmark testing framework #3599

Conversation

shawnh2 commented Jun 12, 2024 • edited Loading

codecov bot commented Jun 12, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zirain commented Jun 17, 2024

shawnh2 commented Jun 17, 2024

zirain commented Jun 17, 2024

arkodg commented Jun 17, 2024

arkodg commented Jun 17, 2024

shawnh2 commented Jun 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shawnh2 Jun 18, 2024 • edited Loading

Choose a reason for hiding this comment

guydc commented Jun 25, 2024 • edited Loading

shawnh2 commented Jun 25, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arkodg left a comment

Choose a reason for hiding this comment

arkodg commented Jun 25, 2024

Choose a reason for hiding this comment

guydc left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arkodg commented Jun 26, 2024

shawnh2 commented Jun 27, 2024

shawnh2 commented Jun 27, 2024

shawnh2 commented Jun 12, 2024 •

edited

Loading

codecov bot commented Jun 12, 2024 •

edited

Loading

shawnh2 commented Jun 18, 2024 •

edited

Loading

shawnh2 Jun 18, 2024 •

edited

Loading

guydc commented Jun 25, 2024 •

edited

Loading