-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regression test for firewall #783
base: main
Are you sure you want to change the base?
Conversation
36d5a98
to
107acaf
Compare
107acaf
to
6ec0272
Compare
if [ -z "$output" ]; then | ||
echo "No performance regression detected." | ||
else | ||
echo "Performance regression detected! Failing the build." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is run on the public github owned runner - we have no control over how busy is the machine any any point in time so I'm not sure that those numbers are meaningful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small differences might not come up, but a slowdown should still be detected, something which would either way fall through as we don't regress.
Increasing the number of runs and taking average, or well self-hosting would be alternatives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, but it also means that this is prone to being flaky. Imo, without runner physical machine that is not being used for anything when the benchmarks are being run*, this shouldn't be merged in.
[*] this is assuming it's being configured correctly to limit thermal throttling and other things that can impact benchmarks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should then think about how to do that, as the same problem would come in for any other performance tests even with boringtun.
working-directory: crates/telio-firewall | ||
run: | | ||
result=$(critcmp current baseline -t 5) | ||
output=$($output | awk '{if (NR > 2) { if ($10 == 1.00 || $14 == 1.00) { if ($10 < $14) {exit 1} } else { if ($13 < $17) {exit 1}}}}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this is very unclear - I think a comment explaining the format would be a good addition
- name: Telio firewall benchmarks | ||
working-directory: crates/telio-firewall | ||
run: cargo bench --features test_utils --bench firewall_bench "64" -- --warm-up-time 1 --measurement-time 1 | ||
run: cargo bench --features test_utils --bench firewall_bench "64" -- --warm-up-time 1 --measurement-time 1 --save-baseline current |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this (and the other one) benchmark run should be run on the same physical core (and only on that one) to avoid noise from scheduling (can be done with taskset
), but this only makes sense when we run on a known machine where we can prevent other processes from being scheduled on our selected core.
Overall, I think this is a good source of ideas for running benchmarks in CI: https://manuel.bernhardt.io/posts/2023-11-16-core-pinning/ and/or https://easyperf.net/blog/2019/08/02/Perf-measurement-environment-on-Linux#7-disable-address-space-randomization
Problem
We have a benchmark test for firewall, but we don't regress future features agains't a baseline.
Solution
Add a regression test for firewall with v5.0.0-rc3 as baseline using critcmp crate. critcmp lists benchmarks alphabetically so a speed up could also result in failing hence awk is used to filter out speed ups.
☑️ Definition of Done checklist