-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Catch performance regressions in CI #1560
Comments
@at-wat @bshi does this sound like a good plan? Any mistakes I am running head first into :) @ashellunts @scorpionknifes if you are interested this could be a really fun task to take on! |
I wrote this benchmark, but didn't commit it
|
@Sean-Der I think it's better to make it pointer not for the performance but to make it less confusing as I wrote: #1540 (review)
|
I was trying out benchstat on github actions. The results are sometimes inconsistent. I'm currently testing using reference. Idk if this is a worker issue since the test runs on two different jobs in github actions. The exact code I used is https://github.com/scorpionknifes/webrtc/blob/bench/.github/workflows/bench.yaml old time and new time are tested on the same code.
All we need to add is to detect % change. I don't think we need to use benchstat. We can compare overall the time taken. (below are overall timing in pairs)
idk about committing reports, as github workers may differ benchmark results |
I think the performance on the CI environment really depends on the timing since many CI jobs shares a host machine. (Even if the virtual core is assigned to the job, memory and i/o bandwidth is limited by the host.) |
@Sean-Der any thoughts about #1560 (comment)? |
@at-wat sorry missed that! I am 100% in support of leaving it as a pointer, agree with your points. |
@scorpionknifes that is a very good point. Maybe we just test the current commit against |
I spent some time doing research today and found https://pythonspeed.com/articles/consistent-benchmarking-in-ci/ cachegrind looks promising! This might be a 'master only' thing depending on how much longer this makes the tests run. |
I'm late to the party but have a couple cents. The discussion is focused on how to track progress with an emphasis on microbenchmarking but I'd like to point out that perhaps this is premature given we don't have much clarity on what we should be measuring (and by extension optimizing). The latter is, admittedly, a trickier problem but my intuition is that we could be more effective if we could find a way to get good profiling data [a] in either real world (which would need to be volunteered) or synthetic (a.k.a macrobenchmarks; this would require compute resources / $$$ that the project may not have access to). So it seems like it's worth (1) polling users to gather data about the top N bottlenecks [b] that exist in practice. What we learn could then (2) feed into efforts to build the appropriate micro/macro-benchmarks that can reproduce said bottlenecks. And finally, after we have experience on what works well and what doesn't work well (to put another way: what mix of micro & macro benchmarking is appropriate) we could built the appropriate tooling to track progress. [a] In #1516, I volunteered data collected by adding a couple lines of code to my synthetic load test application. So the friction in enabling data collection is IMO low but is it low enough that we actually convince interested parties to contribute? |
I think Sean is merely speaking about catching regressions, not about general performance tuning. While microbenchmarks are misleading, they are good for catching regressions, and every performance regression should be investigated, even if it's only so that a human can decide that it doesn't matter in practice. |
Currently we have nothing to catch performance regressions. We have a few benchmarks spread out across repos. No one consistently runs them though.
We should use benchstat, and commit baseline reports in the
.github
folder. When we see a percentage change (5%?) we should fail the build. IPFS has already implemented it here we just need to port it to GitHub Actions.This also looks relevant
The text was updated successfully, but these errors were encountered: