-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Continuous performance benchmarking #234
Comments
This would be very useful. Is asv the right tool for this though? I always thought it was aimed at smaller benchmarks that run in a matter of seconds. |
Matt has suggested that we could use the coiled/benchmarks repo to run cubed benchmarks. We would need to provide the API key for the AWS/GCP account that pays for the compute though. Alternatively we might just look there for inspiration, rather than actually merging cubed benchmarks. |
I was thinking about how one good way to solve this problem in general would be to make a database, host it somewhere, then append to it when we run a new benchmark job... Then I looked closer at coiled/benchmarks and saw that that's pretty much exactly what they do. I suggest we fork it into a new Once that works locally we can maybe add a github action to this repo that imports and runs the benchmarks defined in the benchmarks repo on certain conditions (e.g. A first step might be to decide what information we would want to record, and adjust the database schema accordingly. The new compute id in #382 gives us the unique run id at the top. What do you think? Is that over-engineering it? Footnotes
|
Sounds great! |
I noticed that the lithops release just now added this:
Which sounds maybe like a useful thing to record. I was wondering in general how much information we want to be trying to store in the benchmark results database? i.e. just total job execution time, or the start and end times of every single container? |
That's interesting. I hadn't seen that. If it's easy to store fine-grained information for each task then go ahead, but I don't think it's needed at the moment. |
It would be useful if Cubed had performance tests that could check for regressions. This would especially useful to prevent complex optimizations (e.g. #221) from degrading performance or stability by accident.
There are really three things to test here: scaling (can we even run a large workload?), stability (how likely is it to fail?), & performance (how fast does it complete?). This is therefore related to but hopefully somewhat separable from #7.
Xarray uses airspeed velocity ("asv") for performance regression testing. Once set up, it can be run on any PR by adding a
run-benchmark
label, or run locally. (I think there is supposed to be a html dashboard of results somewhere, but the link to that on the readme badge appears to be broken.)The text was updated successfully, but these errors were encountered: