-
Notifications
You must be signed in to change notification settings - Fork 378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
META: Benchmarks #2919
Comments
The options |
Consider #2915 |
Yes, I'd rather we start off from having a small set of "core" benchmarks which are run on PRs. Comparisons with master using benchstat would be nice. |
IMO this could also be just a list in the workflow YAML file; using The non-essential benchmarks we can disable with build tags or flags maybe? Rest LGTM. :) |
That might be an extra step that is difficult to know from all devs. That is how it was implemented before, you needed to add your benchmark in a yaml file. To reduce the hassle to the minimum, I had to:
To avoid repeating the same mistakes, I propose using a specific tag similar to @thehowl WDYT? |
I believe we should establish two levels of benchmarks:
However, upon reviewing the PR history, it’s unclear if we can conclude that everyone ignored benchmarks; most PRs were simply unrelated to them (case 1). Ignoring benchmarks is definitely more problematic in case 2. A quick win that I suggest includes:
|
Let's summarize all the requests: Requests Summary
Benchmark Execution on Pull Requests
Ping me if I missed something or some point needs more clarification/the output I got is wrong. |
Working on it right now. Let me get a PoC ready and then we’ll figure out a specific rule to meet this need. :) |
Just an update; the current benchmark flow only runs benchdata, since #3007. The execution time of the pipeline is now under a minute. I think we can go from here. If somebody has a use for more benchmarks, they can obviously add them. But I think it's good to start off from here; and gradually add more when a need for a benchmark running every PR arises, rather than trying to run all the small, meaningless microbenchmarks we have.
I'm not sure about this, I don't remember much of the discussion.
I think more than anything we can find some "crucial" paths whose performance we want to keep track of; the GnoVM benchmarks are one example of them. We don't need to run a benchmark of all the operations described, but we can benchmark, for instance, how long For running a node, I don't know what would be good benchmarks. I'll hand it over to the Node Commander-in-Chief @zivkovicmilos to figure out good examples to have in gno.land/tm2.
I don't think this is necessary if we are conservative with the benchmarks we run.
We now already have a subset and the execution time is well below 10 minutes. We could start by enabling them again on the PRs, and then adding |
Description
Benchmark Results
Related issues:
Now that we have a Benchmark MVP working, I'd like to enumerate the requests from several parties regarding the next steps:
Tasks
Execute Benchmarks on Pull Requests
We had to roll back this feature due to concerns from developers about the time it takes for benchmarks to complete. Here are some proposals to mitigate this issue:
-short
flag to execute a smaller set of benchmarks. While this won't completely prevent regressions in themaster
branch, it will reduce their frequency.Disable Non-Essential Benchmarks via Flags
Some benchmarks, like those testing
goleveldb
(which are testing the database itself), don't need to run on every PR. Instead, we could focus on specific benchmarks that test particular cases. Here some examples:examples
package.Benchdata
performanceBenchmark Tools
Currently, two GitHub Actions can run Go benchmarks:
If none of these tools meets all our requirements, we should consider developing a new action fitting our use cases.
This is a call for comments (cc @moul and @thehowl )
The text was updated successfully, but these errors were encountered: