Benchmark downloader (script + CI job) #12818

cameel · 2022-03-18T22:14:30Z

~~Depends on #12804.~~ Merged.

The script makes several CirecleCI and Github API requests to get the JSON with benchmark results for a specific PR or branch. I think it will be very handy for manual use, though my primary goal here is to use it in c_ext_benchmarks job to get benchmarks from the base PR and provide the diff. The PR extends that job to generate such diffs and attach them as artifacts.

After the experiences with #12181 and ethereum/solc-bin#35 I decided to use Python to make this more maintainable. This way I'll be able to cover it with tests, and, compared to my earlier Bash scripts, it was easy to make a module with helpers that we'll be able to reuse in other situations that require API access. While it isn't that short as a whole, it's more concise compared to the previous Bash scripts - the main logic specific to this task is concentrated in one function and the rest are reusable helpers and command-line argument definitions (which is more like docs than real code).

I'm on purpose not trying to do too much error handling to keep things simple. I.e. it either works or you get an exception. Only the cases that would not just crash on unexpected response (like finding no matching results) are explicitly checked.

~~This is still work in progress. I still need to~~ Everything is done:

Add some unit tests.
Handle the case where the job has not finished yet.
Test some corner cases to make sure things fail properly on errors.
Getting artifacts for the specific revision of base branch that the requested PR/branch is based on. This would be helpful in cases where base branch has moved on since the PR/branch was created and has extra commits that affect benchmark results.

Usage of the script

To get benchmarks from your branch and from develop, you just run this:

scripts/externalTests/download_benchmarks.py --branch develop
scripts/externalTests/download_benchmarks.py

This stores summarized-benchmarks-<branch>-<commit hash>.json and all-benchmarks-<branch>-<commit hash>.json in the current directory. Now you can run the script from #12804 on the files.

If you want to compare someone else's PR you probably have the PR number handy - you can pass that to the script using the --pr option instead of the branch.

You can also use --base-of-pr option to get artifacts from the branch your PR is based on (instead of develop). By default the script finds a CI run matching not only the base branch name but also the exact commit. This guarantees that even if the base branch has moved on and has extra commits, your comparison will not be influenced by these commits.

Sometimes you might still want to get the benchmark from the latest state of the base branch, e.g. if there was no successful run on the commit you're based on and you know that any extra commits are not going to distort results all that much - in that case you can use --any-commit flag to bypass the commit check.

cameel · 2022-03-23T12:37:37Z

This is now done and ready for review.

cameel · 2022-03-23T12:40:45Z

.circleci/config.yml

+              "${scripts_dir}/externalTests/benchmark_diff.py" table \
+                --output-format markdown \
+                --style humanized \
+                base-branch/summarized-benchmarks-*.json \
+                summarized-benchmarks.json > diff/benchmark-diff-summarized-table-markdown-humanized.md
+              "${scripts_dir}/externalTests/benchmark_diff.py" table \
+                --output-format markdown \
+                --style absolute \
+                base-branch/summarized-benchmarks-*.json \
+                summarized-benchmarks.json > diff/benchmark-diff-summarized-table-markdown-absolute.md
+              "${scripts_dir}/externalTests/benchmark_diff.py" inplace \
+                --style absolute \
+                base-branch/summarized-benchmarks-*.json \
+                summarized-benchmarks.json > diff/benchmark-diff-summarized-inplace-absolute.md
+              "${scripts_dir}/externalTests/benchmark_diff.py" inplace \
+                --style absolute \
+                base-branch/all-benchmarks-*.json \
+                all-benchmarks.json > diff/benchmark-diff-all-table-inplace-absolute.md


We might not need all of these. benchmark-diff-summarized-table-markdown-humanized.md will probably be the most useful one in practice. It will soon be possible to include extra columns with absolute values in it as well.

On the other hand the overhead of having all of them is negligible and also this acts as an end to end test for the script so I included them here anyway.

cameel · 2022-04-04T12:45:05Z

Note, about the c_ext_benchmarks failure:

Looking for pipelines that ran on branch benchmark-differ, commit fd0ec2e30a8d9c60bfbb15a984cf90bb0f571908.
[ERROR] Job c_ext_benchmarks has failed or is still running. Current status: blocked.

This is a limitation of the script with regard to PRs that are based one on the other. The job tries to access artifacts from the same job on the base PR. If both were pushed at the same time, it's likely that that base job has not finished yet. In that case if just needs to be reran after the base job finishes.

test/scripts/test_externalTests_benchmark_downloader.py

ekpyron

This is a lot of infrastructure for just downloading some files :-). But probably not much one can do to avoid that...

I haven't reviewed the python scripts at all, but - as usual for these PRs - I'd say it's fine to merge based on them working properly in CI - and it being nice to have them, but them being non-critical.

If anyone is up for going through the python scripts once, though, it wouldn't hurt.

cameel · 2022-04-11T19:55:33Z

This is a lot of infrastructure for just downloading some files :-). But probably not much one can do to avoid that...

Unfortunately, dealing with APIs reliably is just verbose. This is not the first time I need to access their APIs and I always end up with a really ugly Bash script. We really needed some wrappers for that that we can later reuse as needed.

CircleCI actually has its own CLI utility called circleci and I was initially hoping I'd get away just using that instead of writing it in Python. But it does not really have the same use case. It does not have commands for fetching info from the API, only for controlling the pipelines from CLI.

Also, to be fair, it adds more value than just "downloading some files". It automates finding the exact runs that you need which is more tedious than it seems at first. E.g. if your branch is based on develop and something got merged in the meantime, you have to dig into history find the old commit ID, then locate the right run via this horrible list of pipelines in CircleCI's UI with infinite scroll. It's painful to use, especially If what you're looking for is not immediately on the first page. With the script you just give it a PR number and it either gives you the right benchmark or fails. Or you go to the artifacts in your PR and find the diff attached there. You don't have to think about it, which is how it should work.

…rcleCI

cameel added the testing 🔨 label Mar 18, 2022

cameel self-assigned this Mar 18, 2022

cameel force-pushed the benchmark-downloader branch 8 times, most recently from fa1ddc6 to f41e05c Compare March 23, 2022 12:34

cameel added the has dependencies The PR depends on other PRs that must be merged first label Mar 23, 2022

cameel marked this pull request as ready for review March 23, 2022 12:37

cameel changed the title ~~Benchmark downloader~~ Benchmark downloader (script + CI job) Mar 23, 2022

cameel commented Mar 23, 2022

View reviewed changes

cameel changed the base branch from develop to benchmark-differ March 23, 2022 12:42

cameel force-pushed the benchmark-downloader branch 2 times, most recently from ad76d0f to 838be6d Compare March 28, 2022 11:35

cameel force-pushed the benchmark-differ branch from 60b45d0 to b3e3669 Compare March 28, 2022 16:04

cameel force-pushed the benchmark-downloader branch from 838be6d to 9dfa818 Compare March 28, 2022 16:04

cameel force-pushed the benchmark-differ branch from b3e3669 to fce9efc Compare March 28, 2022 18:10

cameel force-pushed the benchmark-downloader branch from 9dfa818 to 9c91771 Compare March 28, 2022 18:10

This was referenced Mar 28, 2022

Benchmark differ extensions #12869

Closed

External test benchmarking extensions #12522

Closed

cameel force-pushed the benchmark-differ branch from fce9efc to fd0ec2e Compare April 4, 2022 12:08

cameel force-pushed the benchmark-downloader branch from 9c91771 to 0fea909 Compare April 4, 2022 12:08

cameel mentioned this pull request Apr 4, 2022

[CI] Automate preparation of release binaries for publishing (via artifacts) #12181

Closed

cameel force-pushed the benchmark-differ branch from fd0ec2e to 8c9856c Compare April 5, 2022 13:43

cameel force-pushed the benchmark-downloader branch from 0fea909 to 71d9f08 Compare April 5, 2022 13:45

Base automatically changed from benchmark-differ to develop April 7, 2022 07:24

cameel force-pushed the benchmark-downloader branch from 71d9f08 to 1cf3fec Compare April 7, 2022 11:04

cameel removed the has dependencies The PR depends on other PRs that must be merged first label Apr 7, 2022

ekpyron reviewed Apr 11, 2022

View reviewed changes

test/scripts/test_externalTests_benchmark_downloader.py Show resolved Hide resolved

ekpyron previously approved these changes Apr 11, 2022

View reviewed changes

cameel dismissed ekpyron’s stale review via 69781f8 April 13, 2022 14:06

cameel force-pushed the benchmark-downloader branch from 1cf3fec to 69781f8 Compare April 13, 2022 14:06

cameel added 2 commits May 23, 2022 14:18

download_benchmarks: Script for downloading benchmark results from Ci…

9618cb9

…rcleCI

c_ext_benchmarks: Generate and store benchmark diffs

45dffe5

cameel force-pushed the benchmark-downloader branch from 69781f8 to 45dffe5 Compare May 23, 2022 12:18

chriseth approved these changes May 23, 2022

View reviewed changes

chriseth merged commit a8dc762 into develop May 23, 2022

chriseth deleted the benchmark-downloader branch May 23, 2022 13:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark downloader (script + CI job) #12818

Benchmark downloader (script + CI job) #12818

cameel commented Mar 18, 2022 •

edited

Loading

cameel commented Mar 23, 2022

cameel Mar 23, 2022 •

edited

Loading

cameel commented Apr 4, 2022

ekpyron left a comment

cameel commented Apr 11, 2022 •

edited

Loading

Benchmark downloader (script + CI job) #12818

Benchmark downloader (script + CI job) #12818

Conversation

cameel commented Mar 18, 2022 • edited Loading

Usage of the script

cameel commented Mar 23, 2022

cameel Mar 23, 2022 • edited Loading

Choose a reason for hiding this comment

cameel commented Apr 4, 2022

ekpyron left a comment

Choose a reason for hiding this comment

cameel commented Apr 11, 2022 • edited Loading

cameel commented Mar 18, 2022 •

edited

Loading

cameel Mar 23, 2022 •

edited

Loading

cameel commented Apr 11, 2022 •

edited

Loading