You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We added some checks to git-annex build workflow to spot some cases which could lead to slow(er) standalone build operation, but overall we do not have a good way to detect whenever git-annex "slow downs". We can only see reflection of that whenever we try a new snapshot build sweeping through our datalad tests but then it becomes an archeological expedition to see which change brought the pessimization.
It would be nice to establish automated and consistent benchmarking of git-annex builds as pertinent to datalad.
Proposal:
take some release of datalad still compatible with current annex build (so we could take current release ATM IIRC)
use asv benchmarks of that datalad but for benchmarking git-annex (so whenever we improve our datalad benchmarks collection, it automagically helps to benchmark git-annex)
establish datalad/git-annex-benchmarking on github
git subtree benchmarks from datalad
Include git-annex's master branch (from git://git.kitenet.net/git-annex) as annex-master branch known to that repo
I think asv can benchmark commits in another branch, while benchmarks would be in the master. So asv configuration would do that
add pythonish setup to make standalone install the git-annex so asv could deploy any given version of git-annex ( I wonder if there is smth like ccache for haskell ;))
asv run on new commits in annex-master and then asv gh-pages && datalad save -m "ASV results update" .asv && git push origin
provide github actions worker on a dedicated box (I have some, consistent timing), probably within singularity (at least some isolation and again -- consistency)
make that github action to run only only on pushes, not PRs so we do not anyhow compromise security
There is git-annex benchmark, which does a good job of
benchmarking a git-annex command or sequence of commands you choose.
It can output to json or csv, which lets benchmarks be compared and a
regression be flagged. At least in theory.. I don't have anything doing
that. Output of git-annex benchmark whereis --csv foo.csv
Name,Mean,MeanLB,MeanUB,Stddev,StddevLB,StddevUB
whereis,5.076051109441738e-2,4.914089405704959e-2,5.4101610224266704e-2,4.234978773428508e-3,2.050397769413241e-3,6.8122220186021265e-3
(But does not include startup speed in the benchmark currently. Could
add an option to include that, or maybe better a mode that only
benchmarks the startup speed.)
We added some checks to git-annex build workflow to spot some cases which could lead to slow(er) standalone build operation, but overall we do not have a good way to detect whenever git-annex "slow downs". We can only see reflection of that whenever we try a new snapshot build sweeping through our datalad tests but then it becomes an archeological expedition to see which change brought the pessimization.
It would be nice to establish automated and consistent benchmarking of git-annex builds as pertinent to datalad.
Proposal:
datalad/git-annex-benchmarking
on githubgit subtree
benchmarks from dataladannex-master
branch known to that repomaster
. So asv configuration would do thatmake standalone install
the git-annex so asv could deploy any given version of git-annex ( I wonder if there is smth like ccache for haskell ;))apt build-dep git-annex-standalone
, and that would be container to run asv in -- it would have all build-dependencies etc, and we use this for "worker" env (see below) latergit checkout annex-master && git pull --ff-only && git push origin annex-master && git checkout master
asv run
on new commits inannex-master
and thenasv gh-pages && datalad save -m "ASV results update" .asv && git push origin
WDYT @mih @kyleam @bpoldrack @jwodder
FYI @joeyh
The text was updated successfully, but these errors were encountered: