Add benchmarking workflow to evaluate classification and OD runtimes #638

ntlind · 2024-06-27T19:54:17Z

Use Cases

We want to look at recent commits to figure out which one caused a change in performance
We want to quickly test performance of potential changes locally using standardized benchmarks

Changes

Delete stress-testing.yml and associated files since we never trigger these manual jobs
Create an automatic benchmarking job to track Valor performance for classification and object detection evaluations. The workflow will fail if any of the individual evaluation tests takes longer than 30 seconds.

Output

integration_tests/benchmarks/object-detection/benchmark_script.py

integration_tests/benchmarks/classification/benchmark_script.py

stale - need to review with Eric.

ekorman · 2024-07-03T16:14:31Z

we can host the data in a public s3. prefer not having in version control

ekorman · 2024-07-03T16:20:27Z

we can host the data in a public s3. prefer not having in version control
files:

https://pub-fae71003f78140bdaedf32a7c8d331d2.r2.dev/classification_data.json
https://pub-fae71003f78140bdaedf32a7c8d331d2.r2.dev/detection_data.json

ekorman

can we make the benchmark-evaluations job fail if it takes too long?

ntlind added 9 commits June 27, 2024 13:48

add classification benchmarks

1c80aca

typo

6816ebb

try changing paths

7ec99f2

try new workflow

d344350

fix results.json

e81a903

change quotes

299a9e2

fix typo

d602ebd

preserve formatting

4d506a8

update benchmark query

d0db76b

ntlind self-assigned this Jun 27, 2024

ntlind added the enhancement New feature or request label Jun 27, 2024

ntlind added 9 commits June 28, 2024 01:55

start adding od script

5791c93

get detection test working

1d2db2b

add od benchmarks

ea6487a

turn down limit

3f6cccf

reduce limit again

f7a1b43

return number of annotations

9868598

up limit

6f1cd70

reduce limit

fc63108

capture number of labels and annotations

f8075e4

ntlind marked this pull request as ready for review July 1, 2024 07:09

ntlind requested review from czaloom and ekorman as code owners July 1, 2024 07:09

ntlind changed the title ~~Add benchmarking workflow to provide performance signals for each commit~~ Add benchmarking workflow to evaluate classification and OD runtimes Jul 1, 2024

use info in meta_

bc73fb4

czaloom reviewed Jul 1, 2024

View reviewed changes

integration_tests/benchmarks/object-detection/benchmark_script.py Outdated Show resolved Hide resolved

integration_tests/benchmarks/classification/benchmark_script.py Outdated Show resolved Hide resolved

remove aws role

fd6a58b

czaloom previously approved these changes Jul 1, 2024

View reviewed changes

Merge branch 'main' into add_benchmark_utilities

094732c

ekorman requested changes Jul 3, 2024

View reviewed changes

ntlind added 8 commits July 3, 2024 12:00

Merge branch 'main' into add_benchmark_utilities

1efbfad

add timeout, move file location to cloud

fc9c51c

add limits

d93e32a

try new limits

df41476

finalize od limits

0f493d7

Merge branch 'main' into add_benchmark_utilities

05b6e7d

reduce limit

c13e115

Merge branch 'main' into add_benchmark_utilities

7be70c3

ekorman approved these changes Jul 3, 2024

View reviewed changes

Merge branch 'main' into add_benchmark_utilities

b8ef066

ntlind merged commit 0694e28 into main Jul 3, 2024
12 checks passed

ntlind deleted the add_benchmark_utilities branch July 3, 2024 20:20

Striveworks deleted a comment from czaloom Jul 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add benchmarking workflow to evaluate classification and OD runtimes #638

Add benchmarking workflow to evaluate classification and OD runtimes #638

ntlind commented Jun 27, 2024 •

edited

Loading

ekorman commented Jul 3, 2024

ekorman commented Jul 3, 2024 •

edited

Loading

ekorman left a comment

Add benchmarking workflow to evaluate classification and OD runtimes #638

Add benchmarking workflow to evaluate classification and OD runtimes #638

Conversation

ntlind commented Jun 27, 2024 • edited Loading

Use Cases

Changes

Output

ekorman commented Jul 3, 2024

ekorman commented Jul 3, 2024 • edited Loading

ekorman left a comment

Choose a reason for hiding this comment

ntlind commented Jun 27, 2024 •

edited

Loading

ekorman commented Jul 3, 2024 •

edited

Loading