Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync <- Mlperf inference #557

Merged
merged 224 commits into from
Nov 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
224 commits
Select commit Hold shift + click to select a range
e7e915b
bug fix
anandhu-eng Nov 8, 2024
c7054a4
incorporated whoami to take the username
anandhu-eng Nov 8, 2024
74966b9
Merge pull request #508 from mlcommons/anandhu-eng-patch-1
arjunsuresh Nov 8, 2024
4461eba
Fix nvidia gptj model suffix
arjunsuresh Nov 8, 2024
ddd24af
Merge branch 'mlcommons:mlperf-inference' into mlperf-inference
arjunsuresh Nov 8, 2024
db91160
Merge pull request #509 from GATEOverflow/mlperf-inference
arjunsuresh Nov 8, 2024
0f6306e
Update test-mlperf-inference-mixtral.yml
arjunsuresh Nov 8, 2024
a24b7a8
Update test-mlperf-inference-llama2.yml
arjunsuresh Nov 8, 2024
7844605
Update test-mlperf-inference-mixtral.yml
arjunsuresh Nov 9, 2024
0f7eda4
Update test-mlperf-inference-llama2.yml
arjunsuresh Nov 9, 2024
98e8c39
Merge branch 'mlcommons:mlperf-inference' into mlperf-inference
arjunsuresh Nov 9, 2024
6883f0a
Update test-intel-mlperf-inference-implementations.yml
arjunsuresh Nov 9, 2024
33ef154
Update test-amd-mlperf-inference-implementations.yml
arjunsuresh Nov 9, 2024
5cdef0d
Update test-nvidia-mlperf-inference-implementations.yml
arjunsuresh Nov 9, 2024
15dc7d0
Update test-mlperf-inference-llama2.yml
arjunsuresh Nov 9, 2024
bd97501
Update test-mlperf-inference-mixtral.yml
arjunsuresh Nov 9, 2024
495365e
Update test-amd-mlperf-inference-implementations.yml
arjunsuresh Nov 9, 2024
c0499d0
privileged mode set to on default
anandhu-eng Nov 9, 2024
fac6dff
add privileged mode option to run command
anandhu-eng Nov 9, 2024
d47b8ca
Added docker deps for openimages for non nvidia implementations too
arjunsuresh Nov 9, 2024
4e03464
Removed generic-sys-util deps for openimages
arjunsuresh Nov 9, 2024
17c5430
Skip system package installs when sudo not available
arjunsuresh Nov 9, 2024
da6af8c
Merge pull request #512 from GATEOverflow/mlperf-inference
arjunsuresh Nov 9, 2024
159a1fc
Added an option to skip sudo passwd
arjunsuresh Nov 10, 2024
e7d4e2f
Bug fix for skipping deps - non root users
anandhu-eng Nov 10, 2024
859c805
fix for non interactive terminal
anandhu-eng Nov 10, 2024
12a710d
Update _cm.yaml
anandhu-eng Nov 10, 2024
ce71d8b
updation for docker privileged mode
anandhu-eng Nov 10, 2024
7d1f5bc
Merge pull request #514 from mlcommons/issue-#510
arjunsuresh Nov 10, 2024
e762efc
Merge pull request #511 from mlcommons/anandhu-eng-patch-1
arjunsuresh Nov 10, 2024
79856fd
Allow default version update from variations
arjunsuresh Nov 10, 2024
e9f8a17
Merge branch 'mlcommons:mlperf-inference' into mlperf-inference
arjunsuresh Nov 10, 2024
8e81b5b
Merge pull request #515 from GATEOverflow/mlperf-inference
arjunsuresh Nov 10, 2024
1feab22
Update test-amd-mlperf-inference-implementations.yml
arjunsuresh Nov 10, 2024
2c2f003
Update test-intel-mlperf-inference-implementations.yml
arjunsuresh Nov 10, 2024
3fba880
Added configs for IntelSPR.24c
arjunsuresh Nov 10, 2024
cea03d7
Update test-nvidia-mlperf-inference-implementations.yml
arjunsuresh Nov 10, 2024
74ff3f7
Update test-scc24-sdxl.yaml
arjunsuresh Nov 10, 2024
ddcd25a
Update test-mlperf-inference-llama2.yml
arjunsuresh Nov 10, 2024
9384c34
Update test-mlperf-inference-mixtral.yml
arjunsuresh Nov 10, 2024
dc9c828
Added enable_env_if_env, removed fixed branch for scc24
arjunsuresh Nov 10, 2024
03b59f1
Merge branch 'mlcommons:mlperf-inference' into mlperf-inference
arjunsuresh Nov 10, 2024
7ba8a1f
Added enable_env_if_env, removed fixed branch for scc24
arjunsuresh Nov 10, 2024
e823583
Merge pull request #518 from GATEOverflow/mlperf-inference
arjunsuresh Nov 10, 2024
5195d30
Removed dev branch for SDXL
arjunsuresh Nov 10, 2024
78bd68d
Merge branch 'mlperf-inference' into mlperf-inference
arjunsuresh Nov 10, 2024
63482f9
Merge pull request #519 from GATEOverflow/mlperf-inference
arjunsuresh Nov 10, 2024
98824e9
update_env_if_env -> update_meta_if_env
arjunsuresh Nov 10, 2024
91c16ef
Merge pull request #520 from GATEOverflow/mlperf-inference
arjunsuresh Nov 10, 2024
f8dd8d3
commit against issue https://github.com/mlcommons/cm4mlops/issues/522
anandhu-eng Nov 11, 2024
cca9f1d
fix typo
anandhu-eng Nov 11, 2024
1d2cf20
Added test case 8
anandhu-eng Nov 11, 2024
6517741
Added cuda dependency(conditional)
anandhu-eng Nov 11, 2024
7f63872
Update test-nvidia-mlperf-inference-implementations.yml
arjunsuresh Nov 11, 2024
13dcd9e
pass cuda version also
anandhu-eng Nov 11, 2024
f527ca0
Add cuda version to sut meta
anandhu-eng Nov 11, 2024
22a1d23
reverted cuda dependency
anandhu-eng Nov 11, 2024
8dd6104
Update test-nvidia-mlperf-inference-implementations.yml
arjunsuresh Nov 11, 2024
5fa577d
info update
anandhu-eng Nov 11, 2024
ad56488
Merge pull request #523 from mlcommons/submisison-generation-fix
arjunsuresh Nov 11, 2024
7799261
Merge branch 'mlperf-inference' into anandhu-eng-patch-1
arjunsuresh Nov 11, 2024
be89f5f
Merge branch 'mlperf-inference' into issue#525
anandhu-eng Nov 11, 2024
7ab3736
Merge pull request #524 from mlcommons/anandhu-eng-patch-1
arjunsuresh Nov 11, 2024
3fd9fee
Merge pull request #526 from mlcommons/issue#525
arjunsuresh Nov 11, 2024
e80d4ab
Enabled --total-sample-count option in performance run for llama2 and…
arjunsuresh Nov 11, 2024
07b7fe6
Update test-mlperf-inference-llama2.yml
arjunsuresh Nov 11, 2024
c437b1b
Update test-mlperf-inference-mixtral.yml
arjunsuresh Nov 11, 2024
e0b6c3e
Update test-mlperf-inference-llama2.yml
arjunsuresh Nov 11, 2024
10eb033
Update test-mlperf-inference-llama2.yml
arjunsuresh Nov 11, 2024
aaddb2a
Update test-mlperf-inference-mixtral.yml
arjunsuresh Nov 11, 2024
e783d4f
Merge branch 'mlcommons:mlperf-inference' into mlperf-inference
arjunsuresh Nov 11, 2024
8cec2a6
Added github runner CM script
arjunsuresh Nov 11, 2024
899f9ab
Fix gh runner remove command
arjunsuresh Nov 11, 2024
1934189
Fix dockerfile_env
arjunsuresh Nov 11, 2024
28627a3
Merge pull request #527 from GATEOverflow/mlperf-inference
arjunsuresh Nov 11, 2024
aab34ac
Fix docker build_deps
arjunsuresh Nov 11, 2024
61c9b7d
Update test-nvidia-mlperf-inference-implementations.yml
arjunsuresh Nov 11, 2024
ae38e99
Update test-nvidia-mlperf-inference-implementations.yml
arjunsuresh Nov 11, 2024
e17ece4
Merge branch 'mlcommons:mlperf-inference' into mlperf-inference
arjunsuresh Nov 12, 2024
94602ff
Update test-nvidia-mlperf-inference-implementations.yml
arjunsuresh Nov 12, 2024
41eb252
Update test-nvidia-mlperf-inference-implementations.yml
arjunsuresh Nov 12, 2024
eeba8ff
Update test-nvidia-mlperf-inference-implementations.yml
arjunsuresh Nov 12, 2024
1de3d37
Added detect for nvidia-docker
arjunsuresh Nov 12, 2024
d2d3319
Fix scenario mapping call in mlperf utils
arjunsuresh Nov 12, 2024
9d2bf09
Fix MLPerf inference measurements readme update
arjunsuresh Nov 12, 2024
fbc74fa
Merge branch 'mlcommons:mlperf-inference' into mlperf-inference
arjunsuresh Nov 12, 2024
d4d3cca
Merge pull request #529 from GATEOverflow/mlperf-inference
arjunsuresh Nov 12, 2024
dacb33b
Update requirements.txt
arjunsuresh Nov 12, 2024
ffc8b32
Update setup.py
arjunsuresh Nov 12, 2024
0f07093
Increment version to 0.3.26
arjunsuresh Nov 12, 2024
4dcd13a
Avoid creation of empty model_mapping.json
arjunsuresh Nov 12, 2024
70e0fe5
Merge branch 'mlperf-inference' into mlperf-inference
arjunsuresh Nov 12, 2024
b8a91f1
Merge pull request #530 from GATEOverflow/mlperf-inference
arjunsuresh Nov 12, 2024
b5db377
enable user to submit a result for both closed and open
anandhu-eng Nov 13, 2024
0ff94e0
fix typo
anandhu-eng Nov 13, 2024
16bc227
test commit - enable open division
anandhu-eng Nov 13, 2024
3f63303
added open-closed option under division
anandhu-eng Nov 13, 2024
9a312f6
use the default sut folder name supplied if cm-sut-json is not there …
anandhu-eng Nov 13, 2024
800df7b
Fixes for --docker_pss_user_group
arjunsuresh Nov 13, 2024
d05f261
Update test-mlperf-inference-llama2.yml
arjunsuresh Nov 13, 2024
4be1139
set pytorch as default backend
anandhu-eng Nov 13, 2024
aaa7b5c
revert commit https://github.com/mlcommons/cm4mlops/pull/537/commits/…
anandhu-eng Nov 13, 2024
cf30e25
set pytorch as default framework
anandhu-eng Nov 13, 2024
f73eb6e
Enable pass_user_group option for docker run
arjunsuresh Nov 13, 2024
0e5e92d
By default make MLPerf inference results and submissions dir shared
arjunsuresh Nov 13, 2024
3fe15c0
Merge branch 'mlperf-inference' into mlperf-inference
arjunsuresh Nov 13, 2024
bbdb49d
By default make MLPerf inference results and submissions dir shared
arjunsuresh Nov 13, 2024
7869cc3
Increment the default mlperf inference version in user conf
arjunsuresh Nov 13, 2024
eca729e
Merge pull request #539 from GATEOverflow/mlperf-inference
arjunsuresh Nov 13, 2024
cdffedb
Added default_version=master for get-mlperf-inference-src
arjunsuresh Nov 13, 2024
a240b9d
Fix mixtral starting weights filename, load measurements.json for mlp…
arjunsuresh Nov 13, 2024
b517987
Added mlperf-inference-reference-mixtral image name
arjunsuresh Nov 13, 2024
bd42b84
Update test-mlperf-inference-mixtral.yml
arjunsuresh Nov 13, 2024
e64cfbc
Update test-mlperf-inference-mixtral.yml
arjunsuresh Nov 13, 2024
d5ad232
Merge branch 'mlperf-inference' into mlperf-inference
arjunsuresh Nov 13, 2024
fa7c35a
Update test-mlperf-inference-sdxl.yaml
arjunsuresh Nov 14, 2024
d369373
Fix default_version update in variations
arjunsuresh Nov 14, 2024
f4948dd
Merge pull request #540 from GATEOverflow/mlperf-inference
arjunsuresh Nov 14, 2024
49f5099
Rearranged the logic of detect-sudo
arjunsuresh Nov 14, 2024
c45e3f4
Update test-mlperf-inference-sdxl.yaml
arjunsuresh Nov 14, 2024
5acaecb
Update test-scc24-sdxl.yaml
arjunsuresh Nov 14, 2024
555d6b8
Update test-scc24-sdxl.yaml
arjunsuresh Nov 14, 2024
4f98c07
Update test-scc24-sdxl.yaml
arjunsuresh Nov 14, 2024
fbac609
Update test-mlperf-inference-llama2.yml
arjunsuresh Nov 14, 2024
dfbf6c2
Update test-scc24-sdxl.yaml
arjunsuresh Nov 14, 2024
6fb71ff
Merge pull request #548 from GATEOverflow/mlperf-inference
arjunsuresh Nov 15, 2024
b50056e
updation for test data generation
anandhu-eng Nov 15, 2024
5a46d9a
add pandas deps
anandhu-eng Nov 15, 2024
4be8ceb
Add test data generated path
anandhu-eng Nov 15, 2024
88e416b
Create run.sh
anandhu-eng Nov 15, 2024
caa03ea
add python3 as deps
anandhu-eng Nov 15, 2024
a7e0bf9
add python script to extract test dataset
anandhu-eng Nov 15, 2024
4e63115
add samples input args
anandhu-eng Nov 15, 2024
2dc178e
set default output path
anandhu-eng Nov 15, 2024
010a539
included file name
anandhu-eng Nov 15, 2024
bb4464c
update default out path
anandhu-eng Nov 15, 2024
4be54b3
update env
anandhu-eng Nov 15, 2024
3e11dd1
Add base variation
anandhu-eng Nov 15, 2024
b31e885
os package imported
anandhu-eng Nov 15, 2024
4a0826a
fix typo
anandhu-eng Nov 15, 2024
0294ec3
Remove inference cache from mlperf inference docker
arjunsuresh Nov 15, 2024
82f18f4
Added an option to call preprocess script via the submission checker
arjunsuresh Nov 15, 2024
a30dea7
Nvidia gh action update
arjunsuresh Nov 15, 2024
fc53e03
Added RTX4090x1 configs
arjunsuresh Nov 15, 2024
636f49e
Merge pull request #550 from mlcommons/mixtral-test-dataset
arjunsuresh Nov 15, 2024
6da3acb
Merge branch 'mlperf-inference' into mlperf-inference
arjunsuresh Nov 15, 2024
f260610
Merge pull request #551 from GATEOverflow/mlperf-inference
arjunsuresh Nov 15, 2024
fed53ef
Fix skip_if_any for llama2 and mixtral
arjunsuresh Nov 15, 2024
74c7fa0
change in variation name
anandhu-eng Nov 15, 2024
54025b0
Merge pull request #552 from mlcommons/mixtral-test-dataset
arjunsuresh Nov 15, 2024
b50fff2
Update test-intel-mlperf-inference-implementations.yml
arjunsuresh Nov 15, 2024
a3c8ae3
Merge branch 'mlperf-inference' into mlperf-inference
arjunsuresh Nov 15, 2024
32d3e5d
Merge pull request #553 from GATEOverflow/mlperf-inference
arjunsuresh Nov 15, 2024
b6ba934
Update test-nvidia-mlperf-inference-implementations.yml
arjunsuresh Nov 15, 2024
f81ffe1
Update test-nvidia-mlperf-inference-implementations.yml
arjunsuresh Nov 15, 2024
5cd00e8
Update test-scc24-sdxl.yaml
arjunsuresh Nov 15, 2024
413c966
Support min_query_count and max_query_count for mlperf inference
arjunsuresh Nov 15, 2024
0067dcf
Merge branch 'mlperf-inference' into mlperf-inference
arjunsuresh Nov 15, 2024
a602261
Merge pull request #554 from GATEOverflow/mlperf-inference
arjunsuresh Nov 15, 2024
d5132cd
Remove TEST05
arjunsuresh Nov 15, 2024
2f58688
Merge branch 'mlperf-inference' into mlperf-inference
arjunsuresh Nov 15, 2024
11b3be8
Merge pull request #555 from GATEOverflow/mlperf-inference
arjunsuresh Nov 15, 2024
31eda56
Fix CM_ML_MODEL_PATH export for mixtral
arjunsuresh Nov 16, 2024
18b1d06
Merge branch 'mlperf-inference' into mlperf-inference
arjunsuresh Nov 16, 2024
155cf61
Merge pull request #556 from GATEOverflow/mlperf-inference
arjunsuresh Nov 16, 2024
63d8288
Update test-scc24-sdxl.yaml
arjunsuresh Nov 16, 2024
259a159
Update test-scc24-sdxl.yaml
arjunsuresh Nov 16, 2024
571ae2b
Fix the saving of console logs
arjunsuresh Nov 16, 2024
4081528
Merge branch 'mlperf-inference' into mlperf-inference
arjunsuresh Nov 16, 2024
e456b44
Merge pull request #558 from GATEOverflow/mlperf-inference
arjunsuresh Nov 16, 2024
8f47e15
Update test-nvidia-mlperf-inference-implementations.yml
arjunsuresh Nov 16, 2024
b15df20
Update test-scc24-sdxl.yaml
arjunsuresh Nov 16, 2024
4b3d2ee
Update test-nvidia-mlperf-inference-implementations.yml
arjunsuresh Nov 16, 2024
90c3454
Update test-nvidia-mlperf-inference-implementations.yml
arjunsuresh Nov 16, 2024
5c6dd4e
Fix 3d-unet SS latency in configs
arjunsuresh Nov 16, 2024
17afcd8
Merge branch 'mlperf-inference' into mlperf-inference
arjunsuresh Nov 16, 2024
79fe248
Merge pull request #559 from GATEOverflow/mlperf-inference
arjunsuresh Nov 16, 2024
10788d8
Added sympy dependency for nvidia mlperf inference gptj
arjunsuresh Nov 16, 2024
a51f736
Merge branch 'mlperf-inference' into mlperf-inference
arjunsuresh Nov 16, 2024
7d34691
Merge pull request #560 from GATEOverflow/mlperf-inference
arjunsuresh Nov 16, 2024
58d9b01
Skip docker run command for AMD MLPerf inference gh action
arjunsuresh Nov 16, 2024
6c457dc
Update test-amd-mlperf-inference-implementations.yml
arjunsuresh Nov 16, 2024
6d10ea6
Update test-nvidia-mlperf-inference-implementations.yml
arjunsuresh Nov 16, 2024
a1492c2
Fixes rtx4090x1 configs
arjunsuresh Nov 16, 2024
7031377
Added docker image names for nvidia mlperf inference llm
arjunsuresh Nov 16, 2024
ea8ac4b
Merge branch 'mlperf-inference' into mlperf-inference
arjunsuresh Nov 16, 2024
8e75c89
Merge pull request #562 from GATEOverflow/mlperf-inference
arjunsuresh Nov 16, 2024
fe4d73a
Remove TEST05 in mlperf inference
arjunsuresh Nov 16, 2024
f45728b
Fix retinanet nvidia mlperf inference config
arjunsuresh Nov 16, 2024
dfc46a9
Update test-nvidia-mlperf-inference-implementations.yml
arjunsuresh Nov 16, 2024
f7b24d0
Build TRTLLM for Nvidia MLPerf inference LLM models
arjunsuresh Nov 17, 2024
a5cd676
Update test-amd-mlperf-inference-implementations.yml
arjunsuresh Nov 17, 2024
ef16db8
Update test-amd-mlperf-inference-implementations.yml
arjunsuresh Nov 17, 2024
cc4ad57
Update test-nvidia-mlperf-inference-implementations.yml
arjunsuresh Nov 17, 2024
a91cc3b
Merge branch 'mlperf-inference' into mlperf-inference
arjunsuresh Nov 17, 2024
852b297
Merge pull request #563 from GATEOverflow/mlperf-inference
arjunsuresh Nov 17, 2024
6882ba3
Update default-config.yaml
arjunsuresh Nov 17, 2024
fb4339c
Merge branch 'mlperf-inference' into mlperf-inference
arjunsuresh Nov 17, 2024
636343e
Merge pull request #564 from GATEOverflow/mlperf-inference
arjunsuresh Nov 17, 2024
885e65e
adjust version detection for multi digit version numbers
Submandarine Nov 17, 2024
03a6cea
adjust version detection for multi digit version numbers
Submandarine Nov 17, 2024
f72ffc1
Merge pull request #565 from FAU-cet/mlperf-inference
arjunsuresh Nov 17, 2024
a9ae411
Fixes #566, type error for query counts
arjunsuresh Nov 18, 2024
8b42ced
Merge branch 'mlperf-inference' into mlperf-inference
arjunsuresh Nov 18, 2024
88e9a0d
Fixes #566, type error for query counts
arjunsuresh Nov 18, 2024
9bdfff5
Merge pull request #567 from GATEOverflow/mlperf-inference
arjunsuresh Nov 18, 2024
47055ec
rename mixtral dataset download script
anandhu-eng Nov 18, 2024
6a81cd3
add gnn dataset download script
anandhu-eng Nov 18, 2024
80ee9fa
dlt readme
anandhu-eng Nov 18, 2024
d463753
Merge pull request #570 from mlcommons/gnn-dataset
arjunsuresh Nov 18, 2024
b32ded2
Merge pull request #569 from mlcommons/mixtral-test-dataset
arjunsuresh Nov 18, 2024
7c1cdde
Update test-nvidia-mlperf-inference-implementations.yml
arjunsuresh Nov 18, 2024
173648e
Added draw-graph-from-json-data CM script
arjunsuresh Nov 18, 2024
aae739a
Merge branch 'mlperf-inference' into issue-#536
anandhu-eng Nov 19, 2024
8ec55f5
closed-open added as an option
anandhu-eng Nov 19, 2024
d785cc0
changed to closed-open
anandhu-eng Nov 19, 2024
0b58485
Merge pull request #535 from mlcommons/issue-#403
arjunsuresh Nov 19, 2024
6d5b8dd
Merge pull request #537 from mlcommons/issue-#536
arjunsuresh Nov 19, 2024
cbcbb46
Merge branch 'mlcommons:mlperf-inference' into mlperf-inference
arjunsuresh Nov 19, 2024
e91542b
Fix diffusers version for Nvidia mlperf inference sdxl
arjunsuresh Nov 19, 2024
2d0281e
Fixes #171, added dependency graph including mermaid
arjunsuresh Nov 18, 2024
09f0cdb
Fixes #171, only dump the graph for now
arjunsuresh Nov 19, 2024
8312e33
Merge pull request #573 from GATEOverflow/mlperf-inference
arjunsuresh Nov 19, 2024
3a60122
Merge branch 'main' into mlperf-inference
arjunsuresh Nov 19, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@ name: MLPerf Inference AMD implementations

on:
schedule:
- cron: "29 4 * * *" #to be adjusted
- cron: "46 11 * * *" #to be adjusted

jobs:
build_nvidia:
if: github.repository_owner == 'gateoverflow'
run_amd:
if: github.repository_owner == 'gateoverflow_off'
runs-on: [ self-hosted, linux, x64, GO-spr ]
strategy:
fail-fast: false
Expand All @@ -16,11 +16,11 @@ jobs:
steps:
- name: Test MLPerf Inference AMD (build only) ${{ matrix.model }}
run: |
if [ -f "gh_action_conda/bin/deactivate" ]; then source gh_action_conda/bin/deactivate; fi
python3 -m venv gh_action_conda
source gh_action_conda/bin/activate
if [ -f "gh_action/bin/deactivate" ]; then source gh_action/bin/deactivate; fi
python3 -m venv gh_action
source gh_action/bin/activate
export CM_REPOS=$HOME/GH_CM
pip install --upgrade cm4mlops
pip install tabulate
cm run script --tags=run-mlperf,inference,_all-scenarios,_full,_r4.1-dev --execution_mode=valid --pull_changes=yes --pull_inference_changes=yes --model=${{ matrix.model }} --submitter="MLCommons" --hw_name=IntelSPR.24c --implementation=amd --backend=pytorch --category=datacenter --division=open --scenario=Offline --docker_dt=yes --docker_it=no --docker_cm_repo=gateoverflow@cm4mlops --adr.compiler.tags=gcc --device=rocm --results_dir=$HOME/gh_action_results --submission_dir=$HOME/gh_action_submissions --clean --docker --quiet
cm pull repo
cm run script --tags=run-mlperf,inference,_all-scenarios,_full,_r4.1-dev --execution_mode=valid --pull_changes=yes --pull_inference_changes=yes --model=${{ matrix.model }} --submitter="MLCommons" --hw_name=IntelSPR.24c --implementation=amd --backend=pytorch --category=datacenter --division=open --scenario=Offline --docker_dt=yes --docker_it=no --docker_cm_repo=gateoverflow@cm4mlops --adr.compiler.tags=gcc --device=rocm --use_dataset_from_host=yes --results_dir=$HOME/gh_action_results --submission_dir=$HOME/gh_action_submissions --clean --docker --quiet --docker_skip_run_cmd=yes
# cm run script --tags=push,github,mlperf,inference,submission --repo_url=https://github.com/gateoverflow/mlperf_inference_unofficial_submissions_v5.0 --repo_branch=main --commit_message="Results from GH action on SPR.24c" --quiet --submission_dir=$HOME/gh_action_submissions --hw_name=IntelSPR.24c
8 changes: 5 additions & 3 deletions .github/workflows/test-cm-based-submission-generation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,14 +17,13 @@ jobs:
matrix:
os: [ubuntu-latest, windows-latest, macos-latest]
python-version: [ "3.12" ]
division: ["closed", "open"]
division: ["closed", "open", "closed-open"]
category: ["datacenter", "edge"]
case: ["case-3", "case-7"]
case: ["case-3", "case-7", "case-8"]
action: ["run", "docker"]
exclude:
- os: macos-latest
- os: windows-latest
- division: "open"
- category: "edge"
steps:
- uses: actions/checkout@v4
Expand All @@ -47,6 +46,9 @@ jobs:
elif [ "${{ matrix.case }}" == "case-7" ]; then
#results_dir="submission_generation_tests/case-7/"
description="Submission generation (sut_info.json incomplete, SUT folder name in required format)"
elif [ "${{ matrix.case }}" == "case-8" ]; then
#results_dir="submission_generation_tests/case-8/"
description="Submission generation (system_meta.json not found in results folder)"
fi
# Dynamically set the log group to simulate a dynamic step name
echo "::group::$description"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@ name: MLPerf Inference Intel implementations

on:
schedule:
- cron: "29 1 * * *" #to be adjusted
- cron: "29 1 * * 4" #to be adjusted

jobs:
build_nvidia:
run_intel:
if: github.repository_owner == 'gateoverflow'
runs-on: [ self-hosted, linux, x64, GO-spr ]
strategy:
Expand All @@ -22,5 +22,5 @@ jobs:
export CM_REPOS=$HOME/GH_CM
pip install --upgrade cm4mlops
pip install tabulate
cm run script --tags=run-mlperf,inference,_all-scenarios,_submission,_full,_r4.1-dev --preprocess_submission=yes --execution_mode=valid --pull_changes=yes --pull_inference_changes=yes --model=${{ matrix.model }} --submitter="MLCommons" --hw_name=IntelSPR.24c --implementation=intel --backend=pytorch --category=datacenter --division=open --scenario=Offline --docker_dt=yes --docker_it=no --docker_cm_repo=gateoverflow@cm4mlops --adr.compiler.tags=gcc --device=cpu --results_dir=$HOME/gh_action_results --submission_dir=$HOME/gh_action_submissions --clean --docker --quiet
cm run script --tags=run-mlperf,inference,_all-scenarios,_submission,_full,_r4.1-dev --preprocess_submission=yes --execution_mode=valid --pull_changes=yes --pull_inference_changes=yes --model=${{ matrix.model }} --submitter="MLCommons" --hw_name=IntelSPR.24c --implementation=intel --backend=pytorch --category=datacenter --division=open --scenario=Offline --docker_dt=yes --docker_it=no --docker_cm_repo=gateoverflow@cm4mlops --adr.compiler.tags=gcc --device=cpu --use_dataset_from_host=yes --results_dir=$HOME/gh_action_results --submission_dir=$HOME/gh_action_submissions --clean --docker --quiet
cm run script --tags=push,github,mlperf,inference,submission --repo_url=https://github.com/gateoverflow/mlperf_inference_unofficial_submissions_v5.0 --repo_branch=main --commit_message="Results from GH action on SPR.24c" --quiet --submission_dir=$HOME/gh_action_submissions --hw_name=IntelSPR.24c
11 changes: 6 additions & 5 deletions .github/workflows/test-mlperf-inference-llama2.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ name: MLPerf inference LLAMA 2 70B

on:
schedule:
- cron: "30 2 * * 4"
- cron: "59 04 * * *"

jobs:
build_reference:
Expand All @@ -17,9 +17,10 @@ jobs:
python-version: [ "3.12" ]
backend: [ "pytorch" ]
device: [ "cpu" ]
precision: [ "bfloat16" ]

steps:
- name: Install dependencies
- name: Test MLPerf Inference LLAMA 2 70B reference implementation
run: |
source gh_action/bin/deactivate || python3 -m venv gh_action
source gh_action/bin/activate
Expand All @@ -28,7 +29,7 @@ jobs:
pip install tabulate
cm pull repo
pip install "huggingface_hub[cli]"
git config --global credential.helper store
huggingface-cli login --token ${{ secrets.HF_TOKEN }} --add-to-git-credential
- name: Test MLPerf Inference LLAMA 2 70B reference implementation
run: |
cm run script --tags=run-mlperf,inference,_submission,_short --submitter="MLCommons" --model=llama2-70b-99 --implementation=reference --backend=${{ matrix.backend }} --category=datacenter --scenario=Offline --execution_mode=test --device=${{ matrix.device }} --docker --quiet --test_query_count=1 --target_qps=1 --docker_it=no --docker_cm_repo=gateoverflow@cm4mlops --adr.compiler.tags=gcc --hw_name=gh_action --docker_dt=yes --results_dir=$HOME/gh_action_results --submission_dir=$HOME/gh_action_submissions --env.CM_MLPERF_MODEL_LLAMA2_70B_DOWNLOAD_TO_HOST=yes --adr.inference-src.tags=_repo.https://github.com/anandhu-eng/inference.git --clean
cm run script --tags=run-mlperf,inference,_submission,_short --submitter="MLCommons" --model=llama2-70b-99 --implementation=reference --backend=${{ matrix.backend }} --precision=${{ matrix.precision }} --category=datacenter --scenario=Offline --execution_mode=test --device=${{ matrix.device }} --docker --quiet --test_query_count=1 --target_qps=0.001 --docker_it=no --docker_cm_repo=gateoverflow@cm4mlops --adr.compiler.tags=gcc --hw_name=gh_action --docker_dt=yes --results_dir=$HOME/gh_action_results --submission_dir=$HOME/gh_action_submissions --env.CM_MLPERF_MODEL_LLAMA2_70B_DOWNLOAD_TO_HOST=yes --clean
cm run script --tags=push,github,mlperf,inference,submission --repo_url=https://github.com/gateoverflow/mlperf_inference_test_submissions_v5.0 --repo_branch=main --commit_message="Results from self hosted Github actions" --quiet --submission_dir=$HOME/gh_action_submissions
10 changes: 6 additions & 4 deletions .github/workflows/test-mlperf-inference-mixtral.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,19 @@ name: MLPerf inference MIXTRAL-8x7B

on:
schedule:
- cron: "45 10 * * *" # 30th minute and 20th hour => 20:30 UTC => 2 AM IST
- cron: "32 22 * * *" # 30th minute and 20th hour => 20:30 UTC => 2 AM IST

jobs:
build_reference:
if: github.repository_owner == 'gateoverflow'
runs-on: [ self-hosted, GO-spr, linux, x64 ]
runs-on: [ self-hosted, phoenix, linux, x64 ]
strategy:
fail-fast: false
matrix:
python-version: [ "3.12" ]
backend: [ "pytorch" ]
device: [ "cpu" ]
precision: [ "float16" ]

steps:
- name: Test MLPerf Inference MIXTRAL-8X7B reference implementation
Expand All @@ -26,7 +27,8 @@ jobs:
export CM_REPOS=$HOME/GH_CM
pip install cm4mlops
pip install "huggingface_hub[cli]"
git config --global credential.helper store
huggingface-cli login --token ${{ secrets.HF_TOKEN }} --add-to-git-credential
cm pull repo
cm run script --tags=run-mlperf,inference,_submission,_short --submitter="MLCommons" --model=mixtral-8x7b --implementation=reference --batch_size=1 --backend=${{ matrix.backend }} --category=datacenter --scenario=Offline --execution_mode=test --device=${{ matrix.device }} --docker_it=no --docker_cm_repo=gateoverflow@cm4mlops --adr.compiler.tags=gcc --hw_name=gh_action --docker_dt=yes --results_dir=$HOME/gh_action_results --submission_dir=$HOME/gh_action_submissions --docker --quiet --test_query_count=1 --target_qps=1 --clean --env.CM_MLPERF_MODEL_MIXTRAL_8X7B_DOWNLOAD_TO_HOST=yes --env.CM_MLPERF_DATASET_MIXTRAL_8X7B_DOWNLOAD_TO_HOST=yes
cm run script --tags=push,github,mlperf,inference,submission --repo_url=https://github.com/gateoverflow/mlperf_inference_test_submissions_v5.0 --repo_branch=main --commit_message="Results from self hosted Github actions - GO-i9" --quiet --submission_dir=$HOME/gh_action_submissions
cm run script --tags=run-mlperf,inference,_submission,_short --submitter="MLCommons" --model=mixtral-8x7b --implementation=reference --batch_size=1 --precision=${{ matrix.precision }} --backend=${{ matrix.backend }} --category=datacenter --scenario=Offline --execution_mode=test --device=${{ matrix.device }} --docker_it=no --docker_cm_repo=gateoverflow@cm4mlops --adr.compiler.tags=gcc --hw_name=gh_action --docker_dt=yes --results_dir=$HOME/gh_action_results --submission_dir=$HOME/gh_action_submissions --docker --quiet --test_query_count=1 --target_qps=0.001 --clean --env.CM_MLPERF_MODEL_MIXTRAL_8X7B_DOWNLOAD_TO_HOST=yes --env.CM_MLPERF_DATASET_MIXTRAL_8X7B_DOWNLOAD_TO_HOST=yes
cm run script --tags=push,github,mlperf,inference,submission --repo_url=https://github.com/gateoverflow/mlperf_inference_test_submissions_v5.0 --repo_branch=main --commit_message="Results from self hosted Github actions - GO-phoenix" --quiet --submission_dir=$HOME/gh_action_submissions
4 changes: 2 additions & 2 deletions .github/workflows/test-mlperf-inference-sdxl.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name: MLPerf inference SDXL
on:
schedule:
- cron: "30 2 * * *"
- cron: "19 17 * * *"

jobs:
build_reference:
Expand All @@ -21,5 +21,5 @@ jobs:
export CM_REPOS=$HOME/GH_CM
python3 -m pip install cm4mlops
cm pull repo
cm run script --tags=run-mlperf,inference,_submission,_short --submitter="MLCommons" --docker --model=sdxl --backend=${{ matrix.backend }} --device=cuda --scenario=Offline --test_query_count=1 --precision=${{ matrix.precision }} --target_qps=1 --quiet --docker_it=no --docker_cm_repo=gateoverflow@cm4mlops --adr.compiler.tags=gcc --hw_name=gh_action --docker_dt=yes --results_dir=$HOME/gh_action_results --submission_dir=$HOME/gh_action_submissions --env.CM_MLPERF_MODEL_SDXL_DOWNLOAD_TO_HOST=yes --clean
cm run script --tags=run-mlperf,inference,_submission,_short --submitter="MLCommons" --docker --model=sdxl --backend=${{ matrix.backend }} --device=cuda --scenario=Offline --test_query_count=1 --precision=${{ matrix.precision }} --adr.mlperf-implementation.tags=_branch.dev --quiet --docker_it=no --docker_cm_repo=gateoverflow@cm4mlops --adr.compiler.tags=gcc --hw_name=gh_action --docker_dt=yes --results_dir=$HOME/gh_action_results --submission_dir=$HOME/gh_action_submissions --env.CM_MLPERF_MODEL_SDXL_DOWNLOAD_TO_HOST=yes --clean
cm run script --tags=push,github,mlperf,inference,submission --repo_url=https://github.com/gateoverflow/mlperf_inference_test_submissions_v5.0 --repo_branch=main --commit_message="Results from self hosted Github actions - NVIDIARTX4090" --quiet --submission_dir=$HOME/gh_action_submissions
Original file line number Diff line number Diff line change
Expand Up @@ -2,25 +2,43 @@ name: MLPerf Inference Nvidia implementations

on:
schedule:
- cron: "49 19 * * *" #to be adjusted
- cron: "54 22 * * *" #to be adjusted

jobs:
build_nvidia:
run_nvidia:
if: github.repository_owner == 'gateoverflow'
runs-on: [ self-hosted, linux, x64, GO-spr ]
runs-on:
- self-hosted
- linux
- x64
- cuda
- ${{ matrix.system }}
strategy:
fail-fast: false
matrix:
system: [ "GO-spr", "phoenix", "i9" ]
python-version: [ "3.12" ]
model: [ "resnet50", "retinanet", "bert-99", "bert-99.9", "gptj-99.9", "3d-unet-99.9" ]
model: [ "resnet50", "retinanet", "bert-99", "bert-99.9", "gptj-99.9", "3d-unet-99.9", "sdxl" ]
exclude:
- model: gptj-99.9

steps:
- name: Test MLPerf Inference NVIDIA ${{ matrix.model }}
run: |
# Set hw_name based on matrix.system
if [ "${{ matrix.system }}" = "GO-spr" ]; then
hw_name="RTX4090x2"
else
hw_name="RTX4090x1"
fi

if [ -f "gh_action/bin/deactivate" ]; then source gh_action/bin/deactivate; fi
python3 -m venv gh_action
source gh_action/bin/activate
export CM_REPOS=$HOME/GH_CM
pip install --upgrade cm4mlops
pip install tabulate
cm run script --tags=run-mlperf,inference,_all-scenarios,_submission,_full,_r4.1-dev --preprocess_submission=yes --execution_mode=valid --gpu_name=rtx_4090 --pull_changes=yes --pull_inference_changes=yes --model=${{ matrix.model }} --submitter="MLCommons" --hw_name=RTX4090x2 --implementation=nvidia --backend=tensorrt --category=datacenter,edge --division=closed --docker_dt=yes --docker_it=no --docker_cm_repo=gateoverflow@cm4mlops --adr.compiler.tags=gcc --device=cuda --use_dataset_from_host=yes --results_dir=$HOME/gh_action_results --submission_dir=$HOME/gh_action_submissions --clean --docker --quiet
cm run script --tags=push,github,mlperf,inference,submission --repo_url=https://github.com/gateoverflow/mlperf_inference_unofficial_submissions_v5.0 --repo_branch=main --commit_message="Results from GH action on NVIDIA_RTX4090x2" --quiet --submission_dir=$HOME/gh_action_submissions --hw_name=RTX4090x2
cm pull repo

cm run script --tags=run-mlperf,inference,_all-scenarios,_submission,_full,_r4.1-dev --preprocess_submission=yes --adr.submission-checker-src.tags=_branch.dev --execution_mode=valid --gpu_name=rtx_4090 --pull_changes=yes --pull_inference_changes=yes --model=${{ matrix.model }} --submitter="MLCommons" --hw_name=$hw_name --implementation=nvidia --backend=tensorrt --category=datacenter,edge --division=closed --docker_dt=yes --docker_it=no --docker_cm_repo=gateoverflow@cm4mlops --adr.compiler.tags=gcc --device=cuda --use_dataset_from_host=yes --results_dir=$HOME/gh_action_results --submission_dir=$HOME/gh_action_submissions --clean --docker --quiet

cm run script --tags=push,github,mlperf,inference,submission --repo_url=https://github.com/gateoverflow/mlperf_inference_unofficial_submissions_v5.0 --repo_branch=main --commit_message="Results from GH action on NVIDIA_$hw_name" --quiet --submission_dir=$HOME/gh_action_submissions --hw_name=$hw_name
4 changes: 2 additions & 2 deletions .github/workflows/test-scc24-sdxl.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ name: MLPerf inference SDXL (SCC)

on:
schedule:
- cron: "35 19 * * *"
- cron: "20 01 * * *"

jobs:
build_reference:
Expand Down Expand Up @@ -54,5 +54,5 @@ jobs:
cm pull repo
cm run script --tags=run-mlperf,inference,_find-performance,_r4.1-dev,_short,_scc24-base --pull_changes=yes --model=sdxl --implementation=nvidia --backend=${{ matrix.backend }} --category=datacenter --scenario=Offline --execution_mode=test --device=${{ matrix.device }} --precision=${{ matrix.precision }} --docker --docker_it=no --docker_cm_repo=gateoverflow@cm4mlops --docker_dt=yes --pull_changes --quiet --results_dir=$HOME/scc_gh_action_results --submission_dir=$HOME/scc_gh_action_submissions --env.CM_MLPERF_MODEL_SDXL_DOWNLOAD_TO_HOST=yes --hw_name=go-spr --custom_system_nvidia=yes --clean
cm run script --tags=run-mlperf,inference,_r4.1-dev,_short,_scc24-base --model=sdxl --implementation=nvidia --backend=${{ matrix.backend }} --category=datacenter --scenario=Offline --execution_mode=test --device=${{ matrix.device }} --precision=${{ matrix.precision }} --docker --docker_it=no --docker_cm_repo=gateoverflow@cm4mlops --docker_dt=yes --quiet --results_dir=$HOME/scc_gh_action_results --submission_dir=$HOME/scc_gh_action_submissions --precision=float16 --env.CM_MLPERF_MODEL_SDXL_DOWNLOAD_TO_HOST=yes --clean
cm run script --tags=generate,inference,submission --clean --preprocess_submission=yes --run-checker --tar=yes --env.CM_TAR_OUTFILE=submission.tar.gz --division=open --category=datacenter --run_style=test --adr.submission-checker.tags=_short-run --quiet --submitter=MLCommons --submission_dir=$HOME/scc_gh_action_submissions --results_dir=$HOME/scc_gh_action_results/test_results
cm run script --tags=generate,inference,submission --clean --run-checker --tar=yes --env.CM_TAR_OUTFILE=submission.tar.gz --division=open --category=datacenter --run_style=test --adr.submission-checker.tags=_short-run --quiet --submitter=MLCommons --submission_dir=$HOME/scc_gh_action_submissions --results_dir=$HOME/scc_gh_action_results/test_results
cm run script --tags=push,github,mlperf,inference,submission --repo_url=https://github.com/gateoverflow/cm4mlperf-inference --repo_branch=mlperf-inference-results-scc24 --commit_message="Results from self hosted Github actions - NVIDIARTX4090" --quiet --submission_dir=$HOME/scc_gh_action_submissions
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.3.25
0.3.26
Loading
Loading