Skip to content

fsschneider/algorithmic-efficiency

This branch is up to date with mlcommons/algorithmic-efficiency:main.

Folders and files

NameName
Last commit message
Last commit date
Aug 17, 2022
Feb 15, 2024
Oct 29, 2024
Mar 12, 2024
Oct 16, 2024
Oct 25, 2024
Oct 25, 2024
Nov 21, 2024
Oct 25, 2024
Oct 29, 2024
Jan 26, 2023
Aug 29, 2024
Feb 3, 2023
Sep 4, 2024
Mar 31, 2024
Apr 19, 2024
Jan 12, 2024
Oct 3, 2024
Mar 26, 2024
May 16, 2023
Nov 21, 2024
Mar 9, 2022
Oct 28, 2024
Mar 2, 2022
Oct 25, 2024

Repository files navigation

MLCommons™ AlgoPerf: Training Algorithms Benchmark


MLCommons Logo

Paper (arXiv)Call for SubmissionsGetting StartedCompetition RulesDocumentationContributing

CI Lint License: Apache 2.0 Code style: yapf Discord


AlgoPerf is a suite of benchmarks and competitions to measure neural network training speedups due to algorithmic improvements in both training algorithms and models. This is the repository for the AlgoPerf: Training Algorithms benchmark and its associated competition. It is developed by the MLCommons Algorithms Working Group. This repository holds the competition rules, the technical documentation of the benchmark, getting started guides, and the benchmark code. For a detailed description of the benchmark design, see our paper.


Important

The results of the inaugural AlgoPerf: Training Algorithms benchmark competition have been announced. See the MLCommons blog post for an overview and the results page for more details on the results. We are currently preparing an in-depth analysis of the results in the form of a paper and plan the next iteration of the benchmark competition.

Table of Contents

Installation

Tip

If you have any questions about the benchmark competition or you run into any issues, please feel free to contact us. Either file an issue, ask a question on our Discord or join our weekly meetings.

You can install this package and dependencies in a Python virtual environment or use a Docker/Singularity/Apptainer container (recommended). We recommend using a Docker container (or alternatively, a Singularity/Apptainer container) to ensure a similar environment to our scoring and testing environments. Both options are described in detail in the Getting Started document.

TL;DR to install the Jax version for GPU run:

pip3 install -e '.[pytorch_cpu]'
pip3 install -e '.[jax_gpu]' -f 'https://storage.googleapis.com/jax-releases/jax_cuda_releases.html'
pip3 install -e '.[full]'

TL;DR to install the PyTorch version for GPU run:

pip3 install -e '.[jax_cpu]'
pip3 install -e '.[pytorch_gpu]' -f 'https://download.pytorch.org/whl/cu121'
pip3 install -e '.[full]'

Getting Started

For detailed instructions on developing and scoring your own algorithm in the benchmark see the Getting Started document.

TL;DR running a JAX workload:

python3 submission_runner.py \
    --framework=jax \
    --workload=mnist \
    --experiment_dir=$HOME/experiments \
    --experiment_name=my_first_experiment \
    --submission_path=reference_algorithms/paper_baselines/adamw/jax/submission.py \
    --tuning_search_space=reference_algorithms/paper_baselines/adamw/tuning_search_space.json

TL;DR running a PyTorch workload:

python3 submission_runner.py \
    --framework=pytorch \
    --workload=mnist \
    --experiment_dir=$HOME/experiments \
    --experiment_name=my_first_experiment \
    --submission_path=reference_algorithms/paper_baselines/adamw/pytorch/submission.py \
    --tuning_search_space=reference_algorithms/paper_baselines/adamw/tuning_search_space.json

Call for Submissions

The Call for Submissions announces the first iteration of the AlgoPerf: Training Algorithms competition based on the benchmark by the same name. This document also contains the schedule and key dates for the competition.

Competition Rules

The competition rules for the AlgoPerf: Training Algorithms competition can be found in the separate Competition Rules document.

Technical Documentation of the Benchmark & FAQs

We provide additional technical documentation of the benchmark and answer frequently asked questions in a separate Documentation page. Suggestions, clarifications and questions can be raised via pull requests, creating an issue, or by sending an email to the working group.

Contributing

We invite everyone to look through our rules, documentation, and codebase and submit issues and pull requests, e.g. for rules changes, clarifications, or any bugs you might encounter. If you are interested in contributing to the work of the working group and influence the benchmark's design decisions, please join the weekly meetings and consider becoming a member of the working group.

Our Contributing document provides further MLCommons contributing guidelines and additional setup and workflow instructions.

License

The AlgoPerf codebase is licensed under the Apache License 2.0.

Paper and Citing the AlgoPerf Benchmark

In our paper "Benchmarking Neural Network Training Algorithms" we motivate, describe, and justify the AlgoPerf: Training Algorithms benchmark.

If you are using the AlgoPerf benchmark, its codebase, baselines, or workloads, please consider citing our paper:

Dahl, Schneider, Nado, et al.
Benchmarking Neural Network Training Algorithms
arXiv 2306.07179

@Misc{Dahl2023AlgoPerf,
  title         = {{Benchmarking Neural Network Training Algorithms}},
  author        = {Dahl, George E. and Schneider, Frank and Nado, Zachary and Agarwal, Naman and Sastry, Chandramouli Shama and Hennig, Philipp and Medapati, Sourabh and Eschenhagen, Runa and Kasimbeg, Priya and Suo, Daniel and Bae, Juhan and Gilmer, Justin and Peirson, Abel L. and Khan, Bilal and Anil, Rohan and Rabbat, Mike and Krishnan, Shankar and Snider, Daniel and Amid, Ehsan and Chen, Kongtao and Maddison, Chris J. and Vasudev, Rakshith and Badura, Michal and Garg, Ankush and Mattson, Peter},
  year          = {2023},
  archiveprefix = {arXiv},
  eprint        = {2306.07179},
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.7%
  • Shell 1.1%
  • Dockerfile 0.2%