Skip to content

Commit

Permalink
Specified 16GB V100 variant
Browse files Browse the repository at this point in the history
  • Loading branch information
fsschneider committed Feb 20, 2024
1 parent 099305e commit 0c0e620
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 3 deletions.
4 changes: 2 additions & 2 deletions DOCUMENTATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -431,7 +431,7 @@ The training time until the target performance on the test set was reached is no

All scored runs have to be performed on the benchmarking hardware to allow for a fair comparison of training times. The benchmarking hardware has to be chosen to be easily accessible via common cloud computing providers. The exact hardware specification will most likely change with each iteration of the benchmark. The specs of the benchmarking hardware for this iteration of the benchmark are:

- 8xV100 GPUs
- 8xV100 GPUs (16 GB VRAM each)
- 240 GB in RAM
- 2 TB in storage (for datasets).

Expand Down Expand Up @@ -518,7 +518,7 @@ To ensure that all submitters can develop their submissions based on the same co

#### My machine only has one GPU. How can I use this repo?

You can run this repo on a machine with an arbitrary number of GPUs. However, the default batch sizes in our reference algorithms (e.g. `algorithmic-efficiency/prize_qualification_baselines` and `algorithmic-efficiency/reference_algorithms`) are tuned for a machine with 8 16GB V100 GPUs. You may run into OOMs if you run these algorithms with fewer than 8 GPUs. If you run into these issues because you are using a machine with less total GPU memory, please reduce the batch sizes for the submission. Note that your final submission must 'fit' on the benchmarking hardware, so if you are using fewer
You can run this repo on a machine with an arbitrary number of GPUs. However, the default batch sizes in our reference algorithms (e.g. `algorithmic-efficiency/prize_qualification_baselines` and `algorithmic-efficiency/reference_algorithms`) are tuned for a machine with 8xV100 GPUs (16 GB VRAM). You may run into OOMs if you run these algorithms with fewer than 8 GPUs. If you run into these issues because you are using a machine with less total GPU memory, please reduce the batch sizes for the submission. Note that your final submission must 'fit' on the benchmarking hardware, so if you are using fewer
GPUs with higher per GPU memory, please monitor your memory usage to make sure it will fit on 8xV100 GPUs with 16GB of VRAM per card.

#### How do I run this on my SLURM cluster?
Expand Down
2 changes: 1 addition & 1 deletion GETTING_STARTED.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ To get started you will have to make a few decisions and install the repository
1. Decide if you would like to develop your submission in either PyTorch or JAX.
2. Set up your workstation or VM. We recommend to use a setup similar to the [benchmarking hardware](/DOCUMENTATION.md#benchmarking-hardware).
The specs on the benchmarking machines are:
- 8xV100 GPUs
- 8xV100 GPUs (16 GB VRAM each)
- 240 GB in RAM
- 2 TB in storage (for datasets).
3. Install the algorithmic package and dependencies either in a [Python virtual environment](#python-virtual-environment) or use a [Docker](#docker) (recommended) or [Singularity/Apptainer container](#using-singularityapptainer-instead-of-docker).
Expand Down

0 comments on commit 0c0e620

Please sign in to comment.