Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify the SLURM hostfile setup #442

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

fsimonis
Copy link
Member

@fsimonis fsimonis commented Aug 9, 2024

This PR simplifies the SLURM explanation.
Hostfiles are now generated using slot notations.
This allows to run mpi without -n as it scales to all available slots.

The hostfiles now don't contain repeated hosts, as repeating hosts could still lead to double allocation of CPUs on nodes that used by multiple hostfiles.

@davidscn could you test these formats on the AMD cluster?

I also created a python version of the script to generate hostfiles for an active SLURM session. Is this actually useful, or is it easier to code the generation into the bash of the SLURM jobs? If it is useful, where should it go? In a gist, added to the website as a file, added as text, or added to a separate repo in the org?
https://gist.github.com/fsimonis/4e312c3875c276d96f358bb0ff8ce7a2

@fsimonis fsimonis self-assigned this Aug 9, 2024
@fsimonis fsimonis requested a review from davidscn August 9, 2024 11:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant