-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HCCL Demo does not work for multi-node. #42
Comments
@veritas9872 thank you for reporting. |
Hello. I have been testing with the version updated three weeks ago, where the |
I will try with the latest version. However, I think that this is fundamentally because MPI does not allow running as root by default. |
Hi @veritas9872, These are MPI-related issues that should be handled by the Python wrapper (see detailed usage explanation in the README). The correct way to use hccl_demo is by running it with the Python wrapper. If it doesn't solve your issues, please share your command. Regarding your issues: Issue 1: MPI Requires --allow-run-as-root Flag to Run as Root User Issue 2: HCCL Demo Error: [find_interface] MPI requires --mca btl_tcp_if_include Important note regarding container usage: The Python wrapper uses the We hope these solutions resolve the issues you are experiencing. If you have any further questions or need additional assistance, please do not hesitate to contact us. |
Hello. I am trying to set up the HCCL demo for multiple nodes.
However, I get the following error message when trying to run the commands in a container.
--run-as-root
flag to run as root user.--mca btl_tcp_if_include <interface>
.I think that both issues should be addressed for demos.
The text was updated successfully, but these errors were encountered: