-
Notifications
You must be signed in to change notification settings - Fork 434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with intra-node communication ehen using openMPI in containerized environment #8958
Comments
Can you please elaborate - two process talking to each other within the same container ? Or two separate container talking to each other ? |
2 processes each of them inside their own separate container talking to each other. In this case is actually direct CMA not possible i guess ... |
I ran ucx on debug mode:
Any idea what the |
For isolated processes in user namespace ( container ) the following code will detect cma
Just looking at the underlying ptrace system |
I came across the same error in an Apptainer container too, and although I am not sure it is relevant or not, I added the flag |
you mean that adding the option |
I run the container with something like |
interesting ! do you know what this option does, |
From the docs it said running the container in a new inter-process communication namespace, and I am not sure why it helps. Also I don't know if there is any performance impact. |
if CMA is actually really used than one should see quite a performance boost .... |
solution seems to be accepted by the Apptainer developpers see : |
Great, good to know that setting |
Started related PR #9213 |
apptainer: more details on |
fixed in ucx master branch |
@tvegas1 I wrote a report related to this issue and the other issue |
Your report is protected |
@denisbertini My mistake. I updated a link above. The first one was a link for staging server (^^; |
very good, i am trying to find a possible solution for this problem using apptainer instance. I think it is definitely the way to go. |
Current workaround is using apptainer instance, please see very end of the report. |
ok but how can this work together with a scheduler for example, SLURM ? |
Is there a timeline for apptainer 1.3.0 ? |
@denisbertini I also showed an example job script for SLURM in the report (Please see section "Multi-node test with Slurm"). |
@panda1100 in your multi node test, you used ssh ... this looks like a hack, and we do not allow ssh on compute node. |
@denisbertini Yes, this is kind of ugly solution right now .. Apptainer apptainer/apptainer#1583 will be like internally handle this without ssh.
Without ssh, simple bash script if apptainer instance is already started
apptainer run instance://INSTANCE_NAME YOUR_APP
else
apptainer instance start YOUR.sif INSTANCE_NAME
apptainer run instance://INSTANCE_NAME YOUR_APP
if |
does not work so far
|
may be linked to |
yes i tried to add it at the end, so:
|
it works through but with this kind of errors:
|
If i remove the |
if your slurm doesn't kill processes at the end of job step,
|
Do you know a way to tell apptainer instance not to write output logs in the $HOME directory? |
Please see bind mount option https://apptainer.org/docs/user/main/bind_paths_and_mounts.html#user-defined-bind-paths |
you mean you can change the location of the
buy using the mount option |
@denisbertini To change This environment variable is for specifying the directory to use for per-user configuration. The default is |
no, even setting this variable , apptainer wants still to write something in the $HOMME/.apptainer directory ... |
@denisbertini Oh, Thank you for the pointer. I'll work with the team on that. |
@denisbertini How about explicitly change HOME environment variable inside slurm job script? I haven't tested yet though. This is an example,
and bind mount
|
the problem i see, even using this approach is that the instances are not visible anymore
|
the problem is linked to |
BTW with the modified script, only the
|
Thank you @denisbertini, I also replicated the issue. |
@denisbertini The fix (apptainer/apptainer#1666) merged and will be release as Apptainer v1.2.3. |
great ! thanks ... any timeline for the release v1.2.3 ? |
@denisbertini At least from https://github.com/apptainer/apptainer/milestones completion status, it looks like it will be soon... |
@panda1100 can one imagine instead of just scripts to better integration of apptainer instance via for example SLURM spank plugin, what do you think?
and then only one instance will be created/compute node in the prolog phase and stopped in the epilog phase ... |
This workaround is only for temporal solution until permanent solution released. The permanent solution we are planning is apptainer/apptainer#1583 . This RFE will implement runtime option something like |
@panda1100 is there already a apptainer patch available which includes the fix related to proper behavior of CONFIGDIR for |
i would like to test it on our cluster |
@denisbertini Not merged yet but I tested with apptainer/apptainer#1672 This is the procedures I used on Rocky Linux 9.2
|
this will not be necessary anymore, since we already have the release v1.2.3 |
@panda1100 do you know if there is a recommended version of the linux kernel needed to properly run
|
@denisbertini interesting. please, create an issue on Apptainer repo. Let's work on there. |
done |
When using openMPI 4.1.5 together with UCX v 1.14.0 installed inside a container ( apptainer ) it seems that communication via direct adress space between process is not permitted:
It looks like the optimized communication between processes within one node is not allowed because both process are launched inside a container. Do you know a way arround this limitation?
The text was updated successfully, but these errors were encountered: