Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add OpenMPI flux orte module #923

Closed
garlick opened this issue Dec 13, 2016 · 14 comments
Closed

add OpenMPI flux orte module #923

garlick opened this issue Dec 13, 2016 · 14 comments

Comments

@garlick
Copy link
Member

garlick commented Dec 13, 2016

OpenMPI requires modules for each launcher it supports. We've added one for Flux's PMI in rhc54/ompi#1. Also in that PR it was noted that ORTE (client side) also needs one to support direct launch, as opposed to launch via mpirun.

A skeletal implementation (based on the SLURM one I think) was added in rhc54/ompi@12bef7f

@grondo I was hoping maybe you could have a look at this and we could discuss it this morning before I attempt to work it over for Flux. There are a few different use cases evident there that I'm not sure we need in Flux, but I may be missing something, for example "running in a job step" versus not (but still running in an allocation).

@grondo
Copy link
Contributor

grondo commented Dec 13, 2016

I know nothing about orte/schizo, but it appears what they are attempting to support is the difference between launching ompi app run in a Flux session with a flux parallel launcher (e.g. wreckrun ./my-app) vs direct launch (./my-app) vs. running under a Flux instance, but launched with ompi's mpirun (?) (which I guess would require some other orte/flux plugin to work anyway?)

Sorry if that was no help. Happy to discuss further in the office.

@garlick
Copy link
Member Author

garlick commented Dec 13, 2016

That's helpful.

Here's a summary of how I read the code and how I think it needs to change:

  • module is not loaded if not an app or FLUX_JOB_ID is not set, therefore we know on entry that we are an app running under wreckrun
  • module detects running under mpirun and skips to the end (this is not yet supported as it would require flux to launch orte(?) but could be in the future, so leave it)
  • module checks SLURM_NODELIST and if unset, skips to the end assuming "not in an allocation" (this code block can be deleted)
  • module checks SLURM_STEP_ID and if not set, runs as singleton (this code block can be deleted)
  • module checks SLURM_CPU_BIND_TYPE and SLURM_CPU_BIND_LIST and leave hints for orte to set up binding or not (this code block should unconditionally tell orte not to attempt binding)

Does that make sense?

@grondo
Copy link
Contributor

grondo commented Dec 13, 2016

module is not loaded if not an app or FLUX_JOB_ID is not set, therefore we know on entry that we are an app running under wreckrun

What if we're running flux under flux-wreckrun, isn't FLUX_JOB_ID still exported?

$ src/cmd/flux start -s 4 flux wreckrun -n8 flux start env | grep FLUX_JOB
FLUX_JOB_SIZE=8
FLUX_JOB_ID=1
FLUX_JOB_NNODES=4

@garlick
Copy link
Member Author

garlick commented Dec 13, 2016

FLUX_JOB_ID is only set if running under flux-wreckrun.

@garlick
Copy link
Member Author

garlick commented Dec 13, 2016

Oh I just realized what you are saying. Should we be purging that from the environment in a new flux instance?

@grondo
Copy link
Contributor

grondo commented Dec 13, 2016

Yeah, I was thinking of the same thing, but I'm having trouble remembering the strategy for a child instance to connect to its parent, know if it is a child of Flux or something else, etc... I think it is probably ok, but it does get a little tricky to think about.

@garlick
Copy link
Member Author

garlick commented Dec 13, 2016

The broker both opens the enclosing instance and caches the URI of the enclosing instance in a broker attribute, so I think that case is covered?

@grondo
Copy link
Contributor

grondo commented Dec 13, 2016

I'd say clear that variable for now from flux-start/flux-broker then

@garlick
Copy link
Member Author

garlick commented Dec 17, 2016

This was upstreamed in open-mpi/ompi#2597
Minor flux-core changes went in with #926 and #921

@trws if you have a chance to poke at this please use flux-core master and ompi master from the repo referenced above.

  1. configure ompi with --prefix (no flux config needed) and do a make install to populate prefix.
  2. build flux-core master as usual, no need to install
  3. build MPI test program with ompi's mpicc e.g. prefix/bin/mpicc (t/mpi/hello.c is available for this if you like)
  4. start flux session out of flux-core source tree
  5. flux wreckrun [options] program

@trws
Copy link
Member

trws commented Dec 18, 2016 via email

@trws
Copy link
Member

trws commented Dec 19, 2016 via email

@trws
Copy link
Member

trws commented Dec 19, 2016

Static also works as expected. I toyed with getting the system MPIs to load the component, but that doesn't fly, looks like we'll need new builds of OpenMPI to make use of this. Regardless, it works great.

@garlick
Copy link
Member Author

garlick commented Dec 19, 2016

Thanks!

BTW it should work without pkg-config or any other configure options. That's only necessary in conjunction with the static build. By default it dlopens our libpmi.so, following the FLUX_PMI_LIBRARY_PATH environment variable (set by Flux) at runtime.

@garlick
Copy link
Member Author

garlick commented Dec 28, 2016

Let's call this done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants