Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flux-mini run: full featured version for wreck parity #2150

Closed
garlick opened this issue May 7, 2019 · 15 comments
Closed

flux-mini run: full featured version for wreck parity #2150

garlick opened this issue May 7, 2019 · 15 comments

Comments

@garlick
Copy link
Member

garlick commented May 7, 2019

Following up on a discussion with @dongahn:

It may be useful to add a flux srun command to both flux 0.11 (wreck) and master (new exec system) that superficially mimics SLURM's srun. We could keep it stable over the 0.11 to 0.12 transition, while the "real" porcelain flux run is being developed so we can roll out 0.12 on corona and sierra without breaking the world.

It may also be useful for users that have hardwired srun commands in test suites etc. to ease a transition to flux, which can run in more environments (thus making their test suites more portable).

The caveat is srun has a ton of options and building a full replica is not viable, nor are the returns probably worth going beyond the simplest, most commonly used options and behaviors.

Let's collect some design requirements for this thing here.

@dongahn
Copy link
Member

dongahn commented May 7, 2019

Thanks @garlick for opening this issue. I will be soon talking to some of the key SNL users on Sierra to collect their requirements.

@dongahn
Copy link
Member

dongahn commented May 15, 2019

Starting to get some feedback from SNL users:

Anthony Agelastos at SNL is directly working with the SPARC (SNL ATDM code) team and helping them run their simulations on Sierra as part of their ATCC-7 campaign. He is interested in running SPARC under Flux to do some evaluations of Flux as a means for doing testing of multiple invocations within a single allocation.

@dongahn
Copy link
Member

dongahn commented May 15, 2019

The point of contact for SNL's SPARTA code is Stan Moore:

On Trinity, his code uses the following srun options: --cpu_bind=core (or threads), -u, --exclude, -N, --ntasks-per-node. -u is used as it is helpful for debugging.

@dongahn
Copy link
Member

dongahn commented May 15, 2019

Stan also said initially he had issues with unexpected affinity behavior of jsrun. So we should pay attention to this problem with our solution too. Support, tools and techniques to debug and visualize binding have helped SPARTA. js_task_info script along with https://jsrunvisualizer.olcf.ornl.gov​ have helpful for him.

@dongahn
Copy link
Member

dongahn commented May 15, 2019

Ross Bartlett for SNL's SPARC code:

From a testing perspective, we want to be able to run multiple MPI jobs on the same allocated nodes at the same time. This is so that we can run lots of smaller tests (i.e. 4 MPI ranks and just a few threads per rank) on nodes that have lots of cores and multiple GPUs. That is not the “Production” usage of these machines but we burn up a lot of wall-clock time using say only 8 cores at a time on nodes that have many times that number of cores and therefore has many idle cores.

Therefore, I think discussions about the usability of LSF vs. SLURM are less important for that use case. We just need to know the magical options for each system to accomplish what we want.

It seems his use case can be directly benefited from the current capability even without flux srun

@dongahn
Copy link
Member

dongahn commented May 15, 2019

Also from Ross:

This is actually what Flux can do very well and has already proven to be effective with other production use cases on Sierra. If you want, we can send you some examples on how you can achieve this with Flux on Sierra. Do you want me to create a separate discussion thread?

Yes, we would like to see that. And we can try these on our existing ATS-2-like systems on SNL like ‘ride’, ‘waterman’, and ‘vortex’ as well. That would make a huge impact on the utilization of these machines for running automated tests.

I still need to figure out whether getting a node allocation with LSF first and then flux-submit those small tests into the flux instance on that allocation meets their requirement.

@dongahn
Copy link
Member

dongahn commented May 15, 2019

Rich Drake for SNL's SIERRA code:

Searching Sierra's launch scripts, I only see the use of these:

-n

--multi-prog

And the ":" delimiter for MPMD execution.

Of course, if binding needs to be specified, then we will use that.

Seems MPMD support is a gap... any idea how easy or difficult to match this srun option?

@dongahn
Copy link
Member

dongahn commented May 15, 2019

More from Rich:

We use --multi-prog on ATS-1 for all our MPMD-type coupled executions (I think all examples are two codes in a partitioned core set).  If there was a way to run these without using --multi-prog, I'm sure we could make that work.

@SteVwonder
Copy link
Member

Seems MPMD support is a gap... any idea how easy or difficult to match this srun option?

@garlick and I just chatted briefly about that. We could probably mimic the exact behavior with a wreck plugin, but we could also leverage nested instances to achieve the co-scheduling. In the nested instance case, we (or the user) would just need to make sure that the total amount of work submitted to the sub-instance is not greater than the resources allocated to the instance can handle (i.e., ntasks cannot be greater than ncores). We could probably wrap that logic up in a flux multiprog command, or even add it as a flag to flux capacitor

@dongahn
Copy link
Member

dongahn commented May 15, 2019

Yeah it seems like Rich is alluding that he can make use of a general co-scheduling capability. For oneoff option like this, it would be wise to invite the users like Rich to firm up our solution as co-design. Looks like we will have to divide and conquer across different SNL teams a bit for effective communications going forward.

@garlick
Copy link
Member Author

garlick commented May 16, 2019

Great info we are accumulating here. It is sort of difficult to decide what options to support. The goal is to provide a stable porting target not an srun clone.

My suggestion is to start with the options that are supported in flux jobspec and make a super simple wrapper for the plumbing commands in master, and backport those to a 0.11 script that translates to wreckrun options.

Possibly this will help us identify some missing plumbing in master for synchronization, I/O, etc. that will be good short term work items.

@garlick garlick changed the title idea: flux-srun command as transition from flux-wreckrun flux-srun: full featured version for wreck partiy Sep 12, 2019
@garlick
Copy link
Member Author

garlick commented Sep 12, 2019

Meeting discussion:

  • This will be our main user-facing run command for the wreck parity release
  • It should prob be a python program
  • it would call a python module factored out of flux jobspec srun rather than running the program
  • needs to manage its own options, not pass all of them to flux jobspec srun
  • needs to be able to show jobid
  • needs to be able to pass verbose option etc to flux job attach

@grondo
Copy link
Contributor

grondo commented Sep 23, 2019

We should move any requirements gathered here into #2379, then close.

@chu11 chu11 changed the title flux-srun: full featured version for wreck partiy flux-srun: full featured version for wreck parity Sep 23, 2019
@grondo grondo closed this as completed Sep 26, 2019
@grondo grondo reopened this Sep 26, 2019
@grondo
Copy link
Contributor

grondo commented Sep 26, 2019

Sorry for the noise, didn't mean to close (stray mouse click)

@garlick garlick changed the title flux-srun: full featured version for wreck parity flux-mini run: full featured version for wreck parity Sep 26, 2019
@garlick
Copy link
Member Author

garlick commented Sep 30, 2019

Opened a couple of issues to track outstanding items, and closing this one.

@garlick garlick closed this as completed Sep 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants