Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simple job shell #1335

Closed
garlick opened this issue Feb 12, 2018 · 5 comments
Closed

simple job shell #1335

garlick opened this issue Feb 12, 2018 · 5 comments

Comments

@garlick
Copy link
Member

garlick commented Feb 12, 2018

Implement a simple job shell that can take as input J and Rlocal, and figure out what process to run, setting MPI rank, etc..

The role of the job shell is described in RFC 15. It runs as the user, and spawns potentially multiple user procs.

Implement PMI service for user processes.

Broadcast stdin and collect stdout, stderr back to KVS guest namespace.

@SteVwonder
Copy link
Member

One post-it note on the timeline added a requirement to the job shell: "stdout/err piped without KVS"

@grondo
Copy link
Contributor

grondo commented Jul 10, 2018

One post-it note on the timeline added a requirement to the job shell: "stdout/err piped without KVS"

Thanks. @trws may want to add some detail (I think that was his requirement). e.g. stdout/err piped where? what is an example use case (write to a file without transiting the kvs?)

@trws
Copy link
Member

trws commented Jul 10, 2018

I suppose this is two parts, but having the ability to support the equivalent of the current -o functionality without going through the KVS is probably the most immediately relevant part. The improvements in watch handling have definitely helped here, but it causes some serious overhead even so.

The other part is generically being able to do standard output/error handling without having all of it persist in the KVS. We've worked around the issue for pilot2 by just not using those streams, or rather not letting flux see any data on them, but especially lacking KVS GC #258 it's a potentially serious bloat issue for the KVS. I rate this as secondary because, at least from my perspective, it seems this is the harder part, and having the ability to use file directed output cheaply would make this easy to work around where it would be an issue.

@grondo
Copy link
Contributor

grondo commented Jul 11, 2018

Thanks @trws!

That description seems pretty clear. I think we were leaning toward a single writer for output in the kvs, so that single writer could redirect to a file instead of kvs fairly simply.

If the output is kept in the kvs, it could optionally be removed before the private namespace is 'linked in' to the main namespace during job completion/reaping. However, I don't think that is enough for a quick-and-dirty GC unless we also had a way to tell the KVS not to send certain data to the content store.

@grondo
Copy link
Contributor

grondo commented Oct 1, 2019

Closing this stale issue since we've now got a job shell.

@grondo grondo closed this as completed Oct 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants