Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Atomic job submit and job eventing subscription and etc #1534

Closed
dongahn opened this issue May 23, 2018 · 12 comments
Closed

Atomic job submit and job eventing subscription and etc #1534

dongahn opened this issue May 23, 2018 · 12 comments

Comments

@dongahn
Copy link
Member

dongahn commented May 23, 2018

@grondo suggested to open an issue from #1332 (comment)

As I was helping @koning, we encountered two issues (or I should just say nuances) in his use of job.submit rpc.

He is currently using the JSC interface similar to this example code.

First thing he noticed was he wasn't able to easily pass the jobid of interest (only one job) at the time jsc.notify_status was called because the jobid comes back with the response from the job.submit.

As @grondo suggested, this can also be worked around if we used job.create first and then job.submit-nocreate. But for simplicity, he currently stores the jobid as a class member variable of a Python class which happens before reactor_run () and thus the first invocation of his job status callback.

The second thing: his use case is such that the job wrapper only needs to monitor the state changes of a single job. But this JSC method invokes the job status call back for each state change of all of the jobs, which then would perturb job executions.

We could hide this by exposing richer abstraction through JSC, but I was wondering if we should actually consider the changing the topic string for a job event to include jobid so that one can subscribed to a state event for a specific job.

These are certainly not high priority. But I'm discussion those so that we can factor them into our new execution system design.

@garlick
Copy link
Member

garlick commented May 23, 2018

(thinking of the new interface) what about a RPC interface for submit where you can optionally get multiple responses, one for each state transition, until a terminal state is reached?

he currently stores the jobid as a class member variable of a Python class which happens before reactor_run () and thus the first invocation of his job status callback.

When you subscribe to events but delay entering the reactor, those events are not lost. They are queued in the flux_t handle. I think as long as you subscribe, then submit the job, then enter reactor and filter events based on the job id, you should pick up all the state transitions.

The downside is you get a lot of events that you must ignore, but maybe OK for now?

@garlick
Copy link
Member

garlick commented May 23, 2018

what about a RPC interface for submit where you can optionally get multiple responses, one for each state transition, until a terminal state is reached?

Or maybe a better interface for this use case would be to mimic fork/exec(2) and wait(2): one API to submit job and obtain job id, and another to block until jobid reaches terminal state then return exit code(s) or other errors. Do users need the to track the intermediate states?

@grondo
Copy link
Contributor

grondo commented May 23, 2018

Or maybe a better interface for this use case would be to mimic fork/exec(2) and wait(2)

I kind of like this idea -- users or tools may also want to track a stopped state when starting a job for debugger attach, but wait(2) also traditionally handles that.

As a counterpoint, it does seem like restricting the job state(s) you can reliably watch for is unnecessarily limiting. However, I'm not sure if you were proposing that we limit the job states for which users could watch, or just that we provide a high level wait(2)-style function that runs a callback on a subset of job states (complete, stopped)? (but still allow low-level calls to watch any job state)

@garlick
Copy link
Member

garlick commented May 23, 2018

I guess no reason to limit the states that can be waited for.

I should also note that, if implemented as proposed in RFC 16, status in the KVS will be a log so calling wait on a particular state that has already been passed through can be reliably handled without races.

There is a list of states in RFC 16 that probably needs to be amended too. (No stopped state for example).

@garlick
Copy link
Member

garlick commented May 23, 2018

However, I'm not sure if you were proposing that we limit the job states for which users could watch, or just that we provide a high level wait(2)-style function that runs a callback on a subset of job states (complete, stopped)?

I was thinking of the wait function as a regular RPC that blocks until a particular state (or one of a set of states) is reached, and then is fulfilled. Rather than implementing a new callback, it would return a future and offer a "get" function for retrieving the state and any other information associated with the state transition (like exit code or other failure).

@grondo
Copy link
Contributor

grondo commented May 23, 2018

I should also note that, if implemented as proposed in RFC 16, status in the KVS will be a log so calling wait on a particular state that has already been passed through can be reliably handled without races.

Yeah, I meant to say that above, we might be trying to fix a problem we'll no longer have after replacing the current system. If the low level calls to watch for job states work reliably, then we could always build higher level functionality from these low level calls. (Like the submit-and-wait call you propose could be built from the submit() API coupled with wait-for-complete)

One property of wait(2) we'd want to avoid is that only the parent can wait for a child. In this system multiple actors will need to be able to wait on job states and possibly even collect exit status.

There is a list of states in RFC 16 that probably needs to be amended too. (No stopped state for example).

Is the KVS job schema RFC the right place to enumerate all job states?

This could be a bad idea, but I was wondering if we wanted to allow job states to be extensible anyway. For example, a different implementation of a job shell might have more or less states than what we propose in an RFC. Of course there will need to be a list of required states, but allowing providers to offer new states might offer some flexibility or experimentation.

@grondo
Copy link
Contributor

grondo commented May 23, 2018

I was thinking of the wait function as a regular RPC that blocks until a particular state (or one of a set of states) is reached, and then is fulfilled. Rather than implementing a new callback, it would return a future and offer a "get" function for retrieving the state and any other information associated with the state transition (like exit code or other failure).

Yes, that seems straightforward and was what I assumed. I was more asking if you were proposing this was the only way to wait on job events, and if the function was limited to only the "complete" state as in the wait system call.

Returning "other information associated with state transition" seems like it might be tricky, but I can see how it might be necessary in order to avoid other races.

@dongahn
Copy link
Member Author

dongahn commented May 23, 2018

micmicing fork/exec wait is a good idea. We need to carefully put good mechanisms and APIs to allow for avoiding races, though, in particular when multiple users are allowed to use wait.

We need to think about whether wait should be limited to a single job or a groupd of jobs as well. I like the same or related interfaces used for single and multiple. Unix Wait also has variants like waitpid or wait for all decendent pids.

@grondo
Copy link
Contributor

grondo commented May 23, 2018

If the future returned by the wait RPC has a wait_get_jobid or the jobid is returned in the payload and can be fetched with unpack(), then it will be simple to call wait on multiple jobids and use the same continuation callback.

Also, it might be really useful to have filtering done on the job-manager side (presuming this is the service that responds to wait RPCs), so that a wait RPC can be initiated with not only jobid filtering, but other filters like username.

@garlick
Copy link
Member

garlick commented May 23, 2018

Excellent comments!

Do we need an abstraction something like "process groups" to allow waiting/signaling groups of jobs?

@grondo
Copy link
Contributor

grondo commented May 23, 2018

Do we need an abstraction something like "process groups" to allow waiting/signaling groups of jobs?

@dongahn had mentioned that same thought before, and my initial reaction was to use a sub-instance if you need to treat a group of jobs as an ensemble. However, after having time to think about it, there may be use cases that can't be handled this way (for example, backfilling a series of small jobs treated as a group might be more efficient than trying to run the jobs in a sub-instance) Therefore, I do think adding process groups is a great idea.

grondo added a commit to grondo/flux-core that referenced this issue Feb 5, 2019
The wreck exec system is worthless, remove it along with associated
commands, tests, and support code.

Since libjsc doesn't work without wreck, it is removed as well.

Fixes flux-framework#1984

Closes flux-framework#1947
Closes flux-framework#1618
Closes flux-framework#1595
Closes flux-framework#1593
Closes flux-framework#1534
Closes flux-framework#1468
Closes flux-framework#1443
Closes flux-framework#1438
Closes flux-framework#1419
Closes flux-framework#1410
Closes flux-framework#1407
Closes flux-framework#1393
Closes flux-framework#915
Closes flux-framework#894
Closes flux-framework#866
Closes flux-framework#833
Closes flux-framework#774
Closes flux-framework#772
Closes flux-framework#335
Closes flux-framework#249
grondo added a commit to grondo/flux-core that referenced this issue Feb 5, 2019
The wreck exec system is worthless, remove it along with associated
commands, tests, and support code.

Since libjsc doesn't work without wreck, it is removed as well.

Fixes flux-framework#1984

Closes flux-framework#1947
Closes flux-framework#1618
Closes flux-framework#1595
Closes flux-framework#1593
Closes flux-framework#1534
Closes flux-framework#1468
Closes flux-framework#1443
Closes flux-framework#1438
Closes flux-framework#1419
Closes flux-framework#1410
Closes flux-framework#1407
Closes flux-framework#1393
Closes flux-framework#915
Closes flux-framework#894
Closes flux-framework#866
Closes flux-framework#833
Closes flux-framework#774
Closes flux-framework#772
Closes flux-framework#335
Closes flux-framework#249
grondo added a commit to grondo/flux-core that referenced this issue Feb 9, 2019
The wreck exec system is worthless, remove it along with associated
commands, tests, and support code.

Since libjsc doesn't work without wreck, it is removed as well.

Fixes flux-framework#1984

Closes flux-framework#1947
Closes flux-framework#1618
Closes flux-framework#1595
Closes flux-framework#1593
Closes flux-framework#1534
Closes flux-framework#1468
Closes flux-framework#1443
Closes flux-framework#1438
Closes flux-framework#1419
Closes flux-framework#1410
Closes flux-framework#1407
Closes flux-framework#1393
Closes flux-framework#915
Closes flux-framework#894
Closes flux-framework#866
Closes flux-framework#833
Closes flux-framework#774
Closes flux-framework#772
Closes flux-framework#335
Closes flux-framework#249
@grondo
Copy link
Contributor

grondo commented Feb 13, 2019

closed by #1988

@grondo grondo closed this as completed Feb 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants