Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wreck: use cmb.exec instead of direct fork/exec for wrexecd execution in job module #1508

Merged
merged 9 commits into from
May 8, 2018

Conversation

grondo
Copy link
Contributor

@grondo grondo commented May 5, 2018

This is a very early WIP, posted now for early comments and discussion if necessary.

The end goal of this work is to detect nonzero exit status from launched wrexecd and set jobs to "failed" state, as well as enable the potential to cancel/kill jobs in the starting/runrequest state.

Currently the main work here is to convert the job module from direct fork/exec to the cmb.exec service. At this point failed wrexecd is logged but there is no mechanism yet to cancel the rest of the job.

There is an active_pids hash added in addition to active_jobs here, but that might be removed as this PR moves forward. It was only strictly necessary before flux_future_reset was implemented. Right now it just serves as a way to stash wreck_job structures that have a second reference because they are tied to local wrexecds.

@coveralls
Copy link

coveralls commented May 5, 2018

Coverage Status

Coverage decreased (-0.2%) to 78.801% when pulling 19dd1d2 on grondo:wreck-cmb.exec into 6939f86 on flux-framework:master.

@garlick
Copy link
Member

garlick commented May 6, 2018

This change does seem pretty clean!

In cmb_exec_cb() one thing I noticed is that if flux_future_get() fails, the future is reset, not destroyed. But the rule we just added to RFC 6 states

Servers SHALL stop sending responses to a given request once a response with errnum set to a nonzero value has been sent.

so maybe that case should be fatal as long as the rule holds on the server side?

@grondo
Copy link
Contributor Author

grondo commented May 6, 2018

so maybe that case should be fatal as long as the rule holds on the server side?

Great thanks! I had forgotten about that case. In the current cmb.exec we don't expect an msgs with errnum set, but we should fix that! (and I don't know what purpose resetting the future would server here anyway.)

@grondo grondo force-pushed the wreck-cmb.exec branch from e1d55a3 to 4d76d12 Compare May 8, 2018 16:59
@grondo
Copy link
Contributor Author

grondo commented May 8, 2018

I've reworked this PR to simply replace direct exec from the job module with cmb.exec, making use of flux_future_reset() to accept multiple responses from the cmb.exec server.

The active_pids hash is no longer strictly used, so it was dropped.

Perhaps this simple step forward can be merged until we figure out the best way to recover from failed wrexecd launch.

@grondo grondo force-pushed the wreck-cmb.exec branch from 4d76d12 to 0b95759 Compare May 8, 2018 17:19
grondo added 5 commits May 8, 2018 10:25
Ensure that jobids in the job.kvspath rpc are encoded as integers.
Problem: wreck/io.lua uses a hand-coded rpc to the job.kvspath service
to lookup the kvs path for a jobid if needed, resulting in multiple
call sites to check for fixes and proper args to the rpc.

Use the wreck.id_to_path() function built-in to wreck.lua instead. This
not only consolidates the rpc to a single call site, but also makes
use of memoization of the wreck version of this function.
Remove the last bits of json-c and shortjson from the wreck "job"
module and fully convert to jansson instead.
Add a reference counter to struct wreck_job objects to allow
use in multiple job hashes.
Add tests for wreck_job_incref expected functionality to wreck_job
unit tests.
@grondo grondo force-pushed the wreck-cmb.exec branch 2 times, most recently from 0f41bd2 to 0371bd1 Compare May 8, 2018 17:29
Copy link
Member

@garlick garlick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! I added a few minor review comments inline.

* nor even advisable. With the setsid above, the wrexecd
* process should be reparented to init.
*/
if (flux_future_get (f, &msg) < 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than unpacking msg piecemeal, would it make sense to call flux_rpc_unpack() once here with s? optional keys for the conditional status key?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably. I was possibly just preserving the existing structure, unnecessarily.

exit (255);
if (type && strcmp (type, "io") == 0)
spawn_io_cb (h, job, msg);
else if (strcmp (state, "Exited") == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

State is optionally unpacked and so may be NULL here.

#endif
exec_handler (wrexecd_path, job);
n = snprintf (buf, sizeof (buf), "--kvs-path=%s", job->kvs_path);
if ((n >= sizeof (buf)) || (n < 0)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is snprintf cannot return < 0, because man page says that only happens on an "output error" which I assume only applies to the fprintf variants. I bring it up not so much to nitpick but to see if you agree because I generally don't check for < 0 result from snprintf and if I'm wrong that's more serious!

Copy link
Contributor Author

@grondo grondo May 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a portability habit I picked up from somewhere else. Though unlikely, snprintf can return -1 on ENOMEM and possibly other errors. For example, found this reproducer on the internet of webpages:

$ cat t.c
#include <stdio.h>
#include <string.h>
#include <errno.h>
int
main(int argc, char **argv)
{
  char buf[200];
  char *fmt = argv[1];
  if (argc < 2)
    return 1;
  int n = snprintf (buf, sizeof buf, fmt, 1);
  int saved_errno = errno;
  printf ("fmt: %s  retval=%d  errno=%s\n", fmt, n,
	   n < 0 ? strerror(saved_errno) : "");
  return 0;
}
$ gcc -o tm t.c
$ bash -c 'ulimit -v 5000; ./tm %$((5*2**20))d' 
fmt: %5242880d  retval=-1  errno=Cannot allocate memory

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, the atrocities I've committed.

}
json_array_append_new (o, json_string (buf));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

json_string() can return NULL on error.

struct wreck_job *job = arg;
const char *state = NULL;
int rank;
pid_t pid;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pid is uninitialized, then optionally unpacked, then not used. Initialize or drop it entirely?

const char *wrexecd_path;
char buf [4096];
int n;

if (!(wrexecd_path = flux_attr_get (h, "wrexec.wrexecd_path", NULL))) {
flux_log_error (h, "spawn_exec_handler: flux_attr_get");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spawn_exec_handler -> wrexecd_cmdline_create? (or use __FUNCTION__)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really dislike using __FUNCTION__ because you can't effectively search code based on the actual error message. I'll update the error text if that is required, but the error is during spawn_exec...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry - reasonable. Maybe I should have said the error prefix in this message is different than the one right after it. And it's minor so not worth making a big deal over.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah, the inconsistency is a good catch and it should be fixed.

}
if (!(o = json_array ())) {
flux_log_error (h, "wrexecd_cmdlin_create: json_array");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

misspelled "cmdlin"

@grondo
Copy link
Contributor Author

grondo commented May 8, 2018

Ok, I've pushed some fixes to address your comments @garlick. If they seem ok I'll squash. Sorry this PR's been so sloppy!

@garlick
Copy link
Member

garlick commented May 8, 2018

Not at all! LGTM.

grondo added 4 commits May 8, 2018 14:16
Problem: The job module spawns wrexecd on ranks involved in a job by
direct fork/exec. Not only does this duplicate code from the cmb.exec
service, but this is a fork from a threaded program, it may be
problematic to reap exit status of wrexecd since the job module
may not be successful in registering child watchers.

Switch from direct fork/exec in job module to and rpc to the `cmb.exec`
service on the local broker. Use the new flux_future_reset() facility
to accept multiple responses from the `cmb.exec` service. Finally
destroying the job and the future after the remote wrexecd has exited.

For now non-zero exit codes and failure of exec(2) for wrexecd are
simply logged, and no further action is taken.
Don't create a separate wreck_job object for run event handling,
instead pull the existing job from the active_jobs hash.
Now that all job allocation mechanisms (wreck internal, sched)
use R_lite for resources, drop support for rank.N dirs in
wreck job launch. Update tests accordingly.
Somehow the ngpus tests in t2000-wreck.t were prefixed with a '+'
and more concerning, this just caused the tests to be skipped.
Remove the stray character and fix one of the tests.
@grondo grondo force-pushed the wreck-cmb.exec branch from 4475e94 to 19dd1d2 Compare May 8, 2018 21:16
@grondo
Copy link
Contributor Author

grondo commented May 8, 2018

Ok, I've squashed the fixup commits

@garlick
Copy link
Member

garlick commented May 8, 2018

Restarted a stuck clang builder. I'll merge once that's successful.

@garlick garlick merged commit dc91642 into flux-framework:master May 8, 2018
@grondo grondo mentioned this pull request May 10, 2018
@grondo grondo deleted the wreck-cmb.exec branch May 18, 2018 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants