-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement per-job private KVS namespace #1460
Conversation
Problem: valgrind test fails when private KVS namespace is made part of job setup. Looks like some memory allocated in namespace_create() that was freed in the error path, was not freed in the non-error path. Fix that.
When a job is initiated, create a private KVS namespace for the job and link it into the job's primary namespace subdirectory, e.g. lwj.X.Y.<jobid>.guest -> ns:job42/. This implements the guest namespace described in RFC 16 for wreck, albeit in a KVS base path that does not conform to RFC 16.
When job enters a terminal state (complete or failed), replace symlink to per-job KVS namespace e.g. lwj.X.Y.<jobid>.guest -> ns:job42/. with a snapshot reference, then delete the namespace. Thus the final content of the private namespace is made part of the primary namespace.
Add --kvs-guest option to override the default value of FLUX_JOB_KVSPATH.
Arrange for the FLUX_JOB_KVSPATH environment variable to point to the guest KVS namespace in a launched program, rather than the KVS path in the primary namespace. This changes the KVS path used to look up a sub-instance's FLUX_URI from <job_kvs_dir>.flux.remote_uri to <job_kvs_dir>.guest.flux.remote_uri Update flux-wreck and t2003-recurse.t accordingly.
This is pretty nifty! I know this is an experiment meant to get us to RFC 16 guest namespace, but since the wreck prototype isn't multi-user capable anyway, I wonder if it would be interesting to have an option to replace the whole job kvs directory with a namespace. This would be interesting because
Maybe the third item above wouldn't really be worth pursuing at this point, but I do find it interesting that the list of current Other that that, my only other comment is that I'm confused by the |
Add a sharness test intended to test conformance to RFC 16: KVS job schema. Initially this just checks behavior of the "guest" subdirectory that is a private namespace during program execution.
Looks pretty good to me. One thought:
should there be the opposite test, to make sure |
Good idea. The |
Thanks for the thoughts guys! I'm especially intrigued by the idea of moving the whole job directory into the namespace. It gets us further along and will probably be a simpler change, even if it doesn't wind up looking quite like RFC 16. |
Closing pending rework. |
Now that we have @chu11's excellent work on private KVS namespaces merged, it seems useful to implement per-job "guest" namespaces as envisioned in RFC 16. This PR arranges for the job module to create a guest namespace named
job<jobid>
during job creation, and to "reap" it when the job reaches a terminal state.During execution, the job may access the the namespace via the FLUX_JOB_KVSPATH environment variable, which is set to the direct namespace path, e.g.
ns:job<jobid>/.
. The enclosing instance can access it via thelwj.X.Y.<jobid>.guest
symlink. After the job completes or fails, the namespace is "reaped", e.g. the symlink is replaced with a snapshot of the final content of the private namespace and the namespace is deleted.Migrating job data to a private namespace can potentially reduce the bottleneck on serialized commits in the primary KVS namespace and protects the instance against an unintentional KVS DoS by programs that use the KVS heavily.
In this PR, the only data placed in the private namespace is the
flux.local_uri
andflux.remote_uri
values recently added in support offlux wreck uri
. It is empty if the job is not a flux instance. In a future PR we may want to migrate stdio and other per-job info.