-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wreck: better integration of global lwj.environ environment for wreck jobs #1405
Conversation
Excellent! Thanks @grondo. I am sure this in combination with |
This is great! Will merge once travis finishes. |
This may not make as much difference as expected, unless for some reason each In addition to |
Had to force-push a fix for the new tests under |
Do you mean have the commands stop writing to the KVS, and instead send their info to the job module to be put in (like request parameters, environment, etc?) |
That's too big a change for this implementation I think. I think actually I retract my previous statement. I was at first thinking that if we precreate all directory entries for the job (e.g. rank |
restarted clang build that hung here
|
I think this just needs a rebase? I can take care of it for you @grondo if you're already heading out of town. |
wreck execution first reads a global lwj.environ table and uses that as the default environment for all jobs, overridden by any environment variables encoded in the per-job `environ` table. Currently, flux-wreckrun and flux-submit do not check for lwj.environ and instead export the entire (filtered) current environment to each job, so the lwj.environ is largely going ignored. Fix this by first reading a "default environment" from lwj.environ in wreckrun/submit, and only push variables that are not already the same as the default env to the job. This should cut down on KVS size for large numbers of jobs with largely the same, but slightly different, environments. (If all environment tables are the same, then this change does nothing since all environ entries in KVS will be squashed by CAS)
Add an option -S, --skip-env to submit and wreckrun to skip the export of current environment to the job. This can be used to speed up submit/wreckrun when the global lwj.environ is sufficient, since that default table doesn't need to be fetched from the KVS before submitting the job to the instance.
Add flux-wreck setenv, getenv, unsetenv commands to set, get, and manipulate the global wreck environment under lwj.environ.
Add tests for wreck global environment lwj.environ, and corresponding commands to manipulate and use that environemnt. Note: the test is split from t2000-wreck.t but still prefixed with t2000 for easier test devlepment, parallel tests, etc. Since this is short-lived code, it is probably ok for now.
Rebased. Thanks! |
Restarted another build hung in the same t4000-issues test as above. |
Thanks, @garlick! |
I was thinking that perhaps reducing content in per-job environment
lwj.x.y.ID.environ
might reduce the size of content store with many jobs. Turns out in most cases of submitting a long series of jobs, the environment in the KVS is identical and squashed by the content-store, but I thought I'd push up the work I did in support of better integration of the "global" job environmentlwj.environ
, which might be useful if each job has a slightly different environment, or just for better control of the environment.This PR adds support to wreckrun/submit to first check
lwj.environ
table by default and only submit any different variables as the per job environment. A set of utility commandsflux wreck setenv,getenv,unsetenv
are provided to set and manipulate this global environment. A new option-S, --skip-env
is also added toflux-wreckrun
andsubmit
to skip exporting the environment completely, so that the global environment is always used.To export the current environment to all future jobs, use
flux wreck setenv all
. If you want to run all jobs using this environment (FLUX_URI
and other Flux env vars are set explicitly per-job by wrexecd), then pass-S, --skip-env
in flux-submit or flux-wreckrun and avoid the extra KVS fetch of the globallwj.environ
, otherwise any environment differences are exported to the job in the job-specific env table.E.g.: