-
Notifications
You must be signed in to change notification settings - Fork 50
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #5818 from garlick/housekeeping
job-manager: add support for housekeeping scripts with partial release of resources
- Loading branch information
Showing
37 changed files
with
1,844 additions
and
63 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -599,6 +599,8 @@ AC_CONFIG_FILES( \ | |
etc/flux-hostlist.pc \ | ||
etc/flux-taskmap.pc \ | ||
etc/flux.service \ | ||
etc/[email protected] \ | ||
src/cmd/flux-run-housekeeping \ | ||
doc/Makefile \ | ||
doc/test/Makefile \ | ||
t/Makefile \ | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
Experimental Flux features and interfaces are made available for evaluation | ||
only and may change or be removed without notice. | ||
|
||
Feedback is welcome. Please use the flux-core project Github issue tracker. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,5 @@ | ||
Flux: http://flux-framework.org | ||
|
||
Flux RFC: https://flux-framework.readthedocs.io/projects/flux-rfc | ||
|
||
Issue Tracker: https://github.com/flux-framework/flux-core/issues |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,137 @@ | ||
==================== | ||
flux-housekeeping(1) | ||
==================== | ||
|
||
|
||
SYNOPSIS | ||
======== | ||
|
||
| **flux** **housekeeping** **list** [*-n*] [*-o FORMAT*] | ||
| **flux** **housekeeping** **kill** [*--all*] [*-j JOBID*] [*-t HOSTS|RANKS*] [*-s SIGNUM*] | ||
|
||
DESCRIPTION | ||
=========== | ||
|
||
.. program:: flux housekeeping | ||
|
||
The `EXPERIMENTAL`_ housekeeping service provides similar functionality to | ||
a job epilog, with a few advantages | ||
|
||
- Housekeeping runs after the job, which is then allowed to exit CLEANUP | ||
state and become inactive once resources are released. | ||
- While housekeeping is running, the scheduler still thinks resources are | ||
allocated to the job, and will not allocate resources to other jobs. | ||
- Housekeeping supports partial release of resources back to the scheduler, | ||
such that a subset of stuck nodes do not hold up other nodes from | ||
being returned to service. | ||
|
||
The :program:`flux housekeeping` command is used to interact with the | ||
housekeeping service. It supports listing the resources currently executing | ||
housekeeping actions and a command to forcibly terminate actions on a per-job | ||
or per-node basis. | ||
|
||
|
||
COMMANDS | ||
======== | ||
|
||
list | ||
---- | ||
|
||
.. program:: flux housekeeping list | ||
|
||
:program:`flux housekeeping list` lists active housekeeping tasks by jobid. | ||
|
||
|
||
.. option:: -o, --format=FORMAT | ||
|
||
Customize the output format (See the `OUTPUT FORMAT`_ section below). | ||
|
||
.. option:: -n, --no-header | ||
|
||
Suppress header from output. | ||
|
||
kill | ||
---- | ||
|
||
.. program:: flux housekeeping kill | ||
|
||
:program:`flux housekeeping kill` can be used to terminate active housekeeping | ||
tasks. Housekeeping may be terminated by jobid, a set of targets such as | ||
broker ranks or hostnames, or all housekeeping may be terminated via the | ||
:option:`--all` option. | ||
|
||
.. option:: -s, --signal=SIGNUM | ||
|
||
Send signal SIGNUM instead of SIGTERM. | ||
|
||
.. option:: -t, --targets=RANK|HOSTS | ||
|
||
Target a specific set of ranks or hosts. | ||
|
||
.. option:: -j, --jobid=JOBID | ||
|
||
Target a specific job by JOBID. Without ``--targets`` this will kill all | ||
housekeeping tasks for the specified job. | ||
|
||
.. option:: --all | ||
|
||
Target all housekeeping tasks for all jobs. | ||
|
||
OUTPUT FORMAT | ||
============= | ||
|
||
The :option:`--format` option can be used to specify an output format using | ||
Python's string format syntax or a defined format by name. For a list of | ||
built-in and configured formats use :option:`-o help`. | ||
|
||
The following field names can be specified for | ||
:command:`flux housekeeping list`: | ||
|
||
**id** | ||
The jobid that triggered housekeeping | ||
|
||
**runtime** | ||
The time since this housekeeping task started | ||
|
||
**nnodes** | ||
A synonym for **allocated.nnodes** | ||
|
||
**ranks** | ||
A synonym for **allocated.ranks** | ||
|
||
**nodelist** | ||
A synonym for **allocated.nodelist** | ||
|
||
**allocated.nnodes** | ||
The number of nodes still allocated to this housekeeping task. | ||
|
||
**allocated.ranks** | ||
The list of broker ranks still allocated to this housekeeping task. | ||
|
||
**allocated.ranks** | ||
The list of nodes still allocated to this housekeeping task. | ||
|
||
**pending.nnodes** | ||
The number of nodes that still need to complete housekeeping. | ||
|
||
**pending.ranks** | ||
The list of broker ranks that still need to complete housekeeping. | ||
|
||
**pending.ranks** | ||
The list of nodes that still need to complete housekeeping. | ||
|
||
EXPERIMENTAL | ||
============ | ||
|
||
.. include:: common/experimental.rst | ||
|
||
RESOURCES | ||
========= | ||
|
||
.. include:: common/resources.rst | ||
|
||
SEE ALSO | ||
======== | ||
|
||
:man5:`flux-config-job-manager` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
Experimental Flux features and interfaces are made available for evaluation | ||
only and may change or be removed without notice. | ||
|
||
Feedback is welcome. Please use the flux-core project Github issue tracker. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,5 @@ | ||
Flux: http://flux-framework.org | ||
|
||
Flux RFC: https://flux-framework.readthedocs.io/projects/flux-rfc | ||
|
||
Issue Tracker: https://github.com/flux-framework/flux-core/issues |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
Experimental Flux features and interfaces are made available for evaluation | ||
only and may change or be removed without notice. | ||
|
||
Feedback is welcome. Please use the flux-core project Github issue tracker. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,5 @@ | ||
Flux: http://flux-framework.org | ||
|
||
Flux RFC: https://flux-framework.readthedocs.io/projects/flux-rfc | ||
|
||
Issue Tracker: https://github.com/flux-framework/flux-core/issues |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
Experimental Flux features and interfaces are made available for evaluation | ||
only and may change or be removed without notice. | ||
|
||
Feedback is welcome. Please use the flux-core project Github issue tracker. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,5 @@ | ||
Flux: http://flux-framework.org | ||
|
||
Flux RFC: https://flux-framework.readthedocs.io/projects/flux-rfc | ||
|
||
Issue Tracker: https://github.com/flux-framework/flux-core/issues |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,7 @@ | ||
#if HAVE_SYSTEMD | ||
systemdsystemunit_DATA = flux.service | ||
systemdsystemunit_DATA = \ | ||
flux.service \ | ||
[email protected] | ||
#endif | ||
|
||
tmpfilesdir = $(prefix)/lib/tmpfiles.d | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.