Skip to content

Commit

Permalink
doc: add flux-housekeeping(1)
Browse files Browse the repository at this point in the history
Problem: There is no manual for flux-housekeeping(1).

Add a simple man page for flux-housekeeping(1).
  • Loading branch information
grondo committed Jun 28, 2024
1 parent 4c0a496 commit 505e080
Show file tree
Hide file tree
Showing 3 changed files with 135 additions and 1 deletion.
3 changes: 2 additions & 1 deletion doc/Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,8 @@ MAN1_FILES_PRIMARY = \
man1/flux-cancel.1 \
man1/flux-watch.1 \
man1/flux-update.1 \
man1/flux-hostlist.1
man1/flux-hostlist.1 \
man1/flux-housekeeping.1

# These files are generated as clones of a primary page.
# Sphinx handles this automatically if declared in the conf.py
Expand Down
132 changes: 132 additions & 0 deletions doc/man1/flux-housekeeping.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
====================
flux-housekeeping(1)
====================


SYNOPSIS
========

| **flux** **housekeeping** **list** [*-n*] [*-o FORMAT*]
| **flux** **housekeeping** **kill** [*--all*] [*-j JOBID*] [*-t HOSTS|RANKS*] [*-s SIGNUM*]

DESCRIPTION
===========

.. program:: flux housekeeping

The housekeeping service is an experimental feature in Flux that provides
similar functionality to a job epilog, with a few advantages

- Housekeeping runs after the job, which is then allowed to exit CLEANUP
state and become inactive once resources are released.
- While housekeeping is running, the scheduler still thinks resources are
allocated to the job, and will not allocate resources to other jobs.
- Housekeeping supports partial release of resources back to the scheduler,
such that a subset of stuck nodes do not hold up other nodes from
being returned to service.

The :program:`flux housekeeping` command is used to interact with the
housekeeping service. It supports listing the resources currently executing
housekeeping actions and a command to forcibly terminate actions on a per-job
or per-node basis.


COMMANDS
========

list
----

.. program:: flux housekeeping list

:program:`flux housekeeping list` lists active housekeeping tasks by jobid.


.. option:: -o, --format=FORMAT

Customize the output format (See the `OUTPUT FORMAT`_ section below).

.. option:: -n, --no-header

Suppress header from output.

kill
----

.. program:: flux housekeeping kill

:program:`flux housekeeping kill` can be used to terminate active housekeeping
tasks. Housekeeping may be terminated by jobid, a set of targets such as
broker ranks or hostnames, or all housekeeping may be terminated via the
:option:`--all` option.

.. option:: -s, --signal=SIGNUM

Send signal SIGNUM instead of SIGTERM.

.. option:: -t, --targets=RANK|HOSTS

Target a specific set of ranks or hosts.

.. option:: -j, --jobid=JOBID

Target a specific job by JOBID. Without ``--targets`` this will kill all
housekeeping tasks for the specified job.

.. option:: --all

Target all housekeeping tasks for all jobs.

OUTPUT FORMAT
=============

The :option:`--format` option can be used to specify an output format using
Python's string format syntax or a defined format by name. For a list of
built-in and configured formats use :option:`-o help`.

The following field names can be specified for
:command:`flux housekeeping list`:

**id**
The jobid that triggered housekeeping

**runtime**
The time since this housekeeping task started

**nnodes**
A synonym for **allocated.nnodes**

**ranks**
A synonym for **allocated.ranks**

**nodelist**
A synonym for **allocated.nodelist**

**allocated.nnodes**
The number of nodes still allocated to this housekeeping task.

**allocated.ranks**
The list of broker ranks still allocated to this housekeeping task.

**allocated.ranks**
The list of nodes still allocated to this housekeeping task.

**pending.nnodes**
The number of nodes that still need to complete housekeeping.

**pending.ranks**
The list of broker ranks that still need to complete housekeeping.

**pending.ranks**
The list of nodes that still need to complete housekeeping.

RESOURCES
=========

.. include:: common/resources.rst

SEE ALSO
========

:man5:`flux-config-job-manager`
1 change: 1 addition & 0 deletions doc/manpages.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@
('man1/flux-watch', 'flux-watch', 'monitor one or more Flux jobs', [author], 1),
('man1/flux-update', 'flux-update', 'update active Flux jobs', [author], 1),
('man1/flux-hostlist', 'flux-hostlist', 'fetch, combine, and manipulate Flux hostlists', [author], 1),
('man1/flux-housekeeping', 'flux-housekeeping', 'list and terminate housekeeping tasks', [author], 1),
('man3/flux_attr_get', 'flux_attr_set', 'get/set Flux broker attributes', [author], 3),
('man3/flux_attr_get', 'flux_attr_get', 'get/set Flux broker attributes', [author], 3),
('man3/flux_aux_set', 'flux_aux_get', 'get/set auxiliary handle data', [author], 3),
Expand Down

0 comments on commit 505e080

Please sign in to comment.