diff --git a/doc/man5/flux-config-job-manager.rst b/doc/man5/flux-config-job-manager.rst index ea0a75e8f643..4b29631ffae5 100644 --- a/doc/man5/flux-config-job-manager.rst +++ b/doc/man5/flux-config-job-manager.rst @@ -28,6 +28,20 @@ plugins Each directive follows the format defined in the :ref:`plugin_directive` section. +housekeeping + (optional) Table of configuration for the job-manager housekeeping + service. The housekeeping service is an experimental alternative for + handling administrative job epilog workloads. If enabled, resources are + released by jobs to housekeeping, which runs a command or a systemd unit + and releases resources to the scheduler on completion. See configuration + details in the :ref:`housekeeping` section. + + Note: The housekeeping script runs as the instance owner (e.g. "flux"). + On a real system, "command" is be configured to "imp run housekeeping", + and the IMP is configured to launch the flux-housekeeping systemd + service as root. (See :man5:`flux-config-security-imp` for details + on configuring :command:`flux imp run`). + .. _plugin_directive: @@ -50,6 +64,28 @@ conf (optional) An object, valid with ``load`` only, that defines a configuration table to pass to the loaded plugin. +.. _housekeeping: + +HOUSEKEEPING +============ + +command + (optional) An array of strings specifying the housekeeping command. Either + ``command`` or ``use-systemd-unit`` must be specified. + +use-systemd-unit + (optional) A boolean value indicating whether to run the flux-housekeeping + systemd unit to handle housekeeping, rather than a specific command. + Either ``use-systemd-unit`` or ``command`` must be specified. + +release-after + (optional) A string specified in Flux Standard Duration (FSD). If unset, + resources for a given job are not released until all execution targets for + a given job have completed housekeeping. If set to ``0``, resources are + released as each target completes. Otherwise, a timer is started when the + first execution target for a given job completes, and all resources that + have completed housekeeping when the timer fires are released. Following + that, resources are released as each execution target completes. EXAMPLE ======= @@ -73,6 +109,10 @@ EXAMPLE } ] + [job-manager.housekeeping] + use-systemd-unit = true + release-after = "1m" + RESOURCES =========