Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reject jobs submitted to a named queue when none are configured #4627

Merged
merged 13 commits into from
Oct 3, 2022
Merged
52 changes: 34 additions & 18 deletions doc/man1/flux-queue.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,50 +8,59 @@ flux-queue(1)
SYNOPSIS
========

**flux** **queue** **disable** *reason...*
**flux** **queue** **disable** [*--queue=NAME*] *reason...*

**flux** **queue** **enable**
**flux** **queue** **enable** [*--queue=NAME*]

**flux** **queue** **stop** [*--verbose*] [*--quiet*]
**flux** **queue** **status** [*--queue=NAME*]

**flux** **queue** **start** [*--verbose*] [*--quiet*]
**flux** **queue** **stop**

**flux** **queue** **status** [*--verbose*]
**flux** **queue** **start**

**flux** **queue** **drain** [*--timeout=DURATION*]

**flux** **queue** **idle** [*--quiet*] [*--timeout=DURATION*]
**flux** **queue** **idle** [*--timeout=DURATION*]

DESCRIPTION
===========

The ``flux-queue`` command controls the Flux job queue.
It has the following subcommands:
The ``flux-queue`` command controls Flux job queues.

Normally, Flux has a single anonymous queue, but when queues are configured,
all queues are named. At this time, only the *disable*, *enable*, and
*status* subcommands can be applied to a single, named queue. The rest affect
all queues.

``flux-queue`` has the following subcommands:

disable
Prevent jobs from being submitted to the queue, with `reason` that is
shown to submitting users.
Prevent jobs from being submitted to the queue, with a reason that is
shown to submitting users. If multiple queues are configured, either the
*--queue* or the *--all* option is required.

enable
Allow jobs to be submitted to the queue.
Allow jobs to be submitted to the queue. If multiple queues are configured,
either the *--queue* or the *--all* option is required.

status
Report the current queue status. If multiple queues are configured,
all queues are shown unless one is specified with *--queue*.

stop
Stop allocating resources to jobs. Pending jobs remain in the queue,
Stop allocating resources to jobs. Pending jobs remain enqueued,
and running jobs continue to run, but no new jobs are allocated resources.

start
Start allocating resources to jobs.

status
Report the current queue status.

drain
Block until the queue becomes empty. It is sometimes useful to run after
Block until all queues become empty. It is sometimes useful to run after
``flux queue disable``, to wait until the system is quiescent and can be
taken down for maintenance.

idle
Block until the queue becomes `idle` (no jobs in RUN or CLEANUP state,
Block until all queues become `idle` (no jobs in RUN or CLEANUP state,
and no outstanding alloc requests to the scheduler). It may be useful to run
after ``flux queue stop`` to wait until the scheduler and execution system
are quiescent before maintenance involving them.
Expand All @@ -62,12 +71,19 @@ OPTIONS
**-h, --help**
Summarize available options.

**-q, --queue**\ =\ *NAME*
Select a queue by name.

**-v, --verbose**
Be chatty.

**-q, --quiet**
**--quiet**
Be taciturn.

**-a, --all**
Use with *enable* or *disable* subcommands to signify intent to affect
all queues, when queues are configured but *--queue* is missing.

**--timeout** \ =\ *FSD*
Limit the time that ``drain`` or ``idle`` will block.

Expand Down
2 changes: 1 addition & 1 deletion doc/man1/flux-uptime.rst
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ current Flux instance, on one or two lines:
- The number of offline nodes, if greater than zero. A node is offline if
its broker is not connected to the instance overlay network.

- A notice if job submission is disabled.
- A notice if job submission is disabled on all queues.

- A notice if scheduling is disabled.

Expand Down
1 change: 1 addition & 0 deletions doc/test/spell.en.pws
Original file line number Diff line number Diff line change
Expand Up @@ -647,3 +647,4 @@ myformat
xdg
XDG
yaml
enqueued
3 changes: 2 additions & 1 deletion etc/rc1
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,8 @@ if test $RANK -eq 0; then
if test "$(backing_module)" != "none"; then
if ! flux startlog --check --quiet; then
flux queue stop
flux queue disable "Flux is in safe mode due to an incomplete shutdown."
flux queue disable --all \
"Flux is in safe mode due to an incomplete shutdown."
fi
fi
fi
Expand Down
54 changes: 42 additions & 12 deletions src/cmd/builtin/uptime.c
Original file line number Diff line number Diff line change
Expand Up @@ -59,27 +59,56 @@ static bool sched_disabled (flux_t *h)
"reason", ""))
|| flux_rpc_get_unpack (f, "{s:b}", "enable", &enable) < 0)
log_err_exit ("Error fetching alloc status");
flux_future_destroy (f);
return enable ? false : true;
}

static bool queue_is_enabled (flux_t *h, const char *name)
{
flux_future_t *f;
int enable;
const char *topic = "job-manager.queue-status";

if (name)
f = flux_rpc_pack (h, topic, 0, 0, "{s:s}", "name", name);
else
f = flux_rpc_pack (h, topic, 0, 0, "{}");
if (!f || flux_rpc_get_unpack (f, "{s:b}", "enable", &enable) < 0)
log_err_exit ("Error fetching queue status: %s",
future_strerror (f, errno));
flux_future_destroy (f);
return enable ? true : false;
}

/* Return true if job submission is disabled.
* If there are multiple queues, return true only if ALL queues are disabled.
*/
static bool submit_disabled (flux_t *h)
{
flux_future_t *f;
int enable;
bool disabled = true;
json_t *queues;
size_t index;
json_t *value;

if (!(f = flux_rpc_pack (h,
"job-manager.submit-admin",
0,
0,
"{s:b s:b s:s}",
"query_only", 1,
"enable", 0,
"reason", ""))
|| flux_rpc_get_unpack (f, "{s:b}", "enable", &enable) < 0)
log_err_exit ("Error fetching submit status");
return enable ? false : true;
f = flux_rpc (h, "job-manager.queue-list", NULL, 0, 0);
if (!f || flux_rpc_get_unpack (f, "{s:o}", "queues", &queues))
log_msg_exit ("queue-list: %s", future_strerror (f, errno));
if (json_array_size (queues) > 0) {
json_array_foreach (queues, index, value) {
if (queue_is_enabled (h, json_string_value (value))) {
disabled = false;
break;
}
}
}
else {
if (queue_is_enabled (h, NULL))
disabled = false;
}
flux_future_destroy (f);

return disabled;
}

/* Each key in the drain object is an idset representing a group
Expand Down Expand Up @@ -153,6 +182,7 @@ static double attr_get_starttime (flux_t *h)
d = strtod (s, &endptr);
if (errno != 0 || *endptr != '\0')
log_msg_exit ("Error parsing %s", name);
flux_future_destroy (f);
return d;
}

Expand Down
Loading