-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ingest: set configured queue constraints #4587
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM so far. I had just one thought while doing a first pass (sorry the comment was meant to be an individual comment, not a review, but I had already clicked "Start a Review")
} | ||
if (policy) { | ||
char key[1024]; | ||
snprintf (key, sizeof (key), "queues.%s.policy", name); | ||
if (validate_policy_json (policy, key, error) < 0) | ||
return -1; | ||
} | ||
if (requires) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing we might want to do here is validate that each configured property is valid according to RFC 20, i.e. does not contain whitespace or any characters from the set ! & ' " ^ `` | ( )
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I was wrong! Actually ^
should be allowed since that is currently our shorthand for not
, e.g. a sysadmin should be able to configure a queue.NAME.requires
of [ "^batch" ]
which means that queue NAME
excludes all resources with the batch
property.
I'm sorry for messing that one up.
908c496
to
085a063
Compare
Force pushed with a new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I unfortunately gave bad advice in my previous comment (the ^
character at least should be allowed in requires
(sorry about that), and had one trivial suggestion for the flux-config-queues(5)
man page.
doc/man5/flux-config-queues.rst
Outdated
The ``queues`` table configures job queues, as described in RFC 33. | ||
Normally, Flux has a single anonymous queue, but when queues are configured, | ||
all queues are named, and a job submitted without a queue name is rejected, | ||
unless a default queue is configured. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very small suggestion, this might read better if the last comma is removed:
Normally, Flux has a single anonymous queue, but when queues are configured,
all queues are named, and a job submitted without a queue name is rejected
unless a default queue is configured.
} | ||
if (policy) { | ||
char key[1024]; | ||
snprintf (key, sizeof (key), "queues.%s.policy", name); | ||
if (validate_policy_json (policy, key, error) < 0) | ||
return -1; | ||
} | ||
if (requires) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I was wrong! Actually ^
should be allowed since that is currently our shorthand for not
, e.g. a sysadmin should be able to configure a queue.NAME.requires
of [ "^batch" ]
which means that queue NAME
excludes all resources with the batch
property.
I'm sorry for messing that one up.
NP! I should have caught it - I did skim RFC 20 after your comment but missed it too. Any thoughts for further testing of the constraints plugin before we merge? |
Yeah, here's some ideas:
Also, I noticed that the constraints plugin I gave you allows for a non-list comma separated string to be specified for Edit: Also if you drive the constraints plugin from |
Since the config comes from the broker, and the broker now checks the config, this is not really possible. Just squashed the following changes
|
085a063
to
ad7525b
Compare
oops I need to try that again - I accidentally squashed the wrong pair of commits |
ad7525b
to
02d6ab8
Compare
Ah, yes, then I suppose the code to check validity of the config in the plugin is actually dead code and can be removed? |
Codecov Report
@@ Coverage Diff @@
## master #4587 +/- ##
==========================================
+ Coverage 83.35% 83.37% +0.02%
==========================================
Files 409 410 +1
Lines 68548 68599 +51
==========================================
+ Hits 57135 57196 +61
+ Misses 11413 11403 -10
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Problem: broker policy/queues config validator does not allow 'requires' queues subtable, but this is defined in RFC 33. Allow the following, per RFC 33: queues.NAME.requires = [ string array ]. Add coverage to t2241-policy-config.t.
Problem: a malformed queues config such as queues=42 is allowed by the broker policy/queues config validator. Check that queues is a table. Add test to t2241-policy-config.t.
Problem: configuration can specify a queue-specific default queue, or an unknown default queue. Validate the [policy] default queue. Forbid [queues] from specifying a queue-specific default queue. Update t2241-policy-config.t. Drop config tests from t2112-job-ingest-frobnicator.t. Update the now invalid example in flux-config-policy(5). drop frobnicator test on missing queue config
Problem: RFC 33 specifies that a queue may define resource constraints to identify partitioned resources, but that is not currently supported by the frobnicator. Add `constraints` plugin.
Problem: frobnicator constraints plugin has no test coverage. Add tests to t2112-job-ingest-frobnicator.t to cover the plugin.
Problem: there is no manual page for the [queues] config. Add man page.
02d6ab8
to
279348b
Compare
Well, that prompted me to look through what was being checked and discover a couple of cases that were missed by the broker check (default queue not in I also discovered that the man page example didn't have the default queue in However, the checks in the frobnicator plugins are pretty minimal and do serve as documentation of expectations, if nothing else (like an assertion). Maybe OK to leave them in? |
CAVEATS | ||
======= | ||
|
||
Queue resources should not overlap. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I was going to ask about this caveat. Why can't queue resources overlap (is this a Fluxion constraint?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fluxion allows it, but some issues were identified in flux-framework/flux-sched#939.
Some use cases might work with Fluxion as is, like opening up a stopped queue to select users before a DAT, then when the DAT comes around, stop/drain the normal queues and start the DAT queue. That particular one would require the job manager to support multiple queues though.
Maybe there are other cases, but I thought it best to not advertise it at this point since the near term goal is partitioned resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks, I had forgotten about that. I do know we were talking about allowing a queue with access to all resources (and we do have a pall
partition on many clusters), and this caveat prevents that option. I agree we can ignore it for now as long as the concept of non-overlapping resource sets isn't baked into the design.
Setting MWP, thanks! |
This adds an ingest "frobnicator" plugin (thanks @grondo) that works with the RFC 33
queues.NAME.requires
spec proposed in flux-framework/rfc#342. The broker policy/queue config validator is also modified to allowqueues.NAME.requires
.This was sufficient to get my test system queues working again with this config
Marked as WIP pending