Dws: use static rabbit layout mapping for JGF #204

jameshcorbett · 2024-08-31T15:42:49Z

Problem: as described in #193,
JGF is too unwieldy to be stored in Ansible. On the other hand, Flux's ability to start up and run
jobs cannot be dependent on the responsiveness of kubernetes, so generating JGF from kubernetes
before starting Flux is not a good option.

The solution I've decided to go with after some discussion is to store some
static rabbit layout data in ansible, generated by reading from kubernetes. This data in turn is then
read in to generate JGF. Unlike kubernetes, we can expect the static file to always exist.

Fixes #193

Problem: admins prefer configuring clusters with a config file to doing so with R. However, the current design of flux-dws2jgf forces the usage of R as input to generate JGF. When flux-framework/flux-core#6245 is implemented, Flux will be able to combine a config file with JGF. Add an option to use a config file as input to the JGF generation, so administrators will be able to use a config file + JGF instead of R + JGF.

Problem: as described in flux-framework#193, JGF is too unwieldy to be stored in Ansible. On the other hand, Flux's ability to start up and run jobs cannot be dependent on the responsiveness of kubernetes, so generating JGF from kubernetes before starting Flux is not an option. A solution would be to store some static rabbit data in ansible, generated by reading from kubernetes. This data could be read in to generate JGF. Add a script that generates a JSON file describing which nodes map to which rabbits and what the capacity of each rabbit is.

Problem: as described in flux-framework#193, JGF is too unwieldy to be stored in Ansible. On the other hand, Flux's ability to start up and run jobs cannot be dependent on the responsiveness of kubernetes, so generating JGF from kubernetes before starting Flux is not an option. Change flux-dws2jgf to read from a static JSON file generated by the flux-rabbitmapping script, instead of from kubernetes.

Problem: tests for the flux-dws2jgf script need to be updated now that the script reads from a mapping generated by flux-rabbitmapping instead of by polling kubernetes. Change the tests and add a simple test for flux-rabbitmapping.

Problem: the error message for requesting too much lustre storage gives a float, but an integer would be better and more consistent with the other file system types. Provide an integer in the error message by doing integer division instead of normal float division.

grondo

LGTM! Nice improvement. I just noticed one trivial issue in the tests.

grondo · 2024-09-04T20:37:31Z

t/t2000-dws2jgf.t

@@ -33,27 +33,32 @@ test_expect_success HAVE_JQ 'smoke test to ensure the storage resources are expe
 	test $(hostname) = compute-01
 '

+test_expect_success HAVE_JQ 'flux-rabbitmapping outputs expected mapping' '


This test has HAVE_JQ but I don't see any use of jq in the test itself.

However, also note that flux-core tests and thus flux-sharness.sh assume jq availability, so at least in flux-core we've dropped use of HAVE_JQ and just assume its there (flux-sharness throws an error if not).

(Maybe for future cleanup, no need to address this now)

Thanks, I'll open another PR for that. Setting MWP.

jameshcorbett requested a review from grondo August 31, 2024 15:42

jameshcorbett added 5 commits September 4, 2024 13:30

test: update dws2jgf tests to use rabbit mapping

6921506

Problem: tests for the flux-dws2jgf script need to be updated now that the script reads from a mapping generated by flux-rabbitmapping instead of by polling kubernetes. Change the tests and add a simple test for flux-rabbitmapping.

jameshcorbett force-pushed the rabbit-mapping branch from 8934cdd to bcf9be2 Compare September 4, 2024 20:30

grondo approved these changes Sep 4, 2024

View reviewed changes

jameshcorbett added the merge-when-passing label Sep 4, 2024

mergify bot merged commit 5a3d0a1 into flux-framework:master Sep 4, 2024
8 checks passed

jameshcorbett deleted the rabbit-mapping branch September 4, 2024 23:32

jameshcorbett mentioned this pull request Sep 4, 2024

testsuite: sharness sync #206

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dws: use static rabbit layout mapping for JGF #204

Dws: use static rabbit layout mapping for JGF #204

jameshcorbett commented Aug 31, 2024

grondo left a comment

grondo Sep 4, 2024

jameshcorbett Sep 4, 2024

Dws: use static rabbit layout mapping for JGF #204

Dws: use static rabbit layout mapping for JGF #204

Conversation

jameshcorbett commented Aug 31, 2024

grondo left a comment

Choose a reason for hiding this comment

grondo Sep 4, 2024

Choose a reason for hiding this comment

jameshcorbett Sep 4, 2024

Choose a reason for hiding this comment