Speed up final constraint enforcement #1286

trws · 2024-08-31T17:45:26Z

This includes fixes for various performance issues in the scoring api type, and resource data types (it should even speed up graph creation by making the resource types movable). The main item though is below. It takes the performance of the benchmark from #1171 from 10 jobs/s for constrained jobs and 15 jobs/s for unconstrained to 100-150 jobs/s for constrained and 150-200 jobs/s for unconstrained. For some reason this patch also consistently hits the issue in the progress bar where it will reach past the end of style, so if that comes up make sure to be testing with flux-core master.

problem: In commit b96899b, as part of adding node exclusive,
enforce_dfv was added to precisely mark unconstrained resources between
the root of the containment tree and the constrained resources. This
works fine, but it's a full traversal of the graph that gets run even
when selecting a single core. Usually this isn't terrible performance
because if the Node type is constrained then all the node vertices get
skipped after the constrained one is found. When the job is only
constrained on a core or socket however, or if the node isn't found
quickly, it can be as bad as a full traversal. Literally 90% of the
time spent matching a small single core job in the perf test case in
issue #1171.

solution: The short version is: Remove enforce_dfv completely. The
longer version is that instead of traversing the entire graph, we now
re-use the iteration over constrained resources in dfu_impl_t::enforce
and walk upward from each constrained vertex in the dominant subsystem
until we find a vertex that we've already set best_k (since any above
that must already have been set). That way we walk the minimum amount
of the tree necessary. Honestly it's still doing more work than it
should, because we're still iterating over resources that are children
of exclusive resources unnecessarily, and we're having to iterate over
in_edges rather than directly knowing the parent or subsetting on
subsystem, but fixing either of those would be a meaningful refactor so
I'm stopping here for now with the 15-20x speedup.

fixes #1171

problem: resource pools and relations don't have move constructors or move assignment solution: add them, defaulted, and remove the unnecessary explicit constructors

problem: we're accidentally hashing the unique_id twice in some cases, and using a very slow and not great hash for the pointer in the shared_ptr case solution: specialize the rc storage to more efficiently hash shared_ptr, and only compute the unique_id hash once

problem: scoring_api does a lot of extra std::map lookups that aren't really necessary, and uses a map where an interned_key_vec would do at least as well solution: swap the std::map for the interned_key_vec, and remove a whole lot of calls to handle new keys, which actually hadn't been necessary for a bit

problem: we have typedefs for common maps of subsystem to string or set of string, but we're using std::map solution: stop doing that

problem: we store a mapping of subsystems in every edge, but each edge is only ever in one subsystem, they're effectively an edge type, so it's a waste of time and space both in memory and in all our serialization formats that carry it like JGF solution: swap the std::map<string,subsystem_t> for a single subsystem_t, fix the breakage in all the readers and writers and fix all the JGFs in the repository

problem: the files _start_ with json but then have other output after that, so putting them through jq causes errors solution: use test_cmp instead

problem: In commit b96899b, as part of adding node exclusive, enforce_dfv was added to precisely mark unconstrained resources between the root of the containment tree and the constrained resources. This works fine, but it's a _full traversal of the graph_ that gets run even when selecting a single core. Usually this isn't terrible performance because if the Node type is constrained then all the node vertices get skipped after the constrained one is found. When the job is only constrained on a core or socket however, or if the node isn't found quickly, it can be as bad as a full traversal. Literally 90% of the time spent matching a small single core job in the perf test case in issue flux-framework#1171. solution: The short version is: Remove enforce_dfv completely. The longer version is that instead of traversing the entire graph, we now re-use the iteration over constrained resources in dfu_impl_t::enforce and walk upward from each constrained vertex in the dominant subsystem until we find a vertex that we've already set best_k (since any above that must already have been set). That way we walk the minimum amount of the tree necessary. Honestly it's still doing more work than it should, because we're still iterating over resources that are children of exclusive resources unnecessarily, and we're having to iterate over `in_edges` rather than directly knowing the parent or subsetting on subsystem, but fixing either of those would be a meaningful refactor so I'm stopping here for now with the 15-20x speedup. fixes flux-framework#1171

problem: several files have extra _seconds_ of compile time due to over-including headers in headers solution: include less things we do not need

grondo

Wow! Huge improvement here:

Submitting 100 jobs: │███████████████████████████████████████│100.0% 150.5 job/s

real	0m4.229s
user	0m0.063s
sys	0m0.008s
{
 "V": 1654785,
 "E": 1654784,
 "by_rank": {
  "[0-16383]": 101
 },
 "load-time": 75.918168234999996,
 "graph-uptime": 40,
 "time-since-reset": 116,
 "match": {
  "succeeded": {
   "njobs": 100,
   "njobs-reset": 100,
   "max-match-jobid": 1950452023297,
   "max-match-iters": 1,
   "stats": {
    "min": 0.022669928999999998,
    "max": 0.028772064,
    "avg": 0.027917891049999994,
    "variance": 1.0197218638951509e-6
   }
  },
  "failed": {
   "njobs": 0,
   "njobs-reset": 0,
   "max-match-jobid": 0,
   "max-match-iters": 0,
   "stats": {
    "min": 0.0,
    "max": 0.0,
    "avg": 0.0,
    "variance": 0.0
   }
  }
 }
}

One question, will this break existing JGF on clusters that have it, or if Fluxion is reloaded from an older version on some system?

Oh, and also does flux ion-R need an update? (Sorry if I missed something obvious)

trws · 2024-09-03T16:04:52Z

It's subtle, but the change for flux ion-R is handled by these two lines: https://github.com/flux-framework/flux-sched/pull/1286/files#diff-5f60472bc026d952db8a60e3a2ccbc3f838188190d39ca9cd1f9ea05170ee4cdL94-R94

All the JGF handling in there is done by that other file, come to find out. It actually took me a shockingly long time to find that.

As to the breaking existing JGF, it's possible if we're using JGF in something on those systems, which we might be in some cases. I don't see where though, it looks like we're just storing R for the R key on tioga for example, so I don't think there's anything to break there. This is part of why I asked about evolving JGF at the last meeting. If we need I can make it so the reader can ingest either the new or old format and just always turn it into this internal representation, but this will not be able to emit the old format without some loss.

grondo · 2024-09-03T17:23:31Z

Ok, sounds good. Perhaps @jameshcorbett could comment if there are any systems actively using JGF (which I think he mentioned it shouldn't be an issue to upgrade)

Problem: flux-framework/flux-sched/pull/1286 changed the way edges are represented in JGF. Update the expected JGF output of flux-dws2jgf.

codecov · 2024-12-02T18:01:05Z

Codecov Report

Attention: Patch coverage is 70.83333% with 35 lines in your changes missing coverage. Please review.

Project coverage is 75.3%. Comparing base (4d9fc6f) to head (8b63d37).
Report is 67 commits behind head on master.

Files with missing lines	Patch %	Lines
resource/utilities/resource-query.cpp	19.0%	17 Missing ⚠️
resource/evaluators/scoring_api.cpp	76.0%	6 Missing ⚠️
resource/schema/resource_data.cpp	50.0%	6 Missing ⚠️
resource/evaluators/scoring_api.hpp	77.7%	2 Missing ⚠️
resource/writers/match_writers.cpp	60.0%	2 Missing ⚠️
resource/readers/resource_reader_jgf.cpp	90.9%	1 Missing ⚠️
resource/traversers/dfu_impl.cpp	95.6%	1 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##           master   #1286    +/-   ##
=======================================
  Coverage    75.3%   75.3%            
=======================================
  Files         111     110     -1     
  Lines       15361   15236   -125     
=======================================
- Hits        11575   11482    -93     
+ Misses       3786    3754    -32

Files with missing lines	Coverage Δ
resource/modules/resource_match.cpp	`68.9% <ø> (ø)`
resource/readers/resource_reader_grug.cpp	`81.3% <100.0%> (ø)`
resource/readers/resource_reader_hwloc.cpp	`65.3% <100.0%> (ø)`
resource/readers/resource_reader_jgf.hpp	`100.0% <ø> (ø)`
resource/readers/resource_reader_rv1exec.cpp	`71.0% <100.0%> (ø)`
resource/readers/resource_spec_grug.cpp	`47.2% <ø> (ø)`
resource/readers/resource_spec_grug.hpp	`100.0% <ø> (ø)`
resource/traversers/dfu_impl.hpp	`94.5% <ø> (ø)`
resource/utilities/command.cpp	`78.5% <ø> (ø)`
src/common/libintern/interner.hpp	`97.5% <100.0%> (+<0.1%)`	⬆️
... and 8 more

... and 2 files with indirect coverage changes

trws requested review from grondo and milroy August 31, 2024 17:45

trws added 2 commits August 31, 2024 10:45

resource_data: support move construction

f38e475

problem: resource pools and relations don't have move constructors or move assignment solution: add them, defaulted, and remove the unnecessary explicit constructors

trws force-pushed the refactor-subsystem-identity branch from 2eec4cd to dad5626 Compare August 31, 2024 17:45

trws added 4 commits August 31, 2024 10:48

data_std: swap std::map for interned_key_vec in msubsys types

54638e7

problem: we have typedefs for common maps of subsystem to string or set of string, but we're using std::map solution: stop doing that

t3027: use test_cmp rather than _json

89a0502

problem: the files _start_ with json but then have other output after that, so putting them through jq causes errors solution: use test_cmp instead

trws force-pushed the refactor-subsystem-identity branch from dad5626 to cd63a6a Compare August 31, 2024 17:48

trws force-pushed the refactor-subsystem-identity branch from 0d5065a to 85d61d8 Compare September 1, 2024 23:40

includes: optimize includes for reduced compile times

8b63d37

problem: several files have extra _seconds_ of compile time due to over-including headers in headers solution: include less things we do not need

grondo approved these changes Sep 3, 2024

View reviewed changes

trws added the merge-when-passing mergify.io - merge PR automatically once CI passes label Sep 3, 2024

mergify bot merged commit 758258e into flux-framework:master Sep 3, 2024
20 of 21 checks passed

jameshcorbett added a commit to jameshcorbett/flux-coral2 that referenced this pull request Sep 4, 2024

test: update expected JGF

90b8d4e

Problem: flux-framework/flux-sched/pull/1286 changed the way edges are represented in JGF. Update the expected JGF output of flux-dws2jgf.

jameshcorbett added a commit to jameshcorbett/flux-coral2 that referenced this pull request Sep 5, 2024

test: update expected JGF

a3991b4

Problem: flux-framework/flux-sched/pull/1286 changed the way edges are represented in JGF. Update the expected JGF output of flux-dws2jgf.

jameshcorbett mentioned this pull request Sep 5, 2024

Update expected jgf flux-framework/flux-coral2#207

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up final constraint enforcement #1286

Speed up final constraint enforcement #1286

trws commented Aug 31, 2024

grondo left a comment

trws commented Sep 3, 2024

grondo commented Sep 3, 2024

codecov bot commented Dec 2, 2024

Speed up final constraint enforcement #1286

Speed up final constraint enforcement #1286

Conversation

trws commented Aug 31, 2024

grondo left a comment

Choose a reason for hiding this comment

trws commented Sep 3, 2024

grondo commented Sep 3, 2024

codecov bot commented Dec 2, 2024

Codecov Report