Add by_rank map to graph store #674

milroy · 2020-06-17T06:54:19Z

This PR addresses item II.3 from issue #662 by introducing a by_rank map to resource_graph_metadata_t. The map is keyed by the int64_t rank and takes a std::vector of vtx_t corresponding to the subgraph associated with the rank. The changes include populating the map in the GRUG, JGF, and hwloc readers and the addtion of testsuite checks for correct subgraph sizes.

This PR should be merged before #665 and #667 as those depend on the features created here.

codecov-commenter · 2020-06-17T07:15:11Z

Codecov Report

Merging #674 into master will increase coverage by 0.01%.
The diff coverage is 76.36%.

@@            Coverage Diff             @@
##           master     #674      +/-   ##
==========================================
+ Coverage   75.07%   75.09%   +0.01%     
==========================================
  Files          78       78              
  Lines        8166     8220      +54     
==========================================
+ Hits         6131     6173      +42     
- Misses       2035     2047      +12

Impacted Files	Coverage Δ
resource/store/resource_graph_store.hpp	`100.00% <ø> (ø)`
resource/modules/resource_match.cpp	`76.48% <72.91%> (-0.19%)`	⬇️
resource/readers/resource_reader_grug.cpp	`86.82% <100.00%> (+0.06%)`	⬆️
resource/readers/resource_reader_hwloc.cpp	`85.71% <100.00%> (+0.10%)`	⬆️
resource/readers/resource_reader_jgf.cpp	`65.09% <100.00%> (+0.15%)`	⬆️
resource/utilities/resource-query.cpp	`48.53% <100.00%> (+0.55%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2570646...65ac424. Read the comment docs.

milroy · 2020-06-17T07:17:48Z

All checks are green and Codecov looks good. @dongahn you can give this a pass over when you're able, or I can add @SteVwonder as a reviewer.

dongahn

@milroy: this looks great.

I have a couple of very minor in-line comments but I'm approving it without it.
The current testing of using resource-query is limited as all vertices will be only mapped to one rank (-1). My suggestion would be to add one more simple test by modifying sched-fluxion-resource's stat interface and flux ion-resource. But since a similar test can be added as we merge our PRs and you will be busy with other 2 PRs, I didn't know if that should be a part of this PR's scope. Please let me know your preference.

resource/store/resource_graph_store.hpp

t/t6000-graph-size.t

milroy · 2020-06-17T18:12:36Z

The current testing of using resource-query is limited as all vertices will be only mapped to one rank (-1). My suggestion would be to add one more simple test by modifying sched-fluxion-resource's stat interface and flux ion-resource.

I've been poking at this and am struggling to deal with the need to iterate over multiple ranks. In the stat interface I could send multiple responses (one per rank to vector size association). That approach seems ugly and complex. Another possibility would be to iterate over the map keys and flatten the key-value pairs to a single string, and transmit that string in the response. What do you think @dongahn?

dongahn · 2020-06-17T19:46:45Z

I've been poking at this and am struggling to deal with the need to iterate over multiple ranks. In the stat interface I could send multiple responses (one per rank to vector size association). That approach seems ugly and complex.

I agreed. stat is to get the overall statistics and multiple responses would defeat the purpose.

Another possibility would be to iterate over the map keys and flatten the key-value pairs to a single string, and transmit that string in the response. What do you think @dongahn?

This sounds good to me. Perhaps we can compactly flatten this using struct idset. For example, if the number of vertices are the same across all ranks, you can condense this to: "by_rank": { "[0-1021]" : 100 }. If there are two rank groups each with different vertex count, it would be something like: "by_rank": {"[0-511]" : 100, "[512-1021]": 200}.

Doc for idset is here.

milroy · 2020-06-17T21:52:11Z

If there are two rank groups each with different vertex count, it would be something like: "by_rank": {"[0-511]" : 100, "[512-1021]": 200}.

So the general way to do this is iterate through the ranks and populate an auxiliary map that's keyed by subgraph sizes, and takes vectors of ranks as values. Then I can iterate through the size keys and decode the vector of ranks with idset. Is that the idea?

milroy · 2020-06-17T21:54:12Z

A more fundamental question is: how do I create a resource_graph with multiple ranks in flux ion-resource?

dongahn · 2020-06-17T21:56:39Z

A more fundamental question is: how do I create a resource_graph with multiple ranks

You can use hwloc reader with sched-fluxoin-resource (I think this is the default). As each hwloc files are unpacked into the resource graph, a distinct rank will be assigned.

dongahn · 2020-06-17T22:20:10Z

So the general way to do this is iterate through the ranks and populate an auxiliary map that's keyed by subgraph sizes, and takes vectors of ranks as values. Then I can iterate through the size keys and decode the vector of ranks with idset. Is that the idea?

Yes. But instead of using vectors of ranks as values, you can use an "essentially" idsets as values to directly encode the rank set into idset during the initial iteration. I say "essentially" because using a c++ wrapper around struct idset will make memory management a bit easier.

class idset_wrap_t {
public:
    ~idset_wrap_t() {
        if (m_idset)
            idset_destroy (m_idset);
    }
    const get_idset () const {
        return m_idset;
    }
    void set_idset (struct idset *idset) {
        if (idset)
            m_idset = idset;
    }
private:
    struct idset *m_idset{nullptr};
};

std::map<size_t,  idset_wrap_t> s2r; // subgraph size to ranks

for (auto &kv : by_rank) {
    if (s2r.find (kv.second.size ()) == s2r.end ()) {
        idset_wrap_t iw;
        struct idset *ids = NULL;
        if ( !(ids = idset_create (0, IDSET_FLAG_AUTOGROW))) {
             error_check ()
        }
        iw.set_idset (ids);
        s2r[kv.second.size ()] = iw;
    }
    if (idset_set (s2r[kv.second.size ()].get_idset (), kv.first) < 0)
        error_check ();
}

// once the above iteration is done, each value of s2r
// has the fully encoded rank idset.

The code not tested at all, though.

dongahn · 2020-06-17T23:38:50Z

Yes. But instead of using vectors of ranks as values, you can use an "essentially" idsets as values to directly encode the rank set into idset during the initial iteration. I say "essentially" because using a c++ wrapper around struct idset will make memory management a bit easier.

@milroy: if you are busy working on other PRs, I can add this support. Let me know.

milroy · 2020-06-18T00:57:34Z

@dongahn unfortunately the implementation is taking me longer than I expected. Can you add the support?

Also, I think you will need to make some adjustments to idset_wrap_t:

class idset_wrap_t {
private:
    struct idset *m_idset{nullptr};
public:
    ~idset_wrap_t() {
        if (m_idset)
            idset_destroy (m_idset);
    }
    const idset get_idset () {
        return m_idset;
    }
    void set_idset (struct idset *idset) {
        if (idset)
            m_idset = idset;
    }
};

dongahn · 2020-06-18T01:12:22Z

@dongahn unfortunately the implementation is taking me longer than I expected. Can you add the support?

Sure thing. I will get to this soon so that you can focus on the other PRs!

milroy · 2020-06-18T11:47:34Z

I thoughtlessly pulled your new commits @dongahn and pushed some new ones of my own. That added me as the committer to commits authored by you. Is there some way to fix this?

grondo · 2020-06-18T14:25:41Z

@milroy, IMO it is fine if you are committer, this happens quite often. As long as authorship is preserved it is somewhat standard practice when cherry-picking as you are doing.

If you really need to reset the committer, I think you can amend the commit with GIT_COMMITTER_NAME and GIT_COMMITTER_EMAIL set in your environment. You may also have to add GIT_AUTHOR_NAME and GIT_AUTHOR_EMAIL, or use the --author option of git-commit.

dongahn · 2020-06-18T17:32:09Z

I thoughtlessly pulled your new commits @dongahn and pushed some new ones of my own. That added me as the committer to commits authored by you. Is there some way to fix this?

Don't worry about it. Also this didn't changed the author anyway.

dongahn · 2020-06-18T17:49:32Z

@milroy: your new changed LGTM. @SteVwonder: if you can take a quick look at this PR, this should be ready to go in.

- update resource status via depth first traversal starting at a specified subtree root. The functions find the subtree's parent vertex and mark it black to avoid upward traversal.

SteVwonder

LGTM! I did have one question below, but it is not a blocker to this getting merged. Feel free to set MWP after you squash the fixup commit.

resource/modules/resource_match.cpp

SteVwonder · 2020-06-18T20:17:14Z

t/t6002-graph-hwloc.t

+
+test_expect_success 'qmanager: loading resource and qmanager modules works' '
+    flux module remove sched-simple &&
+    load_resource prune-filters=ALL:core subsystems=containment policy=low


(Mainly a note to myself)

We may need to add an whitelist/allowlist here to make this test work under both hwloc1 and hwloc2, since the number of resources under each execution target may differ (e.g., numanode vs no numanode). Since this PR is on the critical path, I wouldn't worry about it right now. If it does turn out to be an issue, I can handle it in my upcoming hwloc2 PR.

Thanks and this makes sense. Yes, let's target this work when hwloc2 PR is posted. We can either solve this w/ allowlist or by calculating number of ranks x per rank vertices = overall vertex count.

milroy · 2020-06-18T21:31:32Z

@dongahn thank you very much for jumping in and creating the additional functionality. @SteVwonder thanks for the quick review!

I'm going to set MWP after flux-sched gets taggged.

grondo · 2020-06-18T23:02:59Z

@milroy, in case you missed the slack message, flux-sched v0.9.0 has been tagged.

add a map to graph store to keep track of the subgraph vertices associated with each rank. This map will be used to mark resources subgraphs with a new status.

enable population of by_rank map in readers while loading the resource graph

add console debug output to print the size of each vertex vector (subgraph) associated with each rank in the resource graph.

add tests to check the number of vertices associated with rank -1 (the default value) for GRUG, JGF, and hwloc test graphs.

Add by_rank support to stat request callback. "by_rank" key in the response is a JSON dictionary whereby each key is a rank idset that contains the same number of graph vertices. Make the idset lib a dependency of fluxion resource.

milroy force-pushed the by-rank-map branch from 1d552fd to 0b5d5ca Compare June 17, 2020 07:04

milroy changed the title ~~[WIP] Add by_rank map to graph store~~ Add by_rank map to graph store Jun 17, 2020

milroy requested a review from dongahn June 17, 2020 07:16

dongahn reviewed Jun 17, 2020

View reviewed changes

resource/store/resource_graph_store.hpp Show resolved Hide resolved

t/t6000-graph-size.t Show resolved Hide resolved

dongahn mentioned this pull request Jun 17, 2020

[WIP] Integrate Fluxion with flux-core's resource module #675

Closed

dongahn force-pushed the by-rank-map branch from e2aff67 to 7948395 Compare June 18, 2020 06:46

dongahn requested a review from SteVwonder June 18, 2020 06:48

milroy force-pushed the by-rank-map branch 3 times, most recently from d495453 to b22cfd9 Compare June 18, 2020 11:45

milroy mentioned this pull request Jun 18, 2020

Add resource status check for availability #667

Merged

SteVwonder approved these changes Jun 18, 2020

View reviewed changes

milroy force-pushed the by-rank-map branch from 698f8c5 to 7fe70f4 Compare June 18, 2020 21:26

milroy added the merge-when-passing mergify.io - merge PR automatically once CI passes label Jun 19, 2020

milroy and others added 8 commits June 19, 2020 00:37

resource: add by_rank map to graph store

089f7ec

add a map to graph store to keep track of the subgraph vertices associated with each rank. This map will be used to mark resources subgraphs with a new status.

resource: populate by_rank map in readers

54609d0

enable population of by_rank map in readers while loading the resource graph

resource: output by_rank size in resource-query

1c47be0

add console debug output to print the size of each vertex vector (subgraph) associated with each rank in the resource graph.

testsuite: add by_rank checks to graph-size tests

3bae387

add tests to check the number of vertices associated with rank -1 (the default value) for GRUG, JGF, and hwloc test graphs.

resource: pass by_rank info to stat response

b7318fa

Add by_rank support to stat request callback. "by_rank" key in the response is a JSON dictionary whereby each key is a rank idset that contains the same number of graph vertices. Make the idset lib a dependency of fluxion resource.

resource: add by_rank support to stat subcommand

1f8ca23

testsuite: add a test that verifies by_rank correctness

8c319b3

build: integrate t6002-graph-hwloc.t into build

65ac424

grondo force-pushed the by-rank-map branch from 7fe70f4 to 65ac424 Compare June 19, 2020 00:37

mergify bot merged commit 39e263f into flux-framework:master Jun 19, 2020

milroy deleted the by-rank-map branch June 19, 2020 01:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add by_rank map to graph store #674

Add by_rank map to graph store #674

milroy commented Jun 17, 2020

codecov-commenter commented Jun 17, 2020 •

edited

Loading

milroy commented Jun 17, 2020 •

edited

Loading

dongahn left a comment

milroy commented Jun 17, 2020 •

edited

Loading

dongahn commented Jun 17, 2020

milroy commented Jun 17, 2020

milroy commented Jun 17, 2020

dongahn commented Jun 17, 2020

dongahn commented Jun 17, 2020 •

edited

Loading

dongahn commented Jun 17, 2020

milroy commented Jun 18, 2020

dongahn commented Jun 18, 2020

milroy commented Jun 18, 2020

grondo commented Jun 18, 2020

dongahn commented Jun 18, 2020

dongahn commented Jun 18, 2020

SteVwonder left a comment

SteVwonder Jun 18, 2020

dongahn Jun 18, 2020

milroy commented Jun 18, 2020

grondo commented Jun 18, 2020

Add by_rank map to graph store #674

Add by_rank map to graph store #674

Conversation

milroy commented Jun 17, 2020

codecov-commenter commented Jun 17, 2020 • edited Loading

Codecov Report

milroy commented Jun 17, 2020 • edited Loading

dongahn left a comment

Choose a reason for hiding this comment

milroy commented Jun 17, 2020 • edited Loading

dongahn commented Jun 17, 2020

milroy commented Jun 17, 2020

milroy commented Jun 17, 2020

dongahn commented Jun 17, 2020

dongahn commented Jun 17, 2020 • edited Loading

dongahn commented Jun 17, 2020

milroy commented Jun 18, 2020

dongahn commented Jun 18, 2020

milroy commented Jun 18, 2020

grondo commented Jun 18, 2020

dongahn commented Jun 18, 2020

dongahn commented Jun 18, 2020

SteVwonder left a comment

Choose a reason for hiding this comment

SteVwonder Jun 18, 2020

Choose a reason for hiding this comment

dongahn Jun 18, 2020

Choose a reason for hiding this comment

milroy commented Jun 18, 2020

grondo commented Jun 18, 2020

codecov-commenter commented Jun 17, 2020 •

edited

Loading

milroy commented Jun 17, 2020 •

edited

Loading

milroy commented Jun 17, 2020 •

edited

Loading

dongahn commented Jun 17, 2020 •

edited

Loading