Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add by_rank map to graph store #674

Merged
merged 8 commits into from
Jun 19, 2020
Merged

Conversation

milroy
Copy link
Member

@milroy milroy commented Jun 17, 2020

This PR addresses item II.3 from issue #662 by introducing a by_rank map to resource_graph_metadata_t. The map is keyed by the int64_t rank and takes a std::vector of vtx_t corresponding to the subgraph associated with the rank. The changes include populating the map in the GRUG, JGF, and hwloc readers and the addtion of testsuite checks for correct subgraph sizes.

This PR should be merged before #665 and #667 as those depend on the features created here.

@codecov-commenter
Copy link

codecov-commenter commented Jun 17, 2020

Codecov Report

Merging #674 into master will increase coverage by 0.01%.
The diff coverage is 76.36%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #674      +/-   ##
==========================================
+ Coverage   75.07%   75.09%   +0.01%     
==========================================
  Files          78       78              
  Lines        8166     8220      +54     
==========================================
+ Hits         6131     6173      +42     
- Misses       2035     2047      +12     
Impacted Files Coverage Δ
resource/store/resource_graph_store.hpp 100.00% <ø> (ø)
resource/modules/resource_match.cpp 76.48% <72.91%> (-0.19%) ⬇️
resource/readers/resource_reader_grug.cpp 86.82% <100.00%> (+0.06%) ⬆️
resource/readers/resource_reader_hwloc.cpp 85.71% <100.00%> (+0.10%) ⬆️
resource/readers/resource_reader_jgf.cpp 65.09% <100.00%> (+0.15%) ⬆️
resource/utilities/resource-query.cpp 48.53% <100.00%> (+0.55%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2570646...65ac424. Read the comment docs.

@milroy milroy changed the title [WIP] Add by_rank map to graph store Add by_rank map to graph store Jun 17, 2020
@milroy milroy requested a review from dongahn June 17, 2020 07:16
@milroy
Copy link
Member Author

milroy commented Jun 17, 2020

All checks are green and Codecov looks good. @dongahn you can give this a pass over when you're able, or I can add @SteVwonder as a reviewer.

Copy link
Member

@dongahn dongahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@milroy: this looks great.

  • I have a couple of very minor in-line comments but I'm approving it without it.
  • The current testing of using resource-query is limited as all vertices will be only mapped to one rank (-1). My suggestion would be to add one more simple test by modifying sched-fluxion-resource's stat interface and flux ion-resource. But since a similar test can be added as we merge our PRs and you will be busy with other 2 PRs, I didn't know if that should be a part of this PR's scope. Please let me know your preference.

resource/store/resource_graph_store.hpp Show resolved Hide resolved
t/t6000-graph-size.t Show resolved Hide resolved
@milroy
Copy link
Member Author

milroy commented Jun 17, 2020

The current testing of using resource-query is limited as all vertices will be only mapped to one rank (-1). My suggestion would be to add one more simple test by modifying sched-fluxion-resource's stat interface and flux ion-resource.

I've been poking at this and am struggling to deal with the need to iterate over multiple ranks. In the stat interface I could send multiple responses (one per rank to vector size association). That approach seems ugly and complex. Another possibility would be to iterate over the map keys and flatten the key-value pairs to a single string, and transmit that string in the response. What do you think @dongahn?

@dongahn
Copy link
Member

dongahn commented Jun 17, 2020

I've been poking at this and am struggling to deal with the need to iterate over multiple ranks. In the stat interface I could send multiple responses (one per rank to vector size association). That approach seems ugly and complex.

I agreed. stat is to get the overall statistics and multiple responses would defeat the purpose.

Another possibility would be to iterate over the map keys and flatten the key-value pairs to a single string, and transmit that string in the response. What do you think @dongahn?

This sounds good to me. Perhaps we can compactly flatten this using struct idset. For example, if the number of vertices are the same across all ranks, you can condense this to: "by_rank": { "[0-1021]" : 100 }. If there are two rank groups each with different vertex count, it would be something like: "by_rank": {"[0-511]" : 100, "[512-1021]": 200}.

Doc for idset is here.

@milroy
Copy link
Member Author

milroy commented Jun 17, 2020

If there are two rank groups each with different vertex count, it would be something like: "by_rank": {"[0-511]" : 100, "[512-1021]": 200}.

So the general way to do this is iterate through the ranks and populate an auxiliary map that's keyed by subgraph sizes, and takes vectors of ranks as values. Then I can iterate through the size keys and decode the vector of ranks with idset. Is that the idea?

@milroy
Copy link
Member Author

milroy commented Jun 17, 2020

A more fundamental question is: how do I create a resource_graph with multiple ranks in flux ion-resource?

@dongahn
Copy link
Member

dongahn commented Jun 17, 2020

A more fundamental question is: how do I create a resource_graph with multiple ranks

You can use hwloc reader with sched-fluxoin-resource (I think this is the default). As each hwloc files are unpacked into the resource graph, a distinct rank will be assigned.

@dongahn
Copy link
Member

dongahn commented Jun 17, 2020

So the general way to do this is iterate through the ranks and populate an auxiliary map that's keyed by subgraph sizes, and takes vectors of ranks as values. Then I can iterate through the size keys and decode the vector of ranks with idset. Is that the idea?

Yes. But instead of using vectors of ranks as values, you can use an "essentially" idsets as values to directly encode the rank set into idset during the initial iteration. I say "essentially" because using a c++ wrapper around struct idset will make memory management a bit easier.

class idset_wrap_t {
public:
    ~idset_wrap_t() {
        if (m_idset)
            idset_destroy (m_idset);
    }
    const get_idset () const {
        return m_idset;
    }
    void set_idset (struct idset *idset) {
        if (idset)
            m_idset = idset;
    }
private:
    struct idset *m_idset{nullptr};
};

std::map<size_t,  idset_wrap_t> s2r; // subgraph size to ranks

for (auto &kv : by_rank) {
    if (s2r.find (kv.second.size ()) == s2r.end ()) {
        idset_wrap_t iw;
        struct idset *ids = NULL;
        if ( !(ids = idset_create (0, IDSET_FLAG_AUTOGROW))) {
             error_check ()
        }
        iw.set_idset (ids);
        s2r[kv.second.size ()] = iw;
    }
    if (idset_set (s2r[kv.second.size ()].get_idset (), kv.first) < 0)
        error_check ();
}

// once the above iteration is done, each value of s2r
// has the fully encoded rank idset.

The code not tested at all, though.

@dongahn
Copy link
Member

dongahn commented Jun 17, 2020

Yes. But instead of using vectors of ranks as values, you can use an "essentially" idsets as values to directly encode the rank set into idset during the initial iteration. I say "essentially" because using a c++ wrapper around struct idset will make memory management a bit easier.

@milroy: if you are busy working on other PRs, I can add this support. Let me know.

@milroy
Copy link
Member Author

milroy commented Jun 18, 2020

@dongahn unfortunately the implementation is taking me longer than I expected. Can you add the support?

Also, I think you will need to make some adjustments to idset_wrap_t:

class idset_wrap_t {
private:
    struct idset *m_idset{nullptr};
public:
    ~idset_wrap_t() {
        if (m_idset)
            idset_destroy (m_idset);
    }
    const idset get_idset () {
        return m_idset;
    }
    void set_idset (struct idset *idset) {
        if (idset)
            m_idset = idset;
    }
};

@dongahn
Copy link
Member

dongahn commented Jun 18, 2020

@dongahn unfortunately the implementation is taking me longer than I expected. Can you add the support?

Sure thing. I will get to this soon so that you can focus on the other PRs!

@dongahn dongahn requested a review from SteVwonder June 18, 2020 06:48
@milroy milroy force-pushed the by-rank-map branch 3 times, most recently from d495453 to b22cfd9 Compare June 18, 2020 11:45
@milroy
Copy link
Member Author

milroy commented Jun 18, 2020

I thoughtlessly pulled your new commits @dongahn and pushed some new ones of my own. That added me as the committer to commits authored by you. Is there some way to fix this?

@grondo
Copy link
Contributor

grondo commented Jun 18, 2020

@milroy, IMO it is fine if you are committer, this happens quite often. As long as authorship is preserved it is somewhat standard practice when cherry-picking as you are doing.

If you really need to reset the committer, I think you can amend the commit with GIT_COMMITTER_NAME and GIT_COMMITTER_EMAIL set in your environment. You may also have to add GIT_AUTHOR_NAME and GIT_AUTHOR_EMAIL, or use the --author option of git-commit.

@dongahn
Copy link
Member

dongahn commented Jun 18, 2020

I thoughtlessly pulled your new commits @dongahn and pushed some new ones of my own. That added me as the committer to commits authored by you. Is there some way to fix this?

Don't worry about it. Also this didn't changed the author anyway.

@dongahn
Copy link
Member

dongahn commented Jun 18, 2020

@milroy: your new changed LGTM. @SteVwonder: if you can take a quick look at this PR, this should be ready to go in.

milroy referenced this pull request in milroy/flux-sched Jun 18, 2020
- update resource status via depth first traversal
starting at a specified subtree root.
The functions find the subtree's parent vertex
and mark it black to avoid upward traversal.
Copy link
Member

@SteVwonder SteVwonder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I did have one question below, but it is not a blocker to this getting merged. Feel free to set MWP after you squash the fixup commit.

resource/modules/resource_match.cpp Show resolved Hide resolved

test_expect_success 'qmanager: loading resource and qmanager modules works' '
flux module remove sched-simple &&
load_resource prune-filters=ALL:core subsystems=containment policy=low
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Mainly a note to myself)

We may need to add an whitelist/allowlist here to make this test work under both hwloc1 and hwloc2, since the number of resources under each execution target may differ (e.g., numanode vs no numanode). Since this PR is on the critical path, I wouldn't worry about it right now. If it does turn out to be an issue, I can handle it in my upcoming hwloc2 PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks and this makes sense. Yes, let's target this work when hwloc2 PR is posted. We can either solve this w/ allowlist or by calculating number of ranks x per rank vertices = overall vertex count.

@milroy
Copy link
Member Author

milroy commented Jun 18, 2020

@dongahn thank you very much for jumping in and creating the additional functionality. @SteVwonder thanks for the quick review!

I'm going to set MWP after flux-sched gets taggged.

@grondo
Copy link
Contributor

grondo commented Jun 18, 2020

@milroy, in case you missed the slack message, flux-sched v0.9.0 has been tagged.

@milroy milroy added the merge-when-passing mergify.io - merge PR automatically once CI passes label Jun 19, 2020
milroy and others added 8 commits June 19, 2020 00:37
add a map to graph store to keep track of
the subgraph vertices associated with each
rank.  This map will be used to mark
resources subgraphs with a new status.
enable population of by_rank map in readers
while loading the resource graph
add console debug output to print the size of
each vertex vector (subgraph) associated
with each rank in the resource graph.
add tests to check the number of vertices associated
with rank -1 (the default value) for GRUG, JGF, and
hwloc test graphs.
Add by_rank support to stat request callback.
"by_rank" key in the response is a JSON
dictionary whereby each key is a rank
idset that contains the same number of
graph vertices.

Make the idset lib a dependency
of fluxion resource.
@mergify mergify bot merged commit 39e263f into flux-framework:master Jun 19, 2020
@milroy milroy deleted the by-rank-map branch June 19, 2020 01:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge-when-passing mergify.io - merge PR automatically once CI passes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants