Add R-mat generator #1411

seunghwak · 2021-02-16T23:24:00Z

Close #1329 (with #1401)

…at_gen

seunghwak · 2021-02-16T23:28:54Z

Related to PR #1401

codecov-io · 2021-02-17T02:04:32Z

Codecov Report

❗ No coverage uploaded for pull request base (branch-0.19@a8311a9). Click here to learn what that means.
The diff coverage is n/a.

@@              Coverage Diff               @@
##             branch-0.19    #1411   +/-   ##
==============================================
  Coverage               ?   60.75%           
==============================================
  Files                  ?       70           
  Lines                  ?     3134           
  Branches               ?        0           
==============================================
  Hits                   ?     1904           
  Misses                 ?     1230           
  Partials               ?        0

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a8311a9...b72af3e. Read the comment docs.

afender

Great! It is really concise and nicely done. Did you get some info on performance/scalability while you were developing it?

I think next we want to have bindings for this so @jnke2016 / @rlratzel can leverage this for the python test suite and potentially get rid of the mtx/csv files. We should open a separate issue for this that captures the requirements from the test suite perspective.

cpp/include/experimental/graph_generator.hpp

afender · 2021-02-18T17:32:15Z

cpp/include/experimental/graph_generator.hpp

+ * for additional details). a, b, c, d should be non-negative and a + b + c should be no larger
+ * than 1.0.
+ * @param seed Seed value for the random number generator.
+ * @param clip_and_flip Flag controlling whether to generate edges only in the lower triangular part


The way I understand this is that when clip_and_flip is false this returns a directed graph. When it is true it can be seen as an undirected one but not the kind of format we use in cugraph.

I think we would benefit from exposing an option that generates the undirected graph inputs that we expect in cugraph.

Yes, that is correct (if clip-and-flip is set to true, all the edges are in the lower triangular part of the graph adjacency matrix, they need to be symmetrized for cuGraph use), and my plan is to symmetrize the output after this R-mat generator; so to keep the R-mat generator's behavior close to Graph 500. Any thoughts?

I am fine with doing symmetrize separately, I just recall that the symmetrize step was done at the python level with cudf (concat + group_by) before. Do we have that feature at C++ level now? If not, we would need to add it before we can use this for C++ testing for instance.

I don't think we have the feature yet. I think symmetrize, transpose, and triangle counting are related problems and can be addressed together.

seunghwak · 2021-02-18T18:10:31Z

Great! It is really concise and nicely done. Did you get some info on performance/scalability while you were developing it?

It took 3-4 seconds to fill my 32 GB GPU memory on GV100; the code may still have some room to performance tune... but my guess is that this is fast enough for most practical cases.

I haven't tested this on multi-GPUs, but basically every GPU independently generates edges, I assume this will scale very well (actual scaling issue will happen later when we generate a graph from edge list).

…multi-edges and self-loops

…at_gen

@seunghwak

#1411 added code (to address #1329) that follows the BOOST 1.0 license and this PR adds the BOOST 1.0 license to cuGraph codebase. Authors: - Seunghwa Kang (@seunghwak) Approvers: - Brad Rees (@BradReesWork) URL: #1401

aschaffer · 2021-03-01T18:57:13Z

cpp/src/experimental/scramble.cuh

+  constexpr uint64_t scramble_value0{606610977102444280};    // randomly generated
+  constexpr uint64_t scramble_value1{11680327234415193037};  // randomly generated
+
+  auto v = static_cast<uint64_t>(value);


The common part of these 3 specialization could be factored out in a template function; e.g.,

template<typename precision_t, typename brev_wrapper_t> __device__ decltype(auto) generate_v(precision_t const& value, precision_t const& scramble0, precision_t const& scramble1, precision_t const& mask0, precision_t const& mask1, size_t lgN, brev_wrapper_t brev) { auto v = static_cast<precision_t>(value); v += scramble 0 + scramble 1; v *= (scramble0 | mask0); v = brev(v) >> (8*sizeof(precision_t) - lgN); v *= (scramble1| mask1); v = brev(v) >> (8*sizeof(precision_t) - lgN); return v; }

Then in each specialization initialize the scrambles, masks, and the appropriate brev lambda; and pass all to generate_v(...). Something like that. That way the underlying logic is factored out and only moving parts are kept separate.

And brev can be constructed in-place as a __device__ generic lambda (if possible... I think there might be some issues with nvcc compiling generic device lambdas). But, if possible, something like:

auto brev = [] __device__ (auto v){ return __brevll(v); }; //... auto v = generate_v(..., brev);

Obviously, different specializations of scramble(...) would instantiate different versions of brev, accordingly, etc.

OK, thanks for pointing this out, and I refactored the code to remove redundancy.

…at_gen

BradReesWork · 2021-03-02T20:23:59Z

@gpucibot merge

seunghwak added 6 commits February 9, 2021 12:53

add rmat generator code

ac24a8d

add rmat generator test code

3c89ba3

move is_valid_vertex to test_utilities.hpp

f74c3ff

Merge branch 'branch-0.19' of github.com:rapidsai/cugraph into fea_rm…

f238f2f

…at_gen

Merge branch 'branch-0.19' of github.com:rapidsai/cugraph into fea_rm…

279187c

…at_gen

Merge branch 'branch-0.19' of github.com:rapidsai/cugraph into fea_rm…

afad41e

…at_gen

seunghwak requested review from a team as code owners February 16, 2021 23:24

seunghwak added 3 - Ready for Review feature request New feature or request non-breaking Non-breaking change labels Feb 16, 2021

clang-format

b72af3e

afender reviewed Feb 18, 2021

View reviewed changes

updated R-mat generator documentation specifying the function allows …

8c10e51

…multi-edges and self-loops

BradReesWork added this to the 0.19 milestone Feb 22, 2021

seunghwak added 2 commits February 23, 2021 00:12

fix a typo

1b6b201

Merge branch 'branch-0.19' of github.com:rapidsai/cugraph into fea_rm…

0a29f2c

…at_gen

This was referenced Feb 26, 2021

Add boost 1.0 license file. #1401

Merged

[FEA] Large scale RMAT graph generator #1329

Closed

BradReesWork requested a review from aschaffer March 1, 2021 14:59

BradReesWork approved these changes Mar 1, 2021

View reviewed changes

afender approved these changes Mar 1, 2021

View reviewed changes

aschaffer requested changes Mar 1, 2021

View reviewed changes

seunghwak added 2 commits March 2, 2021 13:19

Merge branch 'branch-0.19' of github.com:rapidsai/cugraph into fea_rm…

3331999

…at_gen

refactor scramble

2526918

fix merge conflicts in C++ testing

48fdcd3

aschaffer approved these changes Mar 2, 2021

View reviewed changes

rapids-bot bot merged commit 07f3d71 into rapidsai:branch-0.19 Mar 2, 2021

afender mentioned this pull request Mar 24, 2021

[ENH] Bindings for R-MAT #1473

Closed

seunghwak deleted the fea_rmat_gen branch June 24, 2021 19:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add R-mat generator #1411

Add R-mat generator #1411

seunghwak commented Feb 16, 2021 •

edited

Loading

seunghwak commented Feb 16, 2021

codecov-io commented Feb 17, 2021 •

edited

Loading

afender left a comment

afender Feb 18, 2021

seunghwak Feb 18, 2021 •

edited

Loading

afender Mar 1, 2021

seunghwak Mar 1, 2021

seunghwak commented Feb 18, 2021

aschaffer Mar 1, 2021

aschaffer Mar 1, 2021

seunghwak Mar 2, 2021

BradReesWork commented Mar 2, 2021

Add R-mat generator #1411

Add R-mat generator #1411

Conversation

seunghwak commented Feb 16, 2021 • edited Loading

seunghwak commented Feb 16, 2021

codecov-io commented Feb 17, 2021 • edited Loading

Codecov Report

afender left a comment

Choose a reason for hiding this comment

afender Feb 18, 2021

Choose a reason for hiding this comment

seunghwak Feb 18, 2021 • edited Loading

Choose a reason for hiding this comment

afender Mar 1, 2021

Choose a reason for hiding this comment

seunghwak Mar 1, 2021

Choose a reason for hiding this comment

seunghwak commented Feb 18, 2021

aschaffer Mar 1, 2021

Choose a reason for hiding this comment

aschaffer Mar 1, 2021

Choose a reason for hiding this comment

seunghwak Mar 2, 2021

Choose a reason for hiding this comment

BradReesWork commented Mar 2, 2021

seunghwak commented Feb 16, 2021 •

edited

Loading

codecov-io commented Feb 17, 2021 •

edited

Loading

seunghwak Feb 18, 2021 •

edited

Loading