-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add R-mat generator #1411
Add R-mat generator #1411
Conversation
Related to PR #1401 |
Codecov Report
@@ Coverage Diff @@
## branch-0.19 #1411 +/- ##
==============================================
Coverage ? 60.75%
==============================================
Files ? 70
Lines ? 3134
Branches ? 0
==============================================
Hits ? 1904
Misses ? 1230
Partials ? 0 Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! It is really concise and nicely done. Did you get some info on performance/scalability while you were developing it?
I think next we want to have bindings for this so @jnke2016 / @rlratzel can leverage this for the python test suite and potentially get rid of the mtx/csv files. We should open a separate issue for this that captures the requirements from the test suite perspective.
* for additional details). a, b, c, d should be non-negative and a + b + c should be no larger | ||
* than 1.0. | ||
* @param seed Seed value for the random number generator. | ||
* @param clip_and_flip Flag controlling whether to generate edges only in the lower triangular part |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way I understand this is that when clip_and_flip
is false this returns a directed graph. When it is true it can be seen as an undirected one but not the kind of format we use in cugraph.
I think we would benefit from exposing an option that generates the undirected graph inputs that we expect in cugraph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is correct (if clip-and-flip is set to true, all the edges are in the lower triangular part of the graph adjacency matrix, they need to be symmetrized for cuGraph use), and my plan is to symmetrize the output after this R-mat generator; so to keep the R-mat generator's behavior close to Graph 500. Any thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine with doing symmetrize separately, I just recall that the symmetrize step was done at the python level with cudf (concat + group_by) before. Do we have that feature at C++ level now? If not, we would need to add it before we can use this for C++ testing for instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we have the feature yet. I think symmetrize, transpose, and triangle counting are related problems and can be addressed together.
It took 3-4 seconds to fill my 32 GB GPU memory on GV100; the code may still have some room to performance tune... but my guess is that this is fast enough for most practical cases. I haven't tested this on multi-GPUs, but basically every GPU independently generates edges, I assume this will scale very well (actual scaling issue will happen later when we generate a graph from edge list). |
…multi-edges and self-loops
#1411 added code (to address #1329) that follows the BOOST 1.0 license and this PR adds the BOOST 1.0 license to cuGraph codebase. Authors: - Seunghwa Kang (@seunghwak) Approvers: - Brad Rees (@BradReesWork) URL: #1401
cpp/src/experimental/scramble.cuh
Outdated
constexpr uint64_t scramble_value0{606610977102444280}; // randomly generated | ||
constexpr uint64_t scramble_value1{11680327234415193037}; // randomly generated | ||
|
||
auto v = static_cast<uint64_t>(value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The common part of these 3 specialization could be factored out in a template function; e.g.,
template<typename precision_t, typename brev_wrapper_t>
__device__ decltype(auto) generate_v(precision_t const& value,
precision_t const& scramble0,
precision_t const& scramble1,
precision_t const& mask0,
precision_t const& mask1,
size_t lgN,
brev_wrapper_t brev)
{
auto v = static_cast<precision_t>(value);
v += scramble 0 + scramble 1;
v *= (scramble0 | mask0);
v = brev(v) >> (8*sizeof(precision_t) - lgN);
v *= (scramble1| mask1);
v = brev(v) >> (8*sizeof(precision_t) - lgN);
return v;
}
Then in each specialization initialize the scrambles, masks, and the appropriate brev
lambda; and pass all to generate_v(...)
. Something like that. That way the underlying logic is factored out and only moving parts are kept separate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And brev
can be constructed in-place as a __device__
generic lambda (if possible... I think there might be some issues with nvcc
compiling generic device lambdas). But, if possible, something like:
auto brev = [] __device__ (auto v){
return __brevll(v);
};
//...
auto v = generate_v(..., brev);
Obviously, different specializations of scramble(...)
would instantiate different versions of brev
, accordingly, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thanks for pointing this out, and I refactored the code to remove redundancy.
@gpucibot merge |
Close #1329 (with #1401)