Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Implement C/CUDA RandomWalks functionality #1439

Merged
merged 81 commits into from
Mar 30, 2021

Conversation

aschaffer
Copy link
Collaborator

This PR tracks work on issue: #1380.

@rlratzel rlratzel added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change feature request New feature or request and removed improvement Improvement / enhancement to an existing function labels Mar 5, 2021
@BradReesWork BradReesWork added this to the 0.19 milestone Mar 9, 2021
@codecov-io
Copy link

codecov-io commented Mar 10, 2021

Codecov Report

Merging #1439 (d1d4b2f) into branch-0.19 (369beee) will decrease coverage by 2.45%.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.19    NVIDIA/thrust#1439      +/-   ##
===============================================
- Coverage        60.72%   58.26%   -2.46%     
===============================================
  Files               70       71       +1     
  Lines             3132     3283     +151     
===============================================
+ Hits              1902     1913      +11     
- Misses            1230     1370     +140     
Impacted Files Coverage Δ
python/cugraph/utilities/utils.py 70.76% <0.00%> (-0.89%) ⬇️
python/cugraph/dask/common/part_utils.py 20.66% <0.00%> (-0.18%) ⬇️
python/cugraph/__init__.py 100.00% <0.00%> (ø)
python/cugraph/community/egonet.py 91.42% <0.00%> (ø)
python/cugraph/traversal/__init__.py 100.00% <0.00%> (ø)
python/cugraph/dask/community/louvain.py 33.33% <0.00%> (ø)
python/cugraph/tree/minimum_spanning_tree.py 85.36% <0.00%> (ø)
python/cugraph/dask/structure/renumber.py
python/cugraph/structure/new_number_map.py 0.00% <0.00%> (ø)
python/cugraph/traversal/ms_bfs.py 11.11% <0.00%> (ø)
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c1047ed...d1d4b2f. Read the comment docs.

@aschaffer aschaffer requested a review from a team as a code owner March 10, 2021 19:03
@aschaffer aschaffer requested a review from a team as a code owner March 11, 2021 17:09
cpp/tests/experimental/rw_low_level_test.cu Show resolved Hide resolved
cpp/tests/experimental/rw_low_level_test.cu Show resolved Hide resolved
Comment on lines +244 to +246
EXPECT_EQ(v_ro, v_ro_expected);
EXPECT_EQ(v_ci, v_ci_expected);
EXPECT_EQ(v_vs, v_vs_expected);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the intention here to test make_graph? If so, maybe that would be better off as a separate test to 1) clean up this test, 2) make the test name more accurate, and 3) better communicate that a failure here isn't a failure in RW.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. The intention is to factor out make_graph() to be called in several tests to make small graphs with predictable (and observable) topology.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to test make_graph() separately, then move all the repetitive inline calls to make_graph() to a SetUp().

Copy link
Collaborator Author

@aschaffer aschaffer Mar 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not testing make_graph(). Is not doing enough work to warrant separate testing. It's just providing a reproducible graph to further test RW on.

make_graph() is just calling the graph_t constructor and returns a graph_t instance. This path was extensively tested when graph_t was added to our code base. Re-testing old / existing functionality doesn't bring any value, it just bloats our code base and makes tests harder to understand and isolate functionality.

cpp/tests/experimental/rw_low_level_test.cu Show resolved Hide resolved
cpp/tests/experimental/rw_low_level_test.cu Show resolved Hide resolved
cpp/tests/experimental/rw_low_level_test.cu Outdated Show resolved Hide resolved
cpp/tests/experimental/rw_low_level_test.cu Show resolved Hide resolved
cpp/tests/experimental/rw_low_level_test.cu Outdated Show resolved Hide resolved
python/cugraph/sampling/random_walks.pxd Outdated Show resolved Hide resolved
cpp/include/algorithms.hpp Show resolved Hide resolved
rmm::device_uvector<typename graph_t::weight_type>,
rmm::device_uvector<index_t>>
random_walks(raft::handle_t const &handle,
graph_t const &graph,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems reasonable. The louvain example (which uses graph_t - changed to graph_view_t in the latest PR) that Andrei may have copied this from does this because the same API supports both experimental and legacy graph classes. Things that only support the experimental graph classes should follow the explicit paradigm, I think.

cpp/include/algorithms.hpp Outdated Show resolved Hide resolved
graph_t const &graph,
typename graph_t::vertex_type const *ptr_d_start,
index_t num_paths,
index_t max_depth);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

num_paths is only used to convert the raw pointer/size back into a vector in your implementation. That should probably be a size_t.

Do we expect the max_depth to be a larger integer? Would hardcoding that to an int32_t be sufficient? The returned results is going to be O(num_paths * max_depth). That will limit the value that you can pass in to max_depth anyway.

@aschaffer
Copy link
Collaborator Author

aschaffer commented Mar 29, 2021

Perhaps max_depth could just be int32_t, but making it a different type than num_paths would complicate the interface and potentially make it a bit confusing. Perhaps the more consistent / cleaner interface we have would pay off more in the future.

@BradReesWork BradReesWork changed the title [REVIEW] Implement RandomWalks functionality [REVIEW] Implement C/CUDA RandomWalks functionality Mar 30, 2021
Copy link
Contributor

@seunghwak seunghwak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!!!

@aschaffer
Copy link
Collaborator Author

Thanks!!!

Thank you!

@BradReesWork
Copy link
Member

@gpucibot merge

@rapids-bot rapids-bot bot merged commit f2e5a87 into rapidsai:branch-0.19 Mar 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] RandomWalk analytic - C/CUDA Code
7 participants