-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define API for MG random walk #2407
Define API for MG random walk #2407
Conversation
Codecov Report
@@ Coverage Diff @@
## branch-22.08 #2407 +/- ##
================================================
+ Coverage 60.11% 60.39% +0.27%
================================================
Files 102 102
Lines 5155 5244 +89
================================================
+ Hits 3099 3167 +68
- Misses 2056 2077 +21
Continue to review full report at Codecov.
|
cpp/include/cugraph/algorithms.hpp
Outdated
* @param handle RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and | ||
* handles to various CUDA libraries) to run graph algorithms. | ||
* @param graph_view graph view to operate on | ||
* @param start_span Device span defining the starting vertices |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think span
in the variable name is redundant (as the type is device_span
). start_vertices
might be more informative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, start_vertices would be more clear to developers higher up the stack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had talked about that and I forgot. Updated in next push.
cpp/include/cugraph/algorithms.hpp
Outdated
* handles to various CUDA libraries) to run graph algorithms. | ||
* @param graph_view graph view to operate on | ||
* @param start_span Device span defining the starting vertices | ||
* @param max_depth maximum length of random walk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better be max_length
? (depth is more relevant to sampling a tree but I think length makes more sense for random walks).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in next push.
cpp/include/cugraph/algorithms.hpp
Outdated
* vertices in the random walk. If a path terminates before max_depth, | ||
* the vertices will be populated with invalid_vertex_id | ||
* (-1 for signed vertex_t, std::numeric_limits<vertex_t>::max() for an | ||
* unsigned vertex_t * type)<br> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an unsigned vertex_t * type => unsigned vertex_t
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in next push
cpp/include/cugraph/algorithms.hpp
Outdated
graph_view_t<vertex_t, edge_t, weight_t, false, multi_gpu> const& graph_view, | ||
raft::device_span<vertex_t const> start_span, | ||
size_t max_depth, | ||
uint64_t seed = 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this function work for unweighted graphs? Still return rmm::device_uvector<weight_t>?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know that we've explored support for an unweighted graph very much, especially as it relates to return values.
I could wrap the return in std::optional
and return std::nullopt
if the graph is unweighted. biased_*
would fail on an unweighted graph. node2vec_*
could assume a weight of 1 on an unweighted graph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, and I think it is better to specify how we handle unweighted graphs in the documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in next push (check all 3 APIs)
cpp/src/sampling/random_walks_sg.cu
Outdated
raft::handle_t const& handle, | ||
graph_view_t<vertex_t, edge_t, weight_t, false, multi_gpu> const& graph_view, | ||
raft::device_span<vertex_t const> start_span, | ||
size_t max_depth, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
max_length
?
################################################################################################### | ||
# - RANDOM_WALKS tests ---------------------------------------------------------------------------- | ||
ConfigureTest(RANDOM_WALKS_TEST sampling/random_walks_test.cu) | ||
ConfigureTest(RANDOM_WALKS_TEST sampling/sg_random_walks_test.cu) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we name SG tests with sg_
? Our convention has been omitting sg_
for SG tests. If we decide to use sg_
for SG tests, we need to apply this for all SG tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also temporary. There is an existing random walks test for the original implementation (I hesitate to use the word legacy, since we usually refer to implementations that use the legacy graph objects). I can't delete the original test until I have a replacement working.
I could rename the existing as legacy if you think that would be cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, no complaint if temporary, just don't forget to fix this later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a FIXME to remind me
cpp/tests/CMakeLists.txt
Outdated
@@ -631,6 +639,7 @@ if(BUILD_CUGRAPH_MG_TESTS) | |||
ConfigureCTestMG(MG_CAPI_EIGENVECTOR_CENTRALITY c_api/mg_eigenvector_centrality_test.c c_api/mg_test_utils.cpp) | |||
ConfigureCTestMG(MG_CAPI_HITS c_api/mg_hits_test.c c_api/mg_test_utils.cpp) | |||
ConfigureCTestMG(MG_CAPI_UNIFORM_NEIGHBOR_SAMPLE c_api/mg_uniform_neighbor_sample_test.c c_api/mg_test_utils.cpp) | |||
ConfigureCTestMG(MG_CAPI_RANDOM_WALKS c_api/mg_random_walks_test.c c_api/mg_test_utils.cpp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better align indentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in next push
Looks good, just the one change to use "start_vertices" or "source_vertices" to be consistent with the rest of the C API. |
rerun tests |
* set to weight_t{0}. | ||
*/ | ||
template <typename vertex_t, typename edge_t, typename weight_t, bool multi_gpu> | ||
std::tuple<rmm::device_uvector<vertex_t>, std::optional<rmm::device_uvector<weight_t>>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not accept unweighted graphs, and should the returned weight vector here be std::optional? Can this ever be std::nullopt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was keeping the signature the same for consistency. But you are correct, this function would never return std::null opt
.
cugraph_error_code_t cugraph_uniform_random_walks( | ||
const cugraph_resource_handle_t* handle, | ||
cugraph_graph_t* graph, | ||
const cugraph_type_erased_device_array_view_t* sources, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sources
== start_vertices
in the code above? If yes, better use the same variable name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for the functions below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just pushed an update for this. Also updated in implementation and test files.
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file should be renamed to random_walks_test.cu in the future. Just a reminder.
cpp/src/sampling/random_walks_mg.cu
Outdated
std::tuple<rmm::device_uvector<vertex_t>, std::optional<rmm::device_uvector<weight_t>>> | ||
uniform_random_walks(raft::handle_t const& handle, | ||
graph_view_t<vertex_t, edge_t, weight_t, false, multi_gpu> const& graph_view, | ||
raft::device_span<vertex_t const> start_span, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
start_span
here should be start_vertices
, better search for start_span
and replace them.
cpp/src/sampling/random_walks_sg.cu
Outdated
std::tuple<rmm::device_uvector<vertex_t>, std::optional<rmm::device_uvector<weight_t>>> | ||
uniform_random_walks(raft::handle_t const& handle, | ||
graph_view_t<vertex_t, edge_t, weight_t, false, multi_gpu> const& graph_view, | ||
raft::device_span<vertex_t const> start_span, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
start_span
here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@gpucibot merge |
This PR defines the API for MG random walk in the C API and the C++ API.
C and C++ tests are defined, although some of the code is ifdef'ed out since there is not a working implementation here.