Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random vertex sampling utility function for C++ tests #3347

Merged

Conversation

seunghwak
Copy link
Contributor

@seunghwak seunghwak commented Mar 18, 2023

Code clean-up and use a common vertex sampling function throughout C++ tests.

Additional bug fixes

  • C++ induced subgraph test (the reference CPU code assumes that vertex sets are sorted within a set, but this is not required in the cuGraph induced subgraph function).
    aa25afb
  • divide by 0 error when creating a graph with 0 edges (or # edges / # vertices is close to 0) in multi-GPU. Note that static_cast<size_t>(inf) is (surprisingly) 0.
    290751f

@seunghwak seunghwak changed the title random vertex sampling utility function for C++ tests [skip-ci] random vertex sampling utility function for C++ tests Mar 18, 2023
@seunghwak seunghwak changed the title [skip-ci] random vertex sampling utility function for C++ tests Random vertex sampling utility function for C++ tests Mar 22, 2023
@seunghwak seunghwak self-assigned this Mar 22, 2023
@seunghwak seunghwak added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Mar 22, 2023
@seunghwak seunghwak marked this pull request as ready for review March 22, 2023 18:00
@seunghwak seunghwak requested a review from a team as a code owner March 22, 2023 18:00
@seunghwak seunghwak added this to the 23.04 milestone Mar 22, 2023
size_t sample_size,
bool with_replacement,
bool sort_samples)
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we perhaps use

rmm::device_uvector<vertex_t> select_random_vertices(
, or combine these by adding the sort_samples parameter to that function and perhaps using this implementation? Seems like the functionality should be the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes, you're right. I forgot about the function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@seunghwak seunghwak requested a review from a team as a code owner March 24, 2023 18:10
Comment on lines +133 to +141
if (min_buffer_size <= select_count / comm_size) {
auto new_sizes = std::vector<size_t>(comm_size, min_buffer_size);
auto num_deficits = select_count - min_buffer_size * comm_size;
for (int i = 0; i < comm_size; ++i) {
auto delta = std::min(num_deficits, buffer_sizes[i] - min_buffer_size);
new_sizes[i] += delta;
num_deficits -= delta;
}
this_gpu_select_count = new_sizes[comm_rank];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder when would this be necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Say we need to select 798 vertices out of 800 vertices.

And after shuffling 800 vertices, GPU0 has 380 vertices and GPU 1 has 420 vertices.

We can't select 798/2=399 vertices in GPU0, so we need to select 380 in GPU0 and 418 in GPU1.

@ChuckHastings
Copy link
Collaborator

/merge

@rapids-bot rapids-bot bot merged commit f4f09ba into rapidsai:branch-23.04 Mar 28, 2023
@seunghwak seunghwak deleted the enh_mg_test_random_vertex_set branch May 5, 2023 23:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants