Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fea renumbering test #1742

Merged
merged 18 commits into from
Sep 19, 2021

Conversation

ChuckHastings
Copy link
Collaborator

@ChuckHastings ChuckHastings commented Jul 28, 2021

Adds a C++ renumbering test for integer column renumbering.

Also refactored some of the test utilities (some tech debt from the original development of the test utilities). Test graph construction now relies upon the edgelist construction calls instead of replicating functionality.

@ChuckHastings ChuckHastings requested review from a team as code owners July 28, 2021 19:49
@ChuckHastings ChuckHastings self-assigned this Jul 28, 2021
@ChuckHastings ChuckHastings added 3 - Ready for Review non-breaking Non-breaking change improvement Improvement / enhancement to an existing function labels Jul 28, 2021
@BradReesWork BradReesWork added this to the 21.10 milestone Jul 28, 2021
@codecov-commenter
Copy link

codecov-commenter commented Jul 29, 2021

Codecov Report

Merging #1742 (4507c8c) into branch-21.10 (bf64c2c) will increase coverage by 9.72%.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff                @@
##           branch-21.10    #1742      +/-   ##
================================================
+ Coverage         59.85%   69.57%   +9.72%     
================================================
  Files                77      139      +62     
  Lines              3547     8645    +5098     
================================================
+ Hits               2123     6015    +3892     
- Misses             1424     2630    +1206     
Impacted Files Coverage Δ
python/cugraph/community/triangle_count.py
python/cugraph/dask/components/connectivity.py
...ython/cugraph/centrality/betweenness_centrality.py
...ph/structure/graph_implementation/npartiteGraph.py
python/cugraph/centrality/katz_centrality.py
python/cugraph/dask/link_analysis/pagerank.py
python/cugraph/dask/common/read_utils.py
python/cugraph/dask/community/louvain.py
...raph/structure/graph_implementation/simpleGraph.py
python/cugraph/dask/community/__init__.py
... and 206 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 44be72a...4507c8c. Read the comment docs.

cpp/tests/experimental/renumbering_test.cpp Outdated Show resolved Hide resolved
cpp/tests/experimental/renumbering_test.cpp Outdated Show resolved Hide resolved
dst_v.size(),
renumber_map_labels_v.data(),
0,
static_cast<vertex_t>(renumber_map_labels_v.size()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't matter in SG, but if we consider MG extension, unrenumber_local_int_vertices works only with vertices in the range of [graph_view.get_local_vertex_first(), graph_view.get_local_vertex_last()); so unrenumber_int_vertices (https://github.com/rapidsai/cugraph/blob/branch-21.10/cpp/include/cugraph/experimental/graph_functions.hpp#L212) could be more extensible.

Also note the comment in https://github.com/rapidsai/cugraph/blob/branch-21.10/cpp/include/cugraph/experimental/graph_functions.hpp#L185. If we need to unrenumber local edges in the performance critical path, we need to add addition unrenumber functions for better performance.

// TODO: Tease through generate_graph_from_rmat_params
// to extract the edgelist part
// Call cugraph::translate_vertex_ids(handle, d_src_v, d_dst_v, base_vertex_id_);
constexpr bool symmetric{true};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we always symmetrize R-mat graphs? This is necessary for WCC or Louvain, but there are many other algorithms that work with asymmetric R-mat graphs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we always symmetrize R-mat graphs? This is necessary for WCC or Louvain, but there are many other algorithms that work with asymmetric R-mat graphs.

@ChuckHastings I'm waiting for your answer for this to approve this PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was the intention of my FIXME comment. Right now the symmetrization flag is inside the test graph classes. I think the goal would be to set the symmetrization flag to false for all of the test graph classes (except file which would read it from the MM file and return the intended value) and add a test graph wrapper that will symmetrize any test graph so that Louvain and WCC could use that wrapper around rmat and other generators.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the FIXME comment?

Rmat_Usecase has private member variable bool undirected_{};

So, this should be used instead of redefining symmetric here and ignoring the member variable. I assume users will be surprised if they set Rmat_Usecase.undirected_ = false and get symmetric graphs.

And it seems like construct_graph is still calling

return generate_graph_from_rmat_params<vertex_t, edge_t, weight_t, store_transposed, multi_gpu>(
      handle,
      scale_,
      edge_factor_,
      a_,
      b_,
      c_,
      seed_,
      undirected_,
      scramble_vertex_ids_,
      test_weighted,
      renumber,
      partition_ids,
      comm_size);

So, construct_edgelist here is actually unused. Are you planning to replace generate_graph_from_rmat_params but haven't done the job, yet? Why not replace in this PR if it's the intention?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original implementation work here was truncated and a partially implemented capability was pushed many months ago. This is unfinished tech debt. I will look into making that change as part of this PR.

@ChuckHastings
Copy link
Collaborator Author

rerun tests

1 similar comment
@rlratzel
Copy link
Contributor

rerun tests

Copy link
Contributor

@seunghwak seunghwak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me except for few minor complaints.

// do the perf measurements
// enabled by command line parameter s'--perf'
//
static int PERF = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're no longer using PERF, this gets replaced with cugraph::test::g_perf

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what happens when you start something and then set it aside :-)

Fixed in next push.

}

if (PERF) {
handle.get_stream_view().synchronize(); // for consistent performance measurement
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we're using CUDA_TRY(cudaDeviceSynchronize()) elsewhere.
This will be lower overhead and proper if every operation is executed on stream (this should be true... but may not be always true...) but we may better not mix the two in our tests. So, do you think we'd better use stream synchronize in our tests or we'd better continue using cudaDeviceSynchronize?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I switched to CUDA_TRY(cudaDeviceSynchronize()) in the next push.

I'm not particularly concerned about overhead in tests. We can consider that change as a general possibility later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... me, neither. Just wasn't happy about mixing cudaDeviceSynchronize() and stream_view.synchronize(). Either is fine for me as long as we're consistent.

::testing::Values(cugraph::test::Rmat_Usecase(10, 16, 0.57, 0.19, 0.19, 0, false, false))));

INSTANTIATE_TEST_SUITE_P(
rmat_large_tests,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See https://github.com/rapidsai/cugraph/blob/branch-21.10/cpp/tests/prims/mg_count_if_v.cu#L225
We renamed rmat_large_tests to rmat_benchmark_test to use this in C++ benchmarking.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in latest push.

@ChuckHastings
Copy link
Collaborator Author

rerun tests

@ChuckHastings
Copy link
Collaborator Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 5604446 into rapidsai:branch-21.10 Sep 19, 2021
@ChuckHastings ChuckHastings deleted the fea_renumbering_test branch February 1, 2022 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants