Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize K-Truss #4375

Merged
merged 105 commits into from
Jul 30, 2024
Merged

Conversation

jnke2016
Copy link
Contributor

@jnke2016 jnke2016 commented Apr 25, 2024

This PR leverages our existing edge triangle count to implement both SG and MG K-Truss as our initial version.
This PR also:

  • Exposes the rx count in several as our shuffling functions
  • Add C and C++ tests for MG K-Truss

Closes #4500

@github-actions github-actions bot added the CMake label May 4, 2024
Copy link
Collaborator

@ChuckHastings ChuckHastings left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So much simpler. A few cleanup comments, otherwise I think it's looking pretty good.

@@ -136,7 +136,7 @@ edge_property_t<graph_view_t<vertex_t, edge_t, false, multi_gpu>, edge_t> edge_t
auto edge_first = thrust::make_zip_iterator(edgelist_srcs.begin(), edgelist_dsts.begin());

size_t edges_to_intersect_per_iteration =
static_cast<size_t>(handle.get_device_properties().multiProcessorCount) * (1 << 17);
static_cast<size_t>(handle.get_device_properties().multiProcessorCount) * (1 << 13);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for 13? Should this be some sort of named constant so it's easier to change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a good idea to expose it in order to easily control the chunk size. I temporarily set a lower value so that I can process a larger graph when running at scale

@@ -0,0 +1,41 @@
/*
* Copyright (c) 2023-2024, NVIDIA CORPORATION.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just 2024

@@ -0,0 +1,41 @@
/*
* Copyright (c) 2023-2024, NVIDIA CORPORATION.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just 2024

cpp/src/community/k_truss_mg_v64_e64.cu Outdated Show resolved Hide resolved

std::tie(edgelist_srcs, edgelist_dsts, edgelist_wgts, num_triangles, std::ignore) =
decompress_to_edgelist(
std::chrono::seconds s (0); // 1 second
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two thoughts...

  1. Have you looked at https://github.com/rapidsai/cugraph/blob/branch-24.08/cpp/include/cugraph/utilities/high_res_timer.hpp which I think you could just include and use rather than reinventing. There are features to manage multiple timers inside that as well
  2. Probably should remove these before we merge the PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only wanted to time the triangle count part and I got a compiler error when copying the high_res_timer timing code from our tests so I didn't spend much time understanding what was wrong. But I will look into it again.

Probably should remove these before we merge the PR

I only have them for benchmarking on Draco. I will remove them in my next commit today along with all the unused functors

cpp/tests/CMakeLists.txt Show resolved Hide resolved
struct KTruss_Usecase {
int32_t k_{3};
bool test_weighted_{false};
// FIXME: test edge mask
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason we are skipping this? (edge mask)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I am not. I just forgot to remove this outdated comment

mg_graph_view.vertex_partition_range_lasts());

auto global_d_cugraph_srcs = cugraph::test::device_gatherv(
*handle_, raft::device_span<vertex_t const>(d_cugraph_srcs.data(), d_cugraph_srcs.size()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for additional nitpicking, but it might be better to follow naming convention of K-core tests (https://github.com/rapidsai/cugraph/blob/branch-24.08/cpp/tests/cores/mg_k_core_test.cpp#L136). Later, if this pattern occurs frequent enough, we can create another utility funciton.

@@ -497,7 +132,7 @@ k_truss(raft::handle_t const& handle,
exclude_self_loop_t<vertex_t>{});

if constexpr (multi_gpu) {
std::tie(srcs, dsts, std::ignore, std::ignore, std::ignore) =
std::tie(srcs, dsts, std::ignore, std::ignore, std::ignore, std::ignore) =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can defer this to the next MG optimization PR, but you can use masking here instead of creating a new graph for excluding self-loops (see the example here. https://github.com/rapidsai/cugraph/blob/branch-24.08/cpp/src/community/triangle_count_impl.cuh#L351).

@@ -656,265 +291,53 @@ k_truss(raft::handle_t const& handle,
*vertex_partition_range_lasts);
}
renumber_map = std::move(tmp_renumber_map);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use masking instead of creating a graph except for the final DODG graph creation (this DODG graph will be used again & again so better create a new graph than using masking as masking has its own overhead).

while (true) {

auto edge_triangle_counts =
edge_triangle_count<vertex_t, edge_t, multi_gpu>(handle, cur_graph_view);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a FIXME statement here.

This approach will miserably fail if we need to go through many iterations and only a small number of edges are invalidated in each iteration. No need to address this if this doesn't happen for any practical graphs but if we find one, we can work on optimization focusing on that graph.

Copy link
Contributor

@seunghwak seunghwak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jnke2016 jnke2016 force-pushed the branch-24.06_optimize-k-truss branch from d96081c to 3743208 Compare July 29, 2024 21:07
@ChuckHastings
Copy link
Collaborator

/merge

@rapids-bot rapids-bot bot merged commit a1f8a65 into rapidsai:branch-24.08 Jul 30, 2024
132 checks passed
@bdice bdice removed request for a team July 30, 2024 06:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake cuGraph improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Test MNMG Code Path for k-truss
6 participants