Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Louvain to use new graph primitives and pattern accelerators #1423

Merged
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
97e7d83
Refactor Louvain with new graph primitives
ChuckHastings Feb 23, 2021
e1581d7
clean up include files, delete obsolete header
ChuckHastings Feb 23, 2021
9b6fa0c
Merge branch 'branch-0.19' into fea_louvain_use_new_primitives
ChuckHastings Feb 26, 2021
11aeffa
fix weight use in copy_v_transform_reduce_key_aggregated_out_nbr
ChuckHastings Feb 26, 2021
e8ab13a
use compute_out_weight_sums, add MG unit test
ChuckHastings Feb 26, 2021
f9fb0cd
Merge branch 'branch-0.19' into fea_louvain_use_new_primitives
ChuckHastings Mar 15, 2021
9e4421e
MNMG Louvain working correctly
ChuckHastings Mar 15, 2021
7cb4c13
respond to PR comments
ChuckHastings Mar 16, 2021
d660d4c
rename louvain mg test file
ChuckHastings Mar 16, 2021
c90f608
change Testparams to Usecase
ChuckHastings Mar 16, 2021
0673a52
Address PR review items, improved testing
ChuckHastings Mar 25, 2021
9bd82c2
Merge branch 'branch-0.19' into fea_louvain_use_new_primitives
ChuckHastings Mar 25, 2021
b9cdf40
clean up serial louvain tests
ChuckHastings Mar 26, 2021
ab2fb70
Merge branch 'branch-0.19' into fea_louvain_use_new_primitives
ChuckHastings Mar 26, 2021
b64e65b
revert generate_graph_from_edgelist changes, not necessary with new test
ChuckHastings Mar 26, 2021
bf522a4
update format and copyright information
ChuckHastings Mar 26, 2021
0810213
missed a format issue
ChuckHastings Mar 26, 2021
9e753ca
add cuco/static_map guards for a few primitives that didn't have them
ChuckHastings Mar 26, 2021
d2d1dfe
add guard for cuco static map usage in collect_values_for_keys
ChuckHastings Mar 26, 2021
4a26686
update louvain parameters and documentation to reference a graph_view
ChuckHastings Mar 26, 2021
38dc0e5
fix clang formatting issue
ChuckHastings Mar 29, 2021
a8c91c2
legacy louvain path now uses cuco also
ChuckHastings Mar 29, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 64 additions & 0 deletions cpp/include/algorithms.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,14 @@
* limitations under the License.
*/
#pragma once

#include <dendrogram.hpp>
Copy link
Member

@afender afender Mar 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(question) for the python binding, are you planning on having a Louvain feature (maybe in cython.cu) that encapsulates the dendrogram aspects? Or will we add bindings for this class?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping to add bindings for a louvain that returns a Dendrogram, eventually. I had to make the C++ change to support MNMG testing which needed the Dendrogram exposed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like returning a dendrogram too (probably optionally).

Did you take a look at SLHC btw? It should return a dendrogram (might be in RAFT already). Better be consistent as we consider adding SLHC to cuGraph also.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corey and I talked about the dendrogram and potentially moving it to raft. We thought waiting until the Louvain work was done in cugraph would be prudent.

#include <experimental/graph.hpp>
#include <experimental/graph_view.hpp>

#include <graph.hpp>
#include <internals.hpp>

#include <raft/handle.hpp>

namespace cugraph {
Expand Down Expand Up @@ -637,6 +641,66 @@ std::pair<size_t, typename graph_t::weight_type> louvain(
size_t max_level = 100,
typename graph_t::weight_type resolution = typename graph_t::weight_type{1});

/**
* @brief Louvain implementation, returning dendrogram
*
* Compute a clustering of the graph by maximizing modularity
*
* Computed using the Louvain method described in:
*
* VD Blondel, J-L Guillaume, R Lambiotte and E Lefebvre: Fast unfolding of
* community hierarchies in large networks, J Stat Mech P10008 (2008),
* http://arxiv.org/abs/0803.0476
*
* @throws cugraph::logic_error when an error occurs.
*
* @tparam graph_t Type of graph
*
* @param[in] handle Library handle (RAFT). If a communicator is set in the handle,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"If a communicator is set in the handle, ..." missing the end of the sentence?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed extraneous text from copy/paste.

* @param[in] graph input graph object (CSR)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should start clarifying supported graphs (directed, distributed, self loops, multi edges)

* @param[in] max_level (optional) maximum number of levels to run (default 100)
* @param[in] resolution (optional) The value of the resolution parameter to use.
* Called gamma in the modularity formula, this changes the size
* of the communities. Higher resolutions lead to more smaller
* communities, lower resolutions lead to fewer larger
* communities. (default 1)
*
* @return a pair containing:
* 1) unique pointer to dendrogram
* 2) modularity of the returned clustering
*
*/
template <typename graph_t>
std::pair<std::unique_ptr<Dendrogram<typename graph_t::vertex_type>>, typename graph_t::weight_type>
louvain(raft::handle_t const &handle,
graph_t const &graph,
size_t max_level = 100,
typename graph_t::weight_type resolution = typename graph_t::weight_type{1});

/**
* @brief Flatten a Dendrogram at a particular level
*
* A Dendrogram represents a hierarchical clustering/partitioning of
* a graph. This function will flatten the hierarchical clustering into
* a label for each vertex representing the final cluster/partition to
* which it is assigned
*
* @throws cugraph::logic_error when an error occurs.
*
* @tparam graph_t Type of graph
*
* @param[in] handle Library handle (RAFT). If a communicator is set in the handle,
* @param[in] graph input graph object
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like the actual param name is graph_view, does that need to match (or should the param name be graph since it's not a view obj)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit more pervasive. They are all graph views. In fact all of our algorithms are using graph views and calling them graphs.

I'm testing a change that changes louvain calls to consistently call it a graph_view. I believe we should do this with all algorithms (I'll try and make this change as I rework things to use the new graph primitives).

* @param[in] dendrogram input dendrogram object
* @param[out] clustering Pointer to device array where the clustering should be stored
*
*/
template <typename graph_t>
void flatten_dendrogram(raft::handle_t const &handle,
afender marked this conversation as resolved.
Show resolved Hide resolved
graph_t const &graph_view,
Dendrogram<typename graph_t::vertex_type> const &dendrogram,
typename graph_t::vertex_type *clustering);

/**
* @brief Leiden implementation
*
Expand Down
33 changes: 21 additions & 12 deletions cpp/src/community/dendrogram.cuh → cpp/include/dendrogram.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
#pragma once

#include <rmm/device_buffer.hpp>
#include <rmm/device_uvector.hpp>

#include <memory>
#include <vector>
Expand All @@ -25,30 +26,33 @@ namespace cugraph {
template <typename vertex_t>
class Dendrogram {
public:
void add_level(vertex_t num_verts,
void add_level(vertex_t first_index,
vertex_t num_verts,
cudaStream_t stream = 0,
rmm::mr::device_memory_resource *mr = rmm::mr::get_current_device_resource())
{
level_ptr_.push_back(
std::make_unique<rmm::device_buffer>(num_verts * sizeof(vertex_t), stream, mr));
level_size_.push_back(num_verts);
level_ptr_.push_back(std::make_unique<rmm::device_uvector<vertex_t>>(num_verts, stream, mr));
level_first_index_.push_back(first_index);
}

size_t current_level() const { return level_size_.size() - 1; }
size_t current_level() const { return level_ptr_.size() - 1; }

size_t num_levels() const { return level_size_.size(); }
size_t num_levels() const { return level_ptr_.size(); }

vertex_t const *get_level_ptr_nocheck(size_t level) const
{
return static_cast<vertex_t const *>(level_ptr_[level]->data());
//return static_cast<vertex_t const *>(level_ptr_[level]->data());
return level_ptr_[level]->data();
}

vertex_t *get_level_ptr_nocheck(size_t level)
{
return static_cast<vertex_t *>(level_ptr_[level]->data());
return level_ptr_[level]->data();
}

vertex_t get_level_size_nocheck(size_t level) const { return level_size_[level]; }
size_t get_level_size_nocheck(size_t level) const { return level_ptr_[level]->size(); }

vertex_t get_level_first_index_nocheck(size_t level) const { return level_first_index_[level]; }

vertex_t const *current_level_begin() const { return get_level_ptr_nocheck(current_level()); }

Expand All @@ -58,11 +62,16 @@ class Dendrogram {

vertex_t *current_level_end() { return current_level_begin() + current_level_size(); }

vertex_t current_level_size() const { return get_level_size_nocheck(current_level()); }
size_t current_level_size() const { return get_level_size_nocheck(current_level()); }

vertex_t current_level_first_index() const
{
return get_level_first_index_nocheck(current_level());
}

private:
std::vector<vertex_t> level_size_;
std::vector<std::unique_ptr<rmm::device_buffer>> level_ptr_;
std::vector<vertex_t> level_first_index_;
std::vector<std::unique_ptr<rmm::device_uvector<vertex_t>>> level_ptr_;
};

} // namespace cugraph
8 changes: 8 additions & 0 deletions cpp/include/graph.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,14 @@ class GraphViewBase {
edge_t *local_edges;
vertex_t *local_offsets;

vertex_t get_number_of_vertices() const {
return number_of_vertices;
}

vertex_t get_local_vertex_first() const {
return vertex_t{0};
}

/**
* @brief Fill the identifiers array with the vertex identifiers.
*
Expand Down
12 changes: 7 additions & 5 deletions cpp/include/utilities/device_comm.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -238,10 +238,12 @@ template <typename InputIterator, typename OutputIterator, size_t I>
struct device_sendrecv_tuple_iterator_element_impl<InputIterator, OutputIterator, I, I> {
void run(raft::comms::comms_t const& comm,
InputIterator input_first,
size_t count,
size_t tx_count,
int dst,
int base_tag,
raft::comms::request_t* requests) const
OutputIterator output_first,
size_t rx_count,
int src,
cudaStream_t stream) const
{
}
};
Expand Down Expand Up @@ -460,7 +462,7 @@ struct device_reduce_tuple_iterator_element_impl {
op,
root,
stream);
device_reduce_tuple_iterator_element_impl<InputIterator, OutputIterator, I + 1, N>(
device_reduce_tuple_iterator_element_impl<InputIterator, OutputIterator, I + 1, N>().run(
comm, input_first, output_first, count, op, root, stream);
}
};
Expand Down Expand Up @@ -890,7 +892,7 @@ device_reduce(raft::comms::comms_t const& comm,
thrust::tuple_size<typename thrust::iterator_traits<InputIterator>::value_type>::value;

detail::
device_reduce_tuple_iterator_element_impl<InputIterator, OutputIterator, size_t{0}, tuple_size>(
device_reduce_tuple_iterator_element_impl<InputIterator, OutputIterator, size_t{0}, tuple_size>().run(
comm, input_first, output_first, count, op, root, stream);
}

Expand Down
2 changes: 1 addition & 1 deletion cpp/src/community/ecg.cu
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ class EcgLouvain : public cugraph::Louvain<graph_type> {

void initialize_dendrogram_level(vertex_t num_vertices) override
{
this->dendrogram_->add_level(num_vertices);
this->dendrogram_->add_level(0, num_vertices);

get_permutation_vector(
num_vertices, seed_, this->dendrogram_->current_level_begin(), this->stream_);
Expand Down
39 changes: 22 additions & 17 deletions cpp/src/community/flatten_dendrogram.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
*/
#pragma once

#include <community/dendrogram.cuh>
#include <dendrogram.hpp>
#include <experimental/graph_functions.hpp>

#include <rmm/thrust_rmm_allocator.h>
Expand All @@ -31,23 +31,28 @@ void partition_at_level(raft::handle_t const &handle,
size_t level)
{
vertex_t local_num_verts = dendrogram.get_level_size_nocheck(0);
rmm::device_uvector<vertex_t> local_vertex_ids_v(local_num_verts, handle.get_stream());

thrust::copy(rmm::exec_policy(handle.get_stream())->on(handle.get_stream()),
d_vertex_ids,
d_vertex_ids + local_num_verts,
d_partition);

std::for_each(thrust::make_counting_iterator<size_t>(0),
thrust::make_counting_iterator<size_t>(level),
[&handle, &dendrogram, d_vertex_ids, &d_partition, local_num_verts](size_t l) {
cugraph::experimental::relabel<vertex_t, multi_gpu>(
handle,
std::tuple<vertex_t const *, vertex_t const *>(
d_vertex_ids, dendrogram.get_level_ptr_nocheck(l)),
dendrogram.get_level_size_nocheck(l),
d_partition,
local_num_verts);
});
raft::copy(d_partition, d_vertex_ids, local_num_verts, handle.get_stream());

std::for_each(
thrust::make_counting_iterator<size_t>(0),
thrust::make_counting_iterator<size_t>(level),
[&handle, &dendrogram, &local_vertex_ids_v, d_vertex_ids, &d_partition, local_num_verts](
size_t l) {
thrust::sequence(rmm::exec_policy(handle.get_stream())->on(handle.get_stream()),
local_vertex_ids_v.begin(),
local_vertex_ids_v.begin() + dendrogram.get_level_size_nocheck(l),
dendrogram.get_level_first_index_nocheck(l));

cugraph::experimental::relabel<vertex_t, multi_gpu>(
handle,
std::tuple<vertex_t const *, vertex_t const *>(local_vertex_ids_v.data(),
dendrogram.get_level_ptr_nocheck(l)),
dendrogram.get_level_size_nocheck(l),
d_partition,
local_num_verts);
});
}

} // namespace cugraph
2 changes: 1 addition & 1 deletion cpp/src/community/leiden.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@ class Leiden : public Louvain<graph_type> {
//
// Initialize every cluster to reference each vertex to itself
//
this->dendrogram_->add_level(current_graph.number_of_vertices);
this->dendrogram_->add_level(0, current_graph.number_of_vertices);

thrust::sequence(rmm::exec_policy(this->stream_)->on(this->stream_),
this->dendrogram_->current_level_begin(),
Expand Down
Loading