Skip to content

Commit

Permalink
Remove legacy sampling implementation, no longer used (rapidsai#3252)
Browse files Browse the repository at this point in the history
23.02 refactored the uniform neighborhood sampling code, but we had to change the API.  We left the old API in place to allow the python code changes to be made.  The old C/C++ API is no longer used, this PR removes all of the code that supports the obsolete API.

Authors:
  - Chuck Hastings (https://github.com/ChuckHastings)

Approvers:
  - Seunghwa Kang (https://github.com/seunghwak)
  - Rick Ratzel (https://github.com/rlratzel)

URL: rapidsai#3252
  • Loading branch information
ChuckHastings authored Feb 11, 2023
1 parent 99f0b78 commit 110cfb4
Show file tree
Hide file tree
Showing 15 changed files with 60 additions and 1,318 deletions.
48 changes: 0 additions & 48 deletions cpp/include/cugraph/algorithms.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -1705,54 +1705,6 @@ k_core(raft::handle_t const& handle,
std::optional<raft::device_span<edge_t const>> core_numbers,
bool do_expensive_check = false);

/**
* @brief Uniform Neighborhood Sampling.
*
* @deprecated This function should be replaced with uniform_neighbor_sample. Input of the
* new function adds an optional parameter, output has a number of extra fields.
*
* This function traverses from a set of starting vertices, traversing outgoing edges and
* randomly selects from these outgoing neighbors to extract a subgraph.
*
* Output from this function a set of tuples (src, dst, weight, count), identifying the randomly
* selected edges. src is the source vertex, dst is the destination vertex, weight is the weight
* of the edge and count identifies the number of times this edge was encountered during the
* sampling of this graph (so it is >= 1).
*
* @tparam vertex_t Type of vertex identifiers. Needs to be an integral type.
* @tparam edge_t Type of edge identifiers. Needs to be an integral type.
* @tparam weight_t Type of edge weights. Needs to be a floating point type.
* @tparam multi_gpu Flag indicating whether template instantiation should target single-GPU (false)
* @param handle RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and
* handles to various CUDA libraries) to run graph algorithms.
* @param graph_view Graph View object to generate NBR Sampling on.
* @param edge_weight_view Optional view object holding edge weights for @p graph_view.
* @param starting_vertices Device span of starting vertex IDs for the NBR Sampling.
* @param fan_out Host span defining branching out (fan-out) degree per source vertex for each
* level
* @param with_replacement boolean flag specifying if random sampling is done with replacement
* (true); or, without replacement (false); default = true;
* @param seed A seed to initialize the random number generator
* @return tuple device vectors (vertex_t source_vertex, vertex_t destination_vertex, weight_t
* weight, edge_t count)
*/
template <typename vertex_t,
typename edge_t,
typename weight_t,
bool store_transposed,
bool multi_gpu>
std::tuple<rmm::device_uvector<vertex_t>,
rmm::device_uvector<vertex_t>,
rmm::device_uvector<weight_t>,
rmm::device_uvector<edge_t>>
uniform_nbr_sample(raft::handle_t const& handle,
graph_view_t<vertex_t, edge_t, store_transposed, multi_gpu> const& graph_view,
std::optional<edge_property_view_t<edge_t, weight_t const*>> edge_weight_view,
raft::device_span<vertex_t> starting_vertices,
raft::host_span<const int> fan_out,
bool with_replacement = true,
uint64_t seed = 0);

/**
* @brief Uniform Neighborhood Sampling.
*
Expand Down
39 changes: 0 additions & 39 deletions cpp/include/cugraph_c/sampling_algorithms.h
Original file line number Diff line number Diff line change
Expand Up @@ -187,35 +187,6 @@ typedef struct {
int32_t align_;
} cugraph_sample_result_t;

/**
* @brief Uniform Neighborhood Sampling
* @deprecated This call should be replaced with cugraph_uniform_neighborhood_sampling
*
* @param [in] handle Handle for accessing resources
* @param [in] graph Pointer to graph. NOTE: Graph might be modified if the storage
* needs to be transposed
* @param [in] start Device array of start vertices for the sampling
* @param [in] fanout Host array defining the fan out at each step in the sampling algorithm
* @param [in] with_replacement
* Boolean value. If true selection of edges is done with
* replacement. If false selection is done without replacement.
* @param [in] do_expensive_check
* A flag to run expensive checks for input arguments (if set to true)
* @param [in] result Output from the uniform_neighbor_sample call
* @param [out] error Pointer to an error object storing details of any error. Will
* be populated if error code is not CUGRAPH_SUCCESS
* @return error code
*/
cugraph_error_code_t cugraph_uniform_neighbor_sample(
const cugraph_resource_handle_t* handle,
cugraph_graph_t* graph,
const cugraph_type_erased_device_array_view_t* start,
const cugraph_type_erased_host_array_view_t* fan_out,
bool_t with_replacement,
bool_t do_expensive_check,
cugraph_sample_result_t** result,
cugraph_error_t** error);

/**
* @brief Uniform Neighborhood Sampling
*
Expand Down Expand Up @@ -331,16 +302,6 @@ cugraph_type_erased_device_array_view_t* cugraph_sample_result_get_hop(
cugraph_type_erased_device_array_view_t* cugraph_sample_result_get_index(
const cugraph_sample_result_t* result);

/**
* @brief Get the transaction counts from the sampling algorithm result
*
* @param [in] result The result from a sampling algorithm
* @return type erased host array pointing to the counts
*/
// FIXME: This will be obsolete when the older mechanism is removed
cugraph_type_erased_host_array_view_t* cugraph_sample_result_get_counts(
const cugraph_sample_result_t* result);

/**
* @brief Free a sampling result
*
Expand Down
155 changes: 2 additions & 153 deletions cpp/src/c_api/uniform_neighbor_sampling.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -40,126 +40,13 @@ struct cugraph_sample_result_t {
cugraph_type_erased_device_array_t* wgt_{nullptr};
cugraph_type_erased_device_array_t* hop_{nullptr};
cugraph_type_erased_device_array_t* label_{nullptr};
// FIXME: Will be deleted once experimental replaces current
cugraph_type_erased_host_array_t* count_{nullptr};
};

} // namespace c_api
} // namespace cugraph

namespace {

struct uniform_neighbor_sampling_functor_deprecate : public cugraph::c_api::abstract_functor {
raft::handle_t const& handle_;
cugraph::c_api::cugraph_graph_t* graph_{nullptr};
cugraph::c_api::cugraph_type_erased_device_array_view_t const* start_{nullptr};
cugraph::c_api::cugraph_type_erased_host_array_view_t const* fan_out_{nullptr};
bool with_replacement_{false};
bool do_expensive_check_{false};
cugraph::c_api::cugraph_sample_result_t* result_{nullptr};

uniform_neighbor_sampling_functor_deprecate(cugraph_resource_handle_t const* handle,
cugraph_graph_t* graph,
cugraph_type_erased_device_array_view_t const* start,
cugraph_type_erased_host_array_view_t const* fan_out,
bool with_replacement,
bool do_expensive_check)
: abstract_functor(),
handle_(*reinterpret_cast<cugraph::c_api::cugraph_resource_handle_t const*>(handle)->handle_),
graph_(reinterpret_cast<cugraph::c_api::cugraph_graph_t*>(graph)),
start_(
reinterpret_cast<cugraph::c_api::cugraph_type_erased_device_array_view_t const*>(start)),
fan_out_(
reinterpret_cast<cugraph::c_api::cugraph_type_erased_host_array_view_t const*>(fan_out)),
with_replacement_(with_replacement),
do_expensive_check_(do_expensive_check)
{
}

template <typename vertex_t,
typename edge_t,
typename weight_t,
typename edge_type_t,
bool store_transposed,
bool multi_gpu>
void operator()()
{
// FIXME: Think about how to handle SG vice MG
if constexpr (!cugraph::is_candidate<vertex_t, edge_t, weight_t>::value) {
unsupported();
} else {
// uniform_nbr_sample expects store_transposed == false
if constexpr (store_transposed) {
error_code_ = cugraph::c_api::
transpose_storage<vertex_t, edge_t, weight_t, store_transposed, multi_gpu>(
handle_, graph_, error_.get());
if (error_code_ != CUGRAPH_SUCCESS) return;
}

auto graph =
reinterpret_cast<cugraph::graph_t<vertex_t, edge_t, false, multi_gpu>*>(graph_->graph_);

auto graph_view = graph->view();

auto edge_weights = reinterpret_cast<
cugraph::edge_property_t<cugraph::graph_view_t<vertex_t, edge_t, false, multi_gpu>,
weight_t>*>(graph_->edge_weights_);

auto number_map = reinterpret_cast<rmm::device_uvector<vertex_t>*>(graph_->number_map_);

rmm::device_uvector<vertex_t> start(start_->size_, handle_.get_stream());
raft::copy(start.data(), start_->as_type<vertex_t>(), start.size(), handle_.get_stream());

//
// Need to renumber sources
//
cugraph::renumber_ext_vertices<vertex_t, multi_gpu>(
handle_,
start.data(),
start.size(),
number_map->data(),
graph_view.local_vertex_partition_range_first(),
graph_view.local_vertex_partition_range_last(),
false);

auto&& [srcs, dsts, weights, counts] = cugraph::uniform_nbr_sample(
handle_,
graph_view,
(edge_weights != nullptr) ? std::make_optional(edge_weights->view()) : std::nullopt,
raft::device_span<vertex_t>(start.data(), start.size()),
raft::host_span<const int>(fan_out_->as_type<const int>(), fan_out_->size_),
with_replacement_);

std::vector<vertex_t> vertex_partition_lasts = graph_view.vertex_partition_range_lasts();

cugraph::unrenumber_int_vertices<vertex_t, multi_gpu>(handle_,
srcs.data(),
srcs.size(),
number_map->data(),
vertex_partition_lasts,
do_expensive_check_);

cugraph::unrenumber_int_vertices<vertex_t, multi_gpu>(handle_,
dsts.data(),
dsts.size(),
number_map->data(),
vertex_partition_lasts,
do_expensive_check_);

result_ = new cugraph::c_api::cugraph_sample_result_t{
new cugraph::c_api::cugraph_type_erased_device_array_t(srcs, graph_->vertex_type_),
new cugraph::c_api::cugraph_type_erased_device_array_t(dsts, graph_->vertex_type_),
new cugraph::c_api::cugraph_type_erased_device_array_t(
weights, graph_->weight_type_), // needs to be edge id...
nullptr,
nullptr,
nullptr,
nullptr,
nullptr};
}
}
};

struct uniform_neighbor_sampling_functor : public cugraph::c_api::abstract_functor {
raft::handle_t const& handle_;
cugraph::c_api::cugraph_graph_t* graph_{nullptr};
Expand Down Expand Up @@ -302,36 +189,6 @@ struct uniform_neighbor_sampling_functor : public cugraph::c_api::abstract_funct

} // namespace

extern "C" cugraph_error_code_t cugraph_uniform_neighbor_sample(
const cugraph_resource_handle_t* handle,
cugraph_graph_t* graph,
const cugraph_type_erased_device_array_view_t* start,
const cugraph_type_erased_host_array_view_t* fan_out,
bool_t with_replacement,
bool_t do_expensive_check,
cugraph_sample_result_t** result,
cugraph_error_t** error)
{
CAPI_EXPECTS(
reinterpret_cast<cugraph::c_api::cugraph_graph_t*>(graph)->vertex_type_ ==
reinterpret_cast<cugraph::c_api::cugraph_type_erased_device_array_view_t const*>(start)
->type_,
CUGRAPH_INVALID_INPUT,
"vertex type of graph and start must match",
*error);

CAPI_EXPECTS(
reinterpret_cast<cugraph::c_api::cugraph_type_erased_host_array_view_t const*>(fan_out)
->type_ == INT32,
CUGRAPH_INVALID_INPUT,
"fan_out should be of type int",
*error);

uniform_neighbor_sampling_functor_deprecate functor{
handle, graph, start, fan_out, with_replacement, do_expensive_check};
return cugraph::c_api::run_algorithm(graph, functor, result, error);
}

extern "C" cugraph_type_erased_device_array_view_t* cugraph_sample_result_get_sources(
const cugraph_sample_result_t* result)
{
Expand Down Expand Up @@ -404,13 +261,6 @@ extern "C" cugraph_type_erased_device_array_view_t* cugraph_sample_result_get_in
internal_pointer->edge_id_->view());
}

extern "C" cugraph_type_erased_host_array_view_t* cugraph_sample_result_get_counts(
const cugraph_sample_result_t* result)
{
auto internal_pointer = reinterpret_cast<cugraph::c_api::cugraph_sample_result_t const*>(result);
return reinterpret_cast<cugraph_type_erased_host_array_view_t*>(internal_pointer->count_->view());
}

extern "C" cugraph_error_code_t cugraph_test_uniform_neighborhood_sample_result_create(
const cugraph_resource_handle_t* handle,
const cugraph_type_erased_device_array_view_t* srcs,
Expand Down Expand Up @@ -639,8 +489,8 @@ extern "C" cugraph_error_code_t cugraph_test_sample_result_create(
reinterpret_cast<cugraph::c_api::cugraph_type_erased_device_array_t*>(new_device_wgt.release()),
reinterpret_cast<cugraph::c_api::cugraph_type_erased_device_array_t*>(
new_device_label.release()),
reinterpret_cast<cugraph::c_api::cugraph_type_erased_device_array_t*>(new_device_hop.release()),
nullptr});
reinterpret_cast<cugraph::c_api::cugraph_type_erased_device_array_t*>(
new_device_hop.release())});

return CUGRAPH_SUCCESS;
}
Expand All @@ -655,7 +505,6 @@ extern "C" void cugraph_sample_result_free(cugraph_sample_result_t* result)
delete internal_pointer->wgt_;
delete internal_pointer->hop_;
delete internal_pointer->label_;
delete internal_pointer->count_;
delete internal_pointer;
}

Expand Down
79 changes: 0 additions & 79 deletions cpp/src/sampling/detail/graph_functions.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -25,58 +25,6 @@ namespace detail {
// in implementation, naming and documentation. We should review these and
// consider updating things to support an arbitrary value for store_transposed

/**
* @brief Gather active majors across gpus in a column communicator
*
* Collect all the vertex ids and client gpu ids to be processed by every gpu in
* the column communicator and sort the list.
*
* @tparam vertex_t Type of vertex indices.
* @param handle RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and
* handles to various CUDA libraries) to run graph algorithms.
* @param d_in Device vector containing vertices local to this GPU
* @return Device vector containing all the vertices that are to be processed by every gpu
* in the column communicator
*/
template <typename vertex_t>
rmm::device_uvector<vertex_t> allgather_active_majors(raft::handle_t const& handle,
rmm::device_uvector<vertex_t>&& d_in);

// FIXME: Need docs if this function survives
template <typename vertex_t, typename edge_t, typename weight_t>
std::tuple<rmm::device_uvector<vertex_t>,
rmm::device_uvector<vertex_t>,
rmm::device_uvector<weight_t>,
rmm::device_uvector<edge_t>>
count_and_remove_duplicates(raft::handle_t const& handle,
rmm::device_uvector<vertex_t>&& src,
rmm::device_uvector<vertex_t>&& dst,
rmm::device_uvector<weight_t>&& wgt);

/**
* @brief Gather edge list for specified vertices
*
* Collect all the edges that are present in the adjacency lists on the current gpu
*
* @tparam GraphViewType Type of the passed non-owning graph object.
* @param handle RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and
* handles to various CUDA libraries) to run graph algorithms.
* @param graph_view Non-owning graph object.
* @param active_majors Device vector containing all the vertex id that are processed by
* gpus in the column communicator
* @return A tuple of device vector containing the majors, minors and weights gathered locally
*/
template <typename vertex_t, typename edge_t, typename weight_t, bool multi_gpu>
std::tuple<rmm::device_uvector<vertex_t>,
rmm::device_uvector<vertex_t>,
std::optional<rmm::device_uvector<weight_t>>>
gather_one_hop_edgelist(
raft::handle_t const& handle,
graph_view_t<vertex_t, edge_t, false, multi_gpu> const& graph_view,
std::optional<edge_property_view_t<edge_t, weight_t const*>> edge_weight_view,
const rmm::device_uvector<vertex_t>& active_majors,
bool do_expensive_check = false);

/**
* @brief Gather edge list for specified vertices
*
Expand Down Expand Up @@ -114,33 +62,6 @@ gather_one_hop_edgelist(
std::optional<rmm::device_uvector<int32_t>> const& active_major_labels,
bool do_expensive_check = false);

/**
* @brief Randomly sample edges from the adjacency list of specified vertices
*
* @tparam GraphViewType Type of the passed non-owning graph object.
* @param handle RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and
* handles to various CUDA libraries) to run graph algorithms.
* @param rng_state Random number generator state
* @param graph_view Non-owning graph object.
* @param active_majors Device vector containing all the vertex id that are processed by
* gpus in the column communicator
* @param fanout How many edges to sample for each vertex
* @param with_replacement If true sample with replacement, otherwise sample without replacement
* @param invalid_vertex_id Value to use for an invalid vertex
* @return A tuple of device vector containing the majors, minors and weights gathered locally
*/
template <typename vertex_t, typename edge_t, typename weight_t, bool multi_gpu>
std::tuple<rmm::device_uvector<vertex_t>,
rmm::device_uvector<vertex_t>,
std::optional<rmm::device_uvector<weight_t>>>
sample_edges(raft::handle_t const& handle,
graph_view_t<vertex_t, edge_t, false, multi_gpu> const& graph_view,
std::optional<edge_property_view_t<edge_t, weight_t const*>> edge_weight_view,
raft::random::RngState& rng_state,
rmm::device_uvector<vertex_t> const& active_majors,
size_t fanout,
bool with_replacement);

/**
* @brief Randomly sample edges from the adjacency list of specified vertices
*
Expand Down
Loading

0 comments on commit 110cfb4

Please sign in to comment.