Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] Rename cugraph-ops symbols (refactoring) and update GHA workflows to call pytest via python -m pytest #3688

Merged
merged 5 commits into from
Jul 5, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/pr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ jobs:
with:
build_type: pull-request
package-name: pylibcugraph
test-unittest: "RAPIDS_DATASET_ROOT_DIR=./datasets pytest ./python/pylibcugraph/pylibcugraph/tests"
test-unittest: "RAPIDS_DATASET_ROOT_DIR=./datasets python -m pytest ./python/pylibcugraph/pylibcugraph/tests"
test-smoketest: "python ci/wheel_smoke_test_pylibcugraph.py"
wheel-build-cugraph:
needs: wheel-tests-pylibcugraph
Expand All @@ -120,5 +120,5 @@ jobs:
test-before-amd64: "cd ./datasets && bash ./get_test_data.sh && cd - && RAPIDS_PY_WHEEL_NAME=pylibcugraph_${{ '${PIP_CU_VERSION}' }} rapids-download-wheels-from-s3 ./local-pylibcugraph-dep && pip install --no-deps ./local-pylibcugraph-dep/*.whl && pip install git+https://github.com/dask/dask.git@main git+https://github.com/dask/distributed.git@main git+https://github.com/rapidsai/[email protected]"
# Skip dataset downloads on arm to save CI time -- arm only runs smoke tests.
test-before-arm64: "RAPIDS_PY_WHEEL_NAME=pylibcugraph_${{ '${PIP_CU_VERSION}' }} rapids-download-wheels-from-s3 ./local-pylibcugraph-dep && pip install --no-deps ./local-pylibcugraph-dep/*.whl && pip install git+https://github.com/dask/dask.git@main git+https://github.com/dask/distributed.git@main git+https://github.com/rapidsai/[email protected]"
test-unittest: "RAPIDS_DATASET_ROOT_DIR=/__w/cugraph/cugraph/datasets pytest -m sg ./python/cugraph/cugraph/tests"
test-unittest: "RAPIDS_DATASET_ROOT_DIR=/__w/cugraph/cugraph/datasets python -m pytest -m sg ./python/cugraph/cugraph/tests"
test-smoketest: "python ci/wheel_smoke_test_cugraph.py"
4 changes: 2 additions & 2 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ jobs:
date: ${{ inputs.date }}
sha: ${{ inputs.sha }}
package-name: pylibcugraph
test-unittest: "RAPIDS_DATASET_ROOT_DIR=./datasets pytest ./python/pylibcugraph/pylibcugraph/tests"
test-unittest: "RAPIDS_DATASET_ROOT_DIR=./datasets python -m pytest ./python/pylibcugraph/pylibcugraph/tests"
wheel-tests-cugraph:
secrets: inherit
uses: rapidsai/shared-action-workflows/.github/workflows/[email protected]
Expand All @@ -52,4 +52,4 @@ jobs:
# Always want to test against latest dask/distributed.
test-before-amd64: "cd ./datasets && bash ./get_test_data.sh && cd - && pip install git+https://github.com/dask/dask.git@main git+https://github.com/dask/distributed.git@main git+https://github.com/rapidsai/[email protected]"
test-before-arm64: "cd ./datasets && bash ./get_test_data.sh && cd - && pip install git+https://github.com/dask/dask.git@main git+https://github.com/dask/distributed.git@main git+https://github.com/rapidsai/[email protected]"
test-unittest: "RAPIDS_DATASET_ROOT_DIR=/__w/cugraph/cugraph/datasets pytest -m sg ./python/cugraph/cugraph/tests"
test-unittest: "RAPIDS_DATASET_ROOT_DIR=/__w/cugraph/cugraph/datasets python -m pytest -m sg ./python/cugraph/cugraph/tests"
22 changes: 17 additions & 5 deletions cpp/src/sampling/neighborhood.cu
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@

#include <raft/random/rng_state.hpp>

#include <type_traits>

namespace cugraph {

template <typename vertex_t, typename edge_t>
Expand All @@ -34,14 +36,19 @@ sample_neighbors_adjacency_list(raft::handle_t const& handle,
size_t sampling_size,
ops::graph::SamplingAlgoT sampling_algo)
{
const auto [ops_graph, max_degree] = detail::get_graph_and_max_degree(graph_view);
return ops::graph::uniform_sample_csr(rng_state,
using base_vertex_t = std::decay_t<vertex_t>;
using base_edge_t = std::decay_t<edge_t>;
Comment on lines +39 to +40
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we require decay here, should we better apply decay at the caller site? vertex_t implies it is a non-const non-volatile value type in our code base, and I think we should better maintain that convention.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that makes sense, I don't know about cugraph conventions, so decay would not hurt in case someone does end up calling this with a reference type or const or volatile. It is mainly used for the type checks anyway.
Do you want me to remove this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, yeah, then, I don't care whether we apply decay here or not (I initially thought the caller is also our code, but if the caller is a user, then, I am not sure removing decay here is good or bad).

static_assert(std::is_same_v<base_vertex_t, base_edge_t>,
"cugraph-ops sampling not yet implemented for different node and edge types");

const auto ops_graph = detail::get_graph(graph_view);
return ops::graph::uniform_sample_csc(rng_state,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was csr in the old code, now it is csr, is this intentional?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, see below, this is part of the refactoring

ops_graph,
ptr_d_start,
num_start_vertices,
sampling_size,
sampling_algo,
max_degree,
ops_graph.dst_max_in_degree,
handle.get_stream());
}

Expand All @@ -55,14 +62,19 @@ std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>> sample_
size_t sampling_size,
ops::graph::SamplingAlgoT sampling_algo)
{
const auto [ops_graph, max_degree] = detail::get_graph_and_max_degree(graph_view);
using base_vertex_t = std::decay_t<vertex_t>;
using base_edge_t = std::decay_t<edge_t>;
Comment on lines +65 to +66
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, if decay is necessary, I think we should apply this in the caller site.

static_assert(std::is_same_v<base_vertex_t, base_edge_t>,
"cugraph-ops sampling not yet implemented for different node and edge types");

const auto ops_graph = detail::get_graph(graph_view);
return ops::graph::uniform_sample_coo(rng_state,
ops_graph,
ptr_d_start,
num_start_vertices,
sampling_size,
sampling_algo,
max_degree,
ops_graph.dst_max_in_degree,
handle.get_stream());
}

Expand Down
24 changes: 8 additions & 16 deletions cpp/src/utilities/cugraph_ops_utils.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -20,34 +20,26 @@

#include <cugraph-ops/graph/format.hpp>

#include <tuple>

namespace cugraph {
namespace detail {

template <typename NodeTypeT, typename EdgeTypeT>
ops::graph::fg_csr<EdgeTypeT> get_graph(
ops::graph::csc<EdgeTypeT, NodeTypeT> get_graph(
graph_view_t<NodeTypeT, EdgeTypeT, false, false> const& gview)
{
ops::graph::fg_csr<EdgeTypeT> graph;
graph.n_nodes = gview.number_of_vertices();
graph.n_indices = gview.number_of_edges();
ops::graph::csc<EdgeTypeT, NodeTypeT> graph;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double check. csr to csc here is intentional, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes indeed, part of the refactoring was to align more with general naming conventions

graph.n_src_nodes = gview.number_of_vertices();
graph.n_dst_nodes = gview.number_of_vertices();
graph.n_indices = gview.number_of_edges();
// FIXME this is sufficient for now, but if there is a fast (cached) way
// of getting max degree, use that instead
Comment on lines +34 to +35
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for curiosity, is there any performance benefit in providing a tighter bound for max_in_degree?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For neighborhood sampling, no. For other operations in cugraph-ops, yes.

graph.dst_max_in_degree = std::numeric_limits<EdgeTypeT>::max();
// FIXME: this is evil and is just temporary until we have a matching type in cugraph-ops
// or we change the type accepted by the functions calling into cugraph-ops
graph.offsets = const_cast<EdgeTypeT*>(gview.local_edge_partition_view().offsets().data());
graph.indices = const_cast<EdgeTypeT*>(gview.local_edge_partition_view().indices().data());
return graph;
}

template <typename NodeTypeT, typename EdgeTypeT>
std::tuple<ops::graph::fg_csr<EdgeTypeT>, NodeTypeT> get_graph_and_max_degree(
graph_view_t<NodeTypeT, EdgeTypeT, false, false> const& gview)
{
// FIXME this is sufficient for now, but if there is a fast (cached) way
// of getting max degree, use that instead
auto max_degree = std::numeric_limits<NodeTypeT>::max();
return std::make_tuple(get_graph(gview), max_degree);
}

} // namespace detail
} // namespace cugraph
71 changes: 24 additions & 47 deletions python/cugraph-dgl/cugraph_dgl/nn/conv/gatconv.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,8 @@
from cugraph_dgl.nn.conv.base import BaseConv
from cugraph.utilities.utils import import_optional

from pylibcugraphops.pytorch import BipartiteCSC, SampledCSC, StaticCSC
from pylibcugraphops.pytorch.operators import mha_gat_n2n, mha_gat_n2n_bipartite
from pylibcugraphops.pytorch import CSC
from pylibcugraphops.pytorch.operators import mha_gat_n2n

dgl = import_optional("dgl")
torch = import_optional("torch")
Expand Down Expand Up @@ -173,9 +173,20 @@ def forward(
:math:`H` is the number of heads, and :math:`D_{out}` is size of
output feature.
"""
if max_in_degree is None:
max_in_degree = -1

bipartite = not isinstance(nfeat, torch.Tensor)
offsets, indices, _ = g.adj_tensors("csc")

graph = CSC(
offsets=offsets,
indices=indices,
num_src_nodes=g.num_src_nodes(),
dst_max_in_degree=max_in_degree,
is_bipartite=bipartite,
)

if efeat is not None:
if self.fc_edge is None:
raise RuntimeError(
Expand All @@ -191,60 +202,26 @@ def forward(
f"integers to allow bipartite node features, but got "
f"{self.in_feats}."
)
_graph = BipartiteCSC(
offsets=offsets, indices=indices, num_src_nodes=g.num_src_nodes()
)
nfeat_src = self.fc_src(nfeat[0])
nfeat_dst = self.fc_dst(nfeat[1])

out = mha_gat_n2n_bipartite(
src_feat=nfeat_src,
dst_feat=nfeat_dst,
attn_weights=self.attn_weights,
graph=_graph,
num_heads=self.num_heads,
activation="LeakyReLU",
negative_slope=self.negative_slope,
concat_heads=self.concat,
edge_feat=efeat,
)
else:
if not hasattr(self, "fc"):
raise RuntimeError(
f"{self.__class__.__name__}.in_feats is expected to be an "
f"integer, but got {self.in_feats}."
)
nfeat = self.fc(nfeat)
# Sampled primitive does not support edge features
if g.is_block and efeat is None:
if max_in_degree is None:
max_in_degree = g.in_degrees().max().item()

if max_in_degree < self.MAX_IN_DEGREE_MFG:
_graph = SampledCSC(
offsets=offsets,
indices=indices,
max_num_neighbors=max_in_degree,
num_src_nodes=g.num_src_nodes(),
)
else:
offsets = self.pad_offsets(offsets, g.num_src_nodes() + 1)
_graph = StaticCSC(offsets=offsets, indices=indices)
else:
if g.is_block:
offsets = self.pad_offsets(offsets, g.num_src_nodes() + 1)
_graph = StaticCSC(offsets=offsets, indices=indices)

out = mha_gat_n2n(
feat=nfeat,
attn_weights=self.attn_weights,
graph=_graph,
num_heads=self.num_heads,
activation="LeakyReLU",
negative_slope=self.negative_slope,
concat_heads=self.concat,
edge_feat=efeat,
)[: g.num_dst_nodes()]

out = mha_gat_n2n(
(nfeat_src, nfeat_dst) if bipartite else nfeat,
self.attn_weights,
graph,
num_heads=self.num_heads,
activation="LeakyReLU",
negative_slope=self.negative_slope,
concat_heads=self.concat,
edge_feat=efeat,
)[: g.num_dst_nodes()]

if self.concat:
out = out.view(-1, self.num_heads, self.out_feats)
Expand Down
43 changes: 23 additions & 20 deletions python/cugraph-dgl/cugraph_dgl/nn/conv/transformerconv.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
from cugraph_dgl.nn.conv.base import BaseConv
from cugraph.utilities.utils import import_optional

from pylibcugraphops.pytorch import BipartiteCSC, StaticCSC
from pylibcugraphops.pytorch import CSC
from pylibcugraphops.pytorch.operators import mha_simple_n2n

dgl = import_optional("dgl")
Expand Down Expand Up @@ -132,31 +132,34 @@ def forward(
efeat: torch.Tensor, optional
Edge feature tensor. Default: ``None``.
"""
bipartite = not isinstance(nfeat, torch.Tensor)
offsets, indices, _ = g.adj_tensors("csc")

if bipartite:
src_feats, dst_feats = nfeat
_graph = BipartiteCSC(
offsets=offsets, indices=indices, num_src_nodes=g.num_src_nodes()
)
else:
src_feats = dst_feats = nfeat
if g.is_block:
offsets = self.pad_offsets(offsets, g.num_src_nodes() + 1)
_graph = StaticCSC(offsets=offsets, indices=indices)

query = self.lin_query(dst_feats)
key = self.lin_key(src_feats)
value = self.lin_value(src_feats)
if self.lin_edge is not None:
graph = CSC(
offsets=offsets,
indices=indices,
num_src_nodes=g.num_src_nodes(),
is_bipartite=True,
)

if isinstance(nfeat, torch.Tensor):
nfeat = (nfeat, nfeat)

query = self.lin_query(nfeat[1][: g.num_dst_nodes()])
key = self.lin_key(nfeat[0])
value = self.lin_value(nfeat[0])

if efeat is not None:
if self.lin_edge is None:
raise RuntimeError(
f"{self.__class__.__name__}.edge_feats must be set to allow "
f"edge features."
)
efeat = self.lin_edge(efeat)

out = mha_simple_n2n(
key_emb=key,
query_emb=query,
value_emb=value,
graph=_graph,
graph=graph,
num_heads=self.num_heads,
concat_heads=self.concat,
edge_emb=efeat,
Expand All @@ -165,7 +168,7 @@ def forward(
)[: g.num_dst_nodes()]

if self.root_weight:
res = self.lin_skip(dst_feats[: g.num_dst_nodes()])
res = self.lin_skip(nfeat[1][: g.num_dst_nodes()])
if self.lin_beta is not None:
beta = self.lin_beta(torch.cat([out, res, out - res], dim=-1))
beta = beta.sigmoid()
Expand Down
6 changes: 3 additions & 3 deletions python/cugraph-dgl/tests/nn/test_transformerconv.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,14 +26,14 @@


@pytest.mark.parametrize("beta", [False, True])
@pytest.mark.parametrize("bipartite", [False, True])
@pytest.mark.parametrize("bipartite_node_feats", [False, True])
@pytest.mark.parametrize("concat", [False, True])
@pytest.mark.parametrize("idtype_int", [False, True])
@pytest.mark.parametrize("num_heads", [1, 2, 3, 4])
@pytest.mark.parametrize("to_block", [False, True])
@pytest.mark.parametrize("use_edge_feats", [False, True])
def test_TransformerConv(
beta, bipartite, concat, idtype_int, num_heads, to_block, use_edge_feats
beta, bipartite_node_feats, concat, idtype_int, num_heads, to_block, use_edge_feats
):
device = "cuda"
g = create_graph1().to(device)
Expand All @@ -44,7 +44,7 @@ def test_TransformerConv(
if to_block:
g = dgl.to_block(g)

if bipartite:
if bipartite_node_feats:
in_node_feats = (5, 3)
nfeat = (
torch.rand(g.num_src_nodes(), in_node_feats[0], device=device),
Expand Down
Loading