Add many graph generators to nx-cugraph #3954

eriknw · 2023-10-24T03:42:27Z

Also, better handle dtypes for edge values passed to pylibcugraph, which only takes float32 and float64 atm.

I also defined index_dtype (currently int32) to globally control the dtype of indices.

Also, better handle dtypes for edge values passed to pylibcugraph, which only takes float32 and float64 atm.

Currently, node values aren't used for any values, the only thing they are used for is converting to and from networkx, which we do just fine.

eriknw · 2023-10-25T02:10:50Z

I just updated this PR to allow node values to be numpy arrays (not just cupy arrays) so we can handle str and object dtypes. The only thing we do with node values is convert to/from networkx, which we do just fine.

eriknw · 2023-10-25T19:24:42Z

Switched to Draft, b/c some work is needed to pass networkx tests. This is 95% finished and can be probably be (mostly) reviewed.

eriknw · 2023-10-27T13:24:04Z

This is now passing CI :)

eriknw

Work on this PR also resulted in the following NetworkX PRs:

and to make our tests pass using dev version of NetworkX, we need one of these PRs:

Be forgiving of iteration order in test_write_network_text_circular_ladder_graph networkx/networkx#7063
Compare graphs for generator functions when running tests with backend networkx/networkx#7066

eriknw · 2023-10-27T23:25:30Z

python/nx-cugraph/nx_cugraph/__init__.py

-# from . import convert_matrix
-# from .convert_matrix import *
+from . import convert_matrix
+from .convert_matrix import *

-# from . import generators
-# from .generators import *
+from . import generators
+from .generators import *


Told you this would be coming soon ;)
#3848 (comment)

eriknw · 2023-10-27T23:28:33Z

python/nx-cugraph/nx_cugraph/classes/graph.py

-    node_values: dict[AttrKey, cp.ndarray[NodeValue]]
-    node_masks: dict[AttrKey, cp.ndarray[bool]]
+    node_values: dict[AttrKey, any_ndarray[NodeValue]]
+    node_masks: dict[AttrKey, any_ndarray[bool]]


I wonder how disruptive it would be to allow numpy arrays for edge values too (in a new PR). It may not be bad at all, and we can raise if it has an incompatible dtype.

It could certainly be handy, but supporting this more broadly raises many other questions--how do we let users control whether data is numpy or cupy (for example, when converting from networkx)?

eriknw · 2023-10-27T23:32:20Z

python/nx-cugraph/nx_cugraph/classes/graph.py

+    def add_nodes_from(self, nodes_for_adding: Iterable[NodeKey], **attr) -> None:
+        if self._N != 0:
+            raise NotImplementedError(
+                "add_nodes_from is not implemented for graph that already has nodes."
+            )


This method was added to be compatible with a networkx decorator, which is why it's only partially implemented.

eriknw · 2023-10-27T23:34:45Z

python/nx-cugraph/nx_cugraph/classes/graph.py

+                    ...
+                # Should we warn?


Should we warn if integer values are being (safely, if no operations are done) converted to float?

My first thought would be "no", but I don't have a strong opinion. I'm guessing this is possibly a warning many users would see?

Perhaps we can warn if we ever have any algorithms that could result in different results. Right now we don't, so let's not warn.

I added a code comment based on my previous comment.

eriknw · 2023-10-27T23:36:57Z

python/nx-cugraph/nx_cugraph/classes/graph.py

@@ -541,12 +625,60 @@ def _get_plc_graph(
            do_expensive_check=False,
        )

+    def _sort_edge_indices(self, primary="src"):


This came in handy twice when working on this PR:

to generate the small and classic graph datasets

when testing generators in networkx

It's optionally used by nxcg.to_networkx.

eriknw · 2023-10-27T23:43:36Z

python/nx-cugraph/nx_cugraph/convert_matrix.py

+        # We need to renumber indices--np.searchsorted to the rescue!
+        kwargs["id_to_key"] = nodes.tolist()
+        src_indices = cp.array(np.searchsorted(nodes, src_array), index_dtype)
+        dst_indices = cp.array(np.searchsorted(nodes, dst_array), index_dtype)


np.searchsorted is magical here to do renumbering for us (after doing np.unique)!

eriknw · 2023-10-27T23:44:39Z

python/nx-cugraph/nx_cugraph/convert_matrix.py

+            }
+        kwargs["edge_values"] = edge_values
+
+        if graph_class.is_multigraph() and edge_key is not None:


FYI, the multigraph tests in networkx all seem to use string edge values, which we don't support, so this isn't well covered (yet).

eriknw · 2023-10-27T23:46:48Z

python/nx-cugraph/nx_cugraph/generators/_utils.py

+_IS_NX32_OR_LESS = nx.__version__[:3] <= "3.2" and (
+    len(nx.__version__) <= 3 or not nx.__version__[3].isdigit()
+)


This isn't pretty... I wonder whether we should just depend on packaging (already a test dependency) to make this easier if we expect to do different things w.r.t. networkx versions.

I'm leaning towards just keeping this in place over adding a dependency.

eriknw · 2023-10-27T23:48:33Z

python/nx-cugraph/nx_cugraph/generators/_utils.py

+def _ensure_int(n):
+    """Ensure n is integral."""
+    return op.index(n)


I use this trick elsewhere in the code; I'll probably add a better error message here and use this more places, which captures intent better than op.index(n) (I always add a comment to capture intent).

eriknw · 2023-10-27T23:49:47Z

python/nx-cugraph/nx_cugraph/generators/_utils.py

+    else:
+        graph_class = G.__class__


I'm still not sure if it's a good idea for us to support instances of nxcg.Graph as create_using= with networkx semantics... but they are networkx semantics.

rlratzel

Sorry for the delay in reviewing. Looks good, I just had a few comments/questions. I think the theme of my review is mainly related to testing.

python/nx-cugraph/lint.yaml

python/nx-cugraph/nx_cugraph/algorithms/bipartite/generators.py

python/nx-cugraph/nx_cugraph/classes/graph.py

rlratzel · 2023-10-30T22:18:17Z

python/nx-cugraph/nx_cugraph/classes/graph.py

+                    ...
+                # Should we warn?


My first thought would be "no", but I don't have a strong opinion. I'm guessing this is possibly a warning many users would see?

rlratzel · 2023-10-30T22:33:58Z

python/nx-cugraph/nx_cugraph/classes/graph.py

+                edge_dtype = np.dtype(edge_dtype)
+                if edge_array.dtype != edge_dtype:
+                    edge_array = edge_array.astype(edge_dtype)
+            # PLC doesn't handle int edge weights right now, so cast int to float


Are there any tests for these new conditions too? It might be nice to see the size limits verified with test code that looks like what a user would try to use for creating a graph.

No direct tests yet, but iirc this might get exercised by networkx tests. Wouldn't it be nice to have coverage reports that could answer easily this question 😉 ?

In general, we are largely blazing a path ahead, implementing fast, and trying to write good code, and tests are slow to catch up. You're welcome to write some if you'd like ;) . I expect proactive testing will catch up more next year, since maybe we rely on networkx tests too heavily. For now, "best effort" + networkx tests + add regression tests for regressions is my strategy. Convenient coverage reports would be handy to reveal completely untested code.

python/nx-cugraph/nx_cugraph/convert.py

rlratzel · 2023-10-30T22:55:41Z

python/nx-cugraph/nx_cugraph/generators/_utils.py

+_IS_NX32_OR_LESS = nx.__version__[:3] <= "3.2" and (
+    len(nx.__version__) <= 3 or not nx.__version__[3].isdigit()
+)


I'm leaning towards just keeping this in place over adding a dependency.

rlratzel · 2023-10-30T23:04:41Z

python/nx-cugraph/nx_cugraph/convert_matrix.py

+    create_using=None,
+    edge_key=None,
+):
+    """cudf.DataFrame inputs also supported."""


I added this since I think this would be good coverage, especially for cudf and "pam"

eriknw · 2023-10-31T03:47:14Z

Ooh, networkx recently added tadpole_graph, which looks easy enough to do: networkx/networkx#6999

rlratzel · 2023-10-31T18:01:54Z

/merge

Add a few (mostly "classic") graph generators to nx-cugraph

b4fb8df

Also, better handle dtypes for edge values passed to pylibcugraph, which only takes float32 and float64 atm.

eriknw requested a review from a team as a code owner October 24, 2023 03:42

eriknw mentioned this pull request Oct 24, 2023

nx-cugraph: add k_truss and degree centralities #3945

Merged

eriknw added 3 commits October 24, 2023 14:51

Add 24 more generators

5365700

Add complete_bipartite_graph

03623a9

Allow node values to be numpy arrays or cupy arrays

13c8a6c

Currently, node values aren't used for any values, the only thing they are used for is converting to and from networkx, which we do just fine.

eriknw marked this pull request as draft October 25, 2023 19:23

BradReesWork requested review from rlratzel and BradReesWork October 25, 2023 19:24

eriknw added 3 commits October 25, 2023 21:20

Add and fix a couple generators

98e87a7

Merge branch 'branch-23.12' into generators

e657ec4

Fix some row->src, col->dst names

9e45109

eriknw marked this pull request as ready for review October 26, 2023 02:39

eriknw added 5 commits October 26, 2023 21:35

Also allow multigraphs to be sorted by edge indices

a12e92a

mark tests xfail for networkx 3.2

0305b35

Merge branch 'branch-23.12' into generators

d28edde

Better, I think

f9bd011

Better commented

33f19ee

eriknw added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Oct 27, 2023

eriknw added 5 commits October 27, 2023 09:10

Add from_scipy_sparse_array

1166b5d

Merge branch 'branch-23.12' into generators

834fd5b

Add from_pandas_edgelist

49f0839

Better way to get integer dtype

07dd4e7

Merge branch 'branch-23.12' into generators

36ffccf

eriknw changed the title ~~Add a few (mostly "classic") graph generators to nx-cugraph~~ Add many graph generators to nx-cugraph Oct 27, 2023

eriknw commented Oct 28, 2023

View reviewed changes

nv-rliu added 2 commits October 30, 2023 13:12

Merge branch 'branch-23.12' into generators

bfed89f

Merge branch 'branch-23.12' into generators

a43d5e7

rlratzel reviewed Oct 30, 2023

View reviewed changes

Update docstring and comment based on feedback

f410ba5

Merge branch 'branch-23.12' into generators

299c415

rlratzel mentioned this pull request Oct 31, 2023

update cugraph/python linting tools #3967

Open

eriknw added 2 commits October 31, 2023 11:02

Fix to support networkx 3.2.1

1916489

Add tadpole_graph generator

85e2cd8

eriknw mentioned this pull request Oct 31, 2023

nx-cugraph: add CC for undirected graphs to fix k-truss #3965

Merged

rlratzel approved these changes Oct 31, 2023

View reviewed changes

rapids-bot bot merged commit d6c7fa1 into rapidsai:branch-23.12 Oct 31, 2023
73 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add many graph generators to nx-cugraph #3954

Add many graph generators to nx-cugraph #3954

eriknw commented Oct 24, 2023

eriknw commented Oct 25, 2023

eriknw commented Oct 25, 2023

eriknw commented Oct 27, 2023

eriknw left a comment

eriknw Oct 27, 2023

eriknw Oct 27, 2023

eriknw Oct 27, 2023

eriknw Oct 27, 2023

rlratzel Oct 30, 2023

eriknw Oct 30, 2023

eriknw Oct 31, 2023

eriknw Oct 27, 2023

eriknw Oct 27, 2023

eriknw Oct 27, 2023

eriknw Oct 27, 2023

rlratzel Oct 30, 2023

eriknw Oct 27, 2023

eriknw Oct 27, 2023

rlratzel left a comment

rlratzel Oct 30, 2023

rlratzel Oct 30, 2023

eriknw Oct 30, 2023

rlratzel Oct 30, 2023

rlratzel Oct 30, 2023

eriknw commented Oct 31, 2023

rlratzel commented Oct 31, 2023

Add many graph generators to nx-cugraph #3954

Add many graph generators to nx-cugraph #3954

Conversation

eriknw commented Oct 24, 2023

eriknw commented Oct 25, 2023

eriknw commented Oct 25, 2023

eriknw commented Oct 27, 2023

eriknw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rlratzel left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eriknw commented Oct 31, 2023

rlratzel commented Oct 31, 2023