Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new cugraph-nx package (networkx backend using pylibcugraph) #3614

Merged
merged 36 commits into from
Aug 16, 2023

Conversation

eriknw
Copy link
Contributor

@eriknw eriknw commented May 25, 2023

cugraph-nx is working with two algorithm (betweenness_centrality, edge_betweenness_centrality).

It can efficiently (i.e., faster than cugraph in my benchmarks) convert from (and to) networkx graphs, and runs (and passes) networkx tests using networkx dispatching machinery.

It renumbers vertex labels, which we handle with python dicts.

It only depends on networkx, cupy, and pylibcugraph. cupy is a transitive dependency from pylibcugraph, so we don't actually need to list it as a dependency.

I think it's nice the way we depend on networkx, which allows us to use its exceptions and utilities.

There is still plenty to do (and I have a few questions), but I think it's a good start and gets cugraph in on the NetworkX dispatching party.

I was planning on including cugraph-pg with this PR, but now I think it's probably better to have that in a new PR.

@eriknw eriknw requested a review from a team as a code owner May 25, 2023 17:48
@eriknw eriknw requested review from a team as code owners May 26, 2023 01:58
@eriknw eriknw changed the base branch from branch-23.06 to branch-23.08 May 26, 2023 01:58
@BradReesWork BradReesWork added this to the 23.08 milestone May 26, 2023
@rlratzel rlratzel added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jun 22, 2023
def is_strongly_connected(G):
G = to_directed_graph(G)
N = len(G)
if N == 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not check graph size on the Nx graph before converting it to GPU?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being more intellegent when to convert graphs is a good idea, but this logic likely belongs elsewhere (TBD). On the other hand, small graphs should be quick regardless.

python/cugraph-nx/cugraph_nx/classes/graph.py Show resolved Hide resolved
return n in container
except TypeError:
return False

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how many of these function are truly used within an algorithms? There is value to those functions, but it seems like Nx could just leverage their internal representation first for those calls.

Now if the thought that Nx would totally off load all functions and just be a broker?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think methods on graphs are "nice to have", but most of them aren't strictly necessary for the example workloads we're after. I only added the necessary ones and the "easy to implement" ones. Long term, having methods on graphs will let users copy/paste examples from documentation, docstrings, etc. and update to use cugraph-nx. It gives users a familiar experience that may be useful when interacting with graphs in a notebook. I don't expect to expend significant effort in the near term to implement the missing methods.

@BradReesWork
Copy link
Member

Would be nice to have some performance numbers on how fast the conversion code is across a range of graph sizes. That is the spot that will damp the benefit from using cuGraph

…coming out in nx 3.2

This implements a property graph model where edges and nodes may (or may not) have missing data.
Multigraphs are not yet handled.
@BradReesWork BradReesWork modified the milestones: 23.08, 23.10 Jul 24, 2023
- Rename `from_networkx_propertygraph` to `from_networkx` and reorder arguments
- Allow `from_networkx` to accept a single attribute or dtype for convenience (normally dict)
- `pytest --bench` runs _only_ benchmarks; `pytest` does not run _any_ benchmarks
- Add `from_coo`, `from_csr`, `from_csc`, `from_dcsr`, and `from_dcsc` methods to Graph
- Add `@networkx_api` decorator to copy docstring from NetworkX
- Remove cugraph-pg from build.sh
- Implement more Graph methods
- Add better annotations
- Faster conversion from networkx when only one edge or node attribute has no default
- Add linting (via pre-commit); run `$ make lint`
@eriknw
Copy link
Contributor Author

eriknw commented Aug 7, 2023

@rlratzel should we add anything to dependencies.yaml?

Also, @betochimas and I last week discovered we need pytest-mpl installed to run the networkx tests, so we should probably add this to the conda environment files.

Also, update `@networkx_api` decorators to `@networkx_algorithm` and `@networkx_class`
Because it's not really a dispatcher; networkx is the dispatcher.
We are a target that is dispatched too. Is that dispatchee?
Copy link
Contributor

@rlratzel rlratzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thanks for the attention to detail.

python/cugraph-nx/conftest.py Outdated Show resolved Hide resolved
python/cugraph-nx/README.md Outdated Show resolved Hide resolved
python/cugraph-nx/cugraph_nx/classes/graph.py Outdated Show resolved Hide resolved
python/cugraph-nx/cugraph_nx/classes/graph.py Outdated Show resolved Hide resolved
python/cugraph-nx/cugraph_nx/classes/graph.py Show resolved Hide resolved
python/cugraph-nx/cugraph_nx/convert.py Show resolved Hide resolved
python/cugraph-nx/cugraph_nx/convert.py Outdated Show resolved Hide resolved
python/cugraph-nx/cugraph_nx/convert.py Outdated Show resolved Hide resolved
python/cugraph-nx/run_nx_tests.sh Outdated Show resolved Hide resolved
@eriknw eriknw changed the base branch from branch-23.08 to branch-23.10 August 10, 2023 17:20
- Rename `self = object.__new__(cls)` to use `new_graph` or `instance` instead of `self`.
- Finish updating to 23.10
@eriknw
Copy link
Contributor Author

eriknw commented Aug 10, 2023

@rlratzel, the last commit hopefully addresses all current comments. Shall we aim to merge soon?

Copy link
Contributor

@rlratzel rlratzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, eager to get this merged, thanks!

@raydouglass raydouglass removed the request for review from a team August 14, 2023 17:47
@raydouglass
Copy link
Member

Removing ops-codeowners from the required reviews since it doesn't seem there are any file changes that we're responsible for. Feel free to add us back if necessary.

@rlratzel
Copy link
Contributor

@rlratzel should we add anything to dependencies.yaml?

Also, @betochimas and I last week discovered we need pytest-mpl installed to run the networkx tests, so we should probably add this to the conda environment files.

Thanks, I'll include those updates in #3793

…s CI and be merged. A followup PR to re-enable and fix the C++ bug will be created.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have the license header

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the careful review. I'm not sure an empty file needs the license header. If the license header belongs here, then it also belongs in these files:

./python/cugraph-service/server/cugraph_service_server/testing/__init__.py
./python/cugraph-service/tests/__init__.py
./python/cugraph/cugraph/dask/link_analysis/__init__.py
./python/cugraph/cugraph/dask/link_prediction/__init__.py
./python/cugraph/cugraph/dask/components/__init__.py
./python/cugraph/cugraph/dask/comms/__init__.py
./python/cugraph/cugraph/dask/structure/__init__.py
./python/cugraph/cugraph/dask/common/__init__.py
./python/cugraph/cugraph/dask/traversal/__init__.py
./python/cugraph/cugraph/experimental/compat/__init__.py
./python/cugraph/cugraph/experimental/link_prediction/__init__.py
./python/cugraph/cugraph/experimental/components/__init__.py
./python/cugraph/cugraph/experimental/datasets/metadata/__init__.py
./python/cugraph/cugraph/experimental/structure/__init__.py
./python/cugraph/cugraph/datasets/metadata/__init__.py
./python/cugraph-dgl/cugraph_dgl/utils/__init__.py
./python/pylibcugraph/pylibcugraph/components/__init__.py
./python/pylibcugraph/pylibcugraph/utilities/__init__.py
./python/pylibcugraph/pylibcugraph/internal_types/__init__.py
./python/pylibcugraph/pylibcugraph/_cugraph_c/__init__.py
./python/pylibcugraph/pylibcugraph/structure/__init__.py
./benchmarks/shared/python/cugraph_benchmarking/__init__.py

I was following established practices.

Copy link
Contributor

@rlratzel rlratzel Aug 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BradReesWork if it's okay, I'll open a separate PR to address the __init__.py file in this PR as well as all other empty ones in the entire repo. That will make us more consistent with other RAPIDS repos (I just checked cudf) which include at least a copyright line. That will make us compliant with our contributing doc.
I'll follow up here with the PR once it's up.

Copy link
Contributor Author

@eriknw eriknw Aug 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These files identified via:

find . -type f -name '__init__.py' -print | xargs wc | grep -v skbuild | grep -v cpp.build | grep ' 0  *0  *0'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the followup PR: #3799

@rlratzel
Copy link
Contributor

/merge

@rapids-bot rapids-bot bot merged commit 28fa98f into rapidsai:branch-23.10 Aug 16, 2023
55 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for betweenness_centrality and edge_betweenness_centrality to cugraph-nx
4 participants