nx-cugraph: indicate which plc algorithms are used and version_added #4069

eriknw · 2023-12-21T23:30:53Z

Pretty simple PR. I would like for us to use this metadata when creating tables of supported algorithms.

eriknw · 2023-12-22T10:03:31Z

I'm thinking something like this:

import re

import networkx as nx
from networkx.utils.backends import _registered_algorithms as algos

from _nx_cugraph import get_info
from nx_cugraph.interface import BackendInterface


def get_funcpath(func):
    return f"{func.__module__}.{func.__name__}"


def add_branch(G, funcpath, plc):
    branch = funcpath.split(".")
    prev = branch[0]
    for i in range(2, len(branch)):
        cur = ".".join(branch[:i])
        G.add_edge(prev, cur)
        prev = cur
    if plc is not None:
        funcpath += " (" + ", ".join(sorted(plc)) + ")"
    G.add_edge(prev, funcpath)


path_to_name = {
    get_funcpath(algos[funcname]): funcname
    for funcname in get_info()["functions"].keys() & algos.keys()
}
G = nx.DiGraph()
for funcpath in sorted(path_to_name):
    funcname = path_to_name[funcpath]
    add_branch(G, funcpath, getattr(BackendInterface, funcname)._plc_names)

print(re.sub(r"[A-Za-z_\.]*\.", "", ("\n".join(nx.generate_network_text(G)))))

which creates

╙── networkx
    ├─╼ algorithms
    │   ├─╼ bipartite
    │   │   └─╼ generators
    │   │       └─╼ complete_bipartite_graph
    │   ├─╼ centrality
    │   │   ├─╼ betweenness
    │   │   │   ├─╼ betweenness_centrality (betweenness_centrality)
    │   │   │   └─╼ edge_betweenness_centrality (edge_betweenness_centrality)
    │   │   ├─╼ degree_alg
    │   │   │   ├─╼ degree_centrality
    │   │   │   ├─╼ in_degree_centrality
    │   │   │   └─╼ out_degree_centrality
    │   │   ├─╼ eigenvector
    │   │   │   └─╼ eigenvector_centrality (eigenvector_centrality)
    │   │   └─╼ katz
    │   │       └─╼ katz_centrality (katz_centrality)
    │   ├─╼ community
    │   │   └─╼ louvain
    │   │       └─╼ louvain_communities (louvain)
    │   ├─╼ components
    │   │   └─╼ connected
    │   │       ├─╼ connected_components (weakly_connected_components)
    │   │       ├─╼ is_connected (weakly_connected_components)
    │   │       ├─╼ node_connected_component (weakly_connected_components)
    │   │       └─╼ number_connected_components (weakly_connected_components)
    │   ├─╼ core
    │   │   └─╼ k_truss (k_truss_subgraph)
    │   ├─╼ dag
    │   │   ├─╼ ancestors (bfs)
    │   │   └─╼ descendants (bfs)
    │   ├─╼ isolate
    │   │   ├─╼ is_isolate
    │   │   ├─╼ isolates
    │   │   └─╼ number_of_isolates
    │   ├─╼ link_analysis
    │   │   ├─╼ hits_alg
    │   │   │   └─╼ hits (hits)
    │   │   └─╼ pagerank_alg
    │   │       └─╼ pagerank (pagerank, personalized_pagerank)
    │   ├─╼ shortest_paths
    │   │   └─╼ unweighted
    │   │       ├─╼ single_source_shortest_path_length (bfs)
    │   │       └─╼ single_target_shortest_path_length (bfs)
    │   └─╼ traversal
    │       └─╼ breadth_first_search
    │           ├─╼ bfs_edges (bfs)
    │           ├─╼ bfs_layers (bfs)
    │           ├─╼ bfs_predecessors (bfs)
    │           ├─╼ bfs_successors (bfs)
    │           ├─╼ bfs_tree (bfs)
    │           ├─╼ descendants_at_distance (bfs)
    │           └─╼ generic_bfs_edges (bfs)
    ├─╼ convert_matrix
    │   ├─╼ from_pandas_edgelist
    │   └─╼ from_scipy_sparse_array
    └─╼ generators
        ├─╼ classic
        │   ├─╼ barbell_graph
        │   ├─╼ circular_ladder_graph
        │   ├─╼ complete_graph
        │   ├─╼ complete_multipartite_graph
        │   ├─╼ cycle_graph
        │   ├─╼ empty_graph
        │   ├─╼ ladder_graph
        │   ├─╼ lollipop_graph
        │   ├─╼ null_graph
        │   ├─╼ path_graph
        │   ├─╼ star_graph
        │   ├─╼ tadpole_graph
        │   ├─╼ trivial_graph
        │   ├─╼ turan_graph
        │   └─╼ wheel_graph
        ├─╼ community
        │   └─╼ caveman_graph
        ├─╼ small
        │   ├─╼ bull_graph
        │   ├─╼ chvatal_graph
        │   ├─╼ cubical_graph
        │   ├─╼ desargues_graph
        │   ├─╼ diamond_graph
        │   ├─╼ dodecahedral_graph
        │   ├─╼ frucht_graph
        │   ├─╼ heawood_graph
        │   ├─╼ house_graph
        │   ├─╼ house_x_graph
        │   ├─╼ icosahedral_graph
        │   ├─╼ krackhardt_kite_graph
        │   ├─╼ moebius_kantor_graph
        │   ├─╼ octahedral_graph
        │   ├─╼ pappus_graph
        │   ├─╼ petersen_graph
        │   ├─╼ sedgewick_maze_graph
        │   ├─╼ tetrahedral_graph
        │   ├─╼ truncated_cube_graph
        │   ├─╼ truncated_tetrahedron_graph
        │   └─╼ tutte_graph
        └─╼ social
            ├─╼ davis_southern_women_graph
            ├─╼ florentine_families_graph
            ├─╼ karate_club_graph
            └─╼ les_miserables_graph

@rlratzel what do you think and what would you find helpful? I'm not sure if showing PLC usage is helpful to users, but I think it is to us.

It may be better to show the dispatch name in parentheses if it's different from the networkx name, and then maybe show PLC usage in a different diagram.

eriknw · 2023-12-22T10:29:08Z

Here's an example that shows how PLC functions are used:

╟── plc.betweenness_centrality
╎   └─╼ algorithms
╎       └─╼ centrality
╎           └─╼ betweenness
╎               └─╼ betweenness_centrality
╟── plc.bfs
╎   └─╼ algorithms
╎       ├─╼ dag
╎       │   ├─╼ ancestors
╎       │   └─╼ descendants
╎       ├─╼ shortest_paths
╎       │   └─╼ unweighted
╎       │       ├─╼ single_source_shortest_path_length
╎       │       └─╼ single_target_shortest_path_length
╎       └─╼ traversal
╎           └─╼ breadth_first_search
╎               ├─╼ bfs_edges
╎               ├─╼ bfs_layers
╎               ├─╼ bfs_predecessors
╎               ├─╼ bfs_successors
╎               ├─╼ bfs_tree
╎               ├─╼ descendants_at_distance
╎               └─╼ generic_bfs_edges
╟── plc.edge_betweenness_centrality
╎   └─╼ algorithms
╎       └─╼ centrality
╎           └─╼ betweenness
╎               └─╼ edge_betweenness_centrality
╟── plc.eigenvector_centrality
╎   └─╼ algorithms
╎       └─╼ centrality
╎           └─╼ eigenvector
╎               └─╼ eigenvector_centrality
╟── plc.hits
╎   └─╼ algorithms
╎       └─╼ link_analysis
╎           └─╼ hits_alg
╎               └─╼ hits
╟── plc.k_truss_subgraph
╎   └─╼ algorithms
╎       └─╼ core
╎           └─╼ k_truss
╟── plc.katz_centrality
╎   └─╼ algorithms
╎       └─╼ centrality
╎           └─╼ katz
╎               └─╼ katz_centrality
╟── plc.louvain
╎   └─╼ algorithms
╎       └─╼ community
╎           └─╼ louvain
╎               └─╼ louvain_communities
╟── plc.pagerank
╎   └─╼ algorithms
╎       └─╼ link_analysis
╎           └─╼ pagerank_alg
╎               └─╼ pagerank
╟── plc.personalized_pagerank
╎   └─╼ algorithms
╎       └─╼ link_analysis
╎           └─╼ pagerank_alg
╎               └─╼ pagerank
╙── plc.weakly_connected_components
    └─╼ algorithms
        └─╼ components
            └─╼ connected
                ├─╼ connected_components
                ├─╼ is_connected
                ├─╼ node_connected_component
                └─╼ number_connected_components

eriknw · 2023-12-28T10:35:12Z

Other metadata that might be nice to add to @networkx_algorithm functions is whether or not they fully implement the NetworkX function (maybe use incomplete=True or is_complete=False) and whether the results are the same within floating point tolerance (maybe use results_different=True or is_identical=False). Any other ideas for metadata or better names for these?

eriknw · 2024-01-02T20:12:57Z

I just added the ability to see what functions are "incomplete" (such as unsupported API) and "different" (such as behavior of RNG). It's a little sobering, but can be a good guide for identifying potential cugraph-core work.

It might be nice to be able to filter by complete/incomplete/etc.

Also, the regex used by print_tree.py is a little fragile, so if we want to extend this to do much more, we may want to consider reimplementing instead of adding hack onto hack.

copy-pr-bot · 2024-01-03T15:01:36Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

eriknw · 2024-01-03T15:06:51Z

/ok to test

eriknw · 2024-01-03T15:08:05Z

python/nx-cugraph/pyproject.toml

+# "backend" used in nx version >= 3.2
+[project.entry-points."networkx.backends"]
+cugraph = "nx_cugraph.interface:BackendInterface"
+
+[project.entry-points."networkx.backend_info"]
+cugraph = "_nx_cugraph:get_info"
+


btw, this is needed to work with dev version of NetworkX.

rlratzel

Very neat, thanks! I'm approving but I also had some suggestions I hope you consider. Of all of them, the single metadata dictionary is my biggest preference.

python/nx-cugraph/nx_cugraph/algorithms/centrality/betweenness.py

eriknw · 2024-01-11T17:39:11Z

/ok to test

rlratzel · 2024-01-11T19:34:14Z

/merge

nx-cugraph: indicate which plc algorithms are used

799c864

eriknw requested a review from a team as a code owner December 21, 2023 23:30

github-actions bot added the python label Dec 21, 2023

eriknw added 2 commits December 22, 2023 04:49

missed one

a089fcd

Also add version_added metadata

440b852

eriknw changed the title ~~nx-cugraph: indicate which plc algorithms are used~~ nx-cugraph: indicate which plc algorithms are used and version_added Dec 22, 2023

Add scripts to display information about functions in nx-cugraph

b051bae

eriknw added 3 commits January 2, 2024 20:25

Also add is_incomplete and is_different

8ed2e17

Update copyright years

29692ef

Update copyright for new files

36fc6dd

eriknw added the improvement Improvement / enhancement to an existing function label Jan 2, 2024

Oops!

d9db4b3

eriknw commented Jan 3, 2024

View reviewed changes

rlratzel approved these changes Jan 8, 2024

View reviewed changes

python/nx-cugraph/nx_cugraph/algorithms/centrality/betweenness.py Show resolved Hide resolved

rlratzel added the non-breaking Non-breaking change label Jan 8, 2024

rlratzel linked an issue Jan 8, 2024 that may be closed by this pull request

Update nx-cugraph README/docs for 24.02 #4079

Closed

rlratzel mentioned this pull request Jan 8, 2024

Update nx-cugraph README/docs for 24.02 #4079

Closed

rlratzel added the DO NOT MERGE Hold off on merging; see PR for details label Jan 9, 2024

eriknw added 5 commits January 10, 2024 13:38

Merge branch 'branch-24.02' into nx_cugraph_plc

9163eec

Add clarifying code comments (almost like a docstring!)

d17608a

Merge branch 'branch-24.02' into nx_cugraph_plc

cadd465

Make sure differences are noted in extra docstrings

f5d1252

Merge branch 'branch-24.02' into nx_cugraph_plc

e0f6edc

rlratzel approved these changes Jan 11, 2024

View reviewed changes

rlratzel removed the DO NOT MERGE Hold off on merging; see PR for details label Jan 11, 2024

rapids-bot bot merged commit 88c3884 into rapidsai:branch-24.02 Jan 11, 2024
97 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nx-cugraph: indicate which plc algorithms are used and version_added #4069

nx-cugraph: indicate which plc algorithms are used and version_added #4069

eriknw commented Dec 21, 2023

eriknw commented Dec 22, 2023

eriknw commented Dec 22, 2023

eriknw commented Dec 28, 2023

eriknw commented Jan 2, 2024

copy-pr-bot bot commented Jan 3, 2024

eriknw commented Jan 3, 2024

eriknw Jan 3, 2024

rlratzel left a comment

eriknw commented Jan 11, 2024

rlratzel commented Jan 11, 2024

nx-cugraph: indicate which plc algorithms are used and version_added #4069

nx-cugraph: indicate which plc algorithms are used and version_added #4069

Conversation

eriknw commented Dec 21, 2023

eriknw commented Dec 22, 2023

eriknw commented Dec 22, 2023

eriknw commented Dec 28, 2023

eriknw commented Jan 2, 2024

copy-pr-bot bot commented Jan 3, 2024

eriknw commented Jan 3, 2024

eriknw Jan 3, 2024

Choose a reason for hiding this comment

rlratzel left a comment

Choose a reason for hiding this comment

eriknw commented Jan 11, 2024

rlratzel commented Jan 11, 2024