Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nx-cugraph: indicate which plc algorithms are used and version_added #4069

Merged
merged 13 commits into from
Jan 11, 2024

Conversation

eriknw
Copy link
Contributor

@eriknw eriknw commented Dec 21, 2023

Pretty simple PR. I would like for us to use this metadata when creating tables of supported algorithms.

@eriknw eriknw requested a review from a team as a code owner December 21, 2023 23:30
@eriknw
Copy link
Contributor Author

eriknw commented Dec 22, 2023

I'm thinking something like this:

import re

import networkx as nx
from networkx.utils.backends import _registered_algorithms as algos

from _nx_cugraph import get_info
from nx_cugraph.interface import BackendInterface


def get_funcpath(func):
    return f"{func.__module__}.{func.__name__}"


def add_branch(G, funcpath, plc):
    branch = funcpath.split(".")
    prev = branch[0]
    for i in range(2, len(branch)):
        cur = ".".join(branch[:i])
        G.add_edge(prev, cur)
        prev = cur
    if plc is not None:
        funcpath += " (" + ", ".join(sorted(plc)) + ")"
    G.add_edge(prev, funcpath)


path_to_name = {
    get_funcpath(algos[funcname]): funcname
    for funcname in get_info()["functions"].keys() & algos.keys()
}
G = nx.DiGraph()
for funcpath in sorted(path_to_name):
    funcname = path_to_name[funcpath]
    add_branch(G, funcpath, getattr(BackendInterface, funcname)._plc_names)

print(re.sub(r"[A-Za-z_\.]*\.", "", ("\n".join(nx.generate_network_text(G)))))

which creates

╙── networkx
    ├─╼ algorithms
    │   ├─╼ bipartite
    │   │   └─╼ generators
    │   │       └─╼ complete_bipartite_graph
    │   ├─╼ centrality
    │   │   ├─╼ betweenness
    │   │   │   ├─╼ betweenness_centrality (betweenness_centrality)
    │   │   │   └─╼ edge_betweenness_centrality (edge_betweenness_centrality)
    │   │   ├─╼ degree_alg
    │   │   │   ├─╼ degree_centrality
    │   │   │   ├─╼ in_degree_centrality
    │   │   │   └─╼ out_degree_centrality
    │   │   ├─╼ eigenvector
    │   │   │   └─╼ eigenvector_centrality (eigenvector_centrality)
    │   │   └─╼ katz
    │   │       └─╼ katz_centrality (katz_centrality)
    │   ├─╼ community
    │   │   └─╼ louvain
    │   │       └─╼ louvain_communities (louvain)
    │   ├─╼ components
    │   │   └─╼ connected
    │   │       ├─╼ connected_components (weakly_connected_components)
    │   │       ├─╼ is_connected (weakly_connected_components)
    │   │       ├─╼ node_connected_component (weakly_connected_components)
    │   │       └─╼ number_connected_components (weakly_connected_components)
    │   ├─╼ core
    │   │   └─╼ k_truss (k_truss_subgraph)
    │   ├─╼ dag
    │   │   ├─╼ ancestors (bfs)
    │   │   └─╼ descendants (bfs)
    │   ├─╼ isolate
    │   │   ├─╼ is_isolate
    │   │   ├─╼ isolates
    │   │   └─╼ number_of_isolates
    │   ├─╼ link_analysis
    │   │   ├─╼ hits_alg
    │   │   │   └─╼ hits (hits)
    │   │   └─╼ pagerank_alg
    │   │       └─╼ pagerank (pagerank, personalized_pagerank)
    │   ├─╼ shortest_paths
    │   │   └─╼ unweighted
    │   │       ├─╼ single_source_shortest_path_length (bfs)
    │   │       └─╼ single_target_shortest_path_length (bfs)
    │   └─╼ traversal
    │       └─╼ breadth_first_search
    │           ├─╼ bfs_edges (bfs)
    │           ├─╼ bfs_layers (bfs)
    │           ├─╼ bfs_predecessors (bfs)
    │           ├─╼ bfs_successors (bfs)
    │           ├─╼ bfs_tree (bfs)
    │           ├─╼ descendants_at_distance (bfs)
    │           └─╼ generic_bfs_edges (bfs)
    ├─╼ convert_matrix
    │   ├─╼ from_pandas_edgelist
    │   └─╼ from_scipy_sparse_array
    └─╼ generators
        ├─╼ classic
        │   ├─╼ barbell_graph
        │   ├─╼ circular_ladder_graph
        │   ├─╼ complete_graph
        │   ├─╼ complete_multipartite_graph
        │   ├─╼ cycle_graph
        │   ├─╼ empty_graph
        │   ├─╼ ladder_graph
        │   ├─╼ lollipop_graph
        │   ├─╼ null_graph
        │   ├─╼ path_graph
        │   ├─╼ star_graph
        │   ├─╼ tadpole_graph
        │   ├─╼ trivial_graph
        │   ├─╼ turan_graph
        │   └─╼ wheel_graph
        ├─╼ community
        │   └─╼ caveman_graph
        ├─╼ small
        │   ├─╼ bull_graph
        │   ├─╼ chvatal_graph
        │   ├─╼ cubical_graph
        │   ├─╼ desargues_graph
        │   ├─╼ diamond_graph
        │   ├─╼ dodecahedral_graph
        │   ├─╼ frucht_graph
        │   ├─╼ heawood_graph
        │   ├─╼ house_graph
        │   ├─╼ house_x_graph
        │   ├─╼ icosahedral_graph
        │   ├─╼ krackhardt_kite_graph
        │   ├─╼ moebius_kantor_graph
        │   ├─╼ octahedral_graph
        │   ├─╼ pappus_graph
        │   ├─╼ petersen_graph
        │   ├─╼ sedgewick_maze_graph
        │   ├─╼ tetrahedral_graph
        │   ├─╼ truncated_cube_graph
        │   ├─╼ truncated_tetrahedron_graph
        │   └─╼ tutte_graph
        └─╼ social
            ├─╼ davis_southern_women_graph
            ├─╼ florentine_families_graph
            ├─╼ karate_club_graph
            └─╼ les_miserables_graph

@rlratzel what do you think and what would you find helpful? I'm not sure if showing PLC usage is helpful to users, but I think it is to us.

It may be better to show the dispatch name in parentheses if it's different from the networkx name, and then maybe show PLC usage in a different diagram.

@eriknw
Copy link
Contributor Author

eriknw commented Dec 22, 2023

Here's an example that shows how PLC functions are used:

╟── plc.betweenness_centrality
╎   └─╼ algorithms
╎       └─╼ centrality
╎           └─╼ betweenness
╎               └─╼ betweenness_centrality
╟── plc.bfs
╎   └─╼ algorithms
╎       ├─╼ dag
╎       │   ├─╼ ancestors
╎       │   └─╼ descendants
╎       ├─╼ shortest_paths
╎       │   └─╼ unweighted
╎       │       ├─╼ single_source_shortest_path_length
╎       │       └─╼ single_target_shortest_path_length
╎       └─╼ traversal
╎           └─╼ breadth_first_search
╎               ├─╼ bfs_edges
╎               ├─╼ bfs_layers
╎               ├─╼ bfs_predecessors
╎               ├─╼ bfs_successors
╎               ├─╼ bfs_tree
╎               ├─╼ descendants_at_distance
╎               └─╼ generic_bfs_edges
╟── plc.edge_betweenness_centrality
╎   └─╼ algorithms
╎       └─╼ centrality
╎           └─╼ betweenness
╎               └─╼ edge_betweenness_centrality
╟── plc.eigenvector_centrality
╎   └─╼ algorithms
╎       └─╼ centrality
╎           └─╼ eigenvector
╎               └─╼ eigenvector_centrality
╟── plc.hits
╎   └─╼ algorithms
╎       └─╼ link_analysis
╎           └─╼ hits_alg
╎               └─╼ hits
╟── plc.k_truss_subgraph
╎   └─╼ algorithms
╎       └─╼ core
╎           └─╼ k_truss
╟── plc.katz_centrality
╎   └─╼ algorithms
╎       └─╼ centrality
╎           └─╼ katz
╎               └─╼ katz_centrality
╟── plc.louvain
╎   └─╼ algorithms
╎       └─╼ community
╎           └─╼ louvain
╎               └─╼ louvain_communities
╟── plc.pagerank
╎   └─╼ algorithms
╎       └─╼ link_analysis
╎           └─╼ pagerank_alg
╎               └─╼ pagerank
╟── plc.personalized_pagerank
╎   └─╼ algorithms
╎       └─╼ link_analysis
╎           └─╼ pagerank_alg
╎               └─╼ pagerank
╙── plc.weakly_connected_components
    └─╼ algorithms
        └─╼ components
            └─╼ connected
                ├─╼ connected_components
                ├─╼ is_connected
                ├─╼ node_connected_component
                └─╼ number_connected_components

@eriknw eriknw changed the title nx-cugraph: indicate which plc algorithms are used nx-cugraph: indicate which plc algorithms are used and version_added Dec 22, 2023
@eriknw
Copy link
Contributor Author

eriknw commented Dec 28, 2023

Other metadata that might be nice to add to @networkx_algorithm functions is whether or not they fully implement the NetworkX function (maybe use incomplete=True or is_complete=False) and whether the results are the same within floating point tolerance (maybe use results_different=True or is_identical=False). Any other ideas for metadata or better names for these?

@eriknw eriknw added the improvement Improvement / enhancement to an existing function label Jan 2, 2024
@eriknw
Copy link
Contributor Author

eriknw commented Jan 2, 2024

I just added the ability to see what functions are "incomplete" (such as unsupported API) and "different" (such as behavior of RNG). It's a little sobering, but can be a good guide for identifying potential cugraph-core work.

It might be nice to be able to filter by complete/incomplete/etc.

Also, the regex used by print_tree.py is a little fragile, so if we want to extend this to do much more, we may want to consider reimplementing instead of adding hack onto hack.

Copy link

copy-pr-bot bot commented Jan 3, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@eriknw
Copy link
Contributor Author

eriknw commented Jan 3, 2024

/ok to test

Comment on lines +62 to +68
# "backend" used in nx version >= 3.2
[project.entry-points."networkx.backends"]
cugraph = "nx_cugraph.interface:BackendInterface"

[project.entry-points."networkx.backend_info"]
cugraph = "_nx_cugraph:get_info"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, this is needed to work with dev version of NetworkX.

Copy link
Contributor

@rlratzel rlratzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very neat, thanks! I'm approving but I also had some suggestions I hope you consider. Of all of them, the single metadata dictionary is my biggest preference.

@rlratzel rlratzel added the non-breaking Non-breaking change label Jan 8, 2024
@rlratzel rlratzel linked an issue Jan 8, 2024 that may be closed by this pull request
@rlratzel rlratzel added the DO NOT MERGE Hold off on merging; see PR for details label Jan 9, 2024
@eriknw
Copy link
Contributor Author

eriknw commented Jan 11, 2024

/ok to test

@rlratzel rlratzel removed the DO NOT MERGE Hold off on merging; see PR for details label Jan 11, 2024
@rlratzel
Copy link
Contributor

/merge

@rapids-bot rapids-bot bot merged commit 88c3884 into rapidsai:branch-24.02 Jan 11, 2024
97 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update nx-cugraph README/docs for 24.02
2 participants