Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetching from 22.04 after random walk bug was merged #15

Conversation

betochimas
Copy link
Owner

No description provided.

betochimas and others added 5 commits February 19, 2022 02:40
Increases consistency between the API docs and the namespace in which the method can be called from, reducing potential confusion. Ex: cugraph.centrality.betweenness_centrality.betweenness_centrality can now be found as cugraph.betweenness_centrality. Note: both are valid ways of calling the same method.

Includes methods and attributes of a graph object once data is loaded (from SimpleGraphImpl), such as number_of_vertices, has_edge, get_two_hops_neighbors, etc.

Ready for review.

Authors:
  - https://github.com/betochimas

Approvers:
  - Brad Rees (https://github.com/BradReesWork)

URL: #2086
This pull request adds neighborhood sampling, as needed by GNN frameworks (DGL, PyTorch-Geometric).

Since I did not hear back on most of the other issues that need to be addressed before this, I am continuing with my plan of first opening a PR with just the API. Once we agree on the final API, and once a minimal version of cugraph-ops is integrated, we can add the implementation of this API.

In particular, for now I am suggesting that the sampling type is exposed in the public API (it does not exist yet in cugraph-ops since that has not been integrated yet). This must be decided ahead of sampling for best performance (either by the end user or some automatic heuristic on the original graph), which is why it makes sense to have as a separate parameter for this API.

EDIT: link to issue #1978

Authors:
  - Matt Joux (https://github.com/MatthiasKohl)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Robert Maynard (https://github.com/robertmaynard)
  - Andrei Schaffer (https://github.com/aschaffer)
  - Chuck Hastings (https://github.com/ChuckHastings)

URL: #1982
Previously, graph creation requires roughly edge data size * 2 + alpha (if we can destroy the input edge data and reclaim the memory, * 3 if the input edge data cannot be destroyed). This PR cuts this to edge data size * 1.5 + alpha.

This is a breaking PR for projects using functions under https://github.com/rapidsai/cugraph/blob/branch-22.04/cpp/include/cugraph/utilities/shuffle_comm.cuh (@aschaffer You may be affected).

There will be a follow-up PR further optimizing memory footprint. Wrapping up this PR to avoid creating a giant PR.

Authors:
  - Seunghwa Kang (https://github.com/seunghwak)

Approvers:
  - Chuck Hastings (https://github.com/ChuckHastings)

URL: #2070
…#2080)

If you coarsen a symmetric (i.e. undirected) graph, the output graph should be symmetric as well.

However, due to limited floating point resolution, edge weights can be slightly asymmetric after coarsening (e.g. for a triplet of src, dst, weight, we may see (1, 2, 1.0) and its reverse edge (2, 1, 1.0 + 1e-7), this is only approximately symmetric and not strictly symmetric).

This PR fixes this by coarsening using only the lower triangular part (including the diagonal) after relabeling and reconstructing a symmetric graph from the lower triangular part (if the input graph is symmetric).

Authors:
  - Seunghwa Kang (https://github.com/seunghwak)

Approvers:
  - Chuck Hastings (https://github.com/ChuckHastings)

URL: #2080
Random walk implementation returns a list of paths (vertex ids) and a list of edge weights for the edges on those paths.

@betochimas discovered that the size of the returned arrays is the same, even though the contents are different sizes.  In his example, if the return path contains one path:  `[ 0, 1, 3, 5 ]` it returns a weights array with 4 elements `[0.1, 2.1, 7.2, ???]` even though only the first three elements are valid.  The weights are edge weights, the path `[0,1,3,5]` only has 3 edges.  The first 3 elements in the returned array are correct.  Interpreting the results as intended will never reference that value.  However the sizes should be correct.

This PR fixes the initialization of the arrays to the correct size, which seems to correct the bug.

Authors:
  - Chuck Hastings (https://github.com/ChuckHastings)

Approvers:
  - https://github.com/betochimas
  - Seunghwa Kang (https://github.com/seunghwak)

URL: #2089
@betochimas betochimas merged commit 53d309a into betochimas:branch-22.04-fea-node2vec-pylibcugraph Feb 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants