Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Model] Update CuGraphRelGraphConv to use pylibcugraphops=23.02 #5217

Merged
merged 8 commits into from
Feb 15, 2023
Merged

[Model] Update CuGraphRelGraphConv to use pylibcugraphops=23.02 #5217

merged 8 commits into from
Feb 15, 2023

Conversation

tingyu66
Copy link
Contributor

Description

This PR update CuGraphRelGraphConv module to use pylibcugraphops 23.02.

With pylibcugraphops now offering aggregation autograd functions, we tidy up the source file to only include the nn.Module and make a few improvements.

Detailed changes include:

  • support apply_norm option that enables normalized aggregation
  • self loop weight is fused into the the weights W for better performance
  • move max_in_degree to forward(), since it is a property of the graph not the model
  • support full graph input (we only support sampled graph before)
  • improve the test and update the example

Checklist

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [$CATEGORY] (such as [NN], [Model], [Doc], [Feature]])
  • I've leverage the tools to beautify the python and c++ code.
  • The PR is complete and small, read the Google eng practice (CL equals to PR) to understand more about small PR. In DGL, we consider PRs with less than 200 lines of core code change are small (example, test and documentation could be exempted).
  • All changes have test coverage
  • Code is well-documented
  • To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change
  • Related issue is referred in this PR
  • If the PR is for a new model/paper, I've updated the example index here.

Changes

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 20, 2023

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 20, 2023

Commit ID: 3369346191274395137e71a3933415ed50e99df9

Build ID: 1

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 26, 2023

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 26, 2023

Commit ID: 6e409d7efddb7e4027993c318fd6212e5c659992

Build ID: 2

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@tingyu66 tingyu66 changed the title [Do not merge][Model] Update CuGraphRelGraphConv to use pylibcugraphops=23.02 [Model] Update CuGraphRelGraphConv to use pylibcugraphops=23.02 Feb 2, 2023
@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 2, 2023

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 2, 2023

Commit ID: 59708d281a9f2c96ede8bc7208088b725e1f19cf

Build ID: 3

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@rudongyu rudongyu requested a review from mufeili February 2, 2023 03:01
@mufeili
Copy link
Member

mufeili commented Feb 2, 2023

@dgl-bot

@mufeili
Copy link
Member

mufeili commented Feb 2, 2023

Are there breaking changes?

@@ -214,87 +87,68 @@ def __init__(
regularizer=None,
num_bases=None,
bias=True,
activation=None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a breaking change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I saw the this comment from RelGraphConv:

# TODO(minjie): consider remove those options in the future to make
# the module only about graph convolution.

I also prefer to apply activation outside of the conv layer.

):
if has_pylibcugraphops is False:
raise ModuleNotFoundError(
"dgl.nn.CuGraphRelGraphConv requires pylibcugraphops "
"dgl.nn.CuGraphRelGraphConv requires pylibcugraphops >= 23.02 "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to check the version number of the package?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/rapidsai/dgl/blob/785e294b6e5a596df9c669eeb6ca56672a23d002/python/dgl/nn/pytorch/conv/cugraph_relgraphconv.py#L11-L13

Any version <23.02 does not have pylibcugraphops.torch.autograd API, and thus will throw an exception here.

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 2, 2023

Commit ID: 0a3f04d268015ce04085ce6782fe477077ded4bb

Build ID: 4

Status: ✅ CI test succeeded

Report path: link

Full logs path: link

self_loop : bool, optional
True to include self loop message. Default: ``True``.
dropout : float, optional
Dropout rate. Default: ``0.0``.
layer_norm : bool, optional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a breaking change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, same as the "activation" argument above.

@@ -309,57 +163,58 @@ def forward(self, g, feat, etypes, norm=None):
so any input of other integer types will be casted into int32,
thus introducing some overhead. Pass in int32 tensors directly
for best performance.
norm : torch.Tensor, optional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this replaced by apply_norm=True?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This is an additional feature from CuGraphRelGraphConv: supporting the normalized aggregation presented in the RGCN paper.

Hence, users no longer need to compute the norm during training:

for block in blocks:
block.edata['norm'] = dgl.norm_by_dst(block).unsqueeze(1)

A 1D tensor of edge norm value. Shape: :math:`(|E|,)`.
max_in_degree : int, optional
Maximum in-degree of destination nodes. It is only effective when
:attr:`g` is a :class:`DGLBlock`, i.e., bipartite graph. When
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still only valid for DGLBlock?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

import dgl
from dgl.nn import CuGraphRelGraphConv
from dgl.nn import RelGraphConv
from dgl.nn import CuGraphRelGraphConv, RelGraphConv

# TODO(tingyu66): Re-enable the following tests after updating cuGraph CI image.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this now re-enabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet, I will create a new image after our 23.02 release and update these pytest markers in a separate PR.

@@ -8,19 +8,20 @@
code changes from the current `entity_sample.py` example.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you see similar performance numbers after running this script?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, performance is the same as before.

Copy link
Member

@mufeili mufeili left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done a pass

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 10, 2023

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 10, 2023

Commit ID: 514aacf4df8715b75e2c064c94549a68061a2de0

Build ID: 5

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@tingyu66
Copy link
Contributor Author

Are there breaking changes?

Apologies for the late reply. The norm option is now replaced by apply_norm. Other breaking changes include the removal of activation and layer_norm options.

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 13, 2023

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 13, 2023

Commit ID: a03d0b3

Build ID: 6

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 15, 2023

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 15, 2023

Commit ID: 474f8d2

Build ID: 7

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@mufeili
Copy link
Member

mufeili commented Feb 15, 2023

@dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 15, 2023

Commit ID: 474f8d2

Build ID: 8

Status: ❌ CI test failed in Stage [Distributed Torch CPU Unit test].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 15, 2023

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

  • @dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 15, 2023

Commit ID: 865a0ca

Build ID: 9

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

@mufeili
Copy link
Member

mufeili commented Feb 15, 2023

@dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Feb 15, 2023

Commit ID: 865a0ca

Build ID: 10

Status: ✅ CI test succeeded

Report path: link

Full logs path: link

@mufeili mufeili merged commit 19b3cea into dmlc:master Feb 15, 2023
@tingyu66 tingyu66 deleted the update-cugraph-relgraphconv branch February 15, 2023 14:02
paoxiaode pushed a commit to paoxiaode/dgl that referenced this pull request Mar 24, 2023
…mlc#5217)

* update cugraph_relgraphconv

* update equality test

* update cugraph rgcn example

* update RelGraphConvAgg based on latest API changes

* enable fallback option to fg when fanout is large

---------

Co-authored-by: Mufei Li <[email protected]>
DominikaJedynak pushed a commit to DominikaJedynak/dgl that referenced this pull request Mar 12, 2024
…mlc#5217)

* update cugraph_relgraphconv

* update equality test

* update cugraph rgcn example

* update RelGraphConvAgg based on latest API changes

* enable fallback option to fg when fanout is large

---------

Co-authored-by: Mufei Li <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants