-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Link-level NeighborLoader
#4026
Comments
How is the progress? on the LinkLevelNeighborLoader..? |
We will post here once we make progress. Sorry for the delay. |
Hey @rusty1s want a hand with this one? |
Help is always good, thank you! Let me know how we want to proceed with this. @RexYing might have further thoughts. |
If I'm picking it up I would plan to start with the proposed API in the top of this issue and see how it would look for regular and heterogenous graphs. I need to play around a bit to understand the requirements. With a working example (even if a bit of a hack) we can align on the rest of the implementation details? What do you think? |
We could follow along with that in a similar fashion as in your label masked prop PR, and discuss as we go :) |
Sounds good to me :-)
…On Wed, Mar 9, 2022 at 4:00 PM Matthias Fey ***@***.***> wrote:
We could follow along with that in a similar fashion as in your label
masked prop PR, and discuss as we go :)
—
Reply to this email directly, view it on GitHub
<#4026 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGRPN5J5G4F2366SWR34NLU7BLAZANCNFSM5NZ5CGPA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you commented.Message ID:
***@***.***>
--
By communicating with Grab Inc and/or its subsidiaries, associate
companies and jointly controlled entities (“Grab Group”), you are deemed to
have consented to the processing of your personal data as set out in the
Privacy Notice which can be viewed at https://grab.com/privacy/
<https://grab.com/privacy/>
This email contains confidential information
and is only for the intended recipient(s). If you are not the intended
recipient(s), please do not disseminate, distribute or copy this email
Please notify Grab Group immediately if you have received this by mistake
and delete this email from your system. Email transmission cannot be
guaranteed to be secure or error-free as any information therein could be
intercepted, corrupted, lost, destroyed, delayed or incomplete, or contain
viruses. Grab Group do not accept liability for any errors or omissions in
the contents of this email arises as a result of email transmission. All
intellectual property rights in this email and attachments therein shall
remain vested in Grab Group, unless otherwise provided by law.
|
Hey @rusty1s I'm trying to understand something in the existing As for this change, after reading the code, reading the issue and playing around here is my rough plan (in order of steps):
There are a couple of random questions in my mind which I may not worry about too much right now but feel free to comment on:
What do you think? |
Sorry one more question: The hack in |
Yes, this is intentional. There rarely exists use-cases where we perform node classification across different node types. The reason we currently restrict it is more due to implementation details though, as we somehow need to map the indices produced by the underlying PyTorch I think your roadmap is super useful. Thanks a lot for setting this up. Regarding your questions:
I think the user still needs to specify the links to compute embeddings for. That's what I originally meant with the
Yes, I think so. In the end, we simply sample
Can you explain what you mean? |
My understanding from https://github.com/snap-stanford/ogb/blob/master/examples/linkproppred/citation2/sampler.py#L17-L41
Is that we take the nodes from start and end of each edge in the batch and then do neighbourhood expansion. But there may be duplicate nodes in the result of this |
I agree. I think the implementation is easier if we do not merge duplicated nodes, and the gains in efficiency may by negligible. This also aligns with the intuition that each example in a batch is isolated from each other. |
Okay agreed. Thanks for the thoughts! |
Hello, thanks for adding this feature! I have a small question. If I want to use the result of RandomLinkSplit to generate batch by LinkNeighborLoader, it will produce an IndexError. I think it may be due to the generated edge_label_index attribute by RandomLinkSplit. The shape of edge_label_index is [2, num_edges]. But in the LinkNeighborLoader, if the key of split attribute is an edge attribute, it will only select index from dimension zero, which will cause an IndexError. Do you have some suggestions for this case? Thank you very much! |
Hi @shishixuezi thanks for the report - could you provide a short example of what you're doing exactly and I can take a look to see. |
Hello, @Padarn Thank you for your reply. I created a toy case, please check. Thank you very much!
|
Thanks for the example, I'll take a look as soon as I get a chance.
…On Wed, 11 May 2022, 12:41 pm ssxz, ***@***.***> wrote:
Hello, @Padarn <https://github.com/Padarn>! Thank you for your reply! I
created a toy case, please check! Thank you very much!
`
import torch
from torch_geometric.data import Data
from torch_geometric.loader import LinkNeighborLoader
import torch_geometric.transforms as T
def main():
edge_index = torch.tensor([[0, 1, 1, 2, 0, 1, 2],
[1, 0, 2, 1, 3, 3, 3]], dtype=torch.long)
x = torch.tensor([[-1], [0], [1], [4]], dtype=torch.float)
edge_attr = torch.tensor([[1.0], [2.0], [1.0], [1.0], [1.0], [1.0],
[1.0]], dtype=torch.float)
data = Data(x=x, edge_index=edge_index, edge_attr=edge_attr)
transform = T.Compose([
T.NormalizeFeatures(),
T.ToDevice('cuda' if torch.cuda.is_available() else 'cpu'),
T.RandomLinkSplit(num_val=0.1, num_test=0.05, is_undirected=False,
add_negative_train_samples=False, neg_sampling_ratio=0.0,
key='edge_attr')])
train_data, val_data, test_data = transform(data)
# No Problem
# loader = LinkNeighborLoader(data, num_neighbors=[2]*2
# Cause Error
loader = LinkNeighborLoader(train_data, num_neighbors=[2]*2)
print(next(iter(loader)))
if *name* == '*main*':
main()
`
—
Reply to this email directly, view it on GitHub
<#4026 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGRPN4HUHRHJHA52D3K4BDVJM257ANCNFSM5NZ5CGPA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
By communicating with Grab Inc and/or its subsidiaries, associate
companies and jointly controlled entities (“Grab Group”), you are deemed to
have consented to the processing of your personal data as set out in the
Privacy Notice which can be viewed at https://grab.com/privacy/
<https://grab.com/privacy/>
This email contains confidential information
and is only for the intended recipient(s). If you are not the intended
recipient(s), please do not disseminate, distribute or copy this email
Please notify Grab Group immediately if you have received this by mistake
and delete this email from your system. Email transmission cannot be
guaranteed to be secure or error-free as any information therein could be
intercepted, corrupted, lost, destroyed, delayed or incomplete, or contain
viruses. Grab Group do not accept liability for any errors or omissions in
the contents of this email arises as a result of email transmission. All
intellectual property rights in this email and attachments therein shall
remain vested in Grab Group, unless otherwise provided by law.
|
So I see the problem, but I can't yet think of a clean fix. A workaround you could use for now:
I'll raise a MR with a potential fix. |
Fixed via #4629. |
Hi, I'm trying to import the LinkNeighborLoader in JupyterLab but I'm getting this error: ImportError: cannot import name 'LinkNeighborLoader' from 'torch_geometric.loader' (/Users/cbrumar/.local/share/virtualenvs/gnn-TG0lFQrB/lib/python3.9/site-packages/torch_geometric/loader/init.py) I'm using PyTorch Geometric version 2.0.4 and I installed it using pip. Edit: I am using PyTorch version 1.11.0. |
You need to install PyG master or from nightly. |
Thank you, @rusty1s! |
Hey! I'm currently working on an edge classification problem in an environment in which I can't use the LinkNeighborLoader due to irrelevant constraints. I didn't fully understand the previous work-around: Thanks in advance. |
If you cannot use the new |
Thanks for your response. how can I use the 'PositiveLinkNeighborSampler' in the OGB example to sample a subgraph for specific edges, similarly to 'edge_label_index' in 'LinkNeighborLoader'? |
You can save it in the constructor, initialize |
@rusty1s When I apply the LinkNeighborLoader on training data with negative samples, it turned out target labels 0,1,2. I'm trying to understand how target label=2 shows up given that there is only two class label for link prediction task? Thanks. |
When using |
Hi, While using
it throws me this error: reference code:
torch 2.0.1. What could be the issue? |
What does import torch
import pyg_lib
print(pyg_lib.__version__)
print(torch.ops.pyg.neighbor_sample) return? |
🚀 The feature, motivation and pitch
Currently,
NeighborLoader
is designed to be applied in node-level tasks and there exists no option for mini-batching in link-level tasks.To achieve this, users currently rely on a simple but hacky workaround, first utilized in
ogbl-citation2
in this example.The idea is straightforward and simple: For
input_nodes
, we pass in both the source and destination nodes for every link we want to do link prediction on (both positive and negative):Nonetheless, PyG should provide a dedicated class to perform mini-batch on link-level tasks, re-using functionality from
NeighborLoader
under-the-hood. An API could look like:NOTE: This workaround currently only works for homogenous graphs!
@RexYing @JiaxuanYou
The text was updated successfully, but these errors were encountered: