-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sample doesn't work correctly when nodes are repeated in input #208
Comments
Can you explain a bit more on the first issue? Not yet sure I understand the issue. I think the second issue is not an issue at all since you need to take the remapping of node indices into account. That is node 0 stays node 0, but node 1 becomes node 2 (as indicated by the first output |
ohh I see, the row/col are relative to the nodes in the output? If so, then agreed. And if this is the case.. then actually the first issue is also a non-issue. Sorry for the false alarm! |
Here is an example of what I am seeing
returns a data object with
I'm not convinced this is correct. I see that the edges are correct in the sense that no new connectivity is added, but this subgraph has a higher 'in degree' for node Thoughts? |
Interesting. I think this is correct in a sense that there is no bug in the code. Notably, the in-degree statistics are also correct which ensures that GNN ops should work correctly as well. It is nonetheless indeed a bit weird that we share data across duplicated input nodes (and we may want to eventually fix that), but it shouldn't block us from implementing a link-level neighbor loader. |
True, the in degree is unaffected. Though it may result in some strange
behavior if anyone's model is adding self loops or making the graph
undirected (not sure if anyone would), it also makes it a tad ambiguous
which edge you are making a prediction for and would affect the loss
somewhat.
But agreed, maybe they can move forward in parallel at least. I think I can
see why this is happening in torch_sparse and if you don't mind I can try
to fix it.
…On Tue, 15 Mar 2022, 11:52 pm Matthias Fey, ***@***.***> wrote:
Interesting. I think this is correct in a sense that there is no bug in
the code. Notably, the in-degree statistics are also correct which ensures
that GNN ops should work correctly as well.
It is nonetheless indeed a bit weird that we share data across duplicated
input nodes (and we may want to eventually fix that), but it shouldn't
block us from implementing a link-level neighbor loader.
—
Reply to this email directly, view it on GitHub
<#208 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGRPN2LF3ABIR3TSBCJTN3VACWZVANCNFSM5QUTY7UA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you authored the thread.Message ID:
***@***.***>
--
By communicating with Grab Inc and/or its subsidiaries, associate
companies and jointly controlled entities (“Grab Group”), you are deemed to
have consented to the processing of your personal data as set out in the
Privacy Notice which can be viewed at https://grab.com/privacy/
<https://grab.com/privacy/>
This email contains confidential information
and is only for the intended recipient(s). If you are not the intended
recipient(s), please do not disseminate, distribute or copy this email
Please notify Grab Group immediately if you have received this by mistake
and delete this email from your system. Email transmission cannot be
guaranteed to be secure or error-free as any information therein could be
intercepted, corrupted, lost, destroyed, delayed or incomplete, or contain
viruses. Grab Group do not accept liability for any errors or omissions in
the contents of this email arises as a result of email transmission. All
intellectual property rights in this email and attachments therein shall
remain vested in Grab Group, unless otherwise provided by law.
|
Sounds good to me. Thanks for being so careful! Currently, the |
Good thought. I'll probably not have time until the weekend for the changes on this repo, will take a look then. |
Do you think its fair to say that |
Yes, that is correct. If we think about the message passing flow of a GNN, we actually start sampling from our destination nodes and sample new source nodes. |
Yep makes sense. I figured that out after a lot of head scratching but just
wanted to make sure. Cheers. A
…On Sat, 19 Mar 2022, 11:17 pm Matthias Fey, ***@***.***> wrote:
Yes, that is correct. If we think about the message passing flow of a GNN,
we actually start sampling from our destination nodes and sample new source
nodes.
—
Reply to this email directly, view it on GitHub
<#208 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGRPNYNS4YNQ2I5ERL6VE3VAXVW3ANCNFSM5QUTY7UA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you authored the thread.Message ID:
***@***.***>
--
By communicating with Grab Inc and/or its subsidiaries, associate
companies and jointly controlled entities (“Grab Group”), you are deemed to
have consented to the processing of your personal data as set out in the
Privacy Notice which can be viewed at https://grab.com/privacy/
<https://grab.com/privacy/>
This email contains confidential information
and is only for the intended recipient(s). If you are not the intended
recipient(s), please do not disseminate, distribute or copy this email
Please notify Grab Group immediately if you have received this by mistake
and delete this email from your system. Email transmission cannot be
guaranteed to be secure or error-free as any information therein could be
intercepted, corrupted, lost, destroyed, delayed or incomplete, or contain
viruses. Grab Group do not accept liability for any errors or omissions in
the contents of this email arises as a result of email transmission. All
intellectual property rights in this email and attachments therein shall
remain vested in Grab Group, unless otherwise provided by law.
|
Closing this one as we've agreed not to do anything about it in pytorch_sparse, thanks @rusty1s |
pytorch_sparse/csrc/cpu/neighbor_sample_cpu.cpp
Line 108 in 3ec1eac
Working on pyg-team/pytorch_geometric#4026 I discovered that this sampling function does not work properly when the
input_node
array has duplicates. In this case the number of samples can be > number of edges and the indexi
doesn't do what you'd expect it to.Another strange behaviour is when using the function as
The returned values are
but the (row, col) combination (2, 0) and (0, 2) don't make sense, they're not part of the original adjacency matrix.
I'm happy to dive in a bit deeper and figure out a fix (for at least the first issue, the second issue I am not sure if its actually a problem or not yet), but thought I'd check first if this is known or maybe my understanding is incorrect.
The text was updated successfully, but these errors were encountered: