-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate renumbering and compression to cugraph-dgl
to accelerate MFG creation
#3887
Integrate renumbering and compression to cugraph-dgl
to accelerate MFG creation
#3887
Conversation
…ugraph-sample-convert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has no C/C++ updates and approving without review.
Using batch_size: 128, batches_per_partition: 50. COOCSC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor changes but mostly looks great. Thanks again @tingyu66 for the great work here.
batch_size: int = 1024, | ||
drop_last: bool = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tingyu66 , Can we also change seeds_per_call
to 100_000
to make a better default based on your testing ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we change it to the default value of BulkSampler: 200_000
? After our call the other day, I tested a wide range of seeds_per_call
values and none of the runs threw a OOM error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, if it just works we can probably set the default to None and let upstream handle it ? What do you think, any default which is reasonable and just works is fine by me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the value to 200_000
in 45f93f2 to align with BulkSampler. Did not set to None to avoid the extra step of handling None case.
major_offsets = df.major_offsets.dropna().values | ||
label_hop_offsets = df.label_hop_offsets.dropna().values | ||
renumber_map_offsets = df.renumber_map_offsets.dropna().values | ||
renumber_map = df.map.dropna().values | ||
minors = df.minors.dropna().values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this assumes that the length of the renumber_map is smaller than major_offsets. I will check this again if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we are making any assumptions here. renumber_map
and major_offsets
are simply two different tensors that happen to be stored in a single df.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, My bad I just re-read the code below and I think we should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For each batch:
- The length of renumber_map = number of distinct nodes (i.e., the number of sources in hop 0)
- The length of major_offsets = sum of number of destination nodes in each hop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Curious: What is the batch size being tested here , @tingyu66 ? |
batch_size: 128, batches_per_partition: 50. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/merge |
Allow cugraph-dgl dataloader to consume sampled outputs from BulkSampler in CSC format.