-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Enable using cugraph uniform sampling in multi client environments #3184
[FEA] Enable using cugraph uniform sampling in multi client environments #3184
Conversation
Codecov ReportBase: 55.22% // Head: 55.30% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## branch-23.02 #3184 +/- ##
================================================
+ Coverage 55.22% 55.30% +0.07%
================================================
Files 148 148
Lines 9543 9573 +30
================================================
+ Hits 5270 5294 +24
- Misses 4273 4279 +6
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
That bug is also fixed in 3148 |
Could you add some documentation? Is setting |
Just a code comment is probably sufficient, maybe link to dask documentation. |
Done, added both. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thanks. Just had a few questions below.
python/cugraph/cugraph/dask/sampling/uniform_neighbor_sample.py
Outdated
Show resolved
Hide resolved
python/cugraph/cugraph/dask/sampling/uniform_neighbor_sample.py
Outdated
Show resolved
Hide resolved
python/cugraph/cugraph/dask/sampling/uniform_neighbor_sample.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
marking as "request changes" to prevent accidental merging while past feedback is being addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
/merge |
This PR makes `cugraph.structure.graph_classes` dask serializable and de-serializable. This means that we can cleanly use dask to transmit graph objects b/w different clients without much work and using all ready supported API. Depends on PR: ~#3184 (PR is IN) and closes #3193 See docs here: https://distributed.dask.org/en/stable/publish.html **On client 0** ```python3 print(type(g)) client.publish_dataset(mg_graph_for_sampling=g) ``` ``` <class 'cugraph.structure.graph_classes.MultiGraph'> ``` **On client 1** ```python3 g = client_worker_1.get_dataset('mg_graph_for_sampling') type(g) ``` ``` cugraph.structure.graph_classes.MultiGraph ``` CC: @rlratzel Authors: - Vibhu Jawa (https://github.com/VibhuJawa) Approvers: - Rick Ratzel (https://github.com/rlratzel) - Erik Welch (https://github.com/eriknw) URL: #3192
In multi client environments our sampling algorithms tend to hang, this PR adds a lock to prevent that from happening. This PR closes #3186
See notebooks here with MRE to reproduce.
Before PR (HANG):
After PR (NO HANG):
I have not added tests for this here because mocking dask with multi client environment is not possible in CI yet so we will do this as future work.
Also includes a bug fix in our sampling code path.