-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW]Add remote storage support #2859
[REVIEW]Add remote storage support #2859
Conversation
…to graph creation extensions, changed how unloading extensions work to unload by unique module name, make numpy and cupy optional modules in the client, updated Value and ValueWrapper to handle doubles and lists (recursive), added tests for new extension changes.
…one, added test for extension modules returning results to client GPUs (code not done yet).
…2-cugraph_service_extension_enhancements
…(), updated tests, added test for upcoming ability to load extensions via import paths.
…kage path (eg. import foo.bar.baz), updated test.
…ns, added tests for error conditions, added benchmark for call_extension test.
…2-cugraph_service_extension_enhancements
…raph into add_remore_storage_support
rerun tests |
Codecov ReportBase: 62.76% // Head: 60.75% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## branch-22.12 #2859 +/- ##
================================================
- Coverage 62.76% 60.75% -2.01%
================================================
Files 117 122 +5
Lines 6451 6824 +373
================================================
+ Hits 4049 4146 +97
- Misses 2402 2678 +276
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly minor things, looks fine to me otherwise.
python/cugraph/cugraph/gnn/dgl_extensions/cugraph_service_store.py
Outdated
Show resolved
Hide resolved
"are more than one node types." | ||
) | ||
) | ||
ntype = ntypes[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does ntype
get used after this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, Yup, this was a bug. I now pass it down here.
cugraph/python/cugraph/cugraph/gnn/dgl_extensions/feature_storage.py
Lines 146 to 151 in d24ab52
result = self.pg.get_vertex_data( | |
vertex_ids=indices, columns=self.columns, types=self.types_to_fetch | |
) | |
else: | |
result = self.pg.get_edge_data( | |
edge_ids=indices, columns=self.columns, types=self.types_to_fetch |
We previously had a bug in PG which got fixed by this #2523 but i missed linking it so this slipped through.
I have added tests here:
cugraph/python/cugraph/cugraph/tests/test_dgl_extension_graph_store.py
Lines 391 to 423 in d24ab52
def test_get_node_storage_ntypes(): | |
node_ser = cudf.Series([1, 2, 3]) | |
feat_ser = cudf.Series([1.0, 1.0, 1.0]) | |
df = cudf.DataFrame({"node_ids": node_ser, "feat": feat_ser}) | |
pg = PropertyGraph() | |
gs = CuGraphStore(pg, backend_lib="cupy") | |
gs.add_node_data(df, "node_ids", ntype="nt.a") | |
node_ser = cudf.Series([4, 5, 6]) | |
feat_ser = cudf.Series([2.0, 2.0, 2.0]) | |
df = cudf.DataFrame({"node_ids": node_ser, "feat": feat_ser}) | |
gs.add_node_data(df, "node_ids", ntype="nt.b") | |
# All indices from a single ntype | |
output_ar = gs.get_node_storage(key="feat", ntype="nt.a").fetch([1, 2, 3]) | |
cp.testing.assert_array_equal(cp.asarray([1, 1, 1], dtype=cp.float32), output_ar) | |
# Indices from other ntype are ignored | |
output_ar = gs.get_node_storage(key="feat", ntype="nt.b").fetch([1, 2, 5]) | |
cp.testing.assert_array_equal(cp.asarray([2.0], dtype=cp.float32), output_ar) | |
def test_get_edge_storage_gs(dataset1_CuGraphStore): | |
etype = "('user', 'relationship', 'user')" | |
fs = dataset1_CuGraphStore.get_edge_storage("relationships_k", etype) | |
relationship_t = fs.fetch([6, 7, 8], device="cuda") | |
relationships_df = create_df_from_dataset( | |
dataset1["relationships"][0], dataset1["relationships"][1] | |
) | |
cudf_ar = relationships_df["relationship_type"].iloc[[0, 1, 2]].values | |
assert cp.allclose(cudf_ar, relationship_t) |
) | ||
) | ||
|
||
etype = etypes[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does etype
get used after this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added tests here, similar to ntype.
cugraph/python/cugraph/cugraph/tests/test_dgl_extension_graph_store.py
Lines 413 to 423 in d24ab52
def test_get_edge_storage_gs(dataset1_CuGraphStore): | |
etype = "('user', 'relationship', 'user')" | |
fs = dataset1_CuGraphStore.get_edge_storage("relationships_k", etype) | |
relationship_t = fs.fetch([6, 7, 8], device="cuda") | |
relationships_df = create_df_from_dataset( | |
dataset1["relationships"][0], dataset1["relationships"][1] | |
) | |
cudf_ar = relationships_df["relationship_type"].iloc[[0, 1, 2]].values | |
assert cp.allclose(cudf_ar, relationship_t) |
python/cugraph/cugraph/gnn/dgl_extensions/cugraph_service_store.py
Outdated
Show resolved
Hide resolved
…e.py Co-authored-by: Rick Ratzel <[email protected]>
…Jawa/cugraph into add_remore_storage_support
@rlratzel , All reviews have been addressed. Please give this another look |
|
||
# TODO: Cant send dlpack or cupy arrays or numpys arrays | ||
# through extensions | ||
# Ask Rick |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should create an issue for this and mention it here, remove "ask Rick"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved with bf49417
edge_data = self.__client.get_graph_edge_data( | ||
id_or_ids=edge_ids or -1, | ||
id_or_ids=ids, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reason for this change? There is no functional change here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This throws an error when you have array like objects . I dont like or
when being used with objects that can be array like.
import numpy as np
np.asarray([1,2]) or -1
Traceback (most recent call last):
File "main.py", line 2, in <module>
np.asarray([1,2]) or -1
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
python/cugraph/cugraph/tests/test_dgl_extension_remote_wrappers.py
Outdated
Show resolved
Hide resolved
…s.py Co-authored-by: Alex Barghi <[email protected]>
…Jawa/cugraph into add_remore_storage_support
@alexbarghi-nv , @rlratzel , Addressed all requested reviews, let me know. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@gpucibot merge |
Add remote CuGraphRemoteStore support to enable using of graph service with DGL .
Tests:
This PR depends upon:
#2855
#2850