-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[performance] FeatGraph TVM kernels support #2136
Conversation
* change edge_ids behavior and C++ impl * fix unittests; remove utils.Index in edge_id * pass mx and th tests * pass tf test * add aten::Scatter_ * Add nonzero; impl CSRGetDataAndIndices/CSRSliceMatrix * CSRGetData and CSRGetDataAndIndices passed tests * CSRSliceMatrix basic tests * fix bug in empty slice * CUDA CSRHasDuplicate * has_node; has_edge_between * predecessors, successors * deprecate send/recv; fix send_and_recv * deprecate send/recv; fix send_and_recv * in_edges; out_edges; all_edges; apply_edges * in deg/out deg * subgraph/edge_subgraph * adj * in_subgraph/out_subgraph * sample neighbors * set/get_n/e_repr * wip: working on refactoring all idtypes * pass ndata/edata tests on gpu * fix * stash * workaround nonzero issue * stash * nx conversion * test_hetero_basics except update routines * test_update_routines * test_hetero_basics for pytorch * more fixes * WIP: flatten graph * wip: flatten * test_flatten * test_to_device * fix bug in to_homo * fix bug in CSRSliceMatrix * pass subgraph test * fix send_and_recv * fix filter * test_heterograph * passed all pytorch tests * fix mx unittest * fix pytorch test_nn * fix all unittests for PyTorch * passed all mxnet tests * lint * fix tf nn test * pass all tf tests * lint * lint * change deprecation * try fix compile * lint * update METIDS * fix utest * fix * fix utests * try debug * revert * small fix * fix utests
* upd * upd * upd * fix * upd * upd * upd * upd * upd * trigger * +1s
* mutation add_nodes and add_edges * Add support for remove_edges, remove_nodes, add_selfloop, remove_selfloop * Fix Co-authored-by: Ubuntu <[email protected]>
* add nodesy * All three * Fix * lint * Add some test case * Fix * Fix * Fix * Fix * Fix * Fix * fix * triger * Fix * fix Co-authored-by: Ubuntu <[email protected]>
update sparse, add cache for partitioned graphs, pass tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Congratulations, and thanks for your contribution to this PR.
I've left some comments, please address these concerns.
I suggest preserving old kernel code instead of deleting them. When users want to use tvm kernels, the gspmm
call would be dispatched to gspmm_featgraph
, likewise for gsddmm
.
# pass edge_mapping to tvm only when array packing will be used | ||
use_idx = edge_shuffled and num_feat_partitions > 1 and not use_bcast and use_e | ||
f_input = [indptr, indices] | ||
key = (num_rows, num_cols, nnz, op, reduce_op, u_shp, e_shp, use_idx, \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used for indexing, should find a more elegant way.
python/dgl/sparse.py
Outdated
# num_row_partitions, num_col_partitions = 2, 2 | ||
# num_feat_partitions = 2 | ||
if target == 'cuda' and (num_row_partitions > 1 or num_col_partitions > 1 or num_feat_partitions > 1): | ||
print('Partitioning not supported on GPU') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace this with logging
.
else: | ||
raise NotImplementedError | ||
# check if used for sampling | ||
generic_shape = nnz == 0 and num_rows == 0 and num_cols == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If user would like to deal with generic shape, nnz
, num_rows
and num_cols
need to be set to all zeros?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. We can use another flag, if you prefer.
fix some type bugs. eliminate reshape operator
dimensions of feature tensor have same type minor fixes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, @jermainewang I think this PR is ready to be merged.
Let's finish the remaining stuff in future PRs:
- support sampling.
- support fp16.
- support GE-SpMM.
Things todo before merging this PR.
|
After discussion we decide to use AOT solution (see #2367). |
Description
This branch merges FeatGraph TVM kernels, and implement necessary functions to be able to use them. When it is completed, it should be able to compile optimal kernels, perform partitioning if necessary, given a specific workload just-in-time though TVM.
Checklist