-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Support for Sparse Computation #4332
Comments
Welcome comment and discussion! @cylinbao @yuluny2 @tmoreau89 @Huyuwei @yzh119 @tqchen |
cc @sf-wind |
Thank @ZihengJiang for bringing up the RFC. |
Thanks @ZihengJiang for bringing up the RFC, especially the in-depth thinking to bring the representation in TACO.
Another challenge we also have to consider is that it's hard to introduce sparsity into depth-wise convolution operators, while depth-wise convolution is very useful in modern neural networks. It will be very challenging to work on sparse depth-wise convolution. |
Those are all great questions @liangfu. Question 3 is an interesting one w.r.t to what kinds of scheduling primitives we'll need for sparse operators. One easy workaround is to apply vectorization along a dense tensor dimension if there is one. For many practical examples, tensors won't be sparse along all dimensions. But the question gets more tricky when that assumption does not hold. And due to the dynamism of varying idx/val arrays, this can create an interesting discussion around how autoTVM will be used on more dynamic operators. Right now TOPI is built around the assumption that shapes are known statically. This will be an interesting implementation challenge, which probably deserves its own RFC. Perhaps @ZihengJiang the |
Thanks for the proposal. In scipy sparse, the csr_matrix has a flag indicating whether the column indices are sorted per row. Such a state may affect the performance of certain operations on the matrix. Will this information be stored in the proposed format? |
Hi there! What's the state of this issue? |
A significant driver of progress in deep learning has been advances in computational resources. While those resources are often limited, the is a trend to replace dense computation in DNN with sparse computation for speeding up / saving memory to enable larger models. For example: neural network pruning, sparse transformer. Also some new workloads like GNN relies on sparse support. It would be great if TVM can represent sparse computation workload.
There exists some sparse support in TVM already. Overall, it use the dense tensors to describe the sparse CSR/BSR tensors based on the existing Tensor DSL like here https://github.com/apache/incubator-tvm/blob/master/topi/python/topi/nn/sparse.py. However, this approach have some obvious drawbacks:
This RFC would like to discuss how to add native sparse support in TVM.
Sparse Workloads
Here are some sparse workloads that we would like to keep in mind and take into consideration during design.
From the above workloads, we can summary some requirements that our sparse support need to achieve:
After some investigation, we found that the tree hierarchy representation used by TACO and ExTensor is a good candidate.
The Tree Hierarchy Representation
The tree hierachy representation can represent tensors of any order, by constructing formats from a bounded number of primitives, e.g., specifying whether each dimension is dense of sparse. (TACO also supports many other types like range, hash, etc. but we can expand it in the future depends on the demand.) With this approach, a CSR matrix can be represented as
SparseTensor([Dense, Sparse])
, RSR asSparseTensor([Sparse, Dense])
, BSR asSparseTensor([Dense, Dense, Sparse, Sparse])
.We can found that a general/sparse tensor is actually composed by several dense arrays with the tree hierarchy representation:
A_val
is used to represent the non-zero elements of tensor A.Ai_size
is used to represent the size of tensor A's i-th dimension.Ai_pos
andAi_idx
, together form a segmented vector with one segment per entry in the previous dimension (parent node in the tree). TheAi_idx
array stores all the non-zero indices in the dimension, while theAi_pos
array stores the location in the idx array where each segment begins.Understanding the Representation with Examples
Here we will show with a 2D case to understand how the sparse tensor is represented under different formats:
Implementation
Format Declaration
A tuple-like data structure can be introduced to declare the format with the sparsity on dimensions:
SparseFormat([Dense, Sparse])
Sparse Tensor
As a counterpart of original dense
Tensor
, aSparseTensor
class is a symbolic representation for sparse tensor, which is used during sparse code generation, composed bypos_arrs
,idx_arrs
,val_arr
.DSL Enhancement
To enhance existed DSL with ability to declare sparse computation, here are some approaches we can try.
This approach adds new operators like
SparsePlaceholder
,SparseComputeOp
, with additionalsformat
field compared with origial operations.decl_buffer
We can also enhances
decl_buffer
that let user can declare a sparse buffer and bind it with tensor while building. One thing is this approach separate sparse property with computation declaration. It sounds cool that we can represent dense and sparse computation with the same declaration, but it also means that we don't have such information until scheduling.Extension for Scheduling
TODO
The text was updated successfully, but these errors were encountered: