-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[sparse] slice for csr on two dimensions, cpu implementation #8331
Conversation
src/operator/tensor/matrix_op-inl.h
Outdated
CHECK_EQ(in_attrs->size(), 1); | ||
CHECK_EQ(out_attrs->size(), 1); | ||
const SliceAxisParam& param = nnvm::get<SliceAxisParam>(attrs.parsed); | ||
const auto& in_stype = in_attrs->at(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for & if in_stype is not changed.
src/operator/tensor/matrix_op-inl.h
Outdated
|
||
template<typename xpu> | ||
void SliceAxisEx(const nnvm::NodeAttrs& attrs, | ||
const OpContext& ctx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: indentation
src/operator/tensor/matrix_op-inl.h
Outdated
|
||
const SliceAxisParam& param = nnvm::get<SliceAxisParam>(attrs.parsed); | ||
auto in_stype = inputs[0].storage_type(); | ||
CHECK_NE(in_stype, kDefaultStorage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can remove this check and print operator_info(ctx, ..)
in line 1060
src/operator/tensor/matrix_op-inl.h
Outdated
} else if (param.axis == 1) { | ||
SliceAxisOneCsrImpl<xpu>(param, ctx, inputs[0], req[0], outputs[0]); | ||
} else { | ||
LOG(FATAL) << "CSRNDArray is only for 2-D shape"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it fail with negative axis? I think GetSliceAxisParams already handles negative axis for you
@@ -127,9 +127,27 @@ def check_slice_nd_csr_fallback(shape): | |||
result_dense = mx.nd.slice(mx.nd.array(A2), begin=(start, shape[1] - 1), end=(end + 1, shape[1])) | |||
assert same(result_dense.asnumpy(), result.asnumpy()) | |||
|
|||
shape = (rnd.randint(2, 10), rnd.randint(1, 10)) | |||
def check_sparse_nd_csr_slice_axis(shape): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's also add some test cases for negative axis
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
src/operator/tensor/matrix_op-inl.h
Outdated
CHECK_NE(req, kWriteInplace) << "kWriteInplace for SliceAxis on CSR input is not supported"; | ||
int axis, begin, end; | ||
GetSliceAxisParams(param, in.shape(), &axis, &begin, &end); | ||
int indptr_len = in.shape()[0] + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use nnvm::dim_t
(int64_t) instead because shape[i] is 64 bits
src/operator/tensor/matrix_op-inl.h
Outdated
|
||
template<typename xpu> | ||
void SliceAxisOneCsrImpl(const SliceAxisParam ¶m, const OpContext& ctx, | ||
const NDArray &in, OpReqType req, const NDArray &out) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: indentation
src/operator/tensor/matrix_op-inl.h
Outdated
RType *out_indptr = out.aux_data(kIndPtr).dptr<RType>(); | ||
int nnz = 0; | ||
out_indptr[0] = 0; | ||
for (int i=0; i < indptr_len - 1; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also use nnvm::dim_t for i and j
src/operator/tensor/matrix_op-inl.h
Outdated
for (int i=0; i < indptr_len - 1; i++) { | ||
out_indptr[i+1] = out_indptr[i]; | ||
for (int j=in_indptr[i]; j < in_indptr[i+1]; j++) { | ||
if (in_idx[j] >= begin && in_idx[j] < end) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
continue if in_idx[j] >= end instead of scanning the rest, since indices are sorted per row?
src/operator/tensor/matrix_op-inl.h
Outdated
DType *out_data = out.data().dptr<DType>(); | ||
|
||
Stream<xpu> *s = ctx.get_stream<xpu>(); | ||
Kernel<SliceAxisOneCsrAssign, xpu>::Launch(s, indptr_len-1, out_idx, out_data, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it work when nnz = 0? Is that tested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. If nnz=0
, kernel launch will return immediately. Test for slice_axis(zeros, ...)
is added.
I think slice axis is deprecated, we are using slice now |
OK, I will change it to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments regarding docs. Also adding @anirudh2290 for review
src/operator/tensor/matrix_op.cc
Outdated
@@ -264,7 +264,7 @@ The resulting array's *k*-th dimension contains elements | |||
from the *k*-th dimension of the input array with the open range ``[b_k, e_k)``. | |||
|
|||
For an input array of non-default storage type(e.g. `csr` or `row_sparse`), it only supports |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think row_sparse is not supported for slice. Let's remove this sentence in the doc.
The storage type of ``slice`` output depends on storage types of inputs:
- slice(csr) = csr
- slice(default) = default
src/operator/tensor/matrix_op-inl.h
Outdated
@@ -601,18 +590,127 @@ void SliceCsrImpl(const SliceParam ¶m, const OpContext& ctx, | |||
}); | |||
} | |||
|
|||
// slice a CSRNDArray for two dimensions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add more documentation for the kernels like this one https://github.com/apache/incubator-mxnet/blob/master/src/operator/tensor/dot-inl.cuh#L40
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for adding this operator!
out.CheckAndAllocAuxData(kIndPtr, Shape1(indptr_len)); | ||
if (!in.storage_initialized()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens here if input is a CSR Array with all zeroes ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your comments. If input is zeros, kernel launch will return immediately. Unittest for zeros input case is added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that still true on GPU, when we add GPU support? This PR is dealing with some bugs for zero inputs for dot operator #8470
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For CSRNDArray, storage_initialized()
return aux_shape(0).Size() != 0
, I think it is always true for a valid CSRNDArray except for rank-0 array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to returning csr zeros immediately if nnz=0.
src/operator/tensor/matrix_op-inl.h
Outdated
if (!in.storage_initialized()) { | ||
out.set_aux_shape(kIndPtr, Shape1(0)); | ||
return; | ||
} | ||
// assume idx indptr share the same type | ||
MSHADOW_IDX_TYPE_SWITCH(in.aux_type(kIndPtr), RType, { | ||
MSHADOW_IDX_TYPE_SWITCH(in.aux_type(kIdx), IType, { | ||
MSHADOW_TYPE_SWITCH(in.dtype(), DType, { | ||
auto in_indptr = in.aux_data(kIndPtr).dptr<RType>(); | ||
auto out_indptr = out.aux_data(kIndPtr).dptr<RType>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid auto for in_indptr and out_indptr
src/operator/tensor/matrix_op-inl.h
Outdated
@@ -592,7 +581,7 @@ void SliceCsrImpl(const SliceParam ¶m, const OpContext& ctx, | |||
auto out_idx = out.aux_data(kIdx).dptr<IType>(); | |||
auto in_data = in.data().dptr<DType>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid auto here and use IType and DType
src/operator/tensor/matrix_op-inl.h
Outdated
out_indptr[i+1] = out_indptr[i]; | ||
for (RType j = in_indptr[i + begin_row]; | ||
j < in_indptr[i + begin_row + 1]; j++) { | ||
if (in_idx[j] >= end_col) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add one line comment for the if, else if logic here. Also why not just if (in_idx[j] >= begin_col && in_idx < end_col) ?
src/operator/tensor/matrix_op-inl.h
Outdated
const int begin, const int end) { | ||
RType ind = out_indptr[i]; | ||
for (RType j = in_indptr[i]; j < in_indptr[i+1]; j++) { | ||
if (in_idx[j] >= end) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason to not just do if (in_idx[j] >= begin_col && in_idx < end_col)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indices of csr ndarray is in ascending order per row. So if indice >= end
, there is no need to continue the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was suggesting this change for readability. Also, you would be doing the checks for all in_idx[j] < begin_col which will be avoided with the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used this condition, in_idx[j] >= begin_col && in_idx < end_col, at the first time. But according to @eric-haibin-lin 's comments, this logic should be changed to a if/else logic which can jump out of the loop since indices are sorted per row.
…8331) * slice axis for csr (cpu impl) * fix indice bug and use kernel launch * small fix * misc updates to address comments * fix type * csr slice * unittest * fix lint * address comments * return csr zeros before kernel launch if nnz=0 * fix
…8331) * slice axis for csr (cpu impl) * fix indice bug and use kernel launch * small fix * misc updates to address comments * fix type * csr slice * unittest * fix lint * address comments * return csr zeros before kernel launch if nnz=0 * fix
…8331) * slice axis for csr (cpu impl) * fix indice bug and use kernel launch * small fix * misc updates to address comments * fix type * csr slice * unittest * fix lint * address comments * return csr zeros before kernel launch if nnz=0 * fix
Description
slice_axis
for csr, cpu implementation. This is used in cases like Wide & Deep model, e.g., slice the linear features to feed into wide model.As a feature request in #8168.
cc @eric-haibin-lin for review
Checklist
Essentials
make lint
)Changes
Comments