Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[sparse] slice for csr on two dimensions, cpu implementation #8331

Merged
merged 16 commits into from
Nov 8, 2017
Merged

[sparse] slice for csr on two dimensions, cpu implementation #8331

merged 16 commits into from
Nov 8, 2017

Conversation

ZiyueHuang
Copy link
Member

@ZiyueHuang ZiyueHuang commented Oct 18, 2017

Description

slice_axis for csr, cpu implementation. This is used in cases like Wide & Deep model, e.g., slice the linear features to feed into wide model.

As a feature request in #8168.

cc @eric-haibin-lin for review

Checklist

Essentials

  • Passed code style checking (make lint)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • For user-facing API changes, API doc string has been updated.
  • To my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • slice_axis for csr, cpu implementation, add unittest

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Intersting edge cases to note here

CHECK_EQ(in_attrs->size(), 1);
CHECK_EQ(out_attrs->size(), 1);
const SliceAxisParam& param = nnvm::get<SliceAxisParam>(attrs.parsed);
const auto& in_stype = in_attrs->at(0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for & if in_stype is not changed.


template<typename xpu>
void SliceAxisEx(const nnvm::NodeAttrs& attrs,
const OpContext& ctx,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: indentation


const SliceAxisParam& param = nnvm::get<SliceAxisParam>(attrs.parsed);
auto in_stype = inputs[0].storage_type();
CHECK_NE(in_stype, kDefaultStorage)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can remove this check and print operator_info(ctx, ..) in line 1060

} else if (param.axis == 1) {
SliceAxisOneCsrImpl<xpu>(param, ctx, inputs[0], req[0], outputs[0]);
} else {
LOG(FATAL) << "CSRNDArray is only for 2-D shape";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it fail with negative axis? I think GetSliceAxisParams already handles negative axis for you

@@ -127,9 +127,27 @@ def check_slice_nd_csr_fallback(shape):
result_dense = mx.nd.slice(mx.nd.array(A2), begin=(start, shape[1] - 1), end=(end + 1, shape[1]))
assert same(result_dense.asnumpy(), result.asnumpy())

shape = (rnd.randint(2, 10), rnd.randint(1, 10))
def check_sparse_nd_csr_slice_axis(shape):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's also add some test cases for negative axis

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

CHECK_NE(req, kWriteInplace) << "kWriteInplace for SliceAxis on CSR input is not supported";
int axis, begin, end;
GetSliceAxisParams(param, in.shape(), &axis, &begin, &end);
int indptr_len = in.shape()[0] + 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use nnvm::dim_t (int64_t) instead because shape[i] is 64 bits


template<typename xpu>
void SliceAxisOneCsrImpl(const SliceAxisParam &param, const OpContext& ctx,
const NDArray &in, OpReqType req, const NDArray &out) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: indentation

RType *out_indptr = out.aux_data(kIndPtr).dptr<RType>();
int nnz = 0;
out_indptr[0] = 0;
for (int i=0; i < indptr_len - 1; i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also use nnvm::dim_t for i and j

for (int i=0; i < indptr_len - 1; i++) {
out_indptr[i+1] = out_indptr[i];
for (int j=in_indptr[i]; j < in_indptr[i+1]; j++) {
if (in_idx[j] >= begin && in_idx[j] < end) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

continue if in_idx[j] >= end instead of scanning the rest, since indices are sorted per row?

DType *out_data = out.data().dptr<DType>();

Stream<xpu> *s = ctx.get_stream<xpu>();
Kernel<SliceAxisOneCsrAssign, xpu>::Launch(s, indptr_len-1, out_idx, out_data,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it work when nnz = 0? Is that tested?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. If nnz=0, kernel launch will return immediately. Test for slice_axis(zeros, ...) is added.

@piiswrong
Copy link
Contributor

I think slice axis is deprecated, we are using slice now

@ZiyueHuang
Copy link
Member Author

OK, I will change it to slice.

@ZiyueHuang ZiyueHuang changed the title [sparse] slice_axis for csr, cpu implementation [sparse] slice for csr on two dimensions, cpu implementation Oct 29, 2017
Copy link
Member

@eric-haibin-lin eric-haibin-lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments regarding docs. Also adding @anirudh2290 for review

@@ -264,7 +264,7 @@ The resulting array's *k*-th dimension contains elements
from the *k*-th dimension of the input array with the open range ``[b_k, e_k)``.

For an input array of non-default storage type(e.g. `csr` or `row_sparse`), it only supports
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think row_sparse is not supported for slice. Let's remove this sentence in the doc.

The storage type of ``slice`` output depends on storage types of inputs:
   - slice(csr) = csr
   - slice(default) = default

@@ -601,18 +590,127 @@ void SliceCsrImpl(const SliceParam &param, const OpContext& ctx,
});
}

// slice a CSRNDArray for two dimensions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@anirudh2290 anirudh2290 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding this operator!

out.CheckAndAllocAuxData(kIndPtr, Shape1(indptr_len));
if (!in.storage_initialized()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens here if input is a CSR Array with all zeroes ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comments. If input is zeros, kernel launch will return immediately. Unittest for zeros input case is added.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that still true on GPU, when we add GPU support? This PR is dealing with some bugs for zero inputs for dot operator #8470

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For CSRNDArray, storage_initialized() return aux_shape(0).Size() != 0, I think it is always true for a valid CSRNDArray except for rank-0 array.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to returning csr zeros immediately if nnz=0.

if (!in.storage_initialized()) {
out.set_aux_shape(kIndPtr, Shape1(0));
return;
}
// assume idx indptr share the same type
MSHADOW_IDX_TYPE_SWITCH(in.aux_type(kIndPtr), RType, {
MSHADOW_IDX_TYPE_SWITCH(in.aux_type(kIdx), IType, {
MSHADOW_TYPE_SWITCH(in.dtype(), DType, {
auto in_indptr = in.aux_data(kIndPtr).dptr<RType>();
auto out_indptr = out.aux_data(kIndPtr).dptr<RType>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid auto for in_indptr and out_indptr

@@ -592,7 +581,7 @@ void SliceCsrImpl(const SliceParam &param, const OpContext& ctx,
auto out_idx = out.aux_data(kIdx).dptr<IType>();
auto in_data = in.data().dptr<DType>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid auto here and use IType and DType

out_indptr[i+1] = out_indptr[i];
for (RType j = in_indptr[i + begin_row];
j < in_indptr[i + begin_row + 1]; j++) {
if (in_idx[j] >= end_col) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add one line comment for the if, else if logic here. Also why not just if (in_idx[j] >= begin_col && in_idx < end_col) ?

const int begin, const int end) {
RType ind = out_indptr[i];
for (RType j = in_indptr[i]; j < in_indptr[i+1]; j++) {
if (in_idx[j] >= end) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to not just do if (in_idx[j] >= begin_col && in_idx < end_col)

Copy link
Member Author

@ZiyueHuang ZiyueHuang Oct 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indices of csr ndarray is in ascending order per row. So if indice >= end, there is no need to continue the loop.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was suggesting this change for readability. Also, you would be doing the checks for all in_idx[j] < begin_col which will be avoided with the change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used this condition, in_idx[j] >= begin_col && in_idx < end_col, at the first time. But according to @eric-haibin-lin 's comments, this logic should be changed to a if/else logic which can jump out of the loop since indices are sorted per row.

@eric-haibin-lin eric-haibin-lin self-assigned this Nov 8, 2017
@piiswrong piiswrong merged commit bf2336c into apache:master Nov 8, 2017
cjolivier01 pushed a commit to cjolivier01/mxnet that referenced this pull request Nov 9, 2017
…8331)

* slice axis for csr (cpu impl)

* fix indice bug and use kernel launch

* small fix

* misc updates to address comments

* fix type

* csr slice

* unittest

* fix lint

* address comments

* return csr zeros before kernel launch if nnz=0

* fix
eric-haibin-lin pushed a commit to eric-haibin-lin/mxnet that referenced this pull request Dec 3, 2017
…8331)

* slice axis for csr (cpu impl)

* fix indice bug and use kernel launch

* small fix

* misc updates to address comments

* fix type

* csr slice

* unittest

* fix lint

* address comments

* return csr zeros before kernel launch if nnz=0

* fix
@ZiyueHuang ZiyueHuang deleted the slice-axis-csr branch January 30, 2018 11:33
rahul003 pushed a commit to rahul003/mxnet that referenced this pull request Jun 4, 2018
…8331)

* slice axis for csr (cpu impl)

* fix indice bug and use kernel launch

* small fix

* misc updates to address comments

* fix type

* csr slice

* unittest

* fix lint

* address comments

* return csr zeros before kernel launch if nnz=0

* fix
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants