Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Relay] Bitserial ops #3844

Merged
merged 14 commits into from
Sep 1, 2019
Merged

[Relay] Bitserial ops #3844

merged 14 commits into from
Sep 1, 2019

Conversation

jwfromm
Copy link
Contributor

@jwfromm jwfromm commented Aug 28, 2019

This PR adds relay operations for the bitserial operations conv2d, dense and bitpack. This addition allows relay frontends to leverage the already existing TOPI bitserial ops that enable very fast low-bit execution on arm CPU. There are currently some limitations in regards to automatic shape inference in large part due to the need for these ops to support optional prepacking of weight bits. For example, this makes it difficult to infer shape without explicitly knowing the number of channels ahead of time, so here we require that the channels attribute is always set.

Also included in this PR is a schedule for NHWC convolution on arm_cpu that yields good results. Because of this inclusion, I've accordingly changed the conv2d legalize routine on arm.

Copy link
Contributor

@tmoreau89 tmoreau89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work @jwfromm ! I've left a few comments in there for you to address.

@@ -119,7 +123,11 @@ def _callback(op):
if isinstance(kernel.op, tvm.tensor.ComputeOp) and "dilate" in kernel.op.tag:
s[kernel].compute_inline()

_schedule_spatial_pack(cfg, s, data_vec, kernel_vec, conv, output, outs[0])
# TODO: move to schedule_nhwc later
if 'nhwc' in op.tag:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is in the schedule_conv2d_nchw_arm_cpu function - should 'nhwc' ever be in op.tag?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted changes to arm nhwc schedule in favor of jackwish version.

@@ -810,25 +954,19 @@ def _conv2d_legalize(attrs, inputs, arg_types):
if attrs['data_layout'] == 'NHWC':
data, kernel = inputs
if attrs['kernel_layout'] == 'HWIO':
# Handle HWIO layout. This is common in TF graph.
kernel = relay.transpose(kernel, axes=(3, 2, 0, 1))
# HWIO layout is expected for NHWC input.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the assumption here is that we run bitserial_conv2d_legalize beforehand?

Copy link
Contributor Author

@jwfromm jwfromm Aug 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this specific check has now been reverted a similar legalize routine is still in the bitserial convolution. The NHWC computation only works when the kernel is in HWIO format so the legalize pass is just doing a conversion in case the kernel is in a different format. Since most of TVM uses a default OIHW format, this is a pretty handy check to have.

Copy link
Contributor

@tmoreau89 tmoreau89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the new changes! I added a few more comments to be addressed.

@tmoreau89
Copy link
Contributor

tmoreau89 commented Aug 30, 2019

You may also want to check with @jackwish who posted a draft (#3859) of NHWC conv2d templates for ARM. There's little overlap but I anticipate merge conflicts, so just make sure you agree on code organization.

@jwfromm
Copy link
Contributor Author

jwfromm commented Aug 30, 2019

It looks like @jackwish has a more fleshed out version of the NHWC schedules so I think it makes sense to cut it from my PR and instead focus only on the bitserial ops. All additions to the arm nhwc conv2d have now been reverted.

@tmoreau89
Copy link
Contributor

Agreed, thanks for the changes, it will help integrate jackwish' changes more easily in the future.

@jroesch jroesch merged commit d08c74c into apache:master Sep 1, 2019
@zhenhuaw-me
Copy link
Contributor

Thank you @jwfromm and @tmoreau89 , I think that I borrow insights of this work :)

wweic pushed a commit to wweic/tvm that referenced this pull request Sep 16, 2019
* Added arm_cpu NHWC schedules.

* Fixed kernel shape legalization.

* Added bitserial ops to relay.

* Snapshot and more missing files.

* Added dense testing.

* Added tests

* Added ASF header to new files.

* cc lint

* Pylint change.

* pylint fixes.

* Change arm legalize test.

* Added assert check to arm legalize.

* Added better documentation, fixed some bad style

* Reverted arm conv2d nhwc changes.
wweic pushed a commit to wweic/tvm that referenced this pull request Sep 16, 2019
* Added arm_cpu NHWC schedules.

* Fixed kernel shape legalization.

* Added bitserial ops to relay.

* Snapshot and more missing files.

* Added dense testing.

* Added tests

* Added ASF header to new files.

* cc lint

* Pylint change.

* pylint fixes.

* Change arm legalize test.

* Added assert check to arm legalize.

* Added better documentation, fixed some bad style

* Reverted arm conv2d nhwc changes.
wweic pushed a commit to neo-ai/tvm that referenced this pull request Sep 16, 2019
* Added arm_cpu NHWC schedules.

* Fixed kernel shape legalization.

* Added bitserial ops to relay.

* Snapshot and more missing files.

* Added dense testing.

* Added tests

* Added ASF header to new files.

* cc lint

* Pylint change.

* pylint fixes.

* Change arm legalize test.

* Added assert check to arm legalize.

* Added better documentation, fixed some bad style

* Reverted arm conv2d nhwc changes.
@jwfromm jwfromm deleted the bitserial_ops branch January 11, 2020 22:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants