Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QNN] Lowering for Depthwise Convolution. #4351

Merged
merged 1 commit into from
Nov 21, 2019
Merged

Conversation

anijain2305
Copy link
Contributor

@anijain2305 anijain2305 commented Nov 15, 2019

Adding the QNN lowering sequence for Depthwise conv2d. This creates depthwise conv2d with (u)int8 inputs, opening up path to write schedule using Intel VNNI and ARM DOT instructions.

For older HW that do not have fast in8 support, this lowering will not be called (already merged upstream in a different PR).

@FrozenGene @jackwish @tmoreau89 @yzhliu @zhiics

@anijain2305 anijain2305 force-pushed the qnn_dwc branch 2 times, most recently from 9d45f03 to 4e2ec58 Compare November 19, 2019 05:49
@anijain2305
Copy link
Contributor Author

@FrozenGene @jackwish @tmoreau89 A gentle ping for review :) This PR is ready for review now.

Copy link
Contributor

@zhenhuaw-me zhenhuaw-me left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments :)

if (param->kernel_layout == "OIHW") {
out_channels = get_const_int(kernel_shape[0]);
kernel_h = get_const_int(kernel_shape[2]);
kernel_w = get_const_int(kernel_shape[3]);
if (is_depthwise(param)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using a variable to hold this?

* For depthwise, we can similarly unroll the computation. The intial compute is as follows
* wehere cm = channel_multiplier
*
* Qc(n, oc, oh, ow) = Sigma(r, s) (Qw(oc/m, oc%/m, r, s) - zp_w)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggesting to rewrite related formulas with tex: block example, or inline style.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments. I will send a separate PR for better comments. This will have to change in multiple files I think.

CHECK_EQ(param->dilation.size(), 2) << "qnn.conv2d only supports 2D dilation";
auto dilation_h = get_const_int(param->dilation[0]);
auto dilation_w = get_const_int(param->dilation[1]);
if (dilation_h != 1 || dilation_w != 1 || param->groups != 1) {
if (dilation_h != 1 || dilation_w != 1 || (param->groups != 1 && !is_depthwise(param))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems is_depthwise already checked groups?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to differentiate between grouped convolutions (old Alexnet) vs depthwise convolution. We want to fallback to simpler lowering if its grouped convolution, but not depthwise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for this, I remembered that I have removed this comment after I took a clear look at this :(

Comment on lines -435 to +594
auto term3 = Conv2DThirdTerm(weight, param, batch_size, out_channels);
auto term4 = Conv2DFourthTerm(param, batch_size, in_channels, kernel_h, kernel_w);
auto term3 = Conv2DThirdTerm(weight, param, out_channels);
auto term4 = Conv2DFourthTerm(param, in_channels, kernel_h, kernel_w);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is interesting, why removing batch semantic?

Copy link
Contributor Author

@anijain2305 anijain2305 Nov 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The batch_size was not needed at all for those 2 terms in the first place. I realized that while writing the lowering for Depthwise.

// We can reduce the H and W axis by using avg_pool2d. However, avg_pool2d averages the sum.
// Since, this is integer division (floor), we can first multiply the data by the pool_size and
// then perform avg_pool2d. Reversing this causes inaccuracy due to floor division.
auto scaled_hw_t2 = Multiply(casted_t2, MakeConstantScalar(Int(32), kernel_h * kernel_w));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is unneeded (like L215) when kernel_h * kernel_w == 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice observation! Yes, that will save a few us :) I will make changes for the conv2d lowering as well.

@anijain2305
Copy link
Contributor Author

@jackwish Thanks! Addressed your comments :)

Copy link
Contributor

@zhenhuaw-me zhenhuaw-me left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you @anijain2305

@anijain2305
Copy link
Contributor Author

@zhiics Can you please review?

Copy link
Member

@zhiics zhiics left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhiics zhiics merged commit 464ebb1 into apache:master Nov 21, 2019
@zhiics
Copy link
Member

zhiics commented Nov 21, 2019

Thanks @anijain2305 @jackwish

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants