Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Relay] Option to select which convolution layers are quantized. #3173

Merged
merged 7 commits into from
May 16, 2019

Conversation

jwfromm
Copy link
Contributor

@jwfromm jwfromm commented May 10, 2019

The current Qconfig only allows layers at the beginning of a network to be left at full precision. However many architectures might require layers near the output to remain at high precision. I've added an option to explicitly skip specific convolutions anywhere in the network.

@jwfromm
Copy link
Contributor Author

jwfromm commented May 11, 2019

Not sure why caffe2 tests are failing. They pass when run locally and these changes shouldn't effect anything but quantization.

@jwfromm jwfromm changed the title Option to select which convolution layers are quantized. [RELAY] Option to select which convolution layers are quantized. May 11, 2019
@jwfromm jwfromm changed the title [RELAY] Option to select which convolution layers are quantized. [Relay] Option to select which convolution layers are quantized. May 11, 2019
@tqchen tqchen assigned ZihengJiang and vinx13 and unassigned ZihengJiang May 11, 2019
@tqchen
Copy link
Member

tqchen commented May 11, 2019

@vinx13 @ZihengJiang @kazum @eqy can you help review this?

@@ -105,7 +105,9 @@ def conv2d_cuda(cfg, data, kernel, strides, padding, dilation, layout='NCHW', ou
return winograd_cuda(cfg, data, kernel, strides, padding, dilation, layout, out_dtype,
pre_computed=False)
if cfg.template_key == 'int8':
return conv2d_NCHWc_int8(cfg, data, kernel, strides, padding, dilation, layout, out_dtype)
if (data.dtype == 'int8' or data.dtype == 'uint8'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it work with uint8? conv2d compute and dp4a need some changes since they are hardcoded as int8 (although it should work with uint8)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that it only supports int8 now but could be extended to uint8. I added this check because I was doing some quantization without autotuning, which caused the incorrect convolution algorithm to be chosen in some cases. Since autotuning is the intended workflow I think dropping this would be fine.

@jroesch
Copy link
Member

jroesch commented May 11, 2019

@tqchen Shouldn't we move this logic outside of quantization and instead rely on pre-passes to annotate regions for quantization. I thought that was the whole point of the discussion we had about VTA's quantization annotations.

@vinx13
Copy link
Member

vinx13 commented May 12, 2019

@jwfromm CI error is unrelated. Please rebase and get CI green

@jwfromm jwfromm force-pushed the quantize_conv_select branch from ea4e99f to 3868fc5 Compare May 13, 2019 19:27
@tqchen
Copy link
Member

tqchen commented May 13, 2019

CI is now green @vinx13 please manage the PR

@jwfromm
Copy link
Contributor Author

jwfromm commented May 13, 2019

Also added a small check that prints a warning when annotating a layer that doesn't have input channels divisible by 4 and so may not be quantizable.

@jwfromm jwfromm force-pushed the quantize_conv_select branch from e8c1a9b to f06693e Compare May 14, 2019 00:31

# Check if the kernel is suitable for quantization, post warning if not.
in_channels = new_args[1].data.shape[1]
if in_channels % 4 != 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why in channels divisible by 4 is required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if input channels isn't divisible by 4 relay fails to convert from NCHW to NCHWc and throws an error. This is ok if not using dp4a but annoying otherwise. Printing out a simple warning and count in these cases makes it easier to use the skip_conv_layers option if needed and is easy to ignore if not.

Copy link
Member

@vinx13 vinx13 May 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest removing the warning here since it is unrelated to quantization. The default conv2d implementation can work with different types. So if input channels isn't divisible by 4 (or other factor) we should use the NCHW one. We have some assertions like this in NCHWc conv2d. However, we don't expect that any errors will be raised since you have no way to obtain the AutoTVM log using invalid true. So NCHWc one shouldn't be chosen in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair, I've reverted the warning.

@vinx13
Copy link
Member

vinx13 commented May 14, 2019

@jroesch I'm gonna merge this if you have no objection

@jwfromm jwfromm force-pushed the quantize_conv_select branch from f06693e to bbf128a Compare May 14, 2019 17:14
@tqchen
Copy link
Member

tqchen commented May 16, 2019

ping @vinx13

@vinx13 vinx13 merged commit 36702a7 into apache:master May 16, 2019
@vinx13
Copy link
Member

vinx13 commented May 16, 2019

@jwfromm Thanks, this is merged

@jwfromm jwfromm deleted the quantize_conv_select branch May 17, 2019 01:35
wweic pushed a commit to wweic/tvm that referenced this pull request Jun 26, 2019
…che#3173)

* Stashing for later maybe.

* Added new option to leave specific layers unquantized.

* Better error checking.

* remove unneeded import

* tab to spaces

* pylint fixes

* more pylint fixes
wweic pushed a commit to neo-ai/tvm that referenced this pull request Jun 27, 2019
…che#3173)

* Stashing for later maybe.

* Added new option to leave specific layers unquantized.

* Better error checking.

* remove unneeded import

* tab to spaces

* pylint fixes

* more pylint fixes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants