[Relay] Option to select which convolution layers are quantized. #3173

jwfromm · 2019-05-10T21:20:21Z

The current Qconfig only allows layers at the beginning of a network to be left at full precision. However many architectures might require layers near the output to remain at high precision. I've added an option to explicitly skip specific convolutions anywhere in the network.

jwfromm · 2019-05-11T00:16:45Z

Not sure why caffe2 tests are failing. They pass when run locally and these changes shouldn't effect anything but quantization.

tqchen · 2019-05-11T00:25:26Z

@vinx13 @ZihengJiang @kazum @eqy can you help review this?

vinx13 · 2019-05-11T04:43:57Z

topi/python/topi/cuda/conv2d.py

@@ -105,7 +105,9 @@ def conv2d_cuda(cfg, data, kernel, strides, padding, dilation, layout='NCHW', ou
        return winograd_cuda(cfg, data, kernel, strides, padding, dilation, layout, out_dtype,
                             pre_computed=False)
    if cfg.template_key == 'int8':
-        return conv2d_NCHWc_int8(cfg, data, kernel, strides, padding, dilation, layout, out_dtype)
+        if (data.dtype == 'int8' or data.dtype == 'uint8'):


Does it work with uint8? conv2d compute and dp4a need some changes since they are hardcoded as int8 (although it should work with uint8)

You're right that it only supports int8 now but could be extended to uint8. I added this check because I was doing some quantization without autotuning, which caused the incorrect convolution algorithm to be chosen in some cases. Since autotuning is the intended workflow I think dropping this would be fine.

jroesch · 2019-05-11T22:30:54Z

@tqchen Shouldn't we move this logic outside of quantization and instead rely on pre-passes to annotate regions for quantization. I thought that was the whole point of the discussion we had about VTA's quantization annotations.

vinx13 · 2019-05-12T04:57:28Z

@jwfromm CI error is unrelated. Please rebase and get CI green

tqchen · 2019-05-13T21:18:51Z

CI is now green @vinx13 please manage the PR

jwfromm · 2019-05-13T23:02:06Z

Also added a small check that prints a warning when annotating a layer that doesn't have input channels divisible by 4 and so may not be quantizable.

vinx13 · 2019-05-14T02:08:57Z

python/tvm/relay/quantize/_annotate.py

+
+    # Check if the kernel is suitable for quantization, post warning if not.
+    in_channels = new_args[1].data.shape[1]
+    if in_channels % 4 != 0:


Can you explain why in channels divisible by 4 is required?

if input channels isn't divisible by 4 relay fails to convert from NCHW to NCHWc and throws an error. This is ok if not using dp4a but annoying otherwise. Printing out a simple warning and count in these cases makes it easier to use the skip_conv_layers option if needed and is easy to ignore if not.

I would suggest removing the warning here since it is unrelated to quantization. The default conv2d implementation can work with different types. So if input channels isn't divisible by 4 (or other factor) we should use the NCHW one. We have some assertions like this in NCHWc conv2d. However, we don't expect that any errors will be raised since you have no way to obtain the AutoTVM log using invalid true. So NCHWc one shouldn't be chosen in this case.

That's fair, I've reverted the warning.

vinx13 · 2019-05-14T08:20:07Z

@jroesch I'm gonna merge this if you have no objection

tqchen · 2019-05-16T00:27:20Z

ping @vinx13

vinx13 · 2019-05-16T02:04:45Z

@jwfromm Thanks, this is merged

…che#3173) * Stashing for later maybe. * Added new option to leave specific layers unquantized. * Better error checking. * remove unneeded import * tab to spaces * pylint fixes * more pylint fixes

jwfromm changed the title ~~Option to select which convolution layers are quantized.~~ [RELAY] Option to select which convolution layers are quantized. May 11, 2019

jwfromm changed the title ~~[RELAY] Option to select which convolution layers are quantized.~~ [Relay] Option to select which convolution layers are quantized. May 11, 2019

tqchen added the status: need review label May 11, 2019

tqchen assigned ZihengJiang and vinx13 and unassigned ZihengJiang May 11, 2019

vinx13 reviewed May 11, 2019

View reviewed changes

vinx13 approved these changes May 12, 2019

View reviewed changes

jwfromm force-pushed the quantize_conv_select branch from ea4e99f to 3868fc5 Compare May 13, 2019 19:27

jwfromm added 7 commits May 13, 2019 17:31

Stashing for later maybe.

094a0aa

Added new option to leave specific layers unquantized.

e963eba

Better error checking.

bb3464a

remove unneeded import

d062391

tab to spaces

cd43e2f

pylint fixes

1ba47de

more pylint fixes

bbf128a

jwfromm force-pushed the quantize_conv_select branch from e8c1a9b to f06693e Compare May 14, 2019 00:31

vinx13 reviewed May 14, 2019

View reviewed changes

jwfromm force-pushed the quantize_conv_select branch from f06693e to bbf128a Compare May 14, 2019 17:14

vinx13 merged commit 36702a7 into apache:master May 16, 2019

vinx13 added status: accepted and removed status: need review labels May 16, 2019

jwfromm deleted the quantize_conv_select branch May 17, 2019 01:35

eqy mentioned this pull request Jul 28, 2019

[WIP][RELAY][QUANTIZATION] automatic data-driven calibration and per-channel scales #3294

Closed

tqchen mentioned this pull request Nov 8, 2019

[RELEASE][DRAFT] TVM v0.6 Release candidate #4259

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Relay] Option to select which convolution layers are quantized. #3173

[Relay] Option to select which convolution layers are quantized. #3173

jwfromm commented May 10, 2019

jwfromm commented May 11, 2019

tqchen commented May 11, 2019

vinx13 May 11, 2019

jwfromm May 11, 2019

jroesch commented May 11, 2019 •

edited

Loading

vinx13 commented May 12, 2019

tqchen commented May 13, 2019

jwfromm commented May 13, 2019

vinx13 May 14, 2019

jwfromm May 14, 2019

vinx13 May 14, 2019 •

edited

Loading

jwfromm May 14, 2019

vinx13 commented May 14, 2019

tqchen commented May 16, 2019

vinx13 commented May 16, 2019

[Relay] Option to select which convolution layers are quantized. #3173

[Relay] Option to select which convolution layers are quantized. #3173

Conversation

jwfromm commented May 10, 2019

jwfromm commented May 11, 2019

tqchen commented May 11, 2019

vinx13 May 11, 2019

Choose a reason for hiding this comment

jwfromm May 11, 2019

Choose a reason for hiding this comment

jroesch commented May 11, 2019 • edited Loading

vinx13 commented May 12, 2019

tqchen commented May 13, 2019

jwfromm commented May 13, 2019

vinx13 May 14, 2019

Choose a reason for hiding this comment

jwfromm May 14, 2019

Choose a reason for hiding this comment

vinx13 May 14, 2019 • edited Loading

Choose a reason for hiding this comment

jwfromm May 14, 2019

Choose a reason for hiding this comment

vinx13 commented May 14, 2019

tqchen commented May 16, 2019

vinx13 commented May 16, 2019

jroesch commented May 11, 2019 •

edited

Loading

vinx13 May 14, 2019 •

edited

Loading