-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[MXNET-359] fix checks on convolution parameters in MKLDNN. #10666
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test case for symmetric padding with length of tuple as 4 ?
@@ -88,26 +91,23 @@ static mkldnn::convolution_backward_data::primitive_desc GetConvBwdData( | |||
auto weight_md = GetWeightDesc(weights, param.num_group); | |||
auto out_md = GetMemDesc(output); | |||
auto engine = CpuEngine::Get()->get_engine(); | |||
CHECK_GE(param.stride.ndim(), 2U); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be CHECK_EQ ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after ONNX is fixed, this should be CHECK_EQ. I didn't know if ONNIX would be fixed when I submitted the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please disregard my comment. I think this change shouldnt depend on whether onnx is fixed or not. CHECK_GE
looks good to make it consistent with existing behavior.
@@ -123,16 +123,15 @@ static mkldnn::convolution_backward_weights::primitive_desc GetConvBwdWeights( | |||
auto weight_md = GetWeightDesc(weights, param.num_group); | |||
auto out_md = GetMemDesc(output); | |||
auto engine = CpuEngine::Get()->get_engine(); | |||
CHECK_GE(param.stride.ndim(), 2U); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be CHECK_EQ ?
strides[0] = param.stride[0]; | ||
strides[1] = param.stride[1]; | ||
} else if (param.stride.ndim() == 1) { | ||
strides[0] = param.stride[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
param.pad.ndim() == 1 will not use mkldnn anymore ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mxnet always assume 2 elements in the tuple. in the python, if the input is one element, it'll convert it to 2-element tuple, so in practice, we don't get stride with one element.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Python will extend one element to two-element tuple. What about other frontend languages or what about someone calling c APIs to build their model?
@@ -32,6 +32,12 @@ | |||
namespace mxnet { | |||
namespace op { | |||
|
|||
bool SupportMKLDNNConv(const DeconvolutionParam& params, const NDArray &input) { | |||
if (params.kernel.ndim() != 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to add check for strides and dilate too ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should have a check in the parameter parser of mxnet conv, so we don't need to check it in the MKLDNN code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are checking strides and dilates ndim to be greater than equal to 2, can we fallback to default implementation and return false here when ndim of stride, pad or dilates is less than 2 ?
@piiswrong @eric-haibin-lin @ashokei @pengzhao-intel @TaoLv |
Do we still want this change if onnx correctly handles padding? |
@eric-haibin-lin yes we still need this change to make the behavior consistent (with or without MKLDNN) enabled. We can add this back later on, when we add support to raise error for MXNet conv (without MKLDNN enabled). |
I’m not sure about other front ends. What I see is mxnet conv operator
always assumes two-elements tuple in this case. Ideally, we should fix the
tuple correctly when the parameters are parsed.
…On Tue, Apr 24, 2018 at 6:51 PM Tao Lv ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/operator/nn/mkldnn/mkldnn_deconvolution.cc
<#10666 (comment)>
:
> mkldnn::memory::dims strides{0, 0};
- if (param.stride.ndim() == 2) {
- strides[0] = param.stride[0];
- strides[1] = param.stride[1];
- } else if (param.stride.ndim() == 1) {
- strides[0] = param.stride[0];
Python will extend one element to two-element tuple. What about other
frontend languages or what about someone calling c APIs to build their
model?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#10666 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAETUYqp6vSbkepdsnuLqqU6xXWUqfiBks5tr9acgaJpZM4Tg3hf>
.
|
I ran the following simple script with the code pulled from your branch:
MXNet conv seems to be reading the stride with ndim = 1 correctly here: https://github.com/apache/incubator-mxnet/blob/master/src/operator/nn/convolution-inl.h#L400. To avoid this inconsistency for now, we can fallback to default compute if any of pad , stride or dilate have ndim < 2. Let me know what you think. |
import mxnet as mx
arr = mx.nd.random.uniform(shape=(10, 10, 32, 32))
weight1 = mx.nd.random.uniform(shape=(10, 10, 3, 3))
arr1 = mx.nd.Convolution(data=arr, weight=weight1, no_bias=True, kernel=(3, 3), stride=(1), num_filter=10)
arr2 = mx.nd.Convolution(data=arr, weight=weight1, no_bias=True, kernel=(3, 3), stride=(1, 1), num_filter=10)
print((arr1 == arr2).asnumpy().sum()) This outputs 2616.0, while we expect 3000 because the output shape is (10L, 10L, 30L, 1L). |
src/operator/nn/convolution.cc
Outdated
@@ -363,6 +365,9 @@ static void ConvolutionParamParser(nnvm::NodeAttrs* attrs) { | |||
if (param_.dilate.ndim() == 0) param_.dilate = Shape3(1, 1, 1); | |||
if (param_.pad.ndim() == 0) param_.pad = Shape3(0, 0, 0); | |||
} | |||
CHECK_EQ(param_.kernel.ndim(), param_.stride.ndim()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These checks needs to have error messages.
"stride must have the same number of dimensions with kernel_size, but kernel_size is set to (x,x,x) while stride is (x,x)"
…0666) * fix check on tuples of conv. * check params in (de)conv. * rename. * add messages.
…0666) * fix check on tuples of conv. * check params in (de)conv. * rename. * add messages.
…0666) * fix check on tuples of conv. * check params in (de)conv. * rename. * add messages.
…0666) * fix check on tuples of conv. * check params in (de)conv. * rename. * add messages.
Description
As I explained in #10663, there is mismatch between MXNet and ONNX. This is a temp fix from the MKLDNN side for the problem: MKLDNN conv follows the behavior of MXNet conv (always uses the first two elements in the tuple as padding).
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.