-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MKLDNN conv2d kernel added #8451
Conversation
721cf36
to
6216666
Compare
paddle/fluid/framework/op_registry.h
Outdated
@@ -236,11 +236,11 @@ class OpKernelRegistrar : public Registrar { | |||
|
|||
#define USE_CUDA_ONLY_OP(op_type) \ | |||
USE_OP_ITSELF(op_type); \ | |||
USE_OP_DEVICE_KERNEL(op_type, CUDA) | |||
USE_OP_DEVICE_KERNEL(op_type, CUDA); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should not add ;
at the end, we want user use a macro like
SOME_MACRO();
file(APPEND ${pybind_file} "USE_OP_DEVICE_KERNEL(conv2d, CUDNN);\n") | ||
file(APPEND ${pybind_file} "USE_OP_DEVICE_KERNEL(pool2d, CUDNN);\n") | ||
file(APPEND ${pybind_file} "USE_OP_DEVICE_KERNEL(conv2d_transpose, CUDNN);\n") | ||
op_library(edit_distance_op SRCS edit_distance_op.cc edit_distance_op.cu DEPS math_function) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will pybind USE_OP_DEVICE_KERNEL(XXX, CUDNN)
automatically in #8590, in order to make operators/CMakelists.txt much cleaner.
Then, only one sentence op_library(pool_op DEPS pooling)
will pybind CPU/CUDA/CUDNN/MKLDNN all device kernel.
Thus:
- how about use
conv_mkldnn_op.cc
likesconv_cudnn_op.cu.cc
instead ofconv_op.mn.cc
? - refine these codes after pybind USE_OP_DEVICE_KERNEL(XXX, CUDNN) automatically #8590 finished?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If #8590 doesn't be finished before your small PRs, you can merge your changes at first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#8590 is finished and merged now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all, as synced with @pzelazko-intel, we will break this PR into some smaller ones.
As for current code, we also had a discussion.
The most important information is that the current implementation may not be the best efficient one, since the format is fixed as nchw
and the transform functions is still under developing.
If anything is missing, @pzelazko-intel please point out.
paddle/fluid/operators/conv_op.cc
Outdated
} else { | ||
library_ = framework::LibraryType::kPlain; | ||
} else if (CanMKLDNNBeUsed(ctx)) { | ||
library_ = framework::LibraryType::kMKLDNN; | ||
} | ||
|
||
std::string data_format = ctx.Attr<std::string>("data_format"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we only make MKLDNN library enabled.
As synced with @pzelazko-intel, we would enable MKLDNN layout next time.
Then we would considerate transform function as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add "TODO" in codes for reminding? Including:
- enable MKLDNN layout
- enable groups
- something more.
Besides, could you not cover the previous commit next time, since we could not find the difference after your updating?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I'm going to add TODOs.
I've covered previous commits, because I wanted commit history to be clear.
Would I have opportunity to squash commits when merging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can squash commits when merging your PR.
paddle/fluid/operators/conv_op.mn.cc
Outdated
std::vector<int> dilations = ctx.Attr<std::vector<int>>("dilations"); | ||
int groups = ctx.Attr<int>("groups"); | ||
|
||
PADDLE_ENFORCE(groups == 1, "MKLDNN doesn't support group convolution yet"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will enable groups later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/fluid/operators/conv_op.mn.cc
Outdated
memory::dims conv_padding = {paddings[0], paddings[1]}; | ||
|
||
auto conv_src_md = memory::desc({conv_src_tz}, memory::data_type::f32, | ||
memory::format::nchw); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The format is fixed as nchw
, should support more. We will come back later.
paddle/fluid/operators/conv_op.mn.cc
Outdated
template <typename T> | ||
class ConvOpMkldnnKernel : public paddle::framework::OpKernel<T> { | ||
public: | ||
void Compute(const paddle::framework::ExecutionContext& ctx) const override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This Compute
function is too long. We can think about breaking the code to smaller functions in mkldnn_herlper like cudnn_helper did.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please look at the answer below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/fluid/operators/conv_op.mn.cc
Outdated
|
||
// push op to stream and wait MKLDNN until it's executed | ||
std::vector<primitive> pipeline{conv_prim}; | ||
stream(stream::kind::eager).submit(pipeline).wait(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only conv_pd
is saved to context, I think we can save more, like engine
, primitives
, stream
, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to refactor it as soon as we know how to handle data transferring between forward and backward in parallel mode (ParallelDo OP).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refactoring is done
paddle/fluid/operators/conv_op.mn.cc
Outdated
|
||
// Get an unique name from "argument" name of "Output" variable | ||
// This name will be used as key when saving info into device context | ||
const std::string key = ctx.op().Output("Output"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not so sure if this output name is unique, especially under the scope filed.
I had a discussion with @QiJune before and did not get a formal conclusion at that time.
Maybe Baidu friends can give a better answer. This is just my concern and reminder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that conv_cudnn_op.cu.cc
also use Output
.
Paddle/paddle/fluid/operators/conv_cudnn_op.cu.cc
Lines 35 to 42 in fc37482
class CUDNNConvOpKernel : public framework::OpKernel<T> { | |
public: | |
void Compute(const framework::ExecutionContext& ctx) const override { | |
PADDLE_ENFORCE(platform::is_gpu_place(ctx.GetPlace()), | |
"It must use CUDAPlace."); | |
auto* input = ctx.Input<Tensor>("Input"); | |
auto* filter = ctx.Input<Tensor>("Filter"); | |
auto* output = ctx.Output<Tensor>("Output"); |
Thus, why
conv_mkldnn_op.cc
should not use the same output name?
'use_cudnn': self.use_cudnn, | ||
'data_format': 'AnyLayout' # TODO(dzhwinter) : should be fix latter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my perspective, it's not appropriate we remove this TODO.
I suggest we remain it and let the owner to fix it, since we may miss some background information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not remove this line, it's been moved up.
7882699
to
edcf89a
Compare
Now in this PR I'm introducing only conv2d OP MKLDNN kernel. |
edcf89a
to
0cd6662
Compare
LGTM on following files:
@jacquesqiao Can you help review following files:
@QiJune Can you help review following files:
@tensor-tang Can you help review following files:
|
9a3ecb8
to
a4ab82d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My ARs for conv_mkldnn_op.cc and conv_op.cc
Just a reminder, as discussed before, the Compute
functions are too large. Please do not forget it, since it's pretty important.
// TODO(pzelazko-intel) enable group convolution | ||
PADDLE_ENFORCE(groups == 1, "MKLDNN doesn't support group convolution yet"); | ||
PADDLE_ENFORCE( | ||
dilations.size() == 2 && dilations[0] == 1 && dilations[1] == 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dilation is also supported with MKLDNN conv, you can add one more TODO later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
int groups = ctx.Attr<int>("groups"); | ||
|
||
// TODO(pzelazko-intel) enable group convolution | ||
PADDLE_ENFORCE(groups == 1, "MKLDNN doesn't support group convolution yet"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MKLDNN doesn't support dilation in convolution yet
I think this error message is not clear enough, as we know MKL-DNN itself supports groups.
This message would misleading Paddle team and users.
It's that we did not enable it on paddle yet.
As well as the below dilations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
a4ab82d
to
8090479
Compare
8090479
to
0e3f110
Compare
paddle/fluid/framework/operator.cc
Outdated
@@ -600,6 +600,23 @@ proto::VarType::Type OperatorWithKernel::IndicateDataType( | |||
return static_cast<proto::VarType::Type>(data_type); | |||
} | |||
|
|||
bool OperatorWithKernel::CanCUDNNBeUsed(const ExecutionContext& ctx) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to make CanCUDNNBeUsed and CanMKLDNNBeUsed two global functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jacquesqiao where would you propose to place these functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pzelazko-intel according to this document https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/mkl/mkldnn_fluid.md#mkldnn_helper, we can put the interface there and add a mkldnn_helper.cc
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
fb95caa
to
dad7a02
Compare
Please answer reviewers’ every comment. If you are to follow the comment, please write “Done”; please give a reason otherwise. See code-review. |
@luotao1 refactoring has been completed. |
@luotao1 DeviceContext part looks good to me. |
framework part looks good to me, thanks! @pzelazko-intel |
device_contexts_.emplace(places[i], | ||
new platform::CPUDeviceContext( | ||
boost::get<platform::CPUPlace>(places[i]))); | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, I have a little question here.
When PADDLE_WITH_MKLDNN
is enabled, we will not have platform::CPUDeviceContext
anyway .
Is that fine with you @jacquesqiao @QiJune ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tensor-tang It seems that if PADDLE_WITH_GPU
is enabled, there will be both CPUDeviceContext
and CUDADeviceContext
.
So, I think MKLDNNDeviceContext and CPUDeviceContext should be coexist.
Could you provide an example that has a two FC in the network, which one FC is CPU, and the other FC is MKLDNN? Just mnist demo will be fine.
We can see if "MKLDNN" is compatible with "CPU".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Firstly, MKLDNNDeviceContext
is inherited from CPUDeviceContext
now
https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/platform/device_context.h#L113. So functionally, it can pass the CI on this version.
But from my perspective, I thought CPUDeviceContext
should be always available no matter which third-party is added, not only MKLDNN here. Since we can not guarantee all the ops paddle supported are also supported by the third-party library. If this third-party context is not inherited from cpu context, it would be a problem.
So I just want to hear you voice. I am not sure is that proper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think so.
Since MKLDNNDeviceContext
is inherited from CPUDeviceContext
, there will be no problem.
I have not think out a better choice, we can move on first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, Thx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for MKLDNN part
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for @pzelazko-intel work, and thanks @jacquesqiao @QiJune @tensor-tang review.
MKLDNN conv2d and pool2d OP kernels can be enabled with
use_mkldnn
OP flag - just like currenly presentuse_cudnn
flag. It's set toTrue
by default.use_cudnn
flag has figher priority.Beside unit tests, we validated these kernels by running training and interference on MNIST dataset and comparing results with caffe library.