-
Notifications
You must be signed in to change notification settings - Fork 728
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update cudnn convolution kernel #10440
base: master
Are you sure you want to change the base?
Conversation
Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally. |
void Compute(user_op::KernelComputeContext* ctx, user_op::OpKernelState*, | ||
const user_op::OpKernelCache* cache) const override { | ||
// process context data | ||
auto input = ctx->Tensor4ArgNameAndIndex("in", 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不可变对象比如in
要用const auto*
,可变对象比如tmp_buffer
要用auto*
,下面类似的地方都要这样
.SetIsMatchedHob(user_op::HobDeviceType() == DeviceType::kCUDA \ | ||
&& user_op::HobEnvBool("ONEFLOW_KERNEL_ENABLE_CUDNN_V8", false)) \ | ||
.SetInferTmpSizeFn([](user_op::InferContext* ctx) -> size_t { \ | ||
auto& input = ctx->InputTensorDesc("in", 0); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const,下同
|
||
private: | ||
void Compute(user_op::KernelComputeContext* ctx) const override { | ||
auto input = ctx->Tensor4ArgNameAndIndex("x", 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上
.SetIsMatchedHob(user_op::HobDeviceType() == DeviceType::kCUDA | ||
&& user_op::HobEnvBool("ONEFLOW_KERNEL_ENABLE_CUDNN_V8", false)) | ||
.SetInferTmpSizeFn([](user_op::InferContext* ctx) -> size_t { | ||
auto& input = ctx->InputTensorDesc("x", 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上
另外还有很多地方用了auto,可以看一下能加const的都加上const,部分函数声明中的参数在函数体中是不可变的也都加const& |
No description provided.