Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use BroadcastKernel and ReduceKernel to optimize expand and expand_grad. #49419

Merged
merged 3 commits into from
Jan 3, 2023

Conversation

Xreki
Copy link
Contributor

@Xreki Xreki commented Dec 28, 2022

PR types

Performance optimization

PR changes

OPs

Describe

使用BroadcastKernelReduceKernel重写expand kernel,OP Benchmark CI中性能优化效果如下:

OP配置 优化前 优化后 时间减少幅度 性能提升比例
前向 float32, x.shape=[16, 1785, 1], shape=[1785, 2] 0.0037038 s 0.0036177 s -2.32464% 2.38%
前向 float32, x.shape=[16, 5, 1, 1], shape=[5, 128, 128] 0.0226040 s 0.0096950 s -57.10936% 1.33x
前向 float32, x.shape=[32, 807, 1], shape=[807, 807] 0.1061900 s 0.0983360 s -7.39618% 7.99%
前向+反向 float32, x.shape=[16, 1785, 1], shape=[1785, 2] 0.0112115 s 0.0114307 s 1.955515% -1.92%
前向+反向 float32, x.shape=[16, 5, 1, 1], shape=[5, 128, 128] 5.3964660 s 0.0398334 s -99.26186% 134x
前向+反向 float32, x.shape=[32, 807, 1], shape=[807, 807] 0.9756090 s 0.4280108 s -56.12886% 1.28x

其他情况说明

  • 当前ExpandKernel中仍然加入了输出Shape的计算逻辑。实际上,InferShape的功能应该由ExpandInferMeta函数完成,但静态图ExpandInferMeta执行完后,输出Shape中仍然存在-1,个人认为这是ExpandInferMeta的实现存在bug。考虑到目前尚不是完全确定静态图编译期InferShape的逻辑,故PR暂不修改这部分。
  • ExpandAsKernel可复用ExpandKernel实现,后续再提PR修改。

@paddle-bot
Copy link

paddle-bot bot commented Dec 28, 2022

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@Xreki Xreki requested a review from JamesLim-sy January 3, 2023 06:01
Copy link
Contributor

@JamesLim-sy JamesLim-sy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Xreki Xreki merged commit c460402 into PaddlePaddle:develop Jan 3, 2023
@Xreki Xreki deleted the op/opt_expand branch January 3, 2023 06:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants