optimize index computation #33909

SunNy820828449 · 2021-07-01T09:37:06Z

PR types

Performance optimization

PR changes

OPs

Describe

optimize the index computation

Paddle vs Pytorch

---	Paddle-Old	Pytorch	Paddle
shape=(1000,2000) axis=(1) shift=(5)	avg ~ 25.5 us	avg ~ 24us	avg ~ 23.75us
shape=(1000,2000) axis=(0) shift=(5)	avg ~ 24.9us	avg ~ 24us	avg ~ 22.8us
shape=(1000,2000) axis=(0,1) shift=(5,5)	avg ~ 37.7 us	avg ~ 48us	avg ~ 31.4us

paddle-bot-old · 2021-07-01T09:37:10Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

… roll_optimize_kernel

Xreki · 2021-07-05T05:58:01Z

paddle/fluid/operators/roll_op.cu

-    dim_idx = (idx / strides[i]) % sizes[i];
-    dim_idx_shift = (dim_idx + shifts[i]) % sizes[i];
-    output_idx = output_idx + (dim_idx_shift - dim_idx) * strides[i];
+    dim_idx = (idx / strides[i]) % sizes[i] + shifts[i];


变量名应符合实际代表的含义，这里应该是原来的dim_idx_shift，且临时变量dim_idx不再需要，应该删除。

不是dim_idx_shift，就是新的dim_idx位置的预估

Xreki · 2021-07-05T05:59:28Z

paddle/fluid/operators/roll_op.cu

@@ -40,9 +40,12 @@ __global__ void RollCudaKernel(const T* input, T* output, int64_t N,

 #pragma unroll Rank


这里写#pragma unroll就够了吧？

done,thanks

paddle/fluid/operators/roll_op.cu

… roll_optimize_kernel

Xreki

LGTM

optimize index computation

673a2ae

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

08ddf3c

… roll_optimize_kernel

Xreki reviewed Jul 5, 2021

View reviewed changes

SunNy820828449 added 3 commits July 6, 2021 05:05

fix review

52e7ef1

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

3bbb80f

… roll_optimize_kernel

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

2332646

… roll_optimize_kernel

Xreki approved these changes Jul 7, 2021

View reviewed changes

Xreki merged commit d128c28 into PaddlePaddle:develop Jul 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimize index computation #33909

optimize index computation #33909

SunNy820828449 commented Jul 1, 2021

paddle-bot-old bot commented Jul 1, 2021

Xreki Jul 5, 2021

SunNy820828449 Jul 5, 2021

Xreki Jul 5, 2021

SunNy820828449 Jul 5, 2021

Xreki left a comment

		@@ -40,9 +40,12 @@ __global__ void RollCudaKernel(const T* input, T* output, int64_t N,

		#pragma unroll Rank

optimize index computation #33909

optimize index computation #33909

Conversation

SunNy820828449 commented Jul 1, 2021

PR types

PR changes

Describe

paddle-bot-old bot commented Jul 1, 2021

Xreki Jul 5, 2021

Choose a reason for hiding this comment

SunNy820828449 Jul 5, 2021

Choose a reason for hiding this comment

Xreki Jul 5, 2021

Choose a reason for hiding this comment

SunNy820828449 Jul 5, 2021

Choose a reason for hiding this comment

Xreki left a comment

Choose a reason for hiding this comment