Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize index computation #33909

Merged
merged 5 commits into from
Jul 7, 2021

Conversation

SunNy820828449
Copy link
Contributor

PR types

Performance optimization

PR changes

OPs

Describe

optimize the index computation

Paddle vs Pytorch

--- Paddle-Old Pytorch Paddle
shape=(1000,2000)
axis=(1)
shift=(5)
avg ~ 25.5 us avg ~ 24us avg ~ 23.75us
shape=(1000,2000)
axis=(0)
shift=(5)
avg ~ 24.9us avg ~ 24us avg ~ 22.8us
shape=(1000,2000)
axis=(0,1)
shift=(5,5)
avg ~ 37.7 us avg ~ 48us avg ~ 31.4us

@paddle-bot-old
Copy link

paddle-bot-old bot commented Jul 1, 2021

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

dim_idx = (idx / strides[i]) % sizes[i];
dim_idx_shift = (dim_idx + shifts[i]) % sizes[i];
output_idx = output_idx + (dim_idx_shift - dim_idx) * strides[i];
dim_idx = (idx / strides[i]) % sizes[i] + shifts[i];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

变量名应符合实际代表的含义,这里应该是原来的dim_idx_shift,且临时变量dim_idx不再需要,应该删除。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不是dim_idx_shift,就是新的dim_idx位置的预估

@@ -40,9 +40,12 @@ __global__ void RollCudaKernel(const T* input, T* output, int64_t N,

#pragma unroll Rank
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里写#pragma unroll就够了吧?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done,thanks

paddle/fluid/operators/roll_op.cu Show resolved Hide resolved
Copy link
Contributor

@Xreki Xreki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Xreki Xreki merged commit d128c28 into PaddlePaddle:develop Jul 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants