-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add gru_unit_op #4443
Add gru_unit_op #4443
Conversation
9e47ce0
to
2f66874
Compare
paddle/operators/gru_unit_op.h
Outdated
Eigen::array<int, 2> extents({{batch_size, frame_size}}); | ||
Eigen::array<int, 2> u_offsets({{0, 0}}); | ||
g.slice(u_offsets, extents).device(place) = | ||
g.slice(u_offsets, extents).sigmoid(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many tutorials fixed the Sigmoid
activation for the update gate and reset gate, Tanh
activation for the memory in the GRU, but other activations like ReLU
can also be used. So activation function should not be fixed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Refine to support multiple activation types.
paddle/operators/gru_unit_op.h
Outdated
auto u = g.slice(u_offsets, extents); // update gate | ||
Eigen::array<int, 2> r_offsets({{0, frame_size}}); | ||
g.slice(r_offsets, extents).device(place) = | ||
g.slice(r_offsets, extents).sigmoid(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above, if want to fix it in next PR, please add TODO comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Refine to support multiple activation types.
paddle/operators/gru_unit_op.h
Outdated
|
||
Eigen::array<int, 2> c_offsets({{0, frame_size * 2}}); | ||
g.slice(c_offsets, extents).device(place) = | ||
g.slice(c_offsets, extents).tanh(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above for the tanh
activation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Refine to support multiple activation types.
paddle/operators/gru_unit_op.cc
Outdated
.AsIntermediate(); | ||
AddOutput("hidden", | ||
"(Tensor) The GRU hidden state of the current time step " | ||
"with shape [batch_size, frame_size]."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should follow name convention: CamelCase
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
2f66874
to
ae1b29a
Compare
Resolves #4213
Rewrite GruStepLayer through Eigen in the new framework and keep parameter format consistent with the original GruStepLayer, but leave some optimization like AVX to be done. https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/gserver/layers/GruStepLayer.cpp