Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gru_unit_op #4443

Merged
merged 5 commits into from
Oct 13, 2017
Merged

Add gru_unit_op #4443

merged 5 commits into from
Oct 13, 2017

Conversation

guoshengCS
Copy link
Contributor

@guoshengCS guoshengCS commented Sep 27, 2017

Resolves #4213

Rewrite GruStepLayer through Eigen in the new framework and keep parameter format consistent with the original GruStepLayer, but leave some optimization like AVX to be done. https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/gserver/layers/GruStepLayer.cpp

Eigen::array<int, 2> extents({{batch_size, frame_size}});
Eigen::array<int, 2> u_offsets({{0, 0}});
g.slice(u_offsets, extents).device(place) =
g.slice(u_offsets, extents).sigmoid();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many tutorials fixed the Sigmoid activation for the update gate and reset gate, Tanh activation for the memory in the GRU, but other activations like ReLU can also be used. So activation function should not be fixed here.

Copy link
Contributor Author

@guoshengCS guoshengCS Oct 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Refine to support multiple activation types.

auto u = g.slice(u_offsets, extents); // update gate
Eigen::array<int, 2> r_offsets({{0, frame_size}});
g.slice(r_offsets, extents).device(place) =
g.slice(r_offsets, extents).sigmoid();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, if want to fix it in next PR, please add TODO comments.

Copy link
Contributor Author

@guoshengCS guoshengCS Oct 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Refine to support multiple activation types.


Eigen::array<int, 2> c_offsets({{0, frame_size * 2}});
g.slice(c_offsets, extents).device(place) =
g.slice(c_offsets, extents).tanh();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above for the tanh activation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Refine to support multiple activation types.

.AsIntermediate();
AddOutput("hidden",
"(Tensor) The GRU hidden state of the current time step "
"with shape [batch_size, frame_size].");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@guoshengCS guoshengCS merged commit a0af1ee into PaddlePaddle:develop Oct 13, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants