Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
A complete rework of our recurrent layers, making them more similar to their pytorch counterpart.
This is in line with the proposal in #1365 and should allow to hook into the cuDNN machinery (future PR).
Hopefully, this ends the infinite source of troubles that the recurrent layers have been.
Recur
is no more. Mutating its internal state was a source of problems for AD (explicit differentiation for RNN gives wrong results #2185)RNNCell
is exported and takes care of the minimal recursion step, i.e. a single time:cell(x , h)
x
can be of sizein
orin x batch_size
h
can be of sizeout
orout x batch_size
hnew
of sizeout
orout x batch_size
RNN
instead takes in a (batched) sequence and a (batched) hidden state and returns the hidden state for the whole sequence:rnn(x, h)
x
can be of sizein x len
orin x len x batch_size
h
can be of sizeout
orout x batch_size
hnew
of sizeout x len
orout x len x batch_size
LSTM
andGRU
are similarly changed.Close #2185, close #2341, close #2258, close #1547, close #807, close #1329
Related to #1678
PR Checklist
LSTM
andGRU
reset!
cuDNN
(future PR)num_layers
argument for stacked RNNs (future PR)