-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add unified RNN APIs #26588
Add unified RNN APIs #26588
Conversation
test=develop
Thanks for your contribution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
python/paddle/nn/layer/rnn.py
Outdated
self.is_reverse = is_reverse | ||
self.time_major = time_major | ||
|
||
def forward(self, inputs, initial_states=None, sequence_length=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里是否要增加**kwargs
呢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个可以加上的,Done
outputs, final_states = F.birnn(self.cell_fw, self.cell_bw, inputs, | ||
initial_states, sequence_length, | ||
self.time_major) | ||
return outputs, final_states |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F.birnn
和这里返回的final_states 是否需要像outputs一样把双向的内容给concat起来呢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F.birnn 是一个比较低级的接口,它不能预设两个 cell 的层数或者 hidden_size 相等,也不能预设两个 shell 的类型相等(比如说正向是 SimpleRNN, 但是反向是一个 LSTM),所以没法这样做。
python/paddle/nn/layer/rnn.py
Outdated
|
||
outputs, final_states = F.birnn(self.cell_fw, self.cell_bw, inputs, | ||
initial_states, sequence_length, | ||
self.time_major) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
**kwargs
是否要加上呢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个可以加上的,Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1、请中英文文档一起提交review
2、请附上预览图
python/paddle/fluid/layers/rnn.py
Outdated
**kwargs: Additional keyword arguments. Arguments passed to `cell.call`. | ||
|
||
Returns: | ||
outputs (Tensor): A (possibly nested structure of) tensor variable[s], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
variable可以删除
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
删了
cell_bw = LSTMCell(16, 32) | ||
inputs = paddle.rand((2, 23, 16)) | ||
outputs, final_states = paddle.nn.functional.birnn(cell_fw, cell_bw, inputs) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1、示例输入一般不要用随机生成,最好是具体的例子
2、注释一下具体输出内容
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不用随机生成也无法保证输出。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
所以含参 Layer 相关的一律不写输入输出
|
||
Please refer to `Finding Structure in Time | ||
<https://crl.ucsd.edu/~elman/Papers/fsit.pdf>`_ for more details. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
需要说明一下参数的默认初始化的方法。
因为RNN系列的参数初始化方法跟ParamAttr
的默认的初始化方法不一样。
下同。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
有何不一样?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
指的是默认 Uniform(-std, std) 初始化这个事情吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是的呀。
ParamAttr默认是xavier吧,用户沿着这个文档找到ParamAttr的说明,会误以为rnn也是xavier初始化的。
python/paddle/nn/layer/rnn.py
Outdated
|
||
def forward(self, inputs, states=None): | ||
r""" | ||
Given the input and previous atate, compute the output and update state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo 'state`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
python/paddle/nn/layer/rnn.py
Outdated
hidden_size (int): The hidden size. | ||
nonlinearity (str): The activation in the SimpleRNN cell. It can be | ||
`tanh` or `relu`. Defaults to `tanh`. | ||
weight_ih_attr(ParamAttr, optional): The parameter attribute for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这几个参数的说明文档,直接说input to hidden weights ; hidden to hidden weights,会容易理解一些。
因为文档里也没找到哪儿再在解释weight_ih
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经加上了 Parameter section 来解释参数和公式中的符号的对应关系。
python/paddle/nn/layer/rnn.py
Outdated
`[time_steps, batch_size, ...]`. Defaults to False. | ||
|
||
Inputs: | ||
inputs (Tensor): A (possibly nested structure of) tensor variable[s]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"possibly nested structure of"
这句话有些含糊。
- 这个版本里这里就是个batch * length * input_size 的 Tensor
- 还是说可以结合forward时的
sequence_length
的参数,有另外的用法?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"possibly nested structure of" 是之前就这么设计的。
我认为支持 (T, B, C) 以及 (B, T, C) 两种布局的输入。
python/paddle/nn/layer/rnn.py
Outdated
in RNN. | ||
initial_states (Tensor|list|tuple, optional): A (possibly nested structure of) | ||
tensor[s], representing the initial state for the rnn cell. | ||
If not provided, `cell.get_initial_states` would be used to produce |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_initial_states的默认行为是?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
全 0 初始化
python/paddle/nn/layer/rnn.py
Outdated
class BiRNN(Layer): | ||
r""" | ||
Wrapper for bidirectional RNN. It assembles two RNN cells by performing | ||
forward and backward RNN separately, and concat outputs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
第二句英语感觉让新手不容易理解。(我也没想好其他合适的说法)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrapper for bidirectional RNN. It takes two RNN cells as parameters and
build a bidiretional RNN. A BiRNN applies forward RNN and backward
RNN separately and concats the outputs along the last axis.
这样?
python/paddle/nn/layer/rnn.py
Outdated
`tanh` or `relu`. Defaults to `tanh`. | ||
direction (str): The direction of the network. It can be "forward", | ||
"backward" and "bidirectional". Defaults to "forward". | ||
dropout (float): The droput probability. Dropout is applied to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defaults to "forward".
python/paddle/nn/layer/rnn.py
Outdated
"backward" and "bidirectional". Defaults to "forward". | ||
dropout (float): The droput probability. Dropout is applied to the | ||
input of each layer except for the first layer. | ||
time_major (bool): Whether the first dimension of the input means the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
for i, rnn_layer in enumerate(self): | ||
if i > 0: | ||
inputs = F.dropout( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
确认一下。
这里用functional 形式的dropout的话,Layer.train
和Layer.evaluate
能正确处理吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个我稍候确认一下 Layer 的 eval 行为以及 Program 的 clone 行为
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经加上了 training 参数
"RNN", | ||
"BiRNN", | ||
"RNNCellBase", | ||
"RNNCellBase.get_initial_states" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
只有base class需要加入这里吧?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
文档解析工具不知出什么问题,每个都说没有 sample code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
if nonlinearity == "tanh" \ | ||
else F.relu | ||
|
||
def forward(self, inputs, states=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里inputs 是不是需要改成 input?复数表示有多个Tensor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个看怎么理解?因为 batch 本身就已经是复数。
python/paddle/nn/layer/rnn.py
Outdated
def __init__(self, | ||
input_size, | ||
hidden_size, | ||
nonlinearity="tanh", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
激活函数,建议用activation="tanh"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经修改
python/paddle/nn/layer/rnn.py
Outdated
input_size, | ||
hidden_size, | ||
num_layers=1, | ||
nonlinearity="tanh", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
建议用activation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经修改
and mostly used in RNN. | ||
""" | ||
|
||
def get_initial_states(self, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个API是否需要提供给开发者使用?看起来可以作为一个内部的API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
基类的接口,用户本身也是可以调用。
python/paddle/nn/layer/rnn.py
Outdated
prev_h = paddle.randn((4, 32)) | ||
|
||
cell = paddle.nn.LSTMCell(16, 32) | ||
y, h = cell(x, prev_h) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
返回值不对应
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经修改。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
some followup:
- make explanation to "nested structure of tensors", "padded sequence", "initial states" more clear.
- only Base Class could be added to the white list (for CI).
d0f9fba
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
New features
PR changes
APIs
Describe
Add unified RNN APIs