-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ScaleShiftLayer #3560
Add ScaleShiftLayer #3560
Conversation
@@ -2007,6 +2007,21 @@ TEST(Layer, RowL2NormLayer) { | |||
} | |||
} | |||
|
|||
TEST(Layer, ScaleShiftLayer) { | |||
const size_t batchSize = 128; | |||
const size_t size = 512; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这两个size 改小一些,这个layer 对size 不是非常敏感,没有必要测试这么大的layer。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -0,0 +1,11 @@ | |||
from paddle.trainer_config_helpers import * | |||
|
|||
settings(batch_size=1000, learning_rate=1e-5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这一行可以去掉。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
name=name, | ||
type=LayerType.SCALE_SHIFT_LAYER, | ||
inputs=Input(input.name, **param_attr.attr), | ||
bias=ParamAttr.to_bias(bias_attr)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
有一点想确认一下,如果用户不指定 w 这个参数的初始化 std, mean 和策略, w 这个参数会如何初始化?会不会有意外情况被设置为 0 ,后续网络可能就直接废掉了 ,默认值设置的逻辑有检查过吗 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. 进行了检查,目前设置会默认初始化为mean=0,std=1.0。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update the comments.
/** | ||
* A layer applies a slope and an intercept to the input element-wise for | ||
* scaling and shifting. Noting that this layer is trainable which differs | ||
* from the SlopeInterceptLayer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A layer applies a linear transformation to each element in each row of the input matrix. For each element, the layer first re-scale it and then adds a bias to it.
This layer is very like the SlopeInterceptLayer, except the scale and bias are trainable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
* y = wx + b | ||
* \f] | ||
* | ||
* Here, w is scale and b is offset, which are scalars and trainable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
w is the scale and b is the bias. Both w and b are trainable scalars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
""" | ||
A layer applies a slope and an intercept to the input element-wise for | ||
scaling and shifting. Noting that this layer is trainable which differs | ||
from the slope_intercept_layer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A layer applies a linear transformation to each element in each row of the input matrix. For each element, the layer first re-scale it and then adds a bias to it.
This layer is very like the SlopeInterceptLayer, except the scale and bias are trainable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
Add trainable ScaleShiftLayer to do scaling and shifting.