-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Factorization Machine Layer #4859
Add Factorization Machine Layer #4859
Conversation
const MatrixPtr& inputV = getInputValue(0); | ||
|
||
size_t batchSize = inputV->getHeight(); | ||
size_t size = getSize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is getSize mean? I cannot validate this snippet of code without your comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getSize returns the output size of this layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请按照上面 @dzhwinter 的comment,为变量起一个更有意义的名字。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已改为outputSize
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some simple comments first.
factorSize_ = config_.factor_size(); | ||
|
||
/* initialize the latentVectors_ */ | ||
CHECK_EQ(inputLayers_.size(), 1UL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
35 ~ 40 行不要在 init
里面做,移到 forward 里面。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已改
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已改
latentVectors_ = | ||
std::unique_ptr<Weight>(new Weight(height, factorSize_, parameters_[0])); | ||
|
||
v2_ = Matrix::create(height, factorSize_, false, useGpu_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
v2_
这个命名不可读,请使用有意义更可读的名字。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已改为latentVectorsSquare_
const MatrixPtr& inputV = getInputValue(0); | ||
|
||
size_t batchSize = inputV->getHeight(); | ||
size_t size = getSize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请按照上面 @dzhwinter 的comment,为变量起一个更有意义的名字。
Matrix::resizeOrCreate(tmpMul_, batchSize, factorSize_, false, useGpu_); | ||
Matrix::resizeOrCreate(tmpOut_, batchSize, factorSize_, false, useGpu_); | ||
|
||
REGISTER_TIMER_INFO("FwMulTimer", getName().c_str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果要使用 REGISTER_TIMER_INFO
第一个参数是 Timer的名字,这里是从 FC copy过来的吧,请把名字改一下。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已改
outV->sumRows(*tmpOut_, -0.5, 1.0); | ||
|
||
/* activation */ { | ||
REGISTER_TIMER_INFO("FwAtvTimer", getName().c_str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请改一下Timer的名字。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已改
|
||
void FactorizationMachineLayer::backward(const UpdateCallback& callback) { | ||
/* Do derivation */ { | ||
REGISTER_TIMER_INFO("BpAvtTimer", getName().c_str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请注意改一下Timer的名字。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已改
CpuSparseMatrix* x2_s = dynamic_cast<CpuSparseMatrix*>(x2_.get()); | ||
CpuSparseMatrix* tmpIn_s = dynamic_cast<CpuSparseMatrix*>(tmpIn.get()); | ||
tmpIn_s->copyFrom(*inputV_s); | ||
tmpIn_s->rowScale(0, *inputV_s, *oGrad); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inputV_s
x2_s
tmpIn_s
inputV_s
这些命名的风格不统一,请按照layers里面的风格进行统一。并且,这些变量的命名不可读,请考虑使用有意义的名字。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已改为sparseInputV, sparseInputSquare, sparseTmpInput
latentVectors_->getWGrad()->mul(*tmpIn_s->getTranspose(), *tmpMul_, 1, 1); | ||
tmpIn_s->rowScale(0, *x2_s, *oGrad); | ||
|
||
MatrixPtr ones = Matrix::create(1, inputV->getHeight(), false, useGpu_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
把临时变量ones变成员变量。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已改
if (inGrad != NULL) { | ||
MatrixPtr latentVectors_T = latentVectors_->getW()->getTranspose(); | ||
inGrad->mul(*tmpMul_, *latentVectors_T, 1, 1); | ||
tmpSum_T->sumRows(*v2_, -1, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tmpSum_T
请修改一下变量的命名风格。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已改为tmpSumTrans
config.biasSize = 0; | ||
config.inputDefs.push_back({type, "layer_0", 128, 1280}); | ||
config.layerConfig.add_inputs(); | ||
testLayerGrad(config, "factorization_machine", 16, false, useGpu, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SparseMatrix 作为输时请添加单测。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已加
} | ||
|
||
void FactorizationMachineLayer::forward(PassType passType) { | ||
Layer::forward(passType); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不支持GPU上运行请加检查并提示错误。
* | ||
* \f[ | ||
* y = \sum_{i=1}^{n-1}\sum_{j=i+1}^n\langle v_i, v_j \rangle x_i x_j | ||
* \f] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can cite the inference paper here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已加
paddle/math/CpuSparseMatrix.cpp
Outdated
for (size_t i = 0; i < height_; i++) { | ||
size_t start = getRowStartIdx(i); | ||
size_t end = getRowStartIdx(i + 1); | ||
CHECK(start == b.getRowStartIdx(i)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CHECK --> CHECK_EQ
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已改
The Factorization Machine can effectively capture feature interactions | ||
especially when the input is sparse. In practice, usually order 2 feature | ||
interactions are considered using Factorization Machine with the formula: | ||
.. math:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line 7166 之前空一行。
line 7167 之后空一行,否则公式无法正常显示。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已加
The Factorization Machine models pairwise feature interactions as inner | ||
product of the learned latent vectors corresponding to each input feature. | ||
The Factorization Machine can effectively capture feature interactions | ||
especially when the input is sparse. In practice, usually order 2 feature |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- usually order 2 feature --> this implementation only consider the 2-order feature interactions.
- 请在注释中增加一下对FM层实现所参考的原论文的引用。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已加
is the latent vector corresponding to each input dimesion. The size of | ||
each latent vector is k. | ||
.. code-block:: python | ||
factor_machine = factorization_machine(input=input_layer, factor_size=10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
7172 行之前空一行,
7173 行之后空一行。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已改
proto/ModelConfig.proto
Outdated
@@ -540,6 +540,9 @@ message LayerConfig { | |||
|
|||
// for switch order layer | |||
optional ReshapeConfig reshape_conf = 59; | |||
|
|||
// for factorization machine layer | |||
optional uint32 factor_size = 60; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
为什么不能复用 Layer 的size,而新定义这个字段。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Layer的size是输出的维度,而这个是内部使用的隐变量(factor)的维度
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
要是用size感觉会有歧义
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
抱歉,我理解错误。忽略。
:param layer_attr: Extra Layer config. | ||
:type layer_attr: ExtraLayerAttribute|None | ||
:return: LayerOutput object. | ||
:rtype: LayerOutput |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请在comment和示例代码中注明这一层本身并不是 FM,只是完成二阶特征组合部分。需要和其它层配置使用,在simple code 部分给出一个完整的示例。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已加
|
||
|
||
@wrap_name_default() | ||
@wrap_act_default(act=LinearActivation()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里只可以使用非线性激活函数吧。如果从原理上不能使用非线性激活,就把激活写死,不要让用户来设置了。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以用非线性的~
especially when the input is sparse. | ||
|
||
This implementation only consider the 2-order feature interactions using | ||
Factorization Machine with the formula: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 除了这个注释之外,在 7426 行加一个完整的配置,方便用户看到这一层的文档时,能够写出来一个完整的 FM 模型。
- 注释一下支持的 input 类型和不支持的类型。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的,我加下
:param input: The input layer. | ||
:type input: LayerOutput | ||
:param factor_size: The hyperparameter that defines the dimensionality of | ||
the latent vector size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
句末加上句号。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的
:param factor_size: The hyperparameter that defines the dimensionality of | ||
the latent vector size | ||
:type context_len: int | ||
:param act: Activation Type. Default is linear activation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
原理上这里可以使用非线性激活吗?应该不可以吧。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以的~
:param act: Activation Type. Default is linear activation. | ||
:type act: BaseActivation | ||
:param param_attr: The Parameter Attribute. If None, the latent vectors will | ||
be initialized smartly. It's better to set it by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
作为注释,还是解释一下 “be initialized smartly” 到底是怎样初始化的。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的
Matrix::resizeOrCreate(negOnes_, 1, inputV->getHeight(), false, useGpu_); | ||
negOnes_->zeroMem(); | ||
negOnes_->add(-1); | ||
tmpSum_->mul(*negOnes_, *sparseTmpInput, 1, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// this = scaleAB*(a*b) + scaleT*this
mul(const Matrix& a, const Matrix& b, real scaleAB, real scaleT)
125 ~ 127 行为什么不能是:
ones_->ones();
tmpSum_->mul(*ones_, *sparseTmpInput, -1, 0);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 2944 in b28b2f1
CHECK_EQ(scaleAB, static_cast<real>(1.0)); |
因为b是sparse的时候mul只支持scaleAB是1,不支持其他value
|
||
/* activation */ { | ||
REGISTER_TIMER_INFO("FmFwAtvTimer", getName().c_str()); | ||
forwardActivation(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FM 层可以加非线性激活吗?如果原理上不可以(我记得不可以,可以再确认下),这里可以删掉。如果允许,就保留。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以加非线性的激活~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里算的只是二阶交叉项,你的意思是如果我在二阶交叉项使用非线性激活A,一阶项使用非线性激活B,这样也可以吗 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
虽然没有看到这样用的,但理论上应该是可以的~
Resolve #4628 #3664 #3971