-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Context Projection Paddle Function-- follow comments #1080
Context Projection Paddle Function-- follow comments #1080
Conversation
a2b0c28
to
47b0416
Compare
Merge the latest code, and the sequence data uses SequenceArg as the argument type. |
87c1b7d
to
8ae158c
Compare
* \param inputs[1] input weight. | ||
* \param inputs[2] input sequence. | ||
* Paddle Function for Context Projection Forward. | ||
* Calculate the value for the output layer with context projection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个描述不正确。
ContextProjectionForwardFunc应该是一个Input a Seuqnces Data,Output a Sequences Data,所以calc接口的参数类型应该修改一下。
另外,下面这段code,好像是描述的单个sequence是怎么计算的(也需要注释出来)。
后续,class SequenceArg 换成class SequencesArg好像更加准确。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update the interface, not sure it is what you expect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
? typename Tensor<real, Device>::Matrix(nullptr, 0, 0) | ||
: inputs[1].matrix<Device>(); | ||
auto seq_vec = inputs[2].vector<int, Device>(); | ||
const auto in_mat = inputs[0].matrix<Device>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inputs and outputs应该都是SequencesArg。
另外,2 == inputs.size(), 1 == inputs.size() 都是正确,后一种表示没有padding的weight,就可以去掉(!inputs[1].data())判断了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hedaoyuan 我的理解对不对?inputs[0] = SeqArg{input value, input sequence}, inputs[1] = SeqArg{input weight, input sequence}, outputs[0] = SeqArg{output value, input sequence}?如果都包括sequence的话,会不会增加额外的内存?没有padding的weight是不是pad = 0, 然后整个weight就是nullptr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
? typename Tensor<real, Device>::Matrix(nullptr, 0, 0) | ||
: inputs[1].matrix<Device>(); | ||
auto seq_vec = inputs[2].vector<int, Device>(); | ||
const auto in_mat = inputs[0].matrix<Device>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inputs and outputs应该都是SequenceArg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -350,13 +285,5 @@ REGISTER_TYPED_FUNC(ContextProjectionForward, | |||
REGISTER_TYPED_FUNC(ContextProjectionBackward, | |||
GPU, | |||
ContextProjectionBackwardFunc); | |||
#if 0 | |||
REGISTER_TYPED_FUNC(ContextProjectionBackwardData, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
感觉还是ContextProjectionBackwardData,ContextProjectionBackwardWeight两个接口比较合理,这两个Function计算的是两个无关(没有依赖)的Buffer。之前,我注释掉这两个代码没有删掉,一是编译问题,另外也是想着后面修改来的。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just keep and update these interfaces. Do we need to have to output SequenceArg{output, sequence}?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Dims{gpu_seq->getSize()})}, | ||
{Tensor(gpu_out.getData(), Dims{batch_size, input_dim * context_length})}, | ||
{}); | ||
BufferArgs cpu_inputs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test部分先这样,我后续会再修改。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好
8ae158c
to
df2c26f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好久没看function这块的代码了,顺便学习一下。
CHECK_EQ(inputs[0].shape().ndims(), (size_t)2); | ||
CHECK_EQ(inputs[1].shape().ndims(), (size_t)2); | ||
CHECK_EQ(inputs[2].shape().ndims(), (size_t)1); | ||
const auto val_seqs = dynamic_cast<const SequenceArg&>(inputs[0]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这么搞容易报exception啊。。
如果inputs[0]不是SequenceArg类型的话,dynamic_cast直接挂了吧。。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
对的,Good point. 这个我改一下,可以先判断inputs[0]是SequenceArg类型,再dynamic_cast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
把BufferArg::bufferType_完善一下,可以用bufferType()来判断。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, working on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
const auto w_seqs = inputs.size() <= 1 | ||
? nullptr | ||
: dynamic_cast<const SequenceArg*>(&inputs[1]); | ||
auto out_seqs = dynamic_cast<const SequenceArg&>(outputs[0]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上,容易exception。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
: dynamic_cast<const SequenceArg*>(&inputs[1]); | ||
auto out_seqs = dynamic_cast<const SequenceArg&>(outputs[0]); | ||
|
||
CHECK(out_seqs.data() && val_seqs.data() && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hedaoyuan 我们既然想最终不用LOG(FATAL),不让Paddle在运行时挂掉,那么我们现在重构这个function的时候,能否考虑加上错误的返回值呢? 类似于 https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/lib/core/status.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tensorflow这套调测代码挺复杂的,逻辑上是把所有Error记录下来,到后面处理。这里可以继续用glog,后续实现类似功能后再统一修改即可,所有需要替换掉CHECK、LOG(FATAL)的地方都是一样的,返回值的定义也不复杂,放到后续做即可。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tensorflow这么搞确实挺复杂的。。我只是在想要不回头先定义一个Paddle的Status,新增的code就尽量返回Status了呗?
因为单纯的替换CHECK可能不能达到这个效果。
int * a = new int [10000];
LOG(FATAL) << "I will exit!" ;
delete [] a;
这里如果把glog替换的话,a就永远不会delete了。
int * a = new int [10000];
throw SomeError(); // or return SomeError;
delete [] a;
所以,错误处理可能还和单纯的替换GLOG不太一样。
不过确实应该不复杂,后续再弄也对。
@@ -56,7 +56,7 @@ void ContextProjectionForward( | |||
*/ | |||
template <DeviceType DType> | |||
void ContextProjectionBackward( | |||
typename Tensor<real, DType>::Matrix& out_grad, | |||
const typename Tensor<real, DType>::Matrix& out_grad, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不是很理解为什么输出是const的引用,输入反而是引用呢?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
主要out_grad没有变,而in_grad变了。
df2c26f
to
e979421
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
先approve了。
另外有两点,一是注释可能还是需要加强,二是对于特定的Function,Output与Input的Shape关系是固定的,这个可以在init或者别的地方写入到Function属性里面,基于一个Input shape到Output shape的推导关系再去check input/output argument应该可以省略一些代码,另外,之前徐老师的建议也是Function应该描述这个关系,便于在调用Function之前就可以对参数进行check。(包括CrossMapNormalOp.cpp里面的实现也还有同样的问题,我后续也还会再继续Fix)
val_seqs.getSequenceIds().data()); | ||
CHECK_EQ(out_seq.shape().ndims(), (size_t)2); | ||
CHECK_EQ(val_seqs.shape().ndims(), (size_t)2); | ||
CHECK_EQ(val_seqs.getSequenceIds().shape().ndims(), (size_t)1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CHECK_EQ(val_seqs.getSequenceIds().shape().ndims(), (size_t)1);
好像是不需要的,SequenceIdArg的Shape.ndims() == 1应该是始终成立的。
auto out_seq = dynamic_cast<const SequenceArg&>(outputs[0]); | ||
CHECK(in_seq.data() && in_seq.getSequenceIds().data()); | ||
CHECK_EQ(in_seq.shape().ndims(), (size_t)2); | ||
CHECK_EQ(in_seq.getSequenceIds().shape().ndims(), (size_t)1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上,后面对getSequenceIds的判断也一样。
* fixed one error of api_cn * fixed some errors of api_cn * fixed some errors of api_cn * fixed two errors. * fixed two errors * fixed one error. * fixed one error * fixed one error. * fixed two errors. * fixed one error. * fixed two errors * fixed two links
Follow further comments of PR(#1009) from Wei Xu and Daoyuan.
todo: