We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
请教一个问题哈,UNILM中,我看论文里写的pretrain时候其中seq2seq LM, [MASK]位置可以从source或者target中任意取是吧,finetune时候只对target进行mask?
我想问的是,unilm的seq2seq的finetune里面也有[mask]标记? 我以为是类似transformer的decoder的pos_shift的方式
The text was updated successfully, but these errors were encountered:
同有这个问题,感觉不是传统的seq2seq训练的,还是bert那一套mask训练方式,只是修改了掩码矩阵,target部分变成单向的了。
Sorry, something went wrong.
我也这么感觉的,一开始以为 seq2seq 的mask 模式也是类似gpt 的预测方式,输入当前词, 预测词应该是当前词的下一个词。 unilm 实际只有nsp 和 完形填空任务。 想想也很正常,毕竟完型填空任务预测结果是自己, seq2seq 任务预测的是下一个单词,对于属于同一个segment 的输入,这两个任务没法同时进行。
我想问下,那Unilm中seq2seq finetune后,怎么进行推理,它也不是预测下一个单词啊
No branches or pull requests
请教一个问题哈,UNILM中,我看论文里写的pretrain时候其中seq2seq LM, [MASK]位置可以从source或者target中任意取是吧,finetune时候只对target进行mask?
我想问的是,unilm的seq2seq的finetune里面也有[mask]标记? 我以为是类似transformer的decoder的pos_shift的方式
The text was updated successfully, but these errors were encountered: