Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于seq2seq任务pretrain和finetune的区别 #24

Open
Tongjilibo opened this issue Aug 26, 2021 · 3 comments
Open

关于seq2seq任务pretrain和finetune的区别 #24

Tongjilibo opened this issue Aug 26, 2021 · 3 comments

Comments

@Tongjilibo
Copy link

请教一个问题哈,UNILM中,我看论文里写的pretrain时候其中seq2seq LM, [MASK]位置可以从source或者target中任意取是吧,finetune时候只对target进行mask?

我想问的是,unilm的seq2seq的finetune里面也有[mask]标记? 我以为是类似transformer的decoder的pos_shift的方式

@xiaoshengjun
Copy link

同有这个问题,感觉不是传统的seq2seq训练的,还是bert那一套mask训练方式,只是修改了掩码矩阵,target部分变成单向的了。

@rookiebird
Copy link

我也这么感觉的,一开始以为 seq2seq 的mask 模式也是类似gpt 的预测方式,输入当前词, 预测词应该是当前词的下一个词。 unilm 实际只有nsp 和 完形填空任务。 想想也很正常,毕竟完型填空任务预测结果是自己, seq2seq 任务预测的是下一个单词,对于属于同一个segment 的输入,这两个任务没法同时进行。

@NUAA-XSF
Copy link

我想问下,那Unilm中seq2seq finetune后,怎么进行推理,它也不是预测下一个单词啊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants