Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于数据输入的问题 #2

Open
maoyj0119 opened this issue Jul 26, 2020 · 6 comments
Open

关于数据输入的问题 #2

maoyj0119 opened this issue Jul 26, 2020 · 6 comments

Comments

@maoyj0119
Copy link

你好!
关于文本摘要生成的任务
我的数据集是train.src和train.tgt两个文件分别装有text和summarization请问应该如何封装dataset

@maoyj0119
Copy link
Author

maoyj0119 commented Jul 26, 2020

image
这样子修改可以了吗

@YunwenTechnology
Copy link
Owner

看起来应该没有问题,可以在在进入模型前将数据打出,观察是否对应。
或将两个文件按照格式进行融合

@maoyj0119
Copy link
Author

maoyj0119 commented Jul 28, 2020

image
你好出现的是这样子的问题,似乎是迭代的时候有些问题

@guijuzhejiang
Copy link

模型需要的一行数据是什么格式的?能不能给个例子参考,谢谢

@maoyj0119
Copy link
Author

模型需要的一行数据是什么格式的?能不能给个例子参考,谢谢
你好 src与tgt都是一行一句化的形式,以下是我当时做的修改
image
image

@lidongxing
Copy link

模型需要的一行数据是什么格式的?能不能给个例子参考,谢谢
你好 src与tgt都是一行一句化的形式,以下是我当时做的修改
image
image

Have you pretraining the unilm model successfully? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants