-
Notifications
You must be signed in to change notification settings - Fork 525
下游任务数据集
Zhaoxin edited this page Jan 8, 2021
·
26 revisions
CLUE 是中文语言理解测评基准,包括分类和机器阅读理解任务,CLUE中的数据集为JSON格式。对于分类数据集,我们将JSON格式转换为TSV格式,以便UER可以直接加载它们;对于机器阅读理解,将保留原始格式,并将数据集预处理包括在项目中。
Classification:
Dataset | Link |
TNEWS | https://share.weiyun.com/maExfIeO |
CSL | https://share.weiyun.com/LftIGlIT |
CMNLI | https://share.weiyun.com/hn3kTeKm |
OCNLI | https://share.weiyun.com/3DlKxB3q |
AFQMC | https://share.weiyun.com/CdlEKMON |
IFLYTEK | https://share.weiyun.com/ldiLjnZJ |
CLUEWSC2020 | https://share.weiyun.com/RLL1ShBi |
Machine reading comprehension:
Dataset | Link |
CMRC2018 | https://share.weiyun.com/p3Y9INyC |
C3 | in the project |
ChID | https://share.weiyun.com/Mix4q2ns |
Named entity recognition:
Dataset | Link |
CLUENER2020 | https://share.weiyun.com/smSMtLkn |
ERNIE provides 5 Chinese datasets in its first version and use them to test ERNIE's performance.
Dataset | Link |
ChnSentiCorp | in the project |
LCQMC | https://share.weiyun.com/5Fmf2SZ |
XNLI | https://share.weiyun.com/mcd8EApl |
MSRA-NER | in the project |
NLPCC-DBQA | https://share.weiyun.com/5HJMbih |