Skip to content

Coupling Distant Annotation and Adversarial Training for Cross-Domain Chinese Word Segmentation

License

Notifications You must be signed in to change notification settings

Alibaba-NLP/DAAT-CWS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DAAT-CWS

Coupling Distant Annotation and Adversarial Training for Cross-Domain Chinese Word Segmentation

Paper accepted by ACL 2020

Prerequisites

  • python == 2.7
  • tensorflow == 1.8.0

Dataset

source domain dataset PKU and five distantly-annotated target datasets are put in data/datasets directory

Usage

Run python train.py --tgt_train_path <tgt_train_path> --tgt_test_path <tgt_test_path>

Note:

This code is based on the previous work by chqiwang. Many thanks to chqiwang. The raw text of dataset used in our paper can be found at CWS-NAACL2019

About

Coupling Distant Annotation and Adversarial Training for Cross-Domain Chinese Word Segmentation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published