Skip to content

Latest commit

 

History

History
51 lines (44 loc) · 2.03 KB

README.md

File metadata and controls

51 lines (44 loc) · 2.03 KB

S3LG for Text to Gloss Translation

Official implementation of the [ACL 2024 paper] Semi-Supervisied Spoken Language Glossification.

Setup

Installation

  • Install the environment
    pip install -r requirement.txt

Data Preparation

The data used in our experiments is placed in ./dataset. The structure is as following:

  • ./dataset/phoenix/phoenix2014T/text: the plain-text version of the corpora for PHOENIX2014T dataset.
  • ./dataset/phoenix/external_corpus.txt: the monolingual data.
  • .dataset/phoenix/rule_pseudo.txt: the rule-based synthetic data. Run the following command to generate rule-based pseudo glosses for unlabeled data.
    cd ./dataset/phoenix
    python rule_based_preprocess.py

Training

To train the proposed model, please run the following command.

   cd ./project/translation
   python G2T_train.py -t ./../../dataset/phoenix/phoenix2014T/ -ec ./../../dataset/phoenix/external_corpus.txt -ge 70 -g 0

Citation

If you find this repo useful in your research works, please consider citing:

@inproceedings{yao2024s3lg,
  title={Semi-Supervised Spoken Language Glossification},
  author={Yao, Huijie and Zhou, Wengang and Zhou, Hao and Li, Houqiang},
  booktitle={Proceedings of the Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
  year={2024}
}

If you find this monolingual corpus useful in your research works, please consider citing:

@inproceedings{zhou2021improving,
  title={Improving sign language translation with monolingual data by sign back-translation},
  author={Zhou, Hao and Zhou, Wengang and Qi, Weizhen and Pu, Junfu and Li, Houqiang},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={1316--1325},
  year={2021}
}

Please note that the corpora (PHOENIX2014T) have their own licenses and any use of them should be conforming with them and include the appropriate citations.