Controlled Language Generation for Language Learning Items

This is the repository for the paper Controlled Language Generation for Language Learning Items at the Industry Track, EMNLP 2022. The code is based heavily on HuggingFace's sequence-to-sequence Trainer examples.

Requirements

Scripts were tested with python 3.9 and transformers version 4.6.1. Nothing else should be required.

Data

The data is provided as jsonlines objects containing relevant fields for concept-to-sequence generation with control. The files require Git LFS.

Training

To train, call the concept2seq.py script with --mode train, along with the required parameters. The "extras" parameter includes the control: this can be "srl", "wsd", "or "cefr".

# Set a root directory
r=/home/nlp-text/dynamic/kstowe/github/concept-control-gen/
data_json=${r}/data/concept2seq_train.jsonl

# Substitute in your python
/home/conda/kstowe/envs/pretrain/bin/python $r/concept2seq.py \
    --mode train \
    --data_dir $data_json \
    --output_dir $r/models/c2s_test \
    --epochs 3 \
    --batch_size 32 \
    --model_path facebook/bart-base \
#    --extras srl \

Prediction

Prediction works similarly, using the supported parameters.

# Set a nice root
r=/home/nlp-text/dynamic/kstowe/github/concept-control-gen/

/home/conda/kstowe/envs/pretrain/bin/python $r/concept2seq.py \
        --mode test \
        --output_path $r/outputs/test.txt \
        --test_path ${r}/data/concept2seq_test.jsonl \
        --model_path kevincstowe/concept2seq

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
LICENSE		LICENSE
README.md		README.md
concept2seq.py		concept2seq.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Controlled Language Generation for Language Learning Items

Requirements

Data

Training

Prediction

About

Releases

Packages

Languages

License

EducationalTestingService/concept-control-gen

Folders and files

Latest commit

History

Repository files navigation

Controlled Language Generation for Language Learning Items

Requirements

Data

Training

Prediction

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages