This code is for ICLR2021 paper: Pre-training Text-to-Text Transformers for Concept-centric Common Sense. Checkout our Project website for details!
conda create -n calm python==3.7
conda activate calm
python setup.py install
cd CALM
cat wiki.doc | tail -n +500000 | head -n 500000 > wiki/wiki.train.raw
cat wiki.doc | tail -n +1000000 | head -n 100000 > wiki/wiki.valid.raw
python dataset_utils/concept_deshuffling_data_generation.py
python dataset_utils/keyword_lm_data_generation.py
Dataset creation for concept-order-recovering (COR) and concept-to-sentence (C2S).
python dataset_utils/generate_discriminative_dataset.py
Dataset creation for generative question answering (QA) dataset.
There are three types of contrastive objectives (See Table 4 (b) in the paper).
Option 1: Multi-choice QA
Option 2: Generative QA
Option 3: True/False
For CALM, we use option 2, which is Generative QA.
python dataset_utils/mix_dataset.py
First, train the mix dataset.
python finetune.py \
--data_dir datasets/mix \
--output_dir outputs/calm_mix_base \
--model_name_or_path t5-base \
--tokenizer_name_or_path t5-base \
--max_seq_length 256 \
--learning_rate 5e-4 \
--num_train_epochs 2 \
--train_batch_size 8 \
--graident_accumulation_steps 4 \
--weight_decay 0.01 \
--warmup_steps 10000 \
--adam_epsilon 1e-6 \
--n_gpu 4 \
--gpu_nums 4,5,6,7 \
--model_parallel
python finetune.py \
--data_dir datasets/mix \
--output_dir outputs/calm_mix_large_dp \
--model_name_or_path t5-large \
--tokenizer_name_or_path t5-large \
--max_seq_length 256 \
--learning_rate 5e-4 \
--num_train_epochs 2 \
--train_batch_size 8 \
--graident_accumulation_steps 4 \
--weight_decay 0.01 \
--warmup_steps 10000 \
--adam_epsilon 1e-6
Then, train CALM using the checkpoint of mix dataset.
python finetune_generator_discriminator.py \
--data_dir datasets/option2 \
--checkpoint_dir outputs/calm_mix \
--output_dir outputs/calm \
--max_seq_length 256 \
--learning_rate 5e-7 \
--num_train_epochs 3 \
--train_batch_size 8 \
--graident_accumulation_steps 32 \
--fp_16 False \
--weight_decay 0.01 \
--warmup_steps 10000 \
--adam_epsilon 1e-6 \
--n_gpu 8 \
--gpu_nums 0,1,2,3,4,5,6,7
python finetune_generator_discriminator.py \
--data_dir datasets/option2 \
--checkpoint_dir outputs/calm_mix_base_dp \
--output_dir outputs/calm_base_dp \
--max_seq_length 256 \
--learning_rate 5e-7 \
--num_train_epochs 3 \
--train_batch_size 8 \
--graident_accumulation_steps 32 \
--fp_16 False \
--weight_decay 0.01 \
--warmup_steps 10000 \
--adam_epsilon 1e-6
Use checkpoint to fine-tune on the downstream tasks.
Our released models are listed as following. You can import these models by using HuggingFace's Transformers.
Model | CSQA | OBQA | PIQA | aNLI | Description |
---|---|---|---|---|---|
danny911kr/calm-mix-base | 63.02 | 60.40 | 70.07 | 62.79 | Mix-Only |
danny911kr/calm-base | 63.32 | 60.90 | 71.01 | 63.20 | |
danny911kr/calm-mix-large | 70.26 | 62.50 | 73.70 | 75.99 | Mix-Only |
danny911kr/calm-large | 71.31 | 66.00 | 75.11 | 77.12 |