Code for paper Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks [ACL 2023 Findings]
Instructions on downloading preprocessed datasets and prepraring costum datasets can be found here
Download checkpoints
from: https://uofi.box.com/s/wnt6cv7icuir4q3wb2a6viuyklme5dga. Put the checkpoints directories in checkpoints
under zemi/output/p3_finetuning
Set up conda environment with conda env create -f environment.yml
.
Run accelerate config
to config the device.
Scripts for reproducing the main results in Table 1: performing (semi-)parametric multitask prompted training and zero-shot evaluation. Detailed instructions on the configurations can be found here.
All scripts should be run under zemi/
. SETUP_ENV.sh
will be called in the following scripts for setting up env variables. One may modify the variables if not using the exact same folder structure as setup above.
- base:
bash ./training/no_aug_base.sh
- large:
bash ./training/no_aug_large.sh
- base:
bash ./training/concat_base.sh
- large:
bash ./training/concat_large.sh
- base:
bash ./training/fid_base.sh
- large:
bash ./training/fid_large.sh
- base:
bash ./training/zemi_base.sh
- large:
bash ./training/zemi_large.sh
- code for the model architecture:
zemi/modeling_t5.py
from this line andzemi/modeling_xattn.py
- code for multitask training:
- train No Aug and Concat baseline:
zemi/multi_task_fine_tune_baseline.py
- train FiD baseline and Zemi:
zemi/multi_task_fine_tune_xattn.py
- train No Aug and Concat baseline:
- code for zero-shot evaluation:
- eval No Aug and Concat baseline:
zemi/eval_original_task_only.py
- eval FiD baseline and Zemi:
zemi/eval_original_task_only_xattn.py
- eval No Aug and Concat baseline:
visualization/
contains examples of the retrieved documents for each task. We include the top 50 examples with the highest and lowest BM25 scores in visualization/top50_highest_score_retrieval_instances
and visualization/top50_lowest_score_retrieval_instances
. We also include the first 50 instances for each dataset without reordering in visualization/first50_retrieval_instances
.
@article{wang2022zemi,
title={Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks},
author={Wang, Zhenhailong and Pan, Xiaoman and Yu, Dian and Yu, Dong and Chen, Jianshu and Ji, Heng},
journal={arXiv preprint arXiv:2210.00185},
year={2022}
}