Codes for Factoid Question Answering With Distant Supervision.
I am cleaning the codes for uploading, and some description should be added.
- GPU and CUDA 8 are required
- python >=3.5
- pytorch 0.3.0
- pandas
- msgpack
- spacy 1.x
- cupy
- pynvrtc
- jieba
Please download data files from google drive, and put the files under the "dat" file. Specifically, download these four files,
questions_dis_data_150htmls_using_abstext.txt
triple_weight_by_search.txt
new_mined_paraphrase0124.txt
WebQA.v1.0.tar.gz # is it proper to upload this dataset? If it is not permitted, tell me and I'll delete it.
Then unzip the WebQA data with tar -zxvf WebQA.v1.0.tar.gz
.
Train the model via runing
cd DSRC
mkdir logs
python train_model.py
Please refer to parameters.py
for configuration details, where train_idx
is consponding to different experimental configurations in the paper.
Autor of sru: Tao Lei.
Author of the Document Reader model: Danqi Chen.
Author of the original Pytorch implementation: Runqi Yang.
Most of the pytorch model code is borrowed from Facebook/ParlAI under a BSD-3 license.