COVID-19 Chatbot

Please note that the health information generated by the chatbot is for general research purposes only. It is not a diagnostic tool, nor is it a substitute for medical advice or treatment for specific conditions.

Paper

Our work has been accepted at ACM-BCB. The paper is available here: https://dl.acm.org/doi/abs/10.1145/3388440.3412413.

Dataset

The dataset is the initial commercial use subset taken from COVID-19 Open Research Dataset (CORD-19) and consists of 9000 scholarly articles.

For the training purposes, we have extracted the abstract and the main body of these articles and have merged them together.

For (re)extracting the data, run the command below.

python3 extract.py

Model

We have retrained GPT-2 774M model with the COVID-19 corpus.

The model was trained using the Adam optimizer with the learning rate of 0.0001. It went through 2500 iterations with the batch size of 8.

The model is hosted and available at this Google Drive link.

Chatbot

Once you have downloaded the model, and put the model inside the models directory. Afterward, to run the chatbot, execute the sequence of commands shown below.

git clone https://github.com/oniani/covid-19-chatbot
cd covid-19-chatbot
python3 -m pip install -r requirements.txt
PYTHONPATH=src python3 -W ignore interact.py

Web Application

For running the web application, navigate to the web-app directory and run flask run (it should be hosted on port 5000, usually).

Transfer Learning

It is also possible to run transfer learning with your own data.

Google Colaboratory (re)training example:

# Mount the drive
from google.colab import drive
drive.mount("/content/drive")

# Set up the repository
%cd "/content/drive/My Drive"
!mkdir COVID-19_CHATBOT
!rm -rf gpt-2
!git clone https://github.com/oniani/gpt-2 "/content/drive/My Drive/COVID-19_CHATBOT/gpt-2/"
%cd COVID-19_CHATBOT/gpt-2/

# Install the pretrained model and its dependencies
!python3 -m pip install -r requirements.txt
!python3 download_model.py 774M

# Install additional dependencies
!python3 -m pip install fire==0.2.1 \
                        tensorflow-gpu==1.14 \
                        tensorflow-hub==0.7.0 \
                        toposort==1.5

# Run the transfer learning training
#
# NOTE: You will need to upload `data` folder from this repository and put it
# into the `COVID-19_CHATBOT` directory
!PYTHONPATH=src python3 train.py --dataset="/content/drive/My Drive/COVID-19_CHATBOT/data" \
                                 --model_name=774M \
                                 --batch_size=8 \
                                 --optimizer=adam \
                                 --learning_rate=0.0001 \
                                 --save_time=-1 \
                                 --sample_every=-1 \
                                 --save_every=500 \
                                 --init_tpu

Special thanks to @shawwn for making GPT-2 TPU-trainable on Google Colaboratory.

Results

The results are also available in a single file here.

Future Work

The work we would like to see in the future includes retraining the model with a different dataset, tweaking the hyperparameters, and/or applying the larger GPT-2 model (1.5B parameters).

Development

The project was developed by David Oniani and Dr. Yanshan Wang.

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
annotations		annotations
data		data
data_raw		data_raw
models/774M		models/774M
results		results
src		src
web-app		web-app
.gitignore		.gitignore
README.md		README.md
extract.py		extract.py
interact.py		interact.py
requirements.txt		requirements.txt
webapp_demo.png		webapp_demo.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

COVID-19 Chatbot

Table of Contents

Paper

Dataset

Model

Chatbot

Web Application

Transfer Learning

Results

Future Work

Development

About

Releases

Packages

Languages

Sdccoding/covid-19-chatbot

Folders and files

Latest commit

History

Repository files navigation

COVID-19 Chatbot

Table of Contents

Paper

Dataset

Model

Chatbot

Web Application

Transfer Learning

Results

Future Work

Development

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages