Improve Abstractive Summarization by using Question Answering Rewards

The training process for the summarization framework with QA rewards [paper]

This project will implement the framework proposed in this paper that provides a general methodology for training abstractive summarization models to address the hallucination problems such as missing key information in source documents (low recall) or generated summaries be containing facts that are inconsistent with the source documents (low precision). The framework used question-answering based rewards to further train the pre-trained summarization models in a Reinforcement Learning (RL) context.

Features

[DONE] Experience with GPT-2 and PEGASUS on XSUM dataset
[TODO] Training with BART
[TODO] Experience with SAMSUM dataset

How to run

The pre-trained model trains with question-answering based reward using GPT-2 can be downloaded from here
The pre-trained summarization GPT-2 model can be downloaded from here
The pre-trained model trains with question-answering based reward using PEGASUS can be downloaded from here
The pre-trained summarization PEGASUS model can be downloaded from here

Install

#python3.7
pip install --upgrade pip
pip install -r requirements.txt

Demo

python run_demo_server.py --port PORT --model_type TYPE --model_path PATH --model_ref_path PATH

PORT: port to run server (default server will run on http://localhost:8769)
model_type: type of summarization pre-trained model. Chose follow options
- gpt2
- google_pegasus_xsum
model_path: pre-trained model trains with question-answering based reward
model_ref_path: pre-trained summarization model

Training

python training.py --pretrained_model_path PRETRAINED_PATH --summary_model_name MODEL_NAME

pretrained_model_path: pre-trained summarization model
summary_model_name: type of summarization pre-trained model. Chose options
- gpt2
- google_pegasus_xsum
The training model will be saved to ./checkpoint/{summary_model_name}

Eval

python eval.py --model_path MODEL_PATH --model_ref_path MODEL_REF_PATH --model_type MODEL_TYPE

model_path: pre-trained model trains with question-answering based reward
model_ref_path: pre-trained summarization model
model_type: type of summarization pre-trained model. Chose follow options
- gpt2
- google_pegasus_xsum

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
bash		bash
finetuning		finetuning
ppo		ppo
static		static
templates		templates
.gitignore		.gitignore
README.md		README.md
constants.py		constants.py
eval.py		eval.py
model.py		model.py
qa_generation.py		qa_generation.py
requirements.txt		requirements.txt
run_demo_server.py		run_demo_server.py
training.py		training.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Improve Abstractive Summarization by using Question Answering Rewards

Features

How to run

Install

Demo

Training

Eval

About

Releases

Packages

Languages

anhtu-phan/qa-rewards-abstractive-summarization

Folders and files

Latest commit

History

Repository files navigation

Improve Abstractive Summarization by using Question Answering Rewards

Features

How to run

Install

Demo

Training

Eval

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages