This is the PyTorch implementation of the paper: UniConv: A Unified Conversational Neural Architecture for Multi-domain Task-oriented Dialogues. Hung Le, Doyen Sahoo, ChenghaoLiu, Nancy F. Chen, Steven C.H. Hoi. EMNLP 2020.) (arxiv, camera-ready version)
This code has been written using PyTorch 1.0.1. If you find the paper or the source code useful to your projects, please cite the following bibtex:
@inproceedings{le-etal-2020-uniconv, title = "{U}ni{C}onv: A Unified Conversational Neural Architecture for Multi-domain Task-oriented Dialogues", author = "Le, Hung and Sahoo, Doyen and Liu, Chenghao and Chen, Nancy and Hoi, Steven C.H.", booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)", month = nov, year = "2020", address = "Online", publisher = "Association for Computational Linguistics", url = "https://www.aclweb.org/anthology/2020.emnlp-main.146", pages = "1860--1877" }
Building an end-to-end conversational agent for multi-domain task-oriented dialogue has been an open challenge for two main reasons. First, tracking dialogue states of multiple domains is non-trivial as the dialogue agent must obtain complete states from all relevant domains, some of which might have shared slots among domains as well as unique slots specifically for one domain only. Second, the dialogue agent must also process various types of information across domains, including dialogue context, dialogue states, and database, to generate natural responses to users. Unlike the existing approaches that are often designed to train each module separately, we propose “UniConv" — a novel unified neural architecture for end-to-end conversational systems in multi-domain task-oriented dialogues, which is designed to jointly train (i) a Bi-level State Tracker which tracks dialogue states by learning signals at both slot and domain level independently, and (ii) a Joint Dialogue Act and Response Generator which incorporates information from various input components and models dialogue acts and target responses simultaneously. We conduct comprehensive experiments in dialogue state tracking, contextto-text, and end-to-end settings on the MultiWOZ2.1 benchmark, achieving superior performance over competitive baselines in all tasks.
Example of a multi-domain dialogue with two domains: restaurant and attraction
Our unified architecture has three components: (1) Encoders encode all text input into continuous representations; (2) Bi-level State Tracker (BDST) includes 2 modules for slot-level and domain-level representation learning; and (3) Joint Dialogue Act and Response Generator (DARG)obtains dependencies between the target response representations and other dialogue components.
Libraries required for this repo are listed in the requirements.txt
. An example script to install these libraries is made in the setup.sh
file.
We use the MultiWOZ benchmark, including both version 2.0 and 2.1. Download the data here and unzip into the root directory of the repo e.g. UniConv/data2.0
and UniConv/data2.1
.
The data includes the original and pre-processed MultiWOZ dialogues. The preprocessed data are in the multi-woz
sub-folder in each version and includes the following files:
delex_data.pkl
: delexicalized utterances e.g. replacing real restaurant names to place holder tokenrestaurant_name
in all utterancesslots.pkl
: gathered domain and slot labels for DST tasksdials.pkl
: divided data by data instances, each is a dialogue turn with all annotations of states, acts, database pointers, etc.lang.pkl
: collected unique tokens to build vocabulary setsencoded_data.pkl
: data instances encoded by token indices
The procedure for data preprocessing is detailed in the preprocess_data.py
file.
We created run_exps.sh
to train models, generate dialogue states and responses, and evaluating the generated states and responses with automatic metrics. You can directly run this file by following this syntax:
./run_exps.sh <setting> <stage>
where setting
is a specific subtask, either dst
(dialogue state tracking), c2t
(context-to-text generation), or e2e
(end-to-end system). stage
is from 1 to 3: stage 1 performs all steps, including training, generating, and evaluating, stage 2 performs only generating, and evaluating, and stage 3 performs only evaluating.
While training, the model with the best validation is saved. The model output, parameters, vocabulary, and training and validation logs will be save into folder determined in the out_dir
parameter.
Other parameters, including data-related options, model parameters, training and generating settings, are defined under the configs
folder.
Examples of pretrained UniConv models using the sampled script in run_exps.sh
can be downloaded here.
Unzip the download file and update the out_dir
parameter in the generating command (stage 2) in the run_exps.sh
to the corresponding unzip directory e.g. save/multiwoz2.1_c2t
. Using the pretrained models, the test script provides the following results:
Model | Task | Multiwoz version | Joint Acc | Slot Acc | Inform | Success | BLEU |
---|---|---|---|---|---|---|---|
multiwoz2.0_dst | state tracking | 2.0 | 46.35% | 97.14% | - | - | - |
multiwoz2.0_c2t | context-to-text | 2.0 | - | - | 84.7% | 76.3% | 19.83 |
multiwoz2.1_dst | state tracking | 2.1 | 48.85% | 97.24% | - | - | - |
multiwoz2.0_c2t | context-to-text | 2.1 | - | - | 84.5% | 74.2% | 19.97 |