-
Main task (mandatory): Customer service dialog using Twitter
(*) The tools to download the twitter data and transform to the dialog format from the data are provided.
Task A: Full or part of the training data will be used to train conversation models.
Task B: Any open data, e.g. from web, are available as external knowledge to generate informative sentences. But they should not overlap with the training, validation and test data provided by organizers.
-
Pilot task: Movie scenario dialog using OpenSubtitle
The tools and data sets are available for the DSTC6 attendees. Please visit the following page to make your registration.
http://workshop.colips.org/dstc6/index.html
We will provide the access token to download the tools and data sets.
-
prepare data set using collect_twitter_dialogs scripts.
(see collect_twitter_dialogs/README.md)
-
extract training and development sets from stored twitter dialog data
use make_trial_data.sh in tasks/twitter.
Note: the extracted data are trial data at this moment.
-
run baseline system (optional)
copy the data files into ChatbotBaseline/egs/twitter and execute 'run.sh' in ChatbotBaseline/egs/twitter. (see ChatbotBaseline/egs/twitter/README.md)
-
download OpenSubtitles2016 data:
http://opus.lingfil.uu.se/download.php?f=OpenSubtitles2016/en.tar.gz
and extract xml files by
$ tar zxvf en.tar.gz
-
extract training and development sets from stored subtitle data
use make_trial_data.sh in tasks/opensubs.
Note: the extracted data are trial data at this moment.
-
run baseline system (optional)
copy the data files into ChatbotBaseline/egs/opensubs and execute 'run.sh' in ChatbotBaseline/egs/opensubs.
(see ChatbotBaseline/egs/opensubs/README.md)
- README.md : this file
- tasks : data preparation for each subtask
- collect_twitter_dialogs : scripts to collect twitter data
- ChatbotBaseline : a neural conversation model baseline system
You can get the latest updates and participate in discussions on DSTC mailing list
To join the mailing list, send an email to: ([email protected]) putting "subscribe DSTC" in the body of the message (without the quotes). To post a message, send your message to: ([email protected]).