We present a corpus of spoken dialogs collected to support research in the automatic detection of times of dissatisfaction. We collected 191 mock customer-merchant dialogs in two conditions: one where the scripts guided the participants to a satisfactory, mutually agreeable outcome, and one where agreement was precluded. Most dialogs were 1 to 5 minutes in length. The corpus and metadata are freely available for research purposes.
We've also created a bilingual speech corpus of parallel utterances, also freely available. Dialogs Re-enacted Across Languages (DRAL) Corpus: https://www.cs.utep.edu/nigel/dral/
The corpus is available for research purposes from the authors. This includes,
annotations
– dissatisfaction labels for customer utterances (see annotation-guide.txt)calls
– 191 English dialogs in WAV formatcalls-non-English
– 3 Japanese dialogs in WAV formatcall-log.xlsx
– a spreadsheet including the following metadata for each dialog: data, scenario ID, participant ID, confederate ID, and notestrain-dev-test-sets.txt
– a list of dialogs belonging to training set, development set, and testing set used in our experimentsreport.pdf
– a PDF cody of this report