Skip to content

cuayahuitl/SimpleDS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

76 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SimpleDS

A Simple Deep Reinforcement Learning Dialogue System

DESCRIPTION

SimpleDS is a computational framework for training task-oriented dialogue systems with deep reinforcement learning. In contrast to other dialogue systems, this system selects dialogue actions directly from either raw (noisy) text or word embeddings of the last system and user responses -- support from raw audio in progress. The motivation is to train dialogue agents with as little human intervention as possible.

This system runs under a client-server architecture, where the learning agent (in JavaScript) acts as the "client" and the environment (in Java) acts as the "server". They communicate by exchanging messages, where the client tells the client the action to execute, and the server tells the client the actions available, environment state and rewards observed. SimpleDS is a (spoken) dialogue system on top of ConvNetJS with support for multi-threaded and client-server processing, and fast learning via constrained search spaces.

This system has been tested with simulated and real dialogues using the Google Speech Recogniser. It has also been tested in three different languages: English, German and Spanish. SimpleDS is for experimental purposes, represents work in progress, and is therefore released without any guarantees.

SOFTWARE

This system was implemented and tested under Linux and Mac OS X with the following software -- though it should run in other operating systems with minor modifications.

  • Ubuntu 14.10.4 / Mac OS X 10.10 / Windows 10
  • Java 1.8.0 or higher
  • Ant 1.9.3 or higher
  • Node 0.10.25 or higher
  • Octave 3.8.0 or higher
  • Android 4.4.3 (optional)

DOWNLOAD

You can download the system directly from the command line:

git clone https://github.com/cuayahuitl/SimpleDS.git

You can also download the system as a zip file using the following URL, and then unzip it in your path of preference. https://github.com/cuayahuitl/SimpleDS/archive/master.zip

You should download pre-trained word vectors if you want support for word embeddings, e.g. http://nlp.stanford.edu/data/glove.6B.zip and put the text file of your choice under YourPath/SimpleDS/resources/English. Apply the same procedure for other languages.

COMPILATION

cd YourPath/SimpleDS

ant

EXECUTION

You can run the system from two terminals:

Terminal1:YourPath/SimpleDS>ant SimpleDS

Terminal2:YourPath/SimpleDS/web/main>nodejs runclient.js (train|test) [num_dialogues] [-v|-nv]

Alt text

For practical reasons, you can specify the number of dialogues and verbose mode from the command line. The values of these parameters would override the values specified in the file config.txt.

The outputs from the training phase consists in the learnt interaction policy (json file under the folder 'results/language'), and logged performance metrics (txt file under the 'results/language'). Depending on the config file, the metrics produce multiple rows with the following information: number of dialogues, average reward, epsilon value, number of actions per state, number of dialogues, and execution time (in hours). The outputs from the test phase are similar exept that no learnt policy is generated. In addition, executing the system in verbose mode would print out training/test dialogues -- according to the specified parameters.

PLOTTING

You can visualise a learning curve of the SimpleDS agent according to number of learning steps in the x-axis and average reward + learning time in the y-axis. Learning curves can be generated for newly trained or pre-trained policies in the currently supported languages (English, German and Spanish).

cd YourPath/SimpleDS

octave scripts/plotdata.m results/english/simpleds-output.txt

[From the command line, press the space bar key for termination]

or

cd YourPath/SimpleDS

octave scripts/plotdata.m results/english/simpleds-output.txt results/english/simpleds-output.png

[From the command line, press the space bar key for termination]

The latter generates an image of the plot in png (Portable Network Graphics) format. The file plotdata.m can also be used from Matlab if that software is prefered. The following learning curves (available from YourPath/results//.png) can be obtained with the default parameters for the supported languages: English, German and Spanish.

The following learning curve was generated from image-based supervised learning learning: spectrogram.

CONFIGURATION

The config file "YourPath/SimpleDialogueSystem/config.txt" has a number parameters number of dialogues, verbose outputs, saving frequency, etc. You may want to set Verbose=false during training and Verbose=true during testing. You may also want to set a high number of dialogues during training (e.g. Dialogues=2000) and a low one during tests (e.g. Dialogues=1). You may want to change the system/user responses if you want different verbalisations. If this is the case, then you will also want to update the demonstration dialogues in the folder YourPath/SimpleDS/data/.

REFERENCES

SimpleDS has been applied to spoken dialogue systems and interactive games. See the following references for further information.

See "How to apply SimpleDS to interactive systems" if you would like to use SimpleDS in your own system.

COMMENTS/QUESTIONS/COLLABORATIONS?

Contact: Heriberto Cuayahuitl

Email: [email protected]

About

A Simple Deep Reinforcement Learning Dialogue System

Resources

License

Stars

Watchers

Forks

Packages

No packages published