Skip to content

Commit

Permalink
add Element-Research mention and updated RMVA
Browse files Browse the repository at this point in the history
  • Loading branch information
nicholas-leonard committed Jul 26, 2016
1 parent a95d34f commit 4e7626f
Show file tree
Hide file tree
Showing 3 changed files with 11 additions and 5 deletions.
10 changes: 7 additions & 3 deletions blog/_posts/2015-09-21-rmva.md
Original file line number Diff line number Diff line change
Expand Up @@ -366,9 +366,13 @@ Here are some results for the Translated MNIST dataset :
For this dataset, the images are of size `1x60x60` where each image contains a randomly placed `1x28x28` MNIST digit.
The `3x12x12` glimpse uses a depth of 3 scales
where each successive patch is twice the height and width of the previous one.
Training with this dataset was started about 3 days prior to this blog post.
For 7 glimpses, after 193 epochs, we get 1.223% error. Note that the model is still training.
The paper gets 1.22% and 1.2% error for 6 and 8 glimpses, respectively.
After 683 epochs of training on the Translatted MNIST dataset, using 7 glimpses, we obtain 0.92% error.
The paper reaches 1.22% and 1.2% error for 6 and 8 glimpses, respectively.
The exact command used to obtain those results:

```lua
th examples/recurrent-visual-attention.lua --cuda --dataset TranslatedMnist --unitPixels 26 --learningRate 0.001 --glimpseDepth 3 --maxTries 200 --stochastic --glimpsePatchSize 12
```

Note : you can evaluate your models with the [evaluation script](https://github.com/Element-Research/rnn/blob/master/scripts/evaluate-rva.lua).
It will generate a sample of glimpse sequences and print the confusion matrix results for the test set.
Expand Down
6 changes: 4 additions & 2 deletions blog/_posts/2016-07-25-nce.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ title: Language modeling a billion words
comments: True
author: nicholas-leonard
excerpt: Noise contrastive estimation is used to train a multi-GPU recurrent neural network language model on the Google billion words dataset.
picture: https://raw.githubusercontent.com/torch/torch.github.io/master/blog/_posts/images/rnnlm.png
picture: https://raw.githubusercontent.com/torch/torch.github.io/master/blog/_posts/images/rnnlm-small.png
---

<!---# Language modeling a billion words -->
Expand All @@ -18,10 +18,12 @@ picture: https://raw.githubusercontent.com/torch/torch.github.io/master/blog/_po
* [Future work](#nce.future)
* [References](#nce.ref)

In our last post, we presented a [recurrent model for visual attention](http://torch.ch/blog/2015/09/21/rmva.html)
which combined reinforcement learning with recurrent neural networks.
In this Torch blog post, we use noise contrastive estimation (NCE) [[2]](#nce.ref)
to train a multi-GPU recurrent neural network language model (RNNLM)
on the Google billion words (GBW) dataset [[7]](#nce.ref).
The work presented here is the result of many months of on-and-off work.
The work presented here is the result of many months of on-and-off work at [Element-Research](https://www.discoverelement.com/research).
The enormity of the dataset caused us to contribute some novel open-source Torch modules, criteria and even a multi-GPU tensor.
We also provide scripts so that you can train and evaluate your own language models.

Expand Down
Binary file added blog/_posts/images/rnnlm-small.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 4e7626f

Please sign in to comment.