Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
shaktikshri authored May 11, 2020
1 parent 76d036e commit 6f75366
Showing 1 changed file with 15 additions and 3 deletions.
18 changes: 15 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
# Adaptive Transformers in RL

In this experiment we replicate several results from [Stabilizing Transformers for RL](https://arxiv.org/abs/1910.06764) on both [Pong](https://gym.openai.com/envs/Pong-v0/) and [rooms_select_nonmatching_object](https://github.com/deepmind/lab/tree/master/game_scripts/levels/contributed/dmlab30#select-non-matching-object) from DMLab30.
Official implementation of [Adaptive Transformers in RL](http://arxiv.org/abs/2004.03761)

We also extend the Stable Transformer architecture with [Adaptive Attention Span](https://arxiv.org/abs/1905.07799) on a partially observable (POMDP) setting of Reinforcement Learning. To our knowledge this is one of the first attempts to stabilize and explore Adaptive Attention Span in an RL domain.
In this work we replicate several results from [Stabilizing Transformers for RL](https://arxiv.org/abs/1910.06764) on both [Pong](https://gym.openai.com/envs/Pong-v0/) and [rooms_select_nonmatching_object](https://github.com/deepmind/lab/tree/master/game_scripts/levels/contributed/dmlab30#select-non-matching-object) from DMLab30.

The arxiv preprint for this work can be found here [Adaptive Transformers in RL](http://arxiv.org/abs/2004.03761)
We also extend the Stable Transformer architecture with [Adaptive Attention Span](https://arxiv.org/abs/1905.07799) on a partially observable (POMDP) setting of Reinforcement Learning. To our knowledge this is one of the first attempts to stabilize and explore Adaptive Attention Span in an RL domain.

### Steps to replicate what we did on your own machine
1. Downloading DMLab:
Expand Down Expand Up @@ -48,3 +48,15 @@ python train.py --total_steps 20000000 \
--num_actors 32 --num_learner_threads 1 --sleep_length 20 \
--level_name rooms_select_nonmatching_object --mem_len 200
```

### Reference
If you find this repository useful, do cite it with,
```
@article{kumar2020adaptive,
title={Adaptive Transformers in RL},
author={Shakti Kumar and Jerrod Parker and Panteha Naderian},
year={2020},
eprint={2004.03761},
archivePrefix={arXiv},
primaryClass={cs.LG}
}

0 comments on commit 6f75366

Please sign in to comment.