A jax/stax implementation of the NeurIPS 2020 paper: Discovering reinforcement learning algorithms [1]
The agent at lpg.agent.py
implements the bsuite.baseline.base.Agent
interface.
The lpg/environments/*.py
interfaces with a dm_env.Environment
.
We wrap the gym-atari suite using the bsuite.utils.gym_wrapper.DMEnvFromGym
adapter into a dqn.AtariEnv
to implement historical observations and actions repeat.
To run the algorithm on a GPU, I suggest to install the gpu version of jax
[4]. You can then install this repo using Anaconda python and pip.
conda env create -n lpg
conda activate lpg
pip install git+https://github.com/epignatelli/discovering-reinforcement-learning-algorithms
Pip installing the github link from above will install all requirements you need to run my student.ipynb.
- To run my experiments you only need to run the code cells in the student.ipynb after pip installing said repo. (Depending on the device the visualizations/rendering might not work. It only works for one of my workstations)
- a) Source code (all *.py files) is from https://github.com/epignatelli/discovering-reinforcement-learning-algorithms b) I've not touched the source code, only read it and applied it. c) As stated at the top of the student.ipynb, every piece of code that I wrote is in that file.
- As this was a RL agent, I didn't use "datasets" on it, but instead had different environments, extensively described in my paper.