Minimal implementations of distributed deep reinforcement learning algorithms, with a focus on recurrent neural networks. Heavily inspired by CleanRL and CORL this library provides high-quality and easy-to-follow stand-alone implementations of some distributed RL algorithms.
Prerequisites:
- python >= 3.10 (tested with 3.10)
To install:
git clone [email protected]:Jjschwartz/miniDRL.git
cd miniDRL
pip install -e .
# or to install all dependencies
pip install -e .[all]
Run PPO on gymnasium CartPole-v1
environment using four parallel workers (reduce number of workers if you have less than four cores, or feel free to increase it if you have more):
python minidrl/ppo/run_gym.py \
--env_id CartPole-v1 \
--total_timesteps 1000000 \
--num_workers 4
# open another terminal and run tensorboard from repo root directory
tensorboard --logdir runs
To use experiment tracking with wandb, run:
wandb login # only required for the first time
python minidrl/ppo/run_gym.py \
--env_id CartPole-v1 \
--total_timesteps 1000000 \
--num_workers 4 \
--track_wandb \
--wandb_project minidrltest
This repository contains standalone implementations of some of the main distributed RL algorithms that support recurrent neural networks, including:
Learning curve of PPO - Single Machine on Atari Pong with different number of parallel workers |
- PPO - Multi Machine
- IMPALA
- R2D2 - Multi Machine