TD3

An implementation of the TD3 algorithm trained on the Roboschool HalfCheetah environment using pytorch. The code here is based on the work of the original authors of the TD3 algorithm found here.

Getting Started

These instructions will demonstrate how to setup a conda environment with all requirements for the project setup.

Installing

conda env create -n rl_dev python=3.6

conda activate rl_dev

git clone https://github.com/djbyrne/TD3.git

cd TD3

python setup.py install 

jupyter notebook

Results

The notebook uses the same hyperparameters and architecture described in the paper. The agent is trained for 5 million timesteps. The agent converged on a successfull policy after 500k timesteps. The results below show the agents avg score over the previous 100 episodes.

As you can see, the agent learned rapidly and then briefly fell into a local optima. However, the agent was able to quickly recover itself. I believe with hyperparameter tuning and a proper sample of trained agents, the results could still improve.

Acknowledgments

Scott Fujimoto TD3
OpenAI Spinning Up
OpenAI Baselines

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.ipynb_checkpoints		.ipynb_checkpoints
media		media
runs		runs
saves		saves
.DS_Store		.DS_Store
README.md		README.md
TD3.ipynb		TD3.ipynb
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TD3

Getting Started

Installing

Results

Acknowledgments

About

Releases

Packages

Languages

djbyrne/TD3

Folders and files

Latest commit

History

Repository files navigation

TD3

Getting Started

Installing

Results

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages