GitHub - danlg/reinforcement-learning: Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton's Book and David Silver's course.

Overview

This repository provides code, exercises and solutions for popular Reinforcement Learning algorithms. These are meant to serve as a learning tool to complement the theoretical materials from

Each folder in corresponds to one or more chapters of the above textbook and/or course. In addition to exercises and solution, each folder also contains a list of learning goals, a brief concept summary, and links to the relevant readings.

All code is written in Python 3 and uses RL environments from OpenAI Gym. Advanced techniques use Tensorflow for neural network implementations.

Introduction to RL problems & OpenAI Gym
MDPs and Bellman Equations
Dynamic Programming: Model-Based RL, Policy Iteration and Value Iteration
Monte Carlo Model-Free Prediction & Control
Temporal Difference Model-Free Prediction & Control
Function Approximation
Deep Q Learning (WIP)
Policy Gradient Methods (WIP)
Learning and Planning (WIP)
Exploration and Exploitation (WIP)

List of Implemented Algorithms

[Dynamic Programming Policy Evaluation](DP/Policy Evaluation Solution.ipynb)
[Dynamic Programming Policy Iteration](DP/Policy Iteration Solution.ipynb)
[Dynamic Programming Value Iteration](DP/Value Iteration Solution.ipynb)
[Monte Carlo Prediction](MC/MC Prediction Solution.ipynb)
[Monte Carlo Control with Epsilon-Greedy Policies](MC/MC Control with Epsilon-Greedy Policies Solution.ipynb)
[Monte Carlo Off-Policy Control with Importance Sampling](MC/Off-Policy MC Control with Weighted Importance Sampling Solution.ipynb)
[SARSA (On Policy TD Learning)](TD/SARSA Solution.ipynb)
[Q-Learning (Off Policy TD Learning)](TD/Q-Learning Solution.ipynb)
[Q-Learning with Linear Function Approximation](FA/Q-Learning with Value Function Approximation Solution.ipynb)
[Deep Q-Learning for Atari Games](DQN/Deep Q Learning Solution.ipynb)
[Double Deep-Q Learning for Atari Games](DQN/Double DQN Solution.ipynb)
Deep Q-Learning with Prioritized Experience Replay (WIP)
[Policy Gradient: REINFORCE with Baseline](PolicyGradient/CliffWalk REINFORCE with Baseline Solution.ipynb)
[Policy Gradient: Actor Critic with Baseline](PolicyGradient/CliffWalk Actor Critic Solution.ipynb)
[Policy Gradient: Actor Critic with Baseline for Continuous Action Spaces](PolicyGradient/Continuous MountainCar Actor Critic Solution.ipynb)
Deterministic Policy Gradients for Continuous Action Spaces (WIP)
Deep Deterministic Policy Gradients (DDPG) (WIP)
Asynchronous Advantage Actor Critic (A3C)

Resources

Textbooks:

Reinforcement Learning: An Introduction (2nd Edition)

Classes:

Talks/Tutorials:

Other Projects:

Selected Papers:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Table of Contents

List of Implemented Algorithms

Resources

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 130 Commits
DP		DP
DQN		DQN
FA		FA
Introduction		Introduction
MC		MC
MDP		MDP
PolicyGradient		PolicyGradient
TD		TD
lib		lib
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py

License

danlg/reinforcement-learning

Folders and files

Latest commit

History

Repository files navigation

Overview

Table of Contents

List of Implemented Algorithms

Resources

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages