Reinforcement Learning

All the algorithms I implemented (using Python3 and NumPy) while reading Introduction to Reinforcement Learning by Sutton and Barto.
There's a separate ReadMe for each topic

High Level structure of the repo :

Bandits
- Epsilon-greedy
- Optimistic initial value
- Softmax exploration
Dynamic Programming methods
- Policy iteration
- Value iteration
Model free methods
- Monte Carlo control
  - On-Policy Monte Carlo
  - Off-policy Monte Carlo using Importance Sampling (incomplete)
- Temporal-difference methods
  - Q-Learning
  - SARSA

Resources

CS234 and David silver often use different notations, it would be better to follow just one of them in the beginning (I prefer David Silver's lectures)

Check this out for more resources!

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
1. Bandits		1. Bandits
2. DP methods		2. DP methods
3. Model free methods		3. Model free methods
Deterministic PG		Deterministic PG
Vanilla Policy gradient methods		Vanilla Policy gradient methods
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reinforcement Learning

Resources

About

Releases

Packages

Languages

jayeshk7/RL-Algorithms

Folders and files

Latest commit

History

Repository files navigation

Reinforcement Learning

Resources

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages