RL

Reinforcement Learning, some practice

Task1

A handcraft-toy for Policy Gradient within MXNet. Refers to Figure 17.1@Artificial Intelligence: A Modern Approach with Policy Gradient(Chapter-13)@Reinforcement Learning: An Introduction.

mkdir ../output
python chpt17.py

by default, use mx.gpu(0)

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
test		test
README.md		README.md
chpt17.py		chpt17.py
config.py		config.py
rl_base.py		rl_base.py