Reinforcement Learning, some practice
A handcraft-toy for Policy Gradient within MXNet
.
Refers to Figure 17.1
@Artificial Intelligence: A Modern Approach with Policy Gradient
(Chapter-13)@Reinforcement Learning: An Introduction.
mkdir ../output
python chpt17.py
by default, use mx.gpu(0)
- Task1 Jan 18, 2018