State-of-the art deep reinforcement learning algorithm in Tensorflow and Python for eventual application to non-gaming environments.
The performance above came after:
- Playing 1.75 million games over
- ~ 2 days of training
- 3 x 2D ConvNet, and
- a Linear Annealed Greedy EPS Policy
- Convert agent to work on MktCap / GVA application