Skip to content

Latest commit

 

History

History

A3C

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Asynchronized Advantage Actor-Critic

Following paper: Asynchronous Methods for Deep Reinforcement Learning (https://arxiv.org/pdf/1602.01783.pdf)

Cartpole-v0 result

$ python cartpole_a3c.py --device=cpu --episodes=1000 --workers=4 --log_dir=cartpole_logs

The following graph shows the episode rewards (# workers: 4, entropy loss: 0.2)

Tensorboard:

$ tensorboard --logdir=cartpole_logs/

A3C training

Acrobot-v1 result

$ python acrobot_a3c.py --device=cpu --episodes=500 --workers=4 --log_dir=acrobot_logs

The following graph shows the episode rewards (# workers: 4, entropy loss: 0.2)

A3C training

MountainCar-v0 result

$ python mountaincar_a3c.py --device=cpu --episodes=20000 --workers=8 --log_dir=mc_logs

The following graph shows the episode rewards (# workers: 8, entropy loss: 1.0, tmax=5)

A3C training

References