Skip to content

bishetheanswer/deep-q-learning-tfg

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 

Repository files navigation

drawing

Study of deep reinforcement learning approach and its application to the programming of intelligent agents to play retro video games

Miguel Enrique Játiva Jiménez

Training

Inside the train folder you can find the following files:

  • train.py: code for training DQN agents. It takes as argument the name of the environment.
  • model.py: the implementation of the DQN architecture.
  • wrappers.py: wrappers applied to the environment when training.
  • keys.py: file where the AWS keys are stored.

Names of the environments:

  • Columns-Genesis
  • Flicky-Genesis
  • BioHazardBattle-Genesis
  • StreetsOfRage2-Genesis
  • SonicTheHedgehog-Genesis

Evaluation

Inside the evaluation folder you can find the following files:

  • eval.py: code for evaluating DQN agents. It takes as arguments the agent, the name of the environment and whether you want to record it or not.
  • random_eval.py: code for evaluating random agents.
  • model.py: the implementation of the DQN architecture.
  • wrappers_eval.py: wrappers applied to the environment when evaluating.

You can also find a folder containing the best DQN agent for each game. To see these agents in action you need to execute them using eval.py. The videos of their progress during training can be seen here:

Here is a video of the best agents playing:

Tools

drawingdrawing

  • Python 3.7.9
  • PyTorch 1.6.0
  • Gym Retro 0.8.0
  • Kaggle was used at first to perform an informal search in order to select the final hyperparameters for training. The downside is that you are only able to execute a notebook for 9 hours straight so the final training could not be performed using Kaggle.
  • I used the Notebooks API from Google Cloud as an alternative to Kaggle but the notebooks stopped their execution after 24 hours.
  • The final training was performed in a laboratory of the Escuela Superior de Ingenieria Informática in Albacete. In order to receive the results of the training in my computer I used the S3 (Scalable Storage in cloud) service from AWS. I modified the traning algorithm in order to upload the results to the S3 AWS service.

About

Undergraduate Dissertation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages