Dyna-Maze-Game

Reinforcement learning Second edition Chapter8 Example 8.1 implementation

agentMaze.py This file is used for learning and generating actions under specific state

envMaze.py This is the environmnet for this experiment, given an action under a state, return the next state

expMaze.py This is main file to run this project and has interaciton with RLGlue model and pygame to display the progress.

rl_glue.py This is the file contains maze implementation by pygame and the Rlglue framework for Reinforcement learning

Getting Started

The following instructions will get you a copy of this project and you can run the project on your local machine.

How to start

Clone this project: git clone https://github.com/konantian/Dyna-Maze-Game.git
Enter the project: cd Dyna-Maze-Game/Codes
Start the game:

python3 expMaze.py 5 (5 is for n and you can modify this by yourself)

Prerequisites

You need to install the following software:

Python3
pygame

Installation

Install the package for python

$ pip install -r requirements.txt

Problem Description

Consider the simple maze shown inset in Figure 8.2. In each of the 47 states there are four actions, up, down, right, and left, which take the agent deterministically to the corresponding neighboring states, except when movement is blocked by an obstacle or the edge of the maze, in which case the agent remains where it is. Reward is zero on all transitions, except those into the goal state, on which it is +1. After reaching the goal state (G), the agent returns to the start state (S) to begin a new episode. This is a discounted, episodic task with

The main part of Figure 8.2 shows average learning curves from an experiment in which Dyna-Q agents were applied to the maze task. The initial action values were zero, the step-size parameter was ↵ = 0.1, and the exploration parameter was " = 0.1. When selecting greedily among actions, ties were broken randomly. The agents varied in the number of planning steps, n, they performed per real step. For each n, the curves show the number of steps taken by the agent to reach the goal in each episode, averaged over 30 repetitions of the experiment. In each repetition, the initial seed for the random number generator was held constant across algorithms. Because of this, the first episode was exactly the same (about 1700 steps) for all values of n, and its data are not shown in the figure. After the first episode, performance improved for all values of n, but much more rapidly for larger values. Recall that the n = 0 agent is a nonplanning agent, using only direct reinforcement learning (one-step tabular Q-learning). This was by far the slowest agent on this problem, despite the fact that the parameter values (↵ and ") were optimized for it. The nonplanning agent took about 25 episodes to reach ("-)optimal performance, whereas the n = 5 agent took about five episodes, and the n = 50 agent took only three episodes.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
Codes		Codes
Images		Images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dyna-Maze-Game

Reinforcement learning Second edition Chapter8 Example 8.1 implementation

Getting Started

How to start

Prerequisites

Installation

Problem Description

The screenshot of running progress

The background algorithm from "Reinforcement learning Second edition"

The performance plot for n=0, 5 and 50

Compare between no-planning and planning

About

Languages

License

konantian/Dyna-Maze-Game

Folders and files

Latest commit

History

Repository files navigation

Dyna-Maze-Game

Reinforcement learning Second edition Chapter8 Example 8.1 implementation

Getting Started

How to start

Prerequisites

Installation

Problem Description

The screenshot of running progress

The background algorithm from "Reinforcement learning Second edition"

The performance plot for n=0, 5 and 50

Compare between no-planning and planning

About

Topics

Resources

License

Stars

Watchers

Forks

Languages