Long Term planning with Deep Reinforcement Learning

About The Project

The following is the graduation project of Dor Bitton & Yuval Goshen, 2 Computer engineering Bsc. students from Technion Insitute - Haifa.

The Problem

The Goal of the project is to use solve problem of long term planning. We built an environment for robots to navigate in a goal conditioned maze from an arbitrary start point to an arbitrary goal. The agent has to control the robot joint motors to move to the goal. This task is challenging because the robot has to plan and navigate through the maze from motor control, that problem has long horizon for planning.

We used a Deep Reinforcement Learning algorithms to solve the problem, but we divided the problem into 2 sub-problems that are solved independently by hierarchical agents.

The Stepper

Trained in a separate environment to walk to a nearby subgoal (up to two times it's body size)
No obstacles, just learn the task of "walking"
Dense reward, but independent of the robot type. Reward is a function of distance from the goal plus an indicator that goal achieved.
trained with DDPG algorithm

The Navigator

trained to generate sub-goals for the stepper, which makes the horizon much shorter for the navigation part of the task.
It is still different from solving a point robot maze, because the next state depends on the provided goal, and on the stepper which is not perfect
the robot state is not fully observable for the navigator
We tried using one of the following algorithms:
- TD3
- RRT planner on the maze map where the robot is a point robot
- RRT planner on the maze map, with extended walls to the robot size
- TD3-MP similar to DDPG-MP with demonstrations planned by RRT.

Name		Name	Last commit message	Last commit date
Latest commit History 411 Commits
MazeEnv		MazeEnv
Training		Training
TrainingNavigator		TrainingNavigator
assets		assets
scratch		scratch
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
req_linux.txt		req_linux.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Long Term planning with Deep Reinforcement Learning

About The Project

The Problem

The Stepper

The Navigator

Tools we used for this project

Contact

Acknowledgments

About

Contributors 2

Languages

rl-project-dor-yuval/robomaze

Folders and files

Latest commit

History

Repository files navigation

Long Term planning with Deep Reinforcement Learning

About The Project

The Problem

The Stepper

The Navigator

Tools we used for this project

Contact

Acknowledgments

About

Topics

Resources

Stars

Watchers

Forks

Contributors 2

Languages