Skip to content

This is the final project of Dor Bitton and Yuval Goshen, B.sc Technion Students instructed by PhD Candidates Or Krupnik and Tom Jurgenson

Notifications You must be signed in to change notification settings

rl-project-dor-yuval/robomaze

Repository files navigation

Long Term planning with Deep Reinforcement Learning


Dor BittonYuval Goshen


U S
20x20 maze Spiral

About The Project

The following is the graduation project of Dor Bitton & Yuval Goshen, 2 Computer engineering Bsc. students from Technion Insitute - Haifa.

The Problem

The Goal of the project is to use solve problem of long term planning. We built an environment for robots to navigate in a goal conditioned maze from an arbitrary start point to an arbitrary goal. The agent has to control the robot joint motors to move to the goal. This task is challenging because the robot has to plan and navigate through the maze from motor control, that problem has long horizon for planning.

We used a Deep Reinforcement Learning algorithms to solve the problem, but we divided the problem into 2 sub-problems that are solved independently by hierarchical agents.

The Stepper

Ant Stepper Rex Stepper
  • Trained in a separate environment to walk to a nearby subgoal (up to two times it's body size)
  • No obstacles, just learn the task of "walking"
  • Dense reward, but independent of the robot type. Reward is a function of distance from the goal plus an indicator that goal achieved.
  • trained with DDPG algorithm

The Navigator

Navigator Navigator Plot
  • trained to generate sub-goals for the stepper, which makes the horizon much shorter for the navigation part of the task.
  • It is still different from solving a point robot maze, because the next state depends on the provided goal, and on the stepper which is not perfect
  • the robot state is not fully observable for the navigator
  • We tried using one of the following algorithms:
    • TD3
    • RRT planner on the maze map where the robot is a point robot
    • RRT planner on the maze map, with extended walls to the robot size
    • TD3-MP similar to DDPG-MP with demonstrations planned by RRT.

(back to top)

Tools we used for this project

(back to top)

Contact

Dor Bitton - Linkedin - [email protected]

Yuval Goshen - Linkedin - [email protected]

(back to top)

Acknowledgments

Out work is mainly based on the following papers

(back to top)

About

This is the final project of Dor Bitton and Yuval Goshen, B.sc Technion Students instructed by PhD Candidates Or Krupnik and Tom Jurgenson

Topics

Resources

Stars

Watchers

Forks