Skip to content
This repository has been archived by the owner on Jun 30, 2021. It is now read-only.

Commit

Permalink
updated README, updated base reward func, updated startup/stop script…
Browse files Browse the repository at this point in the history
…s, default track now reinvent_base
  • Loading branch information
mattcamp committed Jun 18, 2020
1 parent bd8afbc commit a5ad548
Show file tree
Hide file tree
Showing 7 changed files with 97 additions and 77 deletions.
82 changes: 53 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,14 @@

Heavily based off work by [Crr0004](https://github.com/crr0004), [AlexSchultz](https://github.com/alexschultz), [Richardfan1126](https://github.com/richardfan1126) and [LarsLL](https://github.com/larsll)

This is a very early upload of Matt's local training setup so that a few people can test. Lots of things probably won't work properly and lots of functionality is still missing.
## Prerequisites

Very rough guide for use (details to come):
This project is designed to run on a linux system, ideally with an nvidia GPU. CPU training is possible but will be very slow. AMD GPUs are not currently supported.
Ubuntu 18.04 has been extensively tested.

- install nvidia cuda drivers and tools.
- install docker and docker-compose
- set docker-nvidia2 as default runtime in your `/etc/docker/daemon.json`
1. install nvidia cuda drivers and tools.
2. install docker and docker-compose
3. set docker-nvidia2 as default runtime in your `/etc/docker/daemon.json`

{
"default-runtime": "nvidia",
Expand All @@ -20,19 +21,34 @@ Very rough guide for use (details to come):
}
}

## Configure training session

- edit reward function and training params in `data/minio/bucket/custom_files`. Note that the track name MUST be the same in both files!
- tweak any other settings you want in `config.env`
- Modify `ENABLE_GPU_TRAINING` for SageMaker runtime: `true` (nvidia runtime) or `false` (CPU runtime). Default is GPU.
- If you do not have an nvidia GPU then you will also need to change the tag of the robomaker image inside `docker-compose.yml`
- Set `ENABLE_LOCAL_DESKTOP` to `true` if you have a local X-windows install (desktop machine) and want to automatically start the stream viewer and tail sagemaker logs.
- Install tmux (`sudo apt install tmux` on Ubuntu Linux) if you want robomaker + sagemaker logs automatically tailed in your terminal session.
- run `./start-training.sh` to start training
- view docker logs to see if it's working (automatic if `tmux` is installed)
- run `./stop-training.sh` to stop training.
- run `./delete_last_run.sh` to clear out the buckets for a fresh run. For convenient version without sudo prompt check out `utilites/delete-last.c`.
- run `./local-copy.sh <model_backup_name>` to backup current model files into user specified MODEL directory.
- run `./mk-model.sh <model_path>` to create physical car uploadable .tar.gz file from your model. (Will be removed in a future update once file gets correctly generated after training)
1. Edit the reward function in `data/minio/bucket/custom_files/reward.py`
2. Edit the action space in `data/mini/bucket/custom_files/model_metadata.json`
3. Edit the training params in `config.env` and `data/minio/bucket/custom_files/training_params.yaml`. Note that the track name MUST be the same in both files!

Useful options include:

| option | description |
|--------|-------------|
|ENABLE_GPU_TRAINING|Enables GPU for SageMaker runtime: `true` (nvidia runtime) or `false` (CPU runtime). Default is GPU|
|ENABLE_LOCAL_DESKTOP|Set to `true` if you have a local X-windows install (desktop machine) and want to automatically start the stream viewer and tail sagemaker and robomaker logs.|
|ENABLE_TMUX|Enables tmux for automatic log tails in your existing terminal session (good for remote servers)|
|ENABLE_GUI|Enables gazebo client. Access via vnc|
|WORLD_NAME|The track name. Tracks are contained within the robomaker container image, built from the [deepracer-simapp community project](https://github.com/aws-deepracer-community/deepracer-simapp/tree/master/bundle/deepracer_simulation_environment/share/deepracer_simulation_environment/worlds) (excluding the .world suffix)

Many other options are available.

4. Edit hyperparameters for training are loaded from `hyperparams.json` inside `src/rl_coach_2020_v2/hyperparams.json` - shortcut link has been created in the root directory. Available options are exactly the same except the new option `pretrained` that simplifies enabling pretrained mode.

More information on configuring local training can be found at https://wiki.deepracing.io/Customise_Local_Training

## Starting a training session
Run `./start-training.sh` to start training.

The current model data dir (defaults to data/minio/bucket/current) must be empty.

To use a pretrained model as a base for a new training session rename `data/minio/bucket/current` to `data/minio/bucket/rl-deepracer-pretrained` and set `"pretrained": "true"` in hyperparams.json

The first run will likely take quite a while to start as it needs to pull over 10GB of all the docker images.
You can avoid this delay by pulling the images in advance:
Expand All @@ -41,19 +57,30 @@ You can avoid this delay by pulling the images in advance:
- `docker pull awsdeepracercommunity/deepracer-robomaker:<cpu or gpu>`
- `docker pull mattcamp/dr-coach`
- `docker pull minio/minio`

Note that different flavours of CPU image are available, see https://github.com/aws-deepracer-community/deepracer-simapp for details.
`cpu-avx2` is the default.

## Modifying parameters
Hyperparameters for training are loaded from `hyperparams.json` inside `src/rl_coach_2020_v2/hyperparams.json` - shortcut link has been created in the root directory. Available options are exactly the same except the new option `pretrained` that simplifies enabling pretrained mode.

## Video stream
## Monitoring training
- Docker logs should open automatically in new terminal tabs if running with `ENABLE_LOCAL_DESKTOP` enabled, or via tmux in your existing terminal session if `ENABLE_TMUX` is enabled.
- Logs can be manually viewed using `docker ps` and `docker logs robomaker` or `docker logs <sagemaker_container_id>`
- The web video stream is available by default on port 8888. If running in desktop mode a browser window should open automatically, otherwise you can try opening a url such as http://127.0.0.1:8888/stream_viewer?topic=/racecar/deepracer/kvs_stream
- Kinesis video stream can also be enabled. See below for more details, however usually the web video stream just works better.
- if `ENABLE_GUI` is enabled then you can connect a vncviewer on port 8080 to view the gazebo client directly.

The video stream is available either via a web stream of via Kinesis.
## Stopping training
Run `./stop-training.sh` to stop training.

### Web stream:
If running, sagemaker will be stopped first and then after a 20s delay the rest of the containers will be stopped. This allows Robomaker to create a model.tar.gz file in the current model dir, ready to be loaded onto a physical DeepRacer car.

**NOTE: Sagemaker should not be stopped during the policy training phase or things might get weird and corrupt. You should only stop training while the video stream status is "Training" and not "Evaluating" (or verify via sagemaker logs that policy training has completed for the current iteration)**

The web video stream is exposed on port 8888. If you're running a local browser then you should be able to browse directly to `http://127.0.0.1:8888/stream_viewer?topic=/racecar/deepracer/kvs_stream` once Robomaker has started.
## Model management
- run `./delete_last_run.sh` to clear out the buckets for a fresh run. For convenient version without sudo prompt check out `utilites/delete-last.c`.
- run `./local-copy.sh <model_backup_name>` to backup current model files into user specified MODEL directory.
- run `./mk-model.sh <model_path>` to create physical car uploadable .tar.gz file from your model. (Will be removed in a future update once file gets correctly generated after training)

### Kinesis stream:
### Kinesis video stream:

Kinesis video currently only works via the real AWS Kinesis service probably only makes sense if you are training on an EC2 instance.

Expand All @@ -67,14 +94,11 @@ Kinesis video is a stream of approx 1.5Mbps so beware the impact on your AWS cos

Once working the stream should be visible in the Kinesis console.

### VNC
You can enter runnning environment using a vncviewer at localhost:8080.

## Known issues:
- Sometimes sagemaker won't start claiming that `/opt/ml/input/config/resourceconfig.json` is missing. Still trying to work out why.
- Stopping training at the wrong time seems to cause a problem where sagemaker will crash next time when trying to load the 'best' model which may not exist properly. This only happens if you start a new training session without clearing out the bucket first. Yet to be seen if this will cause a problem when trying to use pretrained models.
- `training_params.yaml` must exist in the target bucket or robomaker will not start. The start-training.sh script will copy it over from custom_files if necessary.
- Scripts not currently included to handle pretrainined models or uploading to AWS Console or virtual league.
- Scripts not currently included to handle uploading to AWS Console or virtual league.
- Current sagemaker and robomaker GPU images are built for nvidia GPU only.
- The sagemaker and robomakers images are huge (~4.5GB)

Expand Down
4 changes: 2 additions & 2 deletions config.env
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,10 @@ S3_ENDPOINT_URL=http://minio:9000
S3_YAML_NAME=training_params.yaml
SAGEMAKER_SHARED_S3_BUCKET=bucket
SAGEMAKER_SHARED_S3_PREFIX=current
WORLD_NAME=LGSWide
WORLD_NAME=reinvent_base
ENABLE_KINESIS=false
ENABLE_GUI=true
ENABLE_GPU_TRAINING=true
ENABLE_LOCAL_DESKTOP=false
ENABLE_LOCAL_DESKTOP=true
ENABLE_TMUX=false
MIN_EVAL_TRIALS=5
54 changes: 17 additions & 37 deletions data/minio/bucket/custom_files/reward.py
Original file line number Diff line number Diff line change
@@ -1,45 +1,25 @@
def reward_function(params):
'''
Example of rewarding the agent to stay inside two borders
and penalizing getting too close to the objects in front
Example of rewarding the agent to follow center line
'''

all_wheels_on_track = params['all_wheels_on_track']
distance_from_center = params['distance_from_center']
# Read input parameters
track_width = params['track_width']
objects_distance = params['objects_distance']
_, next_object_index = params['closest_objects']
objects_left_of_center = params['objects_left_of_center']
is_left_of_center = params['is_left_of_center']

# Initialize reward with a small number but not zero
# because zero means off-track or crashed
reward = 1e-3
distance_from_center = params['distance_from_center']

# Reward if the agent stays inside the two borders of the track
if all_wheels_on_track and (0.5 * track_width - distance_from_center) >= 0.05:
reward_lane = 1.0
# Calculate 3 markers that are at varying distances away from the center line
marker_1 = 0.1 * track_width
marker_2 = 0.25 * track_width
marker_3 = 0.5 * track_width

# Give higher reward if the car is closer to center line and vice versa
if distance_from_center <= marker_1:
reward = 1.0
elif distance_from_center <= marker_2:
reward = 0.5
elif distance_from_center <= marker_3:
reward = 0.1
else:
reward_lane = 1e-3

# Penalize if the agent is too close to the next object
reward_avoid = 1.0

# Distance to the next object
distance_closest_object = objects_distance[next_object_index]
# Decide if the agent and the next object is on the same lane
is_same_lane = objects_left_of_center[next_object_index] == is_left_of_center

if is_same_lane:
if 0.5 <= distance_closest_object < 0.8:
reward_avoid *= 0.5
elif 0.3 <= distance_closest_object < 0.5:
reward_avoid *= 0.2
elif distance_closest_object < 0.3:
reward_avoid = 1e-3 # Likely crashed

# Calculate reward by putting different weights on
# the two aspects above
reward += 1.0 * reward_lane + 4.0 * reward_avoid
reward = 1e-3 # likely crashed/ close to off track

return reward
return float(reward)
4 changes: 2 additions & 2 deletions data/minio/bucket/custom_files/training_params.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
WORLD_NAME: "LGSWide"
WORLD_NAME: "reinvent_base"
RACE_TYPE: "OBJECT_AVOIDANCE"
SAGEMAKER_SHARED_S3_PREFIX: "current"
CHANGE_START_POSITION: "true"
Expand All @@ -19,6 +19,6 @@ MODEL_METADATA_FILE_S3_KEY: "custom_files/model_metadata.json"
METRIC_NAME: "TrainingRewardScore"
CAR_COLOR: "Purple"
TARGET_REWARD_SCORE: "None"
NUMBER_OF_OBSTACLES: "3"
NUMBER_OF_OBSTACLES: "0"
OBSTACLE_TYPE: "BOX"
RANDOMIZE_OBSTACLE_LOCATIONS: "false"
3 changes: 2 additions & 1 deletion mk-model.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
#!/bin/bash
# create .tar.gz file uploadable to physical deepracer
# create .tar.gz file uploadable to physical deepracer.
# This should not be necessary if sagemaker is stopped before robomaker as the model.tar.gz will automatically be created.
# USAGE: ./mk-model.sh <model_path>
cd $1
echo $(pwd)
Expand Down
8 changes: 8 additions & 0 deletions start-training.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,13 @@

source config.env

if [ -e data/minio/bucket/current/model/deepracer_checkpoints.json ] ; then
echo "WARNING: Files were found in the current model directory data/minio/bucket/current/"
echo "Please run ./delete_last_run.sh or relocate the current model dir before starting a new training session."
echo "You cannot currently restart training of an existing model, instead you should move the current model dir to rl-deepracer-pretrained and enable pretrained in hyperparams.json"
exit 1
fi

if [ ! -e data/minio/bucket/current/training_params.yaml ]; then
mkdir -p data/minio/bucket/current
cp data/minio/bucket/custom_files/training_params.yaml data/minio/bucket/current
Expand All @@ -22,6 +29,7 @@ if [ "$ENABLE_LOCAL_DESKTOP" = true ] ; then
echo 'Attempting to open stream viewer and logs...'
gnome-terminal --tab -- sh -c "echo viewer;x-www-browser -new-window http://localhost:8888/stream_viewer?topic=/racecar/deepracer/kvs_stream;sleep 1;wmctrl -r kvs_stream -b remove,maximized_vert,maximized_horz;sleep 1;wmctrl -r kvs_stream -e 1,100,100,720,640"
gnome-terminal --tab -- sh -c "docker logs -f $SAGEMAKER_ID"
gnome-terminal --tab -- sh -c 'docker logs -f robomaker'
else
echo "Started in headless server mode. Set ENABLE_LOCAL_DESKTOP to true in config.env for desktop mode."
if [ "$ENABLE_TMUX" = true ] ; then
Expand Down
19 changes: 13 additions & 6 deletions stop-training.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,18 +3,25 @@
source config.env

export ROBOMAKER_COMMAND=""
docker-compose -f ./docker-compose.yml down

docker stop $(docker ps | awk ' /sagemaker/ { print $1 }')
docker rm $(docker ps -a | awk ' /sagemaker/ { print $1 }')
SAGEMAKER_ID=$(docker ps | awk ' /sagemaker/ { print $1 }')
if [ ! -z "${SAGEMAKER_ID}" ]; then
echo "Stopping sagemaker and waiting 20s while model.tar.gz is created"
docker stop ${SAGEMAKER_ID}
sleep 20
docker rm ${SAGEMAKER_ID}
fi

docker-compose -f ./docker-compose.yml down

if [ "$ENABLE_LOCAL_DESKTOP" = true ] ; then
wmctrl -c kvs_stream
if [ -n "$(which wmctrl)" ] ; then
wmctrl -c kvs_stream
fi
fi

if [ ! -z "$(which tmux)" ]
then
if [ "$ENABLE_TMUX" = true ] ; then
tmux kill-session
fi


0 comments on commit a5ad548

Please sign in to comment.