2D_guide_env

An easy 2D guide environment for RL.

Background

The 2D coordinate of agent is (x, y), and θ=tan(y/x) shows the included angle between travel direction and positive x-axis of agent.

Target : Guide the agent to a designated point.

Action : { velocity | palstance }

State : { agent position | target postion | relative position }

Algorithm

Continuous-PPO with tricks shown in this work.

Tricks List

Trick 1—Advantage Normalization.

Trick 2—State Normalization.

Trick 4—— Reward Reward Scaling.

Trick 5—Policy Entropy.

Trick 6—Learning Rate Decay.

Trick 7—Gradient clip.

Trick 8—Orthogonal Initialization.

Trick 9—Adam Optimizer Epsilon Parameter.

Trick10—Tanh Activation Function.

Tips

An beta distribution must be used instead of Gaussian one for avoiding agent sample to much at the edge of the action space.

An action distribution sample on Beta distribution be like:

Effect of Reward Function

Two Rewards are used in this session: R = T + α * F

F = -0.01 * (now distance - preious distance)
T = 100(Terminal); -2(Out of Space); -1(Max Episode)

Terminal Reward effects the agents

10 times of evaluation have done for each 5e3 steps, shows the differences:

Terminal Reward = 50:

Terminal Reward = 80:

Terminal Reward = 90:

Termiinal Reward = 100:

Test

A set of 10 times evaluation had been done. The result is shown in /test_img

For example:

Tips

Beside the args of PPO, the args of normalization(mean and std) must be used.

Requirments

numpy

matplotlib

math

gym

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
imgs		imgs
test_img		test_img
PPO_continuous_Beta_env_UAV_number_7_seed_10.pth.tar		PPO_continuous_Beta_env_UAV_number_7_seed_10.pth.tar
README.md		README.md
env_2022.py		env_2022.py
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

2D_guide_env

Background

Algorithm

Tricks List

Tips

Effect of Reward Function

Test

Tips

Requirments

About

Releases

Packages

Languages

wxiangxiaow/2D_guide_env

Folders and files

Latest commit

History

Repository files navigation

2D_guide_env

Background

Algorithm

Tricks List

Tips

Effect of Reward Function

Test

Tips

Requirments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages