line-world

A simple multi-modal continuous control RL environment.

The green dot is the agent's location and the two peaks show the multi-modal reward structure. The agent's action space is the range [-1, 1], which moves the agent stochastically along the horizontal line.

Installation

This package is not distributed on PyPI - you'll have to install from source.

git clone https://github.com/aaronsnoswell/line-world.git
cd line-world
pip install -e .

To test the installation

from line_world.envs import demo
demo()

Usage

Importing the package registers it with the gym environment register.

import gym
import line_world
env = gym.make("LineWorld-v0")

# ... you can now use it like any other environment
env.render()

To train a stable_baselines agent,

import gym
import line_world
from stable_baselines import PPO2

agent = PPO2('MlpPolicy', 'LineWorld-v0').learn(10000)
env = gym.make("LineWorld-v0")
observation = env.reset()
action, states = agent.predict(observation)

Optimal policies

For symmetric versions of this task, the optimal policy can be queried from env._opt_pol(). The optimal policy for symmetric tasks is as follows;

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
doc		doc
line_world		line_world
.gitignore		.gitignore
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

line-world

Installation

Usage

Optimal policies

About

Releases

Packages

Languages

aaronsnoswell/line-world

Folders and files

Latest commit

History

Repository files navigation

line-world

Installation

Usage

Optimal policies

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages