RL Multi Agent Environment and Open AI Gym Spaces #1897

aurora126 · 2022-08-12T11:13:39Z

aurora126
Aug 12, 2022

RL Multi Environment and Spaces Proposal

Currently the DJL implementation of RL has provided a single agent environment that requires the ability to get pre and post states. There is no definition of the observation space and also the action space is limited to a discrete space required by some agents such as DQN.

This proposal seeks to improve on what has already been achieved in DJL by implementing spaces and environments based on some of the popular Python APIs.

Spaces

It is proposed to implement spaces (observation and action space definitions) based around the de-facto standard Open AI Gym.

The current Open AI Gym API supports the following space types:-

Box
A bounded or unbounded box, a multidimensional array that may be bounded by an upper and lower limit.
Discrete
A finite set of integers.
Multi Binary
An n shaped binary array space.
Multi Discrete
A cartesian product of arbitrary Discrete spaces.
Dict
A ordered dictionary of space instances.
Tuple
A multidimensional array of space instances.

Environment Observation Space

Add an observation space (spec) to the existing environment. This will help in constructing policies programmatically e.g. in a tuning framework.

Environment Renderer

Add a Renderer to the existing environment so that agent training and evaluation can be viewed either on screen during running or later via a recorded video.

Multi Agent Environment

An environment often contains more than one actor e.g. in the case of multi player games etc however the Open AI Gym environment supports a single agent.

In order to enable support for multiple agents then the proposal is to implement multi environments based on either Ray RLib or by PettingZoo .

Current DJL Environment Step and DQNAgent

The proposal here is to bring the Step and DQNAgent in line with the popular APIs e.g. gym, tensorflow, ray rlib.

The current Step object requires a pre and post state, this is not normally required in other APIs and results in more overhead in creating the environment. In common implementations such as Tensorflow Agents or RLib the DQNAgent takes the pre and post states from the replay buffer.

JBox2D Environments

Once the above has been achieved we can implement a JBox2D environment/s based on one of the many python ones out there.

aurora126 · 2022-08-23T04:30:58Z

aurora126
Aug 23, 2022
Author

Hi @zachgk what do you think about this proposal?

1 reply

zachgk Aug 24, 2022
Maintainer

I think this proposal looks great. The API was originally designed based on (what may be a very old version and limited subset of) Open AI Gym, so efforts to bring it up to the current standard shouldn't run into any concerns. I don't see problems with any of the changes you have listed.

One minor point with the JBox2D is that RL is currently located as part of the core DJL module and we try to keep that with minimal dependencies. So, we should add the integration of this as part of an extension

aurora126 · 2022-08-29T20:09:20Z

aurora126
Aug 29, 2022
Author

Hi @zachgk , thanks for getting back to me. I'll start by implementing a few of of the spaces which will act as the spec for both action and observation spaces and then we'll see what the next steps are. I note your comment on JBox2D so when we get to that bit we can discuss further.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RL Multi Agent Environment and Open AI Gym Spaces #1897

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

RL Multi Agent Environment and Open AI Gym Spaces #1897

aurora126 Aug 12, 2022

RL Multi Environment and Spaces Proposal

Spaces

Environment Observation Space

Environment Renderer

Multi Agent Environment

Current DJL Environment Step and DQNAgent

JBox2D Environments

Replies: 2 comments · 1 reply

aurora126 Aug 23, 2022 Author

zachgk Aug 24, 2022 Maintainer

aurora126 Aug 29, 2022 Author

aurora126
Aug 12, 2022

Replies: 2 comments 1 reply

aurora126
Aug 23, 2022
Author

zachgk Aug 24, 2022
Maintainer

aurora126
Aug 29, 2022
Author