Repo to build RLlib policies.
An policy contains one or more networks. Each network processes a set of specified observations (e.g., images, graphs, poses).
To build a new policy:
-
Inheret from NetworkBase to define a network in PyTorch.
-
Inheret from RllibPolicy to define a polciy.
See NatureCNNRNNActorCritic for an example.
To use the CNN-LSTM Actor-Critic policy defined here
from rllib_policies.vision import NatureCNNRNNActorCritic
ModelCatalog.register_custom_model("nature_cnn_rnn", NatureCNNRNNActorCritic)
model = {
"custom_model": "nature_cnn_rnn",
"max_seq_len": 200,
# keywords to custom model
"custom_model_config": {
"rnn_type": "LSTM",
"hidden_size": 512,
# cnn specific args
"fields": ["RGB_LEFT", "DEPTH"], # keys in observation dictionary
"cnn_shape_chw": [4, 192, 256],
},
}
DISTRIBUTION STATEMENT A. Approved for public release. Distribution is unlimited.
This material is based upon work supported by the Under Secretary of Defense for Research and Engineering under Air Force Contract No. FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Under Secretary of Defense for Research and Engineering.
(c) 2022 Massachusetts Institute of Technology.
MIT Proprietary, Subject to FAR52.227-11 Patent Rights - Ownership by the contractor (May 2014)
The software/firmware is provided to you on an As-Is basis
Delivered to the U.S. Government with Unlimited Rights, as defined in DFARS Part 252.227-7013 or 7014 (Feb 2014). Notwithstanding any copyright notice, U.S. Government rights in this work are defined by DFARS 252.227-7013 or DFARS 252.227-7014 as detailed above. Use of this work other than as specifically authorized by the U.S. Government may violate any copyrights that exist in this work.