BRAC+: Improved Behavior Regularized Offline Reinforcement Learning

Requirements

We high recommend that you create a new Python environment to test our code

conda create -n bracp python=3.8

To install requirements:

pip install -r requirements.txt

pip install git+https://github.com/rail-berkeley/d4rl@master#egg=d4rl

pip install rlutils-python==0.0.3

python d4rl_bracp.py train --env_name halfcheetah-medium-v0 --seed 110

The script will first pretrain the behavior policy and the initial policy that minimize the KL divergence.

The logs will be placed at data/d4rl_results/