This repository is the official implementation of BRAC+: Improved Behavior Regularized Actor Critic for Offline Reinforcement Learning.
We high recommend that you create a new Python environment to test our code
conda create -n bracp python=3.8
To install requirements:
pip install -r requirements.txt
pip install git+https://github.com/rail-berkeley/d4rl@master#egg=d4rl
pip install rlutils-python==0.0.3
python d4rl_bracp.py train --env_name halfcheetah-medium-v0 --seed 110
The script will first pretrain the behavior policy and the initial policy that minimize the KL divergence.
The logs will be placed at data/d4rl_results/