Source code of "Safety Filters for Black-Box Dynamical Systems by Learning Discriminating Hyperplanes"
Will Lavanakul*, Jason J. Choi*, Koushil Sreenath, Claire J. Tomlin
University of California, Berkeley
To import the conda environment, run: conda env create -f environment.yml
All code for SL-DH is found in SL-DH/
and all code for RL-DH can be found in RL-DH/
.
controllers
contain different controllers including point-follow for car, and the QP safe controller.
envs
provides implementations of cartpole, kinematic car, and inverted pendulum.
filters
provides classes used to provide the hyperplane constraint for the QP controller.
inv_set
contains all control invariant sets used for the experiments.
The following run each SL-DH experiment from the paper. All hyperparameters are set to the ones used in the paper:
- Inverted Pendulum: Assign all hyperparameters in the script
inv_ped_train.py
. Runpython inv_ped_train.py
- Kinematic Car: Assign all hyperparameters in the script
four_car_train.py
. Runpython four_car_train.py
., To visualize different lookahead times and animation, set plot values inplot_four_car.py
. Runpython plot_four_car.py
. - Jet Engine: Inverted Pendulum: Assign all hyperparameters in the script
jet_train.py
. Runpython jet_train.py
ppo.py
contains code for training agents for CartPole and HalfCheetah. ppo.py
currently allows PPO, PPO Lagrangian, PPO with SL-DH, and PPO with RL-DH. Note that PPO with SL-DH only works for CartPole and not HalfCheetah. This codebase is built on top of the following repository: https://github.com/akjayant/PPO_Lagrangian_PyTorch?tab=readme-ov-file.
See training scripts to create SL-DH controllers. The main algorithm can be seen in the training loop in all training scripts.
The following run each RL-DH experiment from the paper. Refer to the paper for hyperparameters. For env
- PPO:
python ppo.py --env {env} --exp_name ppo --steps {steps} --seed {seed}
- PPO Lagrangian:
python ppo.py --env {env} --exp_name ppo_lag --steps {steps} --seed {seed}
- PPO with SL-DH:
python ppo.py --env CartPole --exp_name ppo_nn_qp_ --steps {steps} --seed {seed}
- PPO with RL-DH:
python ppo.py --env {env} --exp_name pret_ppo --seed {seed} --steps {steps} --pret_dir {pret_dir}
. pret_dir is the directory containing a pretrained RL-DH safety filter (see next steps). Usually saved indata/fppo/exp_name/pyt_save
Pretraining RL-DH safety filter:
- Run
python fppo.py --env {env} --seed {seed} --exp_name {exp} --steps {steps}