Skip to content

Code for paper: Reward Uncertainty for Exploration in Preference-based Reinforcement Learning

License

Notifications You must be signed in to change notification settings

rll-research/rune

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Reward Uncertainty for Exploration in Preference-based Reinforcement Learning (RUNE)

Code implementation for Reward Uncertainty for Exploration in Preference-based Reinforcement Learning and scripts to reproduce experiments. This codebase is largely originated and modified from B-Pref.

Install

conda env create -f conda_env.yml
pip install -e .[docs,tests,extra]
cd custom_dmcontrol
pip install -e .
cd custom_dmc2gym
pip install -e .
pip install git+https://github.com/rlworkgroup/metaworld.git@master#egg=metaworld
pip install pybullet

Instructions

Implementation of RUNE algorithm is in train_PEBBLE_explore.py (based on PEBBLE) and train_PrefPPO_explore.py (based on PrefPPO). Default hyperparameters used in paper is included in config files (config/) and training scripts (scripts/).

Example scripts for running experiments in Table 1 can be reproduced with the following:

PEBBLE + RUNE:

./scripts/[env_name]/[max_budget]/run_PEBBLE_rune.sh [date: yyyy-mm-dd]
./scripts/[env_name]/[max_budget]/run_PEBBLE.sh [date: yyyy-mm-dd]

PrefPPO + RUNE:

./scripts/[env_name]/[max_budget]/run_PrefPPO_rune.sh [date: yyyy-mm-dd]
./scripts/[env_name]/[max_budget]/run_PrefPPO.sh [date: yyyy-mm-dd]

About

Code for paper: Reward Uncertainty for Exploration in Preference-based Reinforcement Learning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published