Source code for Hierarchical Kickstarting for Skill Transfer in Reinforcement Learning (CoLLAs 2022).
SkillHack is a repository for skill based learning based on MiniHack and the NetHack Learning Environment (NLE). SkillHack consists of 16 simple skill acquistion environments and 8 complex task environments. The task environments are difficult to solve due to the large state-action space and sparsity of rewards, but can be made tractable by transferring knowledge gained from the simpler skill acquistion environments.
This repository is dependent on the MiniHack repo (https://github.com/facebookresearch/minihack). First, create a directory to store pretrained skills in and set the environment variable SKILL_TRANSFER_HOME to point to this directory. This directory must also include a file called skill_config.yaml. A default setting for this file can be found in the same directory as this readme.Next, this directory must be populated with pretrained skill experts. These are obtained by running skill_transfer_polyhydra.py on a skill-specific environment.
In this example we train the fight skill.
First, the agent is trained with the following command
python -m agent.polybeast.skill_transfer_polyhydra model=baseline env=mini_skill_fight use_lstm=false total_steps=1e7
Once the agent is trained, the final weights are automatically copied to SKILL_TRANSFER_HOME and renamed to the name of the skill environment. So in this example, the skill would be saved at
${SKILL_TRANSFER_HOME}/mini_skill_fight.tar
If there already exists a file with this name in SKILL_TRANSFER_HOME (i.e. if you've already trained an agent on this skill) then the new skill expert is saved as
${SKILL_TRANSFER_HOME}/${ENV_NAME}_${CURRENT_TIME_SECONDS}.tar
Tasks that make use of skills will use the path given in the first example (i.e. without the current time). If you want to use your newer agent, you need to delete the old file and rename the new file to remove the time from it.
Repeat this for all skills.Remember all skills need to be trained with use_lstm=false
The full list of skills to be trained is
- mini_skill_apply_frost_horn
- mini_skill_eat
- mini_skill_fight
- mini_skill_nav_blind
- mini_skill_nav_lava
- mini_skill_nav_lava_to_amulet
- mini_skill_nav_water
- mini_skill_pick_up
- mini_skill_put_on
- mini_skill_take_off
- mini_skill_throw
- mini_skill_unlock
- mini_skill_wear
- mini_skill_wield
- mini_skill_zap_cold
- mini_skill_zap_death
If the relevant skills for the environment are not present in SKILL_TRANSFER_HOME an error will be shown indicating which skill is missing. The skill transfer specific models are
- foc: Options Framework
- ks: Kickstarting
- hks: Hierarchical Kickstarting
The tasks created for skill transfer are
- mini_simple_seq: Battle
- mini_simple_union: Over or Around
- mini_simple_intersection: Prepare for Battle
- mini_simple_random: Target Practice
- mini_lc_freeze: Frozen Lava Cross
- mini_medusa: Medusa
- mini_mimic: Identify Mimic
- mini_seamonsters: Sea Monsters
So, for example, to run hierarchical kickstarting on the Target Practice environment, one would call
python -m agent.polybeast.skill_transfer_polyhydra model=hks env=mini_simple_random
With all other parameters being able to be set in the same way as with polyhydra.py
The runs from the paper can be repeated with the following command. If you don't want to run with wandb, set wandb=false.python -m agent.polybeast.skill_transfer_polyhydra --multirun model=ks,foc,hks,baseline env=mini_simple_seq,mini_simple_intersection,mini_simple_union,mini_simple_random,mini_lc_freeze,mini_medusa,mini_mimic,mini_seamonsters name=1,2,3,4,5,6,7,8,9,10,11,12 total_steps=2.5e8 group=<YOUR_WANDB_GROUP> hks_max_uniform_weight=20 hks_min_uniform_prop=0 train_with_all_skills=false ks_min_lambda_prop=0.05 hks_max_uniform_time=2e7 entity=<YOUR_WANDB_ENTITY> project=<YOUR_WANDB_PROJECT>
- If you want to train with the fixed version of nav_blind, go to data/tasks/tasks.json and replace mini_skill_nav_blind with mini_skill_nav_blind_fixed
If you make use of this code in your own work, please cite our paper:
@misc{matthews2022hierarchical,
url = {https://arxiv.org/abs/2207.11584},
author = {Matthews, Michael and Samvelyan, Mikayel and Parker-Holder, Jack and Grefenstette, Edward and Rockt{\"a}schel, Tim},
title = {Hierarchical Kickstarting for Skill Transfer in Reinforcement Learning},
publisher = {arXiv},
year = {2022},
copyright = {arXiv.org perpetual, non-exclusive license}
}