SkillHack

Source code for Hierarchical Kickstarting for Skill Transfer in Reinforcement Learning (CoLLAs 2022).

SkillHack is a repository for skill based learning based on MiniHack and the NetHack Learning Environment (NLE). SkillHack consists of 16 simple skill acquistion environments and 8 complex task environments. The task environments are difficult to solve due to the large state-action space and sparsity of rewards, but can be made tractable by transferring knowledge gained from the simpler skill acquistion environments.

Installation

This repository is dependent on the MiniHack repo (https://github.com/facebookresearch/minihack).

How to run a Skill Transfer experiment

First, create a directory to store pretrained skills in and set the environment variable SKILL_TRANSFER_HOME to point to this directory. This directory must also include a file called skill_config.yaml. A default setting for this file can be found in the same directory as this readme.

Next, this directory must be populated with pretrained skill experts. These are obtained by running skill_transfer_polyhydra.py on a skill-specific environment.

Full Skill Expert Training Walkthrough

In this example we train the fight skill.

First, the agent is trained with the following command

python -m agent.polybeast.skill_transfer_polyhydra model=baseline env=mini_skill_fight use_lstm=false total_steps=1e7

Once the agent is trained, the final weights are automatically copied to SKILL_TRANSFER_HOME and renamed to the name of the skill environment. So in this example, the skill would be saved at

${SKILL_TRANSFER_HOME}/mini_skill_fight.tar

If there already exists a file with this name in SKILL_TRANSFER_HOME (i.e. if you've already trained an agent on this skill) then the new skill expert is saved as

${SKILL_TRANSFER_HOME}/${ENV_NAME}_${CURRENT_TIME_SECONDS}.tar

Tasks that make use of skills will use the path given in the first example (i.e. without the current time). If you want to use your newer agent, you need to delete the old file and rename the new file to remove the time from it.

Training other skills

Repeat this for all skills.

Remember all skills need to be trained with use_lstm=false

The full list of skills to be trained is

mini_skill_apply_frost_horn
mini_skill_eat
mini_skill_fight
mini_skill_nav_blind
mini_skill_nav_lava
mini_skill_nav_lava_to_amulet
mini_skill_nav_water
mini_skill_pick_up
mini_skill_put_on
mini_skill_take_off
mini_skill_throw
mini_skill_unlock
mini_skill_wear
mini_skill_wield
mini_skill_zap_cold
mini_skill_zap_death

Training on Tasks

If the relevant skills for the environment are not present in SKILL_TRANSFER_HOME an error will be shown indicating which skill is missing. The skill transfer specific models are

foc: Options Framework
ks: Kickstarting
hks: Hierarchical Kickstarting

The tasks created for skill transfer are

mini_simple_seq: Battle
mini_simple_union: Over or Around
mini_simple_intersection: Prepare for Battle
mini_simple_random: Target Practice
mini_lc_freeze: Frozen Lava Cross
mini_medusa: Medusa
mini_mimic: Identify Mimic
mini_seamonsters: Sea Monsters

So, for example, to run hierarchical kickstarting on the Target Practice environment, one would call

python -m agent.polybeast.skill_transfer_polyhydra model=hks env=mini_simple_random

With all other parameters being able to be set in the same way as with polyhydra.py

Training on Tasks

The runs from the paper can be repeated with the following command. If you don't want to run with wandb, set wandb=false.

python -m agent.polybeast.skill_transfer_polyhydra --multirun model=ks,foc,hks,baseline env=mini_simple_seq,mini_simple_intersection,mini_simple_union,mini_simple_random,mini_lc_freeze,mini_medusa,mini_mimic,mini_seamonsters name=1,2,3,4,5,6,7,8,9,10,11,12 total_steps=2.5e8 group=<YOUR_WANDB_GROUP> hks_max_uniform_weight=20 hks_min_uniform_prop=0 train_with_all_skills=false ks_min_lambda_prop=0.05 hks_max_uniform_time=2e7 entity=<YOUR_WANDB_ENTITY> project=<YOUR_WANDB_PROJECT>

Final Notes

If you want to train with the fixed version of nav_blind, go to data/tasks/tasks.json and replace mini_skill_nav_blind with mini_skill_nav_blind_fixed

Citation

If you make use of this code in your own work, please cite our paper:

@misc{matthews2022hierarchical,
  url = {https://arxiv.org/abs/2207.11584},
  author = {Matthews, Michael and Samvelyan, Mikayel and Parker-Holder, Jack and Grefenstette, Edward and Rockt{\"a}schel, Tim},
  title = {Hierarchical Kickstarting for Skill Transfer in Reinforcement Learning},
  publisher = {arXiv},
  year = {2022},
  copyright = {arXiv.org perpetual, non-exclusive license}
}

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
agent		agent
data		data
docs		docs
envs		envs
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
reward_manager.py		reward_manager.py
version.txt		version.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SkillHack

Installation

How to run a Skill Transfer experiment

Full Skill Expert Training Walkthrough

Training other skills

Training on Tasks

Training on Tasks

Final Notes

Citation

About

Releases

Packages

Contributors 2

Languages

License

ucl-dark/skillhack

Folders and files

Latest commit

History

Repository files navigation

SkillHack

Installation

How to run a Skill Transfer experiment

Full Skill Expert Training Walkthrough

Training other skills

Training on Tasks

Training on Tasks

Final Notes

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages