Implementation of Non-Adversarial Imitation Learning (NAIL)

TL;DR: An online imitation learning algorithm with better convergence behavior than Generative Adversarial Imitation Learning using a simple modification.

Implementation based on paper Non-Adversarial Imitation Learning and its Connections to Adversarial Methods, Arenz & Neumann, 2020. The official implementation can be found here.

The training loop follows Discriminator Actor-Critic and uses Soft Actor-Critic as the reinforcement learning algorithm.

For a short note on the algorithm and implementation see here.

Experiment

We perform experiments in the discrete-action mountain car environment. We compute the demonstration policy by discretizing the state space and computing the optimal soft value function in close form. Only a critic is needed in the imitation learning algorithm.

NAIL learning curve

AIL learning curve

Usage

To generate demonstrations, run:

python ./scripts/create_demonstrations.py

To train the NAIL policy, run:

sh ./scripts/train_agent.sh

You can modify the -algo argument in the .sh file to train AIL policy.

To test trained agents, run:

python ./scripts/test_agent.py --exp_name "your_experiment_name"

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
doc		doc
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Implementation of Non-Adversarial Imitation Learning (NAIL)

Experiment

Usage

About

Releases

Packages

Languages

ran-weii/non_adversarial_imitation_learning

Folders and files

Latest commit

History

Repository files navigation

Implementation of Non-Adversarial Imitation Learning (NAIL)

Experiment

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages