Skip to content

ran-weii/non_adversarial_imitation_learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Implementation of Non-Adversarial Imitation Learning (NAIL)

TL;DR: An online imitation learning algorithm with better convergence behavior than Generative Adversarial Imitation Learning using a simple modification.

Implementation based on paper Non-Adversarial Imitation Learning and its Connections to Adversarial Methods, Arenz & Neumann, 2020. The official implementation can be found here.

The training loop follows Discriminator Actor-Critic and uses Soft Actor-Critic as the reinforcement learning algorithm.

For a short note on the algorithm and implementation see here.

Experiment

We perform experiments in the discrete-action mountain car environment. We compute the demonstration policy by discretizing the state space and computing the optimal soft value function in close form. Only a critic is needed in the imitation learning algorithm.

NAIL learning curve

AIL learning curve

Usage

To generate demonstrations, run:

python ./scripts/create_demonstrations.py

To train the NAIL policy, run:

sh ./scripts/train_agent.sh

You can modify the -algo argument in the .sh file to train AIL policy.

To test trained agents, run:

python ./scripts/test_agent.py --exp_name "your_experiment_name"

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published