Skip to content

A simple and easy to use implementation of the soft actor-critic algorithm.

License

Notifications You must be signed in to change notification settings

young-geng/SimpleSAC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SimpleSAC

A simple and modular implementation of the Soft Actor Critic algorithm in PyTorch.

Codebase Deprecated

This library is deprecated in favor of JaxCQL and CQL, which include up-to-date SAC and CQL implementation in JAX and PyTorch. This codebase will no longer receive new updates and bug fixes.

Installation

  1. Install and use the included Ananconda environment
$ conda env create -f environment.yml
$ source activate SimpleSAC

You'll need to get your own MuJoCo key if you want to use MuJoCo.

  1. Add this repo directory to your PYTHONPATH environment variable.
export PYTHONPATH="$PYTHONPATH:$(pwd)"

Run Experiments

Run Experiments

You can run SAC experiments using the following command:

python -m SimpleSAC.sac_main \
    --env 'HalfCheetah-v2' \
    --logging.output_dir './experiment_output' \
    --device='cuda'

If you want to run on CPU only, just omit the --device='cuda' part. All available command options can be seen in SimpleSAC/sac_main.py and SimpleSAC/sac.py.

Visualize Experiments

You can visualize the experiment metrics with viskit:

python -m viskit './experiment_output'

and simply navigate to http://localhost:5000/

Weights and Biases Online Visualization Integration

This codebase can also log to W&B online visualization platform. To log to W&B, you first need to set your W&B API key environment variable:

export WANDB_API_KEY='YOUR W&B API KEY HERE'

Then you can run experiments with W&B logging turned on:

python -m SimpleSAC.sac \
    --env 'HalfCheetah-v2' \
    --logging.output_dir './experiment_output' \
    --device='cuda' \
    --logging.online

Credits

The project organization is inspired by TD3. The SAC implementation is based on rlkit. The viskit visualization is taken from viskit, which is taken from rllab.

About

A simple and easy to use implementation of the soft actor-critic algorithm.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published