Here, we compare Q(\sigma) learning presented by Sutton and Barto in [1] to Tree-Backup, n-step Expected Sarsa, and n-step Sarsa.
The main notebook file with our analysis is: VarianceAnalysisMultistepBootstrapping.ipynb
All experiments are included here that generated the graphs in the notebook.
Authors:
Peter Henderson Wei-Di Chang