Skip to content

Commit

Permalink
Update practice.pytorch
Browse files Browse the repository at this point in the history
Fix error in latex
  • Loading branch information
kventinel authored Apr 16, 2018
1 parent 0ef75c6 commit 083193d
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion week7_pomdp/practice_pytorch.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -412,7 +412,7 @@
"\n",
"__One more thing:__ since we train on T-step rollouts, we can use N-step formula for advantage for free:\n",
" * At the last step, $A(s_t,a_t) = r(s_t, a_t) + \\gamma \\cdot V(s_{t+1}) - V(s) $\n",
" * One step earlier, $A(s_t,a_t) = r(s_t, a_t) + \\gamma \\cdot r(s_{t+1}, a_{t+1}) + \\gamma ^ 2 \\cdot V(s_{t+1}) - V(s) $\n",
" * One step earlier, $A(s_t,a_t) = r(s_t, a_t) + \\gamma \\cdot r(s_{t+1}, a_{t+1}) + \\gamma ^ 2 \\cdot V(s_{t+2}) - V(s) $\n",
" * Et cetera, et cetera. This way agent starts training much faster since it's estimate of A(s,a) depends less on his (imperfect) value function and more on actual rewards. There's also a [nice generalization](https://arxiv.org/abs/1506.02438) of this.\n",
"\n",
"\n",
Expand Down

0 comments on commit 083193d

Please sign in to comment.