From 4d14bf4801d20a1ac6083e617e144eda20d413e9 Mon Sep 17 00:00:00 2001 From: Bartol Karuza Date: Sat, 5 Dec 2020 00:30:27 +0000 Subject: [PATCH] Fix a typo/copy paste error (#415) --- docs/source/reinforce_learn.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/reinforce_learn.rst b/docs/source/reinforce_learn.rst index 4737b60764..ccb6e06392 100644 --- a/docs/source/reinforce_learn.rst +++ b/docs/source/reinforce_learn.rst @@ -306,7 +306,7 @@ steadily increases till convergence. :width: 800 :alt: Noisy DQN Result -**DQN vs Dueling DQN: Pong** +**DQN vs Noisy DQN: Pong** In comparison to the base DQN, the Noisy DQN is more stable and is able to converge on an optimal policy much faster than the original. It seems that the replacement of the epsilon-greedy strategy with network noise provides a better