-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nature '20 | A distributional code for value in dopamine-based reinforcement learning. #14
Comments
Related References
|
Markov Decision Process
|
Why does it work?
|
在上面的第二条中,optimistic的RPE永远和对应的value predictor耦合,其实这是一个很不合理的假设,但或许其它value predictor也有投射,只不过影响很小?或者说加权之后的value-predictor和对应的value predictor达到了同样的效果? |
当然这里首先是默认了一个假设:VTA中dopamine的发放率对应Reward Prediction Error(RPE)。但有关这个问题还是需要进一步检验的,Gershman et al. 2020提出了一种检验dopamine发放是否表示RPE的统一实验,在这项实验中,VTA中部分脑区dopamine还是表征RPE的。 |
至于大脑可以在这样一种distributional representation中得到多少好处,这也是一个值得探讨的问题。在论文中的讨论如下:
|
Other Population Code Schemes
关于Parametric Codes,个人不算特别赞同,毕竟可选的distribution都是人为创立的,然后认为一整个脑区在编码一个参数,然后这个参数被用于下游脑区进行其他计算,并且下游脑区要知道这个参数对应的distribution中含义。但是在#13 中确实是用Approximate Kalman Filter对distribution进行的建模,并没有关心其生物合理性。 |
Dabney W, Kurth-Nelson Z, Uchida N, et al. A distributional code for value in dopamine-based reinforcement learning.
The text was updated successfully, but these errors were encountered: