You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
According to Appendix B.2.1 of the PER paper (http://arxiv.org/abs/1511.05952), the original PER implementation uses stratified sampling:
To sample a minibatch of size k, the range [0, ptotal] is divided equally into k ranges. Next, a value is uniformly sampled from each range. Finally the transitions that correspond to each of these sampled values are retrieved from the tree.
This is different from what PFRL's PrioritizedReplayBuffer does right now, i.e., sampling proportionally without replacement k times:
It is not clear if stratified sampling leads to better performance. In a sense PFRL's way could be better since it can strictly prevent the same minnibatch from having duplicate transitions. However, the difference should be noted, and it is good to support and evaluate stratified sampling as well.
The text was updated successfully, but these errors were encountered:
According to Appendix B.2.1 of the PER paper (http://arxiv.org/abs/1511.05952), the original PER implementation uses stratified sampling:
This is different from what PFRL's
PrioritizedReplayBuffer
does right now, i.e., sampling proportionally without replacement k times:pfrl/pfrl/collections/prioritized.py
Lines 262 to 268 in 322fa45
It is not clear if stratified sampling leads to better performance. In a sense PFRL's way could be better since it can strictly prevent the same minnibatch from having duplicate transitions. However, the difference should be noted, and it is good to support and evaluate stratified sampling as well.
The text was updated successfully, but these errors were encountered: