[Feature Request] Support for alpha and beta parameters' schedule in torchrl.data.PrioritizedReplayBuffer #1575

Mad-Chuck · 2023-09-25T11:10:52Z

Motivation

Since in the original PER paper the parameter beta is changing its value during the training, it would be desirable to be able to perform similar experiments with torch-rl's Prioritized Experience Replay.

Solution

Being able to manually change beta or both alpha and beta by calling a method on a replay buffer or modifying its properties.

Alternatives

OpenAI baseline's implementation does it by providing an additional parameter beta during sampling (while alpha is fixed). It would be also possible to implement this with a scheduler (similar to lr schedules, like in torch.optim)

Checklist

I have checked that there is no similar issue in the repo (required)

vmoens · 2023-10-01T05:05:40Z

Thanks for the suggestion.
I wonder if we should blend this in the sampler or create a separate scheduler class that handles it?
The first option would just be "give me an init and final value, and I will do the decay", whereas the second option is closer to the schedulers in pytorch's optim module (lets you do any scheduling you want).
Oc the first step would just be to do a linear annealing module like in the paper.

Opinions?

Mad-Chuck · 2023-11-21T11:37:16Z

Sorry for being inactive, I had been busy with my master's thesis. The more I think about this, the more I lean towards making beta just an argument in the sampling method because this parameter is only used for calculating the weights, which is done only after sampling. Since PyTorch is only a tool for machine learning, I believe that the PER buffer should provide the user with as much freedom in adapting the parameters as possible. That being said, I also believe that this way satisfies it, because the user can adapt the beta parameter at any given time, and it is the user's duty to provide scheduling, which can be done in any way the user wants.

Alpha scheduling is another story, though. However, I don't think I've ever come across a paper that discusses alpha scheduling. I'm not yet familiar with PER implementation, but I wouldn't be surprised if alpha scheduling was actually impossible or meaningless due to the fact that new priorities would undermine the sense of previous ones.

Mad-Chuck added the enhancement New feature or request label Sep 25, 2023

Mad-Chuck assigned vmoens Sep 25, 2023

vmoens mentioned this issue Nov 3, 2023

[DO NOT CLOSE] Call for contributions #509

Open

36 tasks

LTluttmann mentioned this issue Sep 24, 2024

[Feature] Add scheduler for alpha/beta parameters of PrioritizedSampler #2452

Merged

10 tasks

vmoens closed this as completed in #2452 Sep 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Support for alpha and beta parameters' schedule in torchrl.data.PrioritizedReplayBuffer #1575

[Feature Request] Support for alpha and beta parameters' schedule in torchrl.data.PrioritizedReplayBuffer #1575

Mad-Chuck commented Sep 25, 2023

vmoens commented Oct 1, 2023

Mad-Chuck commented Nov 21, 2023

[Feature Request] Support for alpha and beta parameters' schedule in torchrl.data.PrioritizedReplayBuffer #1575

[Feature Request] Support for alpha and beta parameters' schedule in torchrl.data.PrioritizedReplayBuffer #1575

Comments

Mad-Chuck commented Sep 25, 2023

Motivation

Solution

Alternatives

Checklist

vmoens commented Oct 1, 2023

Mad-Chuck commented Nov 21, 2023