-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Doc] More depth in VMAS docs #1802
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/1802
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (6 Unrelated Failures)As of commit 847cc13 with merge base b632be9 (): FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for this! I left some comments
continuous_actions (bool, optional): Whether to use continuous actions. Defaults to ``True``. If ``False``, actions | ||
will be discrete. The number of actions and their size will depend on the scenario chosen. | ||
See the VMAS repositiory for more info. | ||
max_steps (int, optional): Horizon of the task. Defaults to ``None`` (infinite horizon). Each VMAS scenario can |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to make sure: even when rewards can be consumed there is no max_steps predefined right? All envs are not truncated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, max_steps is additional to the (eventually implemented) scenario "done" function
torchrl/envs/libs/vmas.py
Outdated
will be discrete. The number of actions and their size will depend on the scenario chosen. | ||
See the VMAS repositiory for more info. | ||
max_steps (int, optional): Horizon of the task. Defaults to ``None`` (infinite horizon). Each VMAS scenario can | ||
implement a ``done`` function that will define when the scenario is terminated. If ``max_steps`` is specified, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does it mean that an env can define a done funtion? Some envs are never done? Can I choose if they have a done function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, each scenario optionally implements "done" function. Some scenarios do not present that yes. Those scenarios would be eventually terminated only if you use max_steps.
You can choose to implement the scenario done function if you implement a scenario yourself, otherwise it is a property of the scenario
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it
What is the done
function in practice?
How does it relate to the "done"
key?
I'm asking this to clarify if it's required to mention a function that we do not see appearing anywhere or if we can just say that env can be non-terminating unless max_steps is provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am also pro to not mention it. I just did cause I did not know how much depth is required here.
we can definitely just say what you proposed. Seems better to me to.
We can say something like: some scnearios are terminating and some not. In all cases max_steps provides an additional termination condition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think users need to know the inner machinery but they should know precisely how changing a kwarg will impact what they get from the environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed
Co-authored-by: Vincent Moens <[email protected]>
Co-authored-by: Vincent Moens <[email protected]>
Co-authored-by: Vincent Moens <[email protected]>
Co-authored-by: Vincent Moens <[email protected]>
torchrl/envs/libs/vmas.py
Outdated
max_steps (int, optional): Horizon of the task. Defaults to ``None`` (infinite horizon). Each VMAS scenario can | ||
implement a ``done`` function that will define when the scenario is terminated. If ``max_steps`` is specified, | ||
the scenario will also be terminated after this horizon has been reached. If instead of terminating the scenario | ||
you wish to truncate it, please use a :class:`~torchrl.envs.transforms.StepCounter` transform. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think the current formulation is confusing, because even if we don't set the truncated
key this is still technically a truncation. What about
Unlike gym's `TimeLimit` transform or torchrl's :class:`~torchrl.envs.transforms.StepCounter`, this argument will not set a `"truncated"` entry in the tensordict.
Co-authored-by: Vincent Moens <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
No description provided.