-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transforms documentation clean-up #3200
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @voldemortX
Does the Sequence type represent only list and tuple in torchvision? I find that in Python other types (e.g. strings) can also count as a Sequence. Maybe we should clear it up in the docs, like maybe just say list or tuple.
Yes, sequence is also a string. I'd say here it is implicitly assumed that string input for, for example, size="abc" is completely wrong... also isinstance(size, Sequence)
is a bit shorter then isinstance(size, (list, tuple))
.
Maybe we could delete the private function docs
I'd say yes
To add on the reply of @vfdev-5:
Yes it's only supported in PIL when the image is uint8. In this case it's important to clarify to the users that the tensor should be of data type uint8. Currently an image can be RGB but represented as floats. If that's the case, the transform won't work.
Not necessary because supporting a transform in both backends is our default. We only specific when something is not supported in any of the two backends. |
Maybe we should remove that support both backend statement for other docs as well? |
Does that need to be like unified in the doc to use sequence everywhere or int and tuple everywhere? @vfdev-5 |
Yes, in the docs it is nice to unify the type description. In the code this may be not possible due to torchscript... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the PR. These are much needed changes that hopefully will make the documentation clearer. :)
I left a couple of comments. Let me know what you think.
@@ -678,7 +678,7 @@ def pad(img: Tensor, padding: List[int], fill: int = 0, padding_mode: str = "con | |||
this is the padding for the left, top, right and bottom borders | |||
respectively. In torchscript mode padding as single int is not supported, use a tuple or | |||
list of length 1: ``[padding, ]``. | |||
fill (int): Pixel fill value for constant fill. Default is 0. | |||
fill (int, float): Pixel fill value for constant fill. Default is 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed in comments, let's go ahead and remove all docs from functional_tensor.py
and torchvision/transforms/functional_pil.py
torchvision/transforms/transforms.py
Outdated
@@ -1707,6 +1713,7 @@ class RandomInvert(torch.nn.Module): | |||
The image can be a PIL Image or a torch Tensor, in which case it is expected | |||
to have [..., H, W] shape, where ... means an arbitrary number of leading | |||
dimensions. | |||
For PIL images, only mode "L" and "RGB" is supported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's true for both PIL and Tensor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me check if my understanding is correct: it should be that all methods that use _lut() supports only L and RGB for PIL, and 1/3 channel images in tensor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both require that they are L or RGB. See the entering invert() functional method:
vision/torchvision/transforms/functional.py
Lines 1177 to 1178 in 90645cc
def invert(img: Tensor) -> Tensor: | |
"""Invert the colors of an RGB/grayscale PIL Image or torch Tensor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I specify the related docs (invert, solarize, etc.) like, for example:
"""Invert the colors of an RGB/grayscale image.
Args:
img (PIL Image or Tensor): L or RGB image to have its colors inverted.
If img is a Tensor, it is expected to be in [..., 1 or 3, H, W] format,
where ... means it can have an arbitrary number of leading dimensions.
Returns:
PIL Image or Tensor: Color inverted image.
"""
I mean, tensors don't really have the mode attribute.
EDIT:
All concerned functions include adjust_brightness(), adjust_sharpness(), adjust_gamma(), invert(), posterize(), solarize(), autocontrast(), equalize(). I'm not sure should they all be applied on RGB or L?
Sounds great to me. I saw you already done this in many places.
I agree. Torchscript might give you headaches. Perhaps ensure that our doc is consistent everywhere using Sequences in this PR and follow up with the proposed changes on another one if TorchScript plays nice? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@voldemortX Apologies if I was not very clear. It's not our intention to mark the specific files with underscores at this point.
A very bad side-effect of doing so is we lose the entire git-blame history, so such changes need to be justified. Please revert the name change and I'll check again your PR once it's done.
Sorry... Should I revert all commits after 08fba9b? |
No worries mate. :) You can just undo the filename change in a new commit or revert the specific commit or however else you prefer. When we merge the PRs, the history is squashed so there will be no side-effects on master. |
@datumbox Is this new commit ok? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@voldemortX Awesome work! Thanks a lot for spending the time not just removing the duplicate docs, but finding out what is currently supported and what's not. This is a major improvement on our documentation.
After seeing this issue #3206 , we might need to also clarify the doc for erase(). I think it does support batch input (although the same randomness across the whole batch) right? |
Yes it does but the input needs to be a tensor, not a PIL image. |
Summary: * Initial doc clean-up * Remove all private docs * Rename files * Highlight backend inconsistencies * Sequence and number * [Need checking] AutoAugment related doc change * Revert name changes Reviewed By: datumbox Differential Revision: D25954563 fbshipit-source-id: 3b73d924ec4e23d58416a8d38b554b4e16e64059
A documentation clean-up that concerns
transform.py functional.py functional_pil.py functional_tensor.py
for issue #3071.@datumbox After some initial checking, there are somethings I'm not certain:
Sequence
type represent only list and tuple in torchvision? I find that in Python other types (e.g. strings) can also count as a Sequence. Maybe we should clear it up in the docs, like maybe just say list or tuple.And there are some other things I found after cleaning the docs up, maybe they can be addressed in the future: