Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transforms documentation clean-up #3200

Merged
merged 8 commits into from
Dec 23, 2020
Merged

Conversation

voldemortX
Copy link
Contributor

@voldemortX voldemortX commented Dec 22, 2020

A documentation clean-up that concerns transform.py functional.py functional_pil.py functional_tensor.py for issue #3071.
@datumbox After some initial checking, there are somethings I'm not certain:

  1. Does the Sequence type represent only list and tuple in torchvision? I find that in Python other types (e.g. strings) can also count as a Sequence. Maybe we should clear it up in the docs, like maybe just say list or tuple.
  2. Does the Posterize operation also only support uint8 images in PIL? I'm inclined to add "only support L or RGB" to all related transforms like I did with invert(). But maybe we can just let PIL throw the OSError.
  3. Should we also add "This transform supports both PIL and Tensor, ..." in the documentation for AutoAugment as well?
  4. Maybe we could delete the private function docs, as mentioned by @vfdev-5 here.

And there are some other things I found after cleaning the docs up, maybe they can be addressed in the future:

  1. In Pad (or RandomCrop, it also uses padding), it seems tensors only support single value fill. Also I'm not sure whether str is really needed here by PIL, what is the use case here?
  2. Some type checking is also Sequence, maybe we should check explicitly for tuple and list. Or is this the intended implementation?

Copy link
Collaborator

@vfdev-5 vfdev-5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @voldemortX

Does the Sequence type represent only list and tuple in torchvision? I find that in Python other types (e.g. strings) can also count as a Sequence. Maybe we should clear it up in the docs, like maybe just say list or tuple.

Yes, sequence is also a string. I'd say here it is implicitly assumed that string input for, for example, size="abc" is completely wrong... also isinstance(size, Sequence) is a bit shorter then isinstance(size, (list, tuple)).

Maybe we could delete the private function docs

I'd say yes

torchvision/transforms/functional.py Outdated Show resolved Hide resolved
@datumbox
Copy link
Contributor

datumbox commented Dec 22, 2020

To add on the reply of @vfdev-5:

  1. Does the Posterize operation also only support uint8 images in PIL? I'm inclined to add "only support L or RGB" to all related transforms like I did with invert(). But maybe we can just let PIL throw the OSError.

Yes it's only supported in PIL when the image is uint8. In this case it's important to clarify to the users that the tensor should be of data type uint8. Currently an image can be RGB but represented as floats. If that's the case, the transform won't work.

  1. Should we also add "This transform supports both PIL and Tensor, ..." in the documentation for AutoAugment as well?

Not necessary because supporting a transform in both backends is our default. We only specific when something is not supported in any of the two backends.

@voldemortX
Copy link
Contributor Author

Not necessary because supporting a transform in both backends is our default. We only specific when something is not supported in any of the two backends.

Maybe we should remove that support both backend statement for other docs as well?

@voldemortX
Copy link
Contributor Author

Yes, sequence is also a string. I'd say here it is implicitly assumed that string input for, for example, size="abc" is completely wrong... also isinstance(size, Sequence) is a bit shorter then isinstance(size, (list, tuple)).

Does that need to be like unified in the doc to use sequence everywhere or int and tuple everywhere? @vfdev-5

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Dec 22, 2020

Yes, sequence is also a string. I'd say here it is implicitly assumed that string input for, for example, size="abc" is completely wrong... also isinstance(size, Sequence) is a bit shorter then isinstance(size, (list, tuple)).

Does that need to be like unified in the doc to use sequence everywhere or int and tuple everywhere? @vfdev-5

Yes, in the docs it is nice to unify the type description. In the code this may be not possible due to torchscript...
I'd say most of the times, input type is "sequence or number"... as may be int or float or list or tuple...

Copy link
Contributor

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the PR. These are much needed changes that hopefully will make the documentation clearer. :)

I left a couple of comments. Let me know what you think.

torchvision/transforms/functional.py Outdated Show resolved Hide resolved
@@ -678,7 +678,7 @@ def pad(img: Tensor, padding: List[int], fill: int = 0, padding_mode: str = "con
this is the padding for the left, top, right and bottom borders
respectively. In torchscript mode padding as single int is not supported, use a tuple or
list of length 1: ``[padding, ]``.
fill (int): Pixel fill value for constant fill. Default is 0.
fill (int, float): Pixel fill value for constant fill. Default is 0.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in comments, let's go ahead and remove all docs from functional_tensor.py and torchvision/transforms/functional_pil.py

torchvision/transforms/transforms.py Outdated Show resolved Hide resolved
torchvision/transforms/transforms.py Show resolved Hide resolved
torchvision/transforms/transforms.py Outdated Show resolved Hide resolved
@@ -1707,6 +1713,7 @@ class RandomInvert(torch.nn.Module):
The image can be a PIL Image or a torch Tensor, in which case it is expected
to have [..., H, W] shape, where ... means an arbitrary number of leading
dimensions.
For PIL images, only mode "L" and "RGB" is supported.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's true for both PIL and Tensor.

Copy link
Contributor Author

@voldemortX voldemortX Dec 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me check if my understanding is correct: it should be that all methods that use _lut() supports only L and RGB for PIL, and 1/3 channel images in tensor?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both require that they are L or RGB. See the entering invert() functional method:

def invert(img: Tensor) -> Tensor:
"""Invert the colors of an RGB/grayscale PIL Image or torch Tensor.

Copy link
Contributor Author

@voldemortX voldemortX Dec 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I specify the related docs (invert, solarize, etc.) like, for example:

"""Invert the colors of an RGB/grayscale image.

Args:
    img (PIL Image or Tensor): L or RGB image to have its colors inverted.
        If img is a Tensor, it is expected to be in [..., 1 or 3, H, W] format,
        where ... means it can have an arbitrary number of leading dimensions.

Returns:
    PIL Image or Tensor: Color inverted image.

"""

I mean, tensors don't really have the mode attribute.

EDIT:
All concerned functions include adjust_brightness(), adjust_sharpness(), adjust_gamma(), invert(), posterize(), solarize(), autocontrast(), equalize(). I'm not sure should they all be applied on RGB or L?

@datumbox
Copy link
Contributor

datumbox commented Dec 22, 2020

Maybe we should remove that support both backend statement for other docs as well?

Sounds great to me. I saw you already done this in many places.

Yes, in the docs it is nice to unify the type description. In the code this may be not possible due to torchscript..

I agree. Torchscript might give you headaches. Perhaps ensure that our doc is consistent everywhere using Sequences in this PR and follow up with the proposed changes on another one if TorchScript plays nice?

@voldemortX
Copy link
Contributor Author

I made some further changes based on above reviews. Let me know if I forgot something or got something wrong!
@vfdev-5 @datumbox

Also TODOs in future PRs maybe:

  1. Try check only for number and sequence in the code.
  2. Try unify tensor and PIL Image in pad().

Copy link
Contributor

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@voldemortX Apologies if I was not very clear. It's not our intention to mark the specific files with underscores at this point.

A very bad side-effect of doing so is we lose the entire git-blame history, so such changes need to be justified. Please revert the name change and I'll check again your PR once it's done.

@voldemortX
Copy link
Contributor Author

voldemortX commented Dec 22, 2020

@voldemortX Apologies if I was not very clear. It's not our intention to mark the specific files with underscores at this point.

A very bad side-effect of doing so is we lose the entire git-blame history, so such changes need to be justified. Please revert the name change and I'll check again your PR once it's done.

Sorry... Should I revert all commits after 08fba9b?

@datumbox
Copy link
Contributor

Sorry... Should I revert all commits after 08fba9b?

No worries mate. :) You can just undo the filename change in a new commit or revert the specific commit or however else you prefer. When we merge the PRs, the history is squashed so there will be no side-effects on master.

@voldemortX
Copy link
Contributor Author

@datumbox Is this new commit ok?

Copy link
Contributor

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@voldemortX Awesome work! Thanks a lot for spending the time not just removing the duplicate docs, but finding out what is currently supported and what's not. This is a major improvement on our documentation.

@datumbox datumbox changed the title [WIP] Transforms documentation clean-up Transforms documentation clean-up Dec 22, 2020
@voldemortX
Copy link
Contributor Author

After seeing this issue #3206 , we might need to also clarify the doc for erase(). I think it does support batch input (although the same randomness across the whole batch) right?

@datumbox datumbox merged commit 7b9d30e into pytorch:master Dec 23, 2020
@datumbox
Copy link
Contributor

Yes it does but the input needs to be a tensor, not a PIL image.

facebook-github-bot pushed a commit that referenced this pull request Jan 21, 2021
Summary:
* Initial doc clean-up

* Remove all private docs

* Rename files

* Highlight backend inconsistencies

* Sequence and number

* [Need checking] AutoAugment related doc change

* Revert name changes

Reviewed By: datumbox

Differential Revision: D25954563

fbshipit-source-id: 3b73d924ec4e23d58416a8d38b554b4e16e64059
@vfdev-5 vfdev-5 mentioned this pull request Jun 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants