Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

16 bits png support and compatibility with current transforms #4731

Closed
NicolasHug opened this issue Oct 25, 2021 · 7 comments · Fixed by #8524
Closed

16 bits png support and compatibility with current transforms #4731

NicolasHug opened this issue Oct 25, 2021 · 7 comments · Fixed by #8524

Comments

@NicolasHug
Copy link
Member

NicolasHug commented Oct 25, 2021

We added support for decoding 16 bit pngs in #4657.

Pytorch doesn't support uint16 dtype, so we have to return a int32 tenor instead. However as @nairbv pointed out, this can lead to unexpected behaviour for the convert_image_dtype(), because this function relies on the max value of a given dtype to carry the conversions.

While the actual max value of a 16 bits image is 2**16 - 1, convert_image_dtype() will assume that it's (2**31 - 1) because the image is in a int32 tensor. This leads to e.g. a tensor full of zeros when converting to uint8.

Potential solutions are:

  • native support for uint16 in pytorch
  • add a new input_range=None parameter to convert_image_dtype() (not a fan of that)

For now, I'll just make this a private API (e.g. _decode_16bits_png() or something). It will still allow to continue the work on RAFT, which was the main reason I needed 16bits decoding in the first place.

@datumbox
Copy link
Contributor

FYI adding a new max_val=None parameter to convert_image_dtype() has been discussed before and it's an approach used by other libraries/frameworks (missing references, need to dig to find them). For BC reasons, when None we would infer it from the type.

@NicolasHug
Copy link
Member Author

Thanks for the feedback, I'd be very interested in reading those references - if we have enough good reasons to include a new param this could be a decent solution. For now I'm only aware of one reason (that issue), so it doesn't seem worth it, but if we have more it could be a serious candidate

@datumbox
Copy link
Contributor

@NicolasHug I'll try to have a look to get these examples. Another idea discussed on Github issues was also to store this information on the new Image classes that @pmeier is working on. This way we won't have to pass an extra parameter and we can read this info directly from the input tensor meta-data.

@h-vetinari
Copy link

Trying to package torchvision 0.12 for conda-forge, all the test failures I see are:

        if "16" in img_path:
            # 16 bits image decoding is supported, but only as a private API
            # FIXME: see https://github.com/pytorch/vision/issues/4731 for potential solutions to making it public
>           with pytest.raises(RuntimeError, match="At most 8-bit PNG images are supported"):
E           Failed: DID NOT RAISE <class 'RuntimeError'>

Is there like a secret switch to allow these to pass - because (from what I can tell) it seems the respective images are there in the sources...

@h-vetinari
Copy link

h-vetinari commented Mar 21, 2022

For context, in the so-called feedstock for conda-forge (latest PR), we're running the full torchvision test suite based on the github sources. So AFAIU, running into these "private" tests will be unavoidable, unless we skip/delete them, or there's a secret switch (I presume that's how the CI here does it, but haven't investigated yet...).

@NicolasHug
Copy link
Member Author

NicolasHug commented Mar 21, 2022

Thanks for the report @h-vetinari . The secret switch is just

https://github.com/pytorch/vision/blob/fc63f82890651bb19f27e33eefb52465aeb1c25d/torchvision/csrc/io/image/cpu/decode_png.h#L12:L12

and the offending test calls decode_image

https://github.com/pytorch/vision/blob/fc63f82890651bb19f27e33eefb52465aeb1c25d/torchvision/io/image.py#L227:L227

which itself will call decode_png() with the default value of allow_16_bits=False

https://github.com/pytorch/vision/blob/fc63f82890651bb19f27e33eefb52465aeb1c25d/torchvision/csrc/io/image/cpu/decode_image.cpp#L25:L25

I'm unable to reproduce the issue locally.

If I had to take a wild guess, I would bet that your CI pipeline is setup in such a way that 16 is part of the path to the images, even those that aren't 16bits images. From your logs the image paths look like img_path = '$SRC_DIR/test/assets/fakedata/logos/gray_pytorch.png'. Does $SRC_DIR contain 16?

If that's the case please let us know and we'll make that check more robust

@h-vetinari
Copy link

If I had to take a wild guess, I would bet that your CI pipeline is setup in such a way that 16 is part of the path to the images, even those that aren't 16bits images. From your logs the image paths look like img_path = '$SRC_DIR/test/assets/fakedata/logos/gray_pytorch.png'. Does $SRC_DIR contain 16?

That was a great guess:

## Package Plan ##

  environment location: /home/conda/feedstock_root/build_artifacts/torchvision-split_1647851964444/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol

If that's the case please let us know and we'll make that check more robust

That would be wonderful, thanks a lot! 🙃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants