Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP Allowing for grouped (random) transformations of tensors/tensor-like objects [proof-of-concept] #4267

Closed
wants to merge 3 commits into from

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Aug 10, 2021

Implements a POC solution to #1406 (+ #9). Some aspects are similar to #1315.
I propose a GroupTransform class that can take a tuple of inputs, and applied the same transformation to each.
In between two queries of GroupTransform, the random transformations are 'frozen' (i.e. their random parameters are stored as attributes so that they can be re-used as much as needed) and the same transform is applied to each input. This ensures that all the transformations match exactly. When GroupTransform is done, it resets the state of the transformations that are provided to it.

e.g.

t = transforms.GroupTransform( transforms.RandomOrder( [
                             transforms.GaussianBlur(kernel_size=3, reset_auto=False),
                             transforms.RandomCrop(size=10, reset_auto=False)]}) ], reset_auto=False ) )
img = torch.arange(1024, dtype=torch.float).view(1, 32, 32).expand(3, 32, 32).contiguous()
mask = img[:1]
imgs = (img, mask)
imgs_out = t(imgs)
torch.testing.assert_close(imgs_out[0][0], imgs_out[1][0], rtol=1e-6, atol=1e-6, check_stride=False) 

The reset_auto=False keyword of the transforms basically means that those objects won't be responsible of resetting their state after applying a transform. This responsibility is taken by transforms.GroupTransform (if its own reset_auto is set to True).

## Advantages

  • Low risk of BC issues as far as i can tell
  • few -- if any -- code changes from the users in most cases. Only those that may need GroupTransform will need to care about reset_auto.

Objects of probable contention

  • I implemented a parent Transform class that inherits from nn.Module, and all transforms are now subclasses of it. This may not be ideal and it is totally open to discussion.
  • Transforms are now 'stateful', and store the parameters of the transform in between execution.
  • Many get_params methods, that were static, have now lost that attribute (but one can override the default behaviour which is to take the stored parameters by providing other kwargs).

TODOs:

  1. custom functions, such as those proposed in [RFC] Abstractions for segmentation / detection transforms #1406 for tensor-like objects, should still be implemented. This will be the predominant way of ensuring that the transforms are applied in a meaningful manner (e.g. interpolation mode for images vs boxes) to each of the inputs, depending on their types.
  2. this code heavily relies on the reset_auto attribute, that ensures that random transformations are applied similarly to each input. One could think of a way to set this attribute value after creating the object using a .reset_auto_(mode=True) method, but for now I feel that this isn't a priority, though it might make things easier to code in the future.
  3. For now, compatibility with torchscript has been overlooked.
  4. I did not change the helpers as of now, as i would prioritise the discussions about the feasibility of this solution.
  5. Some more tests may be necessary

Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the interesting proposal!

This RFC seems like a good way of approaching this problem, and I had similar thoughts to what you proposed here.

Before we dive into the specific implementation details (which includes BC considerations, torchscript, etc), I think it would be good if you could have a look into how different libraries solved this problem in the past, so that we can benefit from past experiences.

Some libraries that come to mind (non-exhaustive):

return output


class Compose(Transform):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've had a lot of discussions with @vfdev-5 about making Compose inheriting from nn.Module, and in the end we decided against it, because it would break a lot of people's code if we enforced torchscriptability.
IIRC the problem was that patterns where the user passes an arbitrary callable (which is not a nn.Module) wouldn't work anymore, and this is something which is often done

Comment on lines +104 to +105
"All transforms should be of type torchvision.transforms.Transform. "
"Custom typed transforms will be forbidden in future releases."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is something that we can actually commit to forbid. cc @vfdev-5 who has done some research on how people were using transforms during his torchscript-compatibility work

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a PR where I tried to make Compose scriptable: #2645
Open discussions there can show the blockers we had.

@datumbox
Copy link
Contributor

Do we consider this still WIP or it's superseded by the work at pmeier/torchvision-datasets-rework#1?

@vmoens
Copy link
Contributor Author

vmoens commented Sep 27, 2021

Do we consider this still WIP or it's superseded by the work at pmeier/torchvision-datasets-rework#1?

let's suspend this for now!

@vmoens vmoens closed this Sep 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants