Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Auto-augment input transformation #3050

Closed
9 tasks done
datumbox opened this issue Nov 26, 2020 · 0 comments · Fixed by #3123
Closed
9 tasks done

Add Auto-augment input transformation #3050

datumbox opened this issue Nov 26, 2020 · 0 comments · Fixed by #3123

Comments

@datumbox
Copy link
Contributor

datumbox commented Nov 26, 2020

🚀 Feature

Provide an Implementation of the AutoAugment policies for ImageNet, CIFA10 and SVHN on TorchVision.

Motivation

AutoAugment is a common Data Augmentation technique that when applied, typically increases the accuracy of the models on specific datasets. On the original paper, the authors used RL to search for better data augmentation policies that increase the accuracy on 3 datasets: ImageNet, CIFA10 and SVHN. Though the found data augmentation policies were directly linked to their trained dataset, the paper shows that ImageNet policies provide significant improvements when applied to other datasets such as Oxford Flowers, Caltech-101, Oxford-IIT Pets, FGVC Aircraft and Stanford Cars. Similarly the CIFAR10 policies can be used for boosting the performance of CIFAR100.

Hence providing an out-of-the-box implementation for the aforementioned policies can be beneficial for the users of TorchVision.

Pitch

Provide the implementation of 3 the following policies:

  • ImageNet AutoAugment Policy
  • CIFAR10 AutoAugment Policy
  • SVHN AutoAugment Policy

Currently there are already community implementations for the above, nevertheless they all use PIL as backend. Therefor to add this in TorchVision we need to add support for the following missing Tensor Functional transforms:

  • posterize
  • solarize
  • equalize
  • sharpness
  • autocontrast
  • invert

Note that the original paper also experimented with the following augmentation techniques:

  1. Cutout: Very similar to RandomErasing but with does not apply thresholds on rations/scales, can produce multiple rectangles, uses the value of the average pixel per channel of the dataset. TorchVision already provides an implementation for RandomErasing which closely approximates Cutout, so it's probably not worth reimplementing it. Moreover though cutout was investigated by AutoAugment's RL as a policy, it was not selected for any of the datasets, though it was applied always at the end on the CIFAR10 experiments.
  2. Sample Pairing: Similar to the above, though Sample Pairing was investigated during the RL phase, it was not selected by any of the dataset policies and thus we don't need to implement it here.

Lastly it's important to note that it's worth supporting the aforementioned transforms even if we decide not to implement AutoAugment. This is because it will enable the easier addition of newer Data Augmentation techniques such as FixMatch (official implementation, @vfdev-5's implementation), ReMixMatch, RandAugment etc.

cc @vfdev-5 @fmassa

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant