-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
transforms: add Random Erasing for image augmentation #909
Conversation
Codecov Report
@@ Coverage Diff @@
## master #909 +/- ##
==========================================
+ Coverage 60.03% 62.85% +2.82%
==========================================
Files 64 65 +1
Lines 5054 5140 +86
Branches 754 773 +19
==========================================
+ Hits 3034 3231 +197
+ Misses 1817 1683 -134
- Partials 203 226 +23
Continue to review full report at Codecov.
|
@alykhantejani Could you pay attention to this method and advice the member to add it into the transform. I think random erasing is a useful augmentation method that will often be used in vision tasks. |
This transform can be handy during self-supervision training. |
I've used random erasing successfully in metric loss (triplet) training and larger image (224x224+ imagenet like) training with good results. I think it would be a worthwhile addition. @zhunzhong07 .. I found that the per-pixel version was quite useful for the problems but you didn't include it in your github impl or this PR? In my experiments, normally distributed pixels, after image normalization worked well. Uniform dist caused convergence issues later in training. I perform the RE operation once the tensors are on the GPU as part of a GPU prefetching loader/collate/normalize. Feel free to copy any ideas: https://github.com/rwightman/pytorch-image-models/blob/master/data/random_erasing.py |
@rwightman Thanks for your advice. In PR, I have included per-pixel mode by One request. I don't have enough GPUs to train a model on imagenet right now. So, if you already have the results, could you also provide some results of training w/ or w/o random erasing on imagenet? Thank you! |
@zhunzhong07 I'll run some imagenet trainings to support this. I don't think I have two historical runs with all the hyper-params and results recorded that didn't have some sort of change in library versions, other hyper params, machines etc... I'll let you know how it goes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following proposed changes will be appreciable.
@@ -1317,6 +1317,23 @@ def test_random_grayscale(self): | |||
# Checking if RandomGrayscale can be printed as string | |||
trans3.__repr__() | |||
|
|||
def test_random_erasing(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make the test a bit more stronger by checking if the region around the erased patch is equal to the original image?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed by #1060.
torchvision/transforms/transforms.py
Outdated
def __call__(self, img): | ||
""" | ||
Args: | ||
img (Tensor): Image to be erased. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO , having input and output as PIL Image should be good and consistent with the other transforms
e.g In order to apply RandomOrder with the list of other transforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't do this operation after Normalization (which is a Tensor based transform), you have to duplicate argument passing and pass your dataset stats to both Normalize and the RandomErasing transform. Post norm, you can assume a mean of 0 and consistent std dev. Also, if you do it before norm, mixed up with other transforms, it's much easier to skew the statistics of your data and cause divergence between train and validation.
In my experience, using it on a few projects now, it's generally cleaner, less fussy, and more efficient (integrated with moving data to the GPU) if done after normalization as tensor ops.
torchvision/transforms/transforms.py
Outdated
w = int(round(math.sqrt(target_area / aspect_ratio))) | ||
|
||
if w < img.size()[2] and h < img.size()[1]: | ||
x = random.randint(0, img.size()[1] - h) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zhunzhong07 , just a minor optimisation, can't we store img.size()[1] kinda values instead of computing again and again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed by #1087.
FWIW, I finished several training sessions with different random-erasing settings. I ran no RE, constant (0) RE, and normally distributed (0 mean, 1 std) per-pixel RE. I did not do a random color (solid) run. I'm still running some other tests in this series to validate some other impl and hyper-params for personal interest....
ImageNet 1K validation
ImagenetV2-matched-frequency validation (https://github.com/modestyachts/ImageNetV2)
|
@rwightman Thanks for your experimental results. It is great to see that random erasing could improve the performance of ImageNet. Did you run these results on your impl https://github.com/rwightman/pytorch-image-models? If so, could you also provide the running shell (i.e., parameters of distributed_train.sh), so that we can accurately reproduce results? Thank you for your time and efforts on implementing these results. |
Yeah, using the train script in image-models. I was only running single GPU for these runs and did them in parallel. I had a local mod experimenting with the warmup and changing it's overlap behaviour with the main schedule but differences is minor, I extended the epochs here by 5 to compensate. These should reproduce results closely enough: No RE: RE constant: RE per-pixel normal: |
@rwightman thanks a lot for the feedback wrt the usefulness of RandomErasing. I'll have a closer look at the implementation today |
@rwightman Thanks! With your provided scripts, I have obtained similar results for ResNet34. @fmassa Thank you for your attention. I also run RandomErasing for ResNet50 and ResNet101, and achieve an improvement (+0.7 in Prec@1 for ResNet50 and +0.55 in Prec@1 for ResNet101). Results on ImageNet 1K validation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks fir the PR!
I have a few comments, let me know what you think.
Also, this is the transform that you used to obtain better results, with value='random'
, is that right?
Can you also add an entry to the documentation in https://github.com/pytorch/vision/blob/master/docs/source/transforms.rst
@fmassa Thank you for your improved comments. I have modified the PR and add an entry to the documentation, according to your suggestions. Yes, using the mode of |
Hi @fmassa. I've modified the PR according to your comments. I also add the results of ResNet101 above, consistent improvement is obtained by RandomErasing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot!
@zhunzhong07 Hi, have you tried it for detection? |
@Zhaoyi-Yan Yes. Random Erasing can improve the results of Fast RCNN on VOC17. Please refer to our paper. |
Random Erasing randomly selects a rectangle region in an image and erases its pixels with random values. It can reduce the risk of overfitting, and improves CNN baselines in image classification, object detection and person reidentification.
I found that this augmentation method has been widely used in image classification (CIFA-10, CIFAR-100) and person re-identification.
Also, it could achieve improvements on ImageNet: +0.7% in Prec@1 for ResNet-50, +0.33% in Prec@1 for ResNet-34.
Therefore, I think it would be valuable to users.
'Random Erasing Data Augmentation' by Zhong et.al. https://arxiv.org/pdf/1708.04896.pdf
A parallel work is "Improved Regularization of Convolutional Neural Networks with Cutout" proposed by DeVries. https://arxiv.org/pdf/1708.04552.pdf
Previous pull request and issues #335 #226 #420