Tiny-ImageNet-Classifier

Tiny-ImageNet Classifier using Pytorch

Tiny-ImageNet

Properties
Number of Classes	200
Number of training Images	500
Number of validation Images	50
Number of test Images	50
Image Size	(64,64,3)

Tiny-ImageNet

Step.1 Create Baseline Classifier

We will use a ResNet18 model as our baseline model.

Layer Name	Output Size (Input 224x224x3)	ResNet-18
conv1	112x112x64	7x7, 64, stride=2, pad=3
max pool	56x56x64	3x3, stride=2, pad=1
layer1	56x56x64	[3x3, 64] x 2, stride = 1
layer2	28x28x128	[3x3, 128] x2, stride = 2
layer3	14x14x256	[3x3, 256] x2, stride = 2
layer4	7x7x512	[3x3, 512] x2, stride = 2
average pool	1x1x512	Adaptive Average Pooling(1)
Fully Connected	1000	512x1000
softmax	1000

Since ResNet18 is trained with 224x224 images and output of 1000 classes, we would have to modify the architecture to fit 64x64 images and output of 200 classes.

Model with no pretrained weight

#Load Resnet18
model_ft = models.resnet18()
#Finetune Final few layers to adjust for tiny imagenet input
model_ft.avgpool = nn.AdaptiveAvgPool2d(1)
num_ftrs = model_ft.fc.in_features
model_ft.fc = nn.Linear(num_ftrs, 200)

Following is the loss function and optimization used for baseline model

#Loss Function
criterion = nn.CrossEntropyLoss()
# Observe that all parameters are being optimized
optimizer_ft = optim.SGD(model_ft.parameters(), lr=0.001, momentum=0.9)

# Decay LR by a factor of 0.1 every 7 epochs
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)

Following figure shows the training and validation results.

Model	pretrained weight	Dataset	Validation Accuracy
ResNet18	None	64x64	25.9%

Model with pretrained weight

#Load Resnet18 with pretrained weights
model_ft = models.resnet18(pretrained=True)
#Finetune Final few layers to adjust for tiny imagenet input
model_ft.avgpool = nn.AdaptiveAvgPool2d(1)
num_ftrs = model_ft.fc.in_features
model_ft.fc = nn.Linear(num_ftrs, 200)

Same loss function and optimization were used.

The following figure shows the training and validation results.

Model	pretrained weight	Dataset	Validation Accuracy
ResNet18	ImageNet	64x64	56.9%

Reference Baseline for detail python code.

Step.2 Preprocessing

Validation accuracy increased from 25.9% to 56.9% by using pretrained weight from ImageNet. The validity of pretrained weight was confirmed, even though the image size was 64x64. For the next step, we would like to observe the efficacy of pretrained weight when we train the model with 224x224 images. Images have to be preprocessed from 64x64 to 224x224. We used bicubic interpolation to improve the quality of a low-resolution image when expanding it to 224x224.

import cv2

def resize_img(image_path, size):
    img = cv2.imread(image_path)
    img = cv2.resize(img,(size,size), interpolation = cv2.INTER_CUBIC)
    cv2.imwrite(image_path,img)

The following figure shows the training and validation results.

Model	pretrained weight	Dataset	Validation Accuracy
ResNet18	ImageNet	224x224	73.1%

Reference 224 for detail python code.

Step.3 Finetuning

We achieved a classifier model with validation accuracy of 73.1%. However, if we evaluate 64x64 validation images with this model, validation accuracy drops to 15.3%. This drop happens due to the difference in input image size.

In order to use the 64x64 image, we have to retrain the model with 64x64 images. We used the weight from the previous (224x224 trained) model.

#Load ResNet18
model_ft = models.resnet18()
#Finetune Final few layers to adjust for tiny imagenet input
model_ft.avgpool = nn.AdaptiveAvgPool2d(1)
num_ftrs = model_ft.fc.in_features
model_ft.fc = nn.Linear(num_ftrs, 200)
#Load weights from 224x224 trained model
model_ft.load_state_dict(torch.load('./models/resnet18_224_w.pt'))

The following figure shows the training and validation results.

Model	pretrained weight	Dataset	Validation Accuracy
ResNet18	224x224 model	64x64	54.5%

Validation accuracy of this model was not as high as expected. It is even lower than the model trained from ImageNet pretrained weight.

If we compare the output size of each convolutional layer, we can observe output size of 64x64 input image is much smaller than 224x224 input image.

Layer Name	Output Size (Input 224x224x3)	Output Size (Input 64x64x3)	ResNet-18
conv1	112x112x64	32x32x64	7x7, 64, stride=2, pad=3
max pool	56x56x64	16x16x64	3x3, stride=2, pad=1
layer1	56x56x64	16x16x64	[3x3, 64] x 2, stride = 1
layer2	28x28x128	8x8x128	[3x3, 128] x2, stride = 2
layer3	14x14x256	4x4x256	[3x3, 256] x2, stride = 2
layer4	7x7x512	2x2x512	[3x3, 512] x2, stride = 2
average pool	1x1x512	1x1x512	Adaptive Average Pooling(1)

First layer of ResNet18 has stride of 2 followed by maxpool layer with stride of 2. This reduces the information of the image in the early stage of CNN.

For fine tuning, we decided to reduce the kernel size to 3x3, stride to 1, and padding to 1. Then remove max pool layer to keep the output size.

Layer Name	Output Size (Input 64x64x3)	ResNet-18 FineTune
conv1	64x64x64	(3x3, 64, stride=1, pad=1)*
max pool	--------------	(Removed)*
layer1	64x64x64	[3x3, 64] x 2, stride = 1
layer2	32x32x128	[3x3, 128] x2, stride = 2
layer3	16x16x256	[3x3, 256] x2, stride = 2
layer4	8x8x512	[3x3, 512] x2, stride = 2
average pool	1x1x512	Adaptive Average Pooling(1)

After fine tuning the layer, we train the model with 64x64 images.

#Load Resnet18 with pretrained weights
model_ft = models.resnet18()
#Finetune Final few layers
model_ft.avgpool = nn.AdaptiveAvgPool2d(1)
num_ftrs = model_ft.fc.in_features
model_ft.fc = nn.Linear(num_ftrs, 200)
model_ft.conv1 = nn.Conv2d(3,64, kernel_size=(3,3), stride=(1,1), padding=(1,1))
model_ft.maxpool = nn.Sequential()
#Load pretrained weight from 224x224 trained model
pretrained_dict = torch.load('./models/resnet18_224_w.pt')

The following figure shows the training and validation results.

Model	pretrained weight	Dataset	Validation Accuracy
ResNet18-FineTune	224x224 model	64x64	72.3%

Reference FineTune for detail python code.

Summary

	Model	pretrained weight	Dataset	Validation Accuracy
1	ResNet18	None	64x64	25.9%
2	ResNet18	ImageNet	64x64	56.9%
3	ResNet18	ImageNet	224x224	73.1%
4	ResNet18	From 3	64x64	54.5%
5	ResNet18-FineTune	From 3	64x64	72.3%

Layer Name	ResNet-18	ResNet-18 FineTune
conv1	7x7, 64, stride=2, pad=3	(3x3, 64, stride=1, pad=1)*
max pool	3x3, stride=2, pad=1	(Removed)*
layer1	[3x3, 64] x 2, stride = 1	[3x3, 64] x 2, stride = 1
layer2	[3x3, 128] x2, stride = 2	[3x3, 128] x2, stride = 2
layer3	[3x3, 256] x2, stride = 2	[3x3, 256] x2, stride = 2
layer4	[3x3, 512] x2, stride = 2	[3x3, 512] x2, stride = 2
average pool	Adaptive Average Pooling(1)	Adaptive Average Pooling(1)

Layer Name	ResNet-18 (224x224x3)	ResNet-18 (64x64x3)	ResNet-18-FineTune (64x64x3)
conv1	112x112x64	32x32x64	64x64x64
max pool	56x56x64	16x16x64	---------------
layer1	56x56x64	16x16x64	64x64x64
layer2	28x28x128	8x8x128	32x32x128
layer3	14x14x256	4x4x256	16x16x256
layer4	7x7x512	2x2x512	8x8x512
average pool	1x1x512	1x1x512	1x1x512

Through this project we could observe the efficacy of transfer learning, using pretrained weight and fine tuning the layer.

Reference

http://cs231n.stanford.edu/reports/2016/pdfs/411_Report.pdf

https://towardsdatascience.com/transfer-learning-946518f95666

https://medium.com/@14prakash/understanding-and-implementing-architectures-of-resnet-and-resnext-for-state-of-the-art-image-cf51669e1624

https://tiny-imagenet.herokuapp.com/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Tiny-ImageNet-Classifier

Tiny-ImageNet

Step.1 Create Baseline Classifier

Model with no pretrained weight

Model with pretrained weight

Step.2 Preprocessing

Step.3 Finetuning

Summary

Reference

Files

README.md

Latest commit

History

README.md

File metadata and controls

Tiny-ImageNet-Classifier

Tiny-ImageNet

Step.1 Create Baseline Classifier

Model with no pretrained weight

Model with pretrained weight

Step.2 Preprocessing

Step.3 Finetuning

Summary

Reference