Image Classification using AWS SageMaker

Use AWS Sagemaker to train a pretrained model that can perform image classification by using the Sagemaker profiling, debugger, hyperparameter tuning and other good ML engineering practices. This can be done on either the provided dog breed classication data set or one of your choice.

Project Set Up and Installation

Enter AWS through the gateway in the course and open SageMaker Studio. Download the starter files. Download/Make the dataset available.

Dataset

The provided dataset is the dogbreed classification dataset which can be found in the classroom. The project is designed to be dataset independent so if there is a dataset that is more interesting or relevant to your work, you are welcome to use it to complete the project.

Access

Upload the data to an S3 bucket through the AWS Gateway so that SageMaker has access to the data.

Hyperparameter Tuning

For this project, I chose to apply transfer learning on the Pretrained Resnet50 model provided by the torchvision library. Resnet50 is a convolutional neural network with a total of 50 convolutional and fully connected layers. The model has about 25 million trainable parameters. The provided model is trained on the ImageNet dataset so it has learned to find some relations and insights from training on a large number of images, so using transfer learning, we can transfer the knowledge from the pretrained model and use it to enhance the model for the task of dogbreed classification without needing too much data.

Hyperparameter Type Range Learning Rate Continous interval: [0.001, 0.1] Batch Size Categorical Values : [32, 64, 128] Epochs Categorical Values: [1, 2]

Include a screenshot of completed training jobs

Logs metrics during the training process

First Job

Second Job

Third Job

The Best Hyparameter

Debugging and Profiling

Model debugging is useful for capturing the values of the tensors as they flow through the model during the training & evaluation phases. In addition to saving & monitoring the tensors, sagemaker provides some prebuilt rules for analizing the tensors and extracting insights that are useful for understanding the process of training & evaluating the model.

I chose the to monitor the Loss Not Decreasing Rule during debugging the model which monitors if the loss isn't decreasing at an adequate rate.

Model Profiling is useful for capturing system metrics such as bottlenecks, CPU utilization, GPU utilization and so on. I used the ProfilerReport rule to generate a profiler report with statistics about the training run.

Results

Insights from the Plot

The training loss decreases with the number of steps.
The training loss is a bit noisy, may be this means that the training might have required a larger batch size.
The validation loss seems to be almost constant and it is very low compared to the training loss from the beginning which might be a sign of overfitting.
What to be applied if the plot was erronous Inorder to avoid overfitting we might try the following solutions:
Maybe I need to use a smaller model compared to the resnet50 like the resnet18 for example.
Maybe I need to apply regularization to avoid overfitting over the dataset.
Maybe I need more data for my model..

Model Deployment

Overview of Endpoint

The deployed model is a resnet50 model pretrained on the ImageNet dataset and finetuned using the dog breed classification dataset.

The model takes an image of size (3, 224, 224) as an input and outputs 133 values representing the 133 possible dog breeds availabe in the dataset.

The model doesn't apply softmax or log softmax (they are applied only inside the nn.crossentropy loss during training).

The model's output label can be found by taking the maximum over the 133 output values and finding its correponding index.

The model was finetuned for 1 epoch using a batch size of 128 and learning rate ~0.05.

Instructions to query the model

Provide the path of a local image to the Image.open() function from the PIL library to load the image as a PIL image.
Preprocess the image to prepare the tensor input for the resnet50 network. First the image is resized to (3x256x256) then a center crop is applied to make the image size (3x224x224), the image is then converted to a tensor with values from 0.0 to 1.0 and finally it is normalized by some common known values fro the mean and the standard deviation.
A request is then sent to the endpoint having the image as its payload

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Screenshots		Screenshots
CODEOWNERS		CODEOWNERS
LICENSE.txt		LICENSE.txt
README.md		README.md
hpo.py		hpo.py
profiler-report.html		profiler-report.html
train_and_deploy.ipynb		train_and_deploy.ipynb
train_model.py		train_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Classification using AWS SageMaker

Project Set Up and Installation

Dataset

Access

Hyperparameter Tuning

Debugging and Profiling

Results

Model Deployment

Overview of Endpoint

Instructions to query the model

About

Releases

Packages

Languages

License

Tanya-1109/Image-Classification-using-AWS-SageMaker

Folders and files

Latest commit

History

Repository files navigation

Image Classification using AWS SageMaker

Project Set Up and Installation

Dataset

Access

Hyperparameter Tuning

Debugging and Profiling

Results

Model Deployment

Overview of Endpoint

Instructions to query the model

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages