- PROJECT OVERVIEW
- ABOUT THE DATASET
- PRE-REQUISITES
- HOW TO USE
- CREATE VIRTUAL ENVIORNMENT
- TEST NOTEBOOKS
- TRAIN SCRIPT
- MAKE PREDICTIONS
- With Docker
- CLOUD SERVING
- CONCLUSION
- NEXT STEPS
- CONTRIBUTORS
- ACKNOWLEDGEMENTS
- CONTRIBUTIONS
This repository is a project dedicated to classifying various sports images to their respective sports using Convolutional Neural Networks
.
We use Transfer Learning
to utilise prebuilt models on imagenet
dataset and try to improve on their performance using fine-tuning techniques.
The dataset used for the project, contains images from 100 different kinds of sports and is already split into train, valid and test sets to make life a bit easier. The dataset has been copied from a kaggle
source.
Link to the dataset: (https://www.kaggle.com/datasets/gpiosenka/sports-classification?select=sports.csv)
Contains: 13493 train, 500 test, 500 validate images.
git
linux / WSL2
miniconda
python
docker
awscli
kaggle account
First and foremost, clone the repository to your local using:
git clone https://github.com/abhijitchak103/sports-classification-cnn.git
To download the dataset, you can download the zip file from kaggle
and unzip manually to /data
folder in your local copy of this repo or use the Terminal to help you with it.
Make sure you have copied your kaggle key kaggle.json
from account to your local directory in the folder:
C:\Users\{user}\.kaggle
Then change to the directory where you cloned the repo using cd
and follow the following commands:
kaggle datasets download -d gpiosenka/sports-classification
unzip sports-classification.zip -d data
rm sports-classification.zip
cd data
rm "EfficientNetB0-100-(224 X 224)- 98.40.h5"
cd ..
This should unzip the required data used in this project to the correct folder structure.
After this, the folder structure would look like this:
.
├── data
│ ├── test
│ ├── train
│ ├── valid
│ └── sports.csv
├── images
│ ├── api-endpoint.JPG
│ └── using-api-endpoint.JPG
├── models
│ ├── mobilenetv2_v5_aug_height_16_0.922.h5
│ ├── mobilenetv2_v6_aug_rot_17_0.920.h5
│ └── mobilenetv2_v7_17_0.912.h5
├── notebooks
│ ├── model-converter.ipynb
│ ├── sports-classification.ipynb
│ └── test.ipynb
├── .gitignore
├── Dockerfile
├── lambda_function.py
├── prediction.tflite
├── README.md
├── requirements.txt
├── test.py
├── tflite_runtime-2.4.4-cp38-cp38-linux_x86_64.whl
├── train.py
└── utils.py
To run the project and to test out notebooks, you can create a new virtual environment using a framework of your choice. eg.
conda create -n project python==3.8 -y
conda activate project
Once you activate the venv, install the dependencies:
pip install -r requirements.txt
To test out the notebooks and rerun the entire notebook, you can do so. Keep in mind, this is will be time-taking. To do the same:
jupyter notebook
In the web browser, the jupyter notebook environment should open. Open the notebooks folder, and open sports-classification.ipynb
to test it.
A python file has been provided to test out and train the network on the data provided. To do the same, simply activate the environment if not active yet, and run train.py
conda activate project
python train.py
To test the notebook locally you can use docker.
docker build -t sports .
docker run -it --rm -p 8080:8080 sports
Then in a new terminal, cd to the working directory.
python test.py
This should give similar results to the following:
[['cricket', 10.8796835], ['baseball', 8.193538], ['croquet', 7.908885], ['football', 6.1987066], ['golf', 5.5249014]]
The model has been served in AWS cloud using Python Lambda and AWS REST API. For this first, the docker image has been published to AWS ECR and then used for creating a lambda function which then is used as a REST API.
Below are some of the images which show the REST Api endpoint and it's usage.
The AWS link will be not be functioning, hence provided the images above. If images, not loading please find the same in
images
folder.
The model yields an accuracy of 93% on test set, which is lower than the benchmark 95%.
- Fine tune and test with different pre-built models to improve model accuracy.
- Build a model from scratch without using Transfer Learning
Abhijit Chakraborty ([email protected])
All sorts of contributions and ideas are welcome to add on to the current project and improve the models. Any feedback received will be highly appreciated.