This is a Keras implementation of the object detection & classification algorithm described in the ECCV 2014 paper "Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition". This has been done using Fast-RCNN technique on top of AlexNet architecture using Keras API & Tensorflow.
(Image credit: Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition, K. He, X. Zhang, S. Ren, J. Sun)
This is an implementation of both object detection & classification and has been trained and tested on PASCAL VOC 2007 dataset.
Paper: Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
Dataset: The PASCAL Visual Object Classes Challenge 2007
Detailed presentation report - NNFL presentation.pptx
PASCAL VOC 2007 dataset consists of 20 visual object classes from realistic scenes. These include -
- Person: person
- Animal: bird, cat, cow, dog, horse, sheep
- Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train
- Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor
- Copy the data set to your drive click here
- Open up
sppnetFinal.ipynb
on google colab to run the respective experiments. - Run code to obtain the result.
In The Paper the mAP is 35.11% in the testing set.
After executing sppnetFinal2.ipynb
Ground Truth and Detected values will be stored (we could train the detection model only for 500 images out of 5017, due to out of memory issue) in your drive.
mAP obtained is 10.4%.
You can run the code for all images by removing size restruction on read_images_for_detection
input_imgs, region_proposals, out,no_of_proposals, gt_boxes = read_images_for_detection(df_anno, list_of_img_name, img_dir)
- Copy ground-truth files into the folder input/ground-truth/
- Copy detection-results files into the folder input/detection-results/
- python mainmAP.py
mAP output can be viewed in output folder
Precision-Recall curve for diningTable is shown below . For rest of classes it can be found in output folder
Precision = True Positives / (True Positives + False Positives)
Recall = True Positives / (True Positives + False Negatives)