Skip to content

Implementation of Spatial Pyramid Pooling (SPP-net) in Keras for object classification and detection

Notifications You must be signed in to change notification settings

bahl24/SpatialPyramidPooling_Keras

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

63 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Object Detection using Spatial Pyramid Pooling (SPPNet) (paper id: 74)

This is a Keras implementation of the object detection & classification algorithm described in the ECCV 2014 paper "Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition". This has been done using Fast-RCNN technique on top of AlexNet architecture using Keras API & Tensorflow. spp

(Image credit: Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition, K. He, X. Zhang, S. Ren, J. Sun)

This is an implementation of both object detection & classification and has been trained and tested on PASCAL VOC 2007 dataset.

Paper: Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

Dataset: The PASCAL Visual Object Classes Challenge 2007

Detailed presentation report - NNFL presentation.pptx

Dataset

PASCAL VOC 2007 dataset consists of 20 visual object classes from realistic scenes. These include -

  1. Person: person
  2. Animal: bird, cat, cow, dog, horse, sheep
  3. Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train
  4. Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor image

Instructions to run

  1. Copy the data set to your drive click here
  2. Open up sppnetFinal.ipynb on google colab to run the respective experiments.
  3. Run code to obtain the result.

Detection results

image image image

Obtaining mean Average Precision

In The Paper the mAP is 35.11% in the testing set.
After executing sppnetFinal2.ipynb Ground Truth and Detected values will be stored (we could train the detection model only for 500 images out of 5017, due to out of memory issue) in your drive.

mAP obtained is 10.4%. You can run the code for all images by removing size restruction on read_images_for_detection
input_imgs, region_proposals, out,no_of_proposals, gt_boxes = read_images_for_detection(df_anno, list_of_img_name, img_dir)

- Copy ground-truth files into the folder input/ground-truth/ 
- Copy detection-results files into the folder input/detection-results/
- python mainmAP.py

mAP output can be viewed in output folder detection-results-info mAP
Precision-Recall curve for diningTable is shown below . For rest of classes it can be found in output folder

Precision = True Positives / (True Positives + False Positives)

Recall = True Positives / (True Positives + False Negatives)

diningtable