Capstone project for the Metis Data Science Bootcamp: Species Identification with Convolutional Neural Networks
To build a model with sufficient accuracy to automate processing trail camera imagery with the capability to push notifications when specific animals are identified.
Notebooks | Description |
---|---|
cnn-model.ipynb | Convolutional Neural Network Model |
download-process.ipynb | Download & Processing with Microsoft Cognitive Services API |
label-rekognition.ipynb | Image Label Verification with AWS Rekognition |
exif-to-dataframe.ipynb | Convert Image EXIF Data to DataFrame |
For Jupyter Notebook files not loading properly with Github, copy the url to https://nbviewer.jupyter.org/.
Trail cameras, also known as game cameras, automatically take photos when motion is detected. They have a variety of applications and their popularity has driven the market size to $60B worldwide - a number expected to double over the next decade.
While these cameras are automatic, they are not yet intelligent. When the SD card is retrieved from the field, the user is typically met with 1,000s of photographs to review. These images are often low quality or redundant, creating a labor intensive process to locate animals of interest.
Microsoft Cognitive Services’ Bing Images API was used to download over 5,000 training images to AWS S3. AWS Rekognition was used to verify image labels (see blog post), which correctly eliminated 20% of the directory's files. All processing was done with an EC2 m8.large GPU on Ubuntu, which also supported Jupyter Labs running the model, built on Keras with Tensor Flow backend. All visuals were produced with Tableau.
After iterating through some Sequential variants, I settled on a CNN model with VGG-16 and ImageNet as a base. I applied transfer learning by adding 2 dense layers and used 7 degrees of augmentation to compensate for a relatively small dataset to achieve an accuracy of 81%.
I applied this model to 8,000 images taken from my personal game cameras taken over all of 3 months - see the problem? Combining the model's species identification with an image's EXIF data (the information embedded when a photograph is taken) yields time series analytics by species type, which I refer to as "animalytics."
Below is the distribution by hour of day of when deer identified in photos came to a feeder.
Overlaying historic weather data reveals an inverse relationship between rain and feeding patterns. This could possibly be due to the lack of alternative options when grass diminishes in the drier summer months.
Conventional wisdom says animals are more active during full moons; this dataset tells a different story. Activity was highest during the waning gibbous phase.
The model was deployed in a live demo demonstrating text notification functionality. This was achieved using the AWS Simple Notification Service (SNS) coupled with the Python SDK Boto3. This feature provides users to receive a notification on their mobile device when the neural network model positively identifies the preselected animal species. The image below links to a video demonstration.