Project developed for the class SEM5952 - Neural Networks and Machine Learning, from the Graduate Program in Mechanical Engineering at EESC-USP.
Carlos André Persiani Filho
María José Burbano Guzmán
Vincent Edward Wong Díaz
The São Carlos campus of the University of São Paulo features a large array of cameras, but not enough security personnel to monitor it. Therefore, a system which could bring attention to cameras with possible anomalous activity could make the system more effective. We present a proof-of-concept implementation of such a system, based on Inception v2 neural network.
With a given set of video or screen inputs (the cameras), the algorithm uses Inception v2 to detect classes of interest (a person, or selected objects). Each class has a weight associated with it. The code ranks the inputs according to an attention score, given by the weighted sum of all the detections. The cameras' streams are composed in a display that puts the highest-rated camera into focus, taking the biggest portion of the display.
On the video streams, the interactions between people and objects of interest are highlighted. Their bounding boxes appear in red to bring more attention to the action. As a way to demonstrate all the instances detected in the videos, the other detections appear in blue.
The code generates a log that lists the cameras' attention scores and all the interacted objects, to aid in posterior activity supervision.
The project uses a Faster-RCNN Inception v2 network, trained using the COCO dataset. The algorithm was tested and debugged using the VIRAT Video Dataset as inputs.
In the directory, execute command below to run the main code. All the camera inputs must be altered in code, inserted as the list cameras.
python main.py
File settings.py contains most of the configurations for the algorithm operation. Listed in requirements.txt are the libraries that require especific versions.