Fall Detection - alternative infer input (#397)

* Removed returning of detection lines and infer now takes list of poses in addition to images * Removed drawing of detection lines and added bbox on detected person even when not fallen * Updated fall det readme appropriately * Fixed ROS1 fall det node based on modified infer output * Fixed ROS2 fall det node based on modified infer output * Added tests cases for using infer with pose lists * Fixed webcam demo based on new infer and improved its drawing * Fixed and improved jupyter notebook tutorial * Removed unused import * Fixed hr pose estimation test according to latest changes in the infer method --------- Co-authored-by: ad-daniel <[email protected]>
opendr-eu · Feb 21, 2023 · 968706e · 968706e
1 parent 1e4238a
commit 968706e
Show file tree

Hide file tree

Showing 9 changed files with 175 additions and 120 deletions.
diff --git a/docs/reference/fall-detection.md b/docs/reference/fall-detection.md
@@ -6,7 +6,7 @@ The *fall_detection* module contains the *FallDetectorLearner* class, which inhe
 Bases: `engine.learners.Learner`
 
 The *FallDetectorLearner* class contains the implementation of a rule-based fall detector algorithm.
-It can be used to perform fall detection on images (inference) using a pose estimator.
+It can be used to perform fall detection on images (inference) using a pose estimator or perform fall detection straight on poses.
 
 This rule-based method can provide **cheap and fast** fall detection capabilities when pose estimation
 is already being used. Its inference time cost is ~0.1% of pose estimation, adding negligible overhead.
@@ -27,8 +27,20 @@ FallDetectorLearner(self, pose_estimator)
 
 Constructor parameters:
 
+- **pose_estimator**: *object, default=None*\
+  Optionally provided pose estimator class used to detect poses that the fall detector uses to determine if a person has fallen.
+
+#### `FallDetectorLearner.initialize`
+```python
+FallDetectorLearner.initialize(self, pose_estimator)
+```
+
+This method is used to initialize the fall detector with a pose estimator, in case one was not provided in the constructor.
+
+Parameters:
+
 - **pose_estimator**: *object*\
-  The provided pose estimator class used to detect poses that the fall detector uses to determine if a person has fallen.
+  Provided pose estimator class used to detect poses that the fall detector uses to determine if a person has fallen.
 
 #### `FallDetectorLearner.eval`
 ```python
@@ -48,20 +60,22 @@ Parameters:
 
 #### `FallDetectorLearner.infer`
 ```python
-FallDetectorLearner.infer(self, img)
+FallDetectorLearner.infer(self, input_)
 ```
 
-This method is used to perform fall detection on an image.
-Returns a list of tuples, one for every person detected, that each contains an `engine.target.Category`, a list of three keypoints that define two lines that are used to determine if the person has fallen, and the complete poses that were detected.
+This method is used to perform fall detection on provided input.
+The input can be either an opendr.engine.data.Image object, an OpenCV image type (ndarray) or a list of poses.
+The list of poses input is helpful when the pose estimator is run externally.
+Returns a list of tuples, one for every person detected, that each contains an `engine.target.Category`, and the corresponding pose that was detected.
 It returns an empty list if no pose detections were made.
 
 The `engine.target.Category` is `1` if person has fallen, `-1` if person is standing and `0` if a person is detected, but
 the algorithm is unable to detect if person is standing or fallen.
 
 Parameters:
 
-- **img**: *object*\
-  Object of type engine.data.Image.
+- **input_**: *object*\
+  Object of type engine.data.Image, ndarray or a list of poses.
 
 #### `FallDetectorLearner.download`
 ```python
@@ -84,7 +98,7 @@ Parameters:
 
 #### Examples
 
-* **Inference and result drawing example on a test image using OpenCV.**
+* **Inference using the pose estimator internally and result drawing example on a test image using OpenCV.**
   ```python
   import cv2
   from opendr.engine.data import Image
@@ -104,7 +118,7 @@ Parameters:
   img = Image.open("test_images/fallen.png")
   detections = fall_detector.infer(img)
   fallen = detections[0][0].data  # Get fallen int from first detection
-  pose = detections[0][2]  # Get pose from first detection
+  pose = detections[0][1]  # Get pose from first detection
   img = img.opencv()
   draw(img, pose)  # Draw the detected pose
 
@@ -115,3 +129,37 @@ Parameters:
   cv2.imshow('Result', img)
   cv2.waitKey(0)
   ```
+
+
+* **Inference using the pose estimator externally, passing a list of poses to infer and result drawing example on a test image using OpenCV.**
+  ```python
+  import cv2
+  from opendr.engine.data import Image
+  from opendr.perception.fall_detection import FallDetectorLearner
+  from opendr.perception.pose_estimation import LightweightOpenPoseLearner
+  from opendr.perception.pose_estimation import draw, get_bbox
+
+  pose_estimator = LightweightOpenPoseLearner(device="cuda", mobilenet_use_stride=False)
+  pose_estimator.download(verbose=True)  # Download the default pretrained mobilenet model
+  pose_estimator.load("openpose_default")
+
+  fall_detector = FallDetectorLearner()
+
+  # Download a sample dataset
+  fall_detector.download(verbose=True)
+
+  img = Image.open("test_images/fallen.png")
+  poses = pose_estimator.infer(img)
+  detections = fall_detector.infer(poses)
+  fallen = detections[0][0].data  # Get fallen int from first detection
+  pose = detections[0][1]  # Get pose from first detection
+  img = img.opencv()
+  draw(img, pose)  # Draw the detected pose
+
+  if fallen == 1:
+      x, y, w, h = get_bbox(pose)
+      cv2.rectangle(img, (x, y), (x + w, y + h), (0, 0, 255), 2)
+      cv2.putText(img, "Detected fallen person", (5, 12), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 1, cv2.LINE_AA)
+  cv2.imshow('Result', img)
+  cv2.waitKey(0)
+  ```
diff --git a/projects/opendr_ws/src/opendr_perception/scripts/fall_detection_node.py b/projects/opendr_ws/src/opendr_perception/scripts/fall_detection_node.py
@@ -111,7 +111,7 @@ def callback(self, data):
             fallen = detection[0].data
 
             if fallen == 1:
-                pose = detection[2]
+                pose = detection[1]
                 x, y, w, h = get_bbox(pose)
                 if self.image_publisher is not None:
                     # Paint person bounding box inferred from pose

diff --git a/projects/opendr_ws_2/src/opendr_perception/opendr_perception/fall_detection_node.py b/projects/opendr_ws_2/src/opendr_perception/opendr_perception/fall_detection_node.py
@@ -106,7 +106,7 @@ def callback(self, data):
         fallen_pose_id = 0
         for detection in detections:
             fallen = detection[0].data
-            pose = detection[2]
+            pose = detection[1]
             x, y, w, h = get_bbox(pose)
 
             if fallen == 1:

diff --git a/projects/python/perception/fall_detection/demos/inference_demo.py b/projects/python/perception/fall_detection/demos/inference_demo.py
@@ -22,7 +22,7 @@
 from opendr.perception.pose_estimation import draw, get_bbox
 
 
-def fall_detection_on_img(img, draw_pose=False, draw_fall_detection_lines=False):
+def fall_detection_on_img(img, draw_pose=False):
     detections = fall_detector.infer(img)
     img = img.opencv()
 
@@ -35,8 +35,7 @@ def fall_detection_on_img(img, draw_pose=False, draw_fall_detection_lines=False)
     else:
         for detection in detections:
             fallen = detection[0].data
-            keypoints = detection[1]
-            pose = detection[2]
+            pose = detection[1]
             print("- Detected person.")
             if fallen == 1:
                 print("  Detected fallen person.")
@@ -48,21 +47,19 @@ def fall_detection_on_img(img, draw_pose=False, draw_fall_detection_lines=False)
             if draw_pose:
                 draw(img, pose)
 
-            color = (255, 255, 255)
             if fallen == 1:
                 color = (0, 0, 255)
                 x, y, w, h = get_bbox(pose)
                 cv2.rectangle(img, (x, y), (x + w, y + h), color, 2)
                 cv2.putText(img, "Detected fallen person", (5, 12), cv2.FONT_HERSHEY_SIMPLEX,
                             0.5, color, 1, cv2.LINE_AA)
+            elif fallen == -1:
+                color = (0, 255, 0)
+                x, y, w, h = get_bbox(pose)
+                cv2.rectangle(img, (x, y), (x + w, y + h), color, 2)
+                cv2.putText(img, "Detected standing person", (5, 12), cv2.FONT_HERSHEY_SIMPLEX,
+                            0.5, color, 1, cv2.LINE_AA)
 
-            if draw_fall_detection_lines:
-                if keypoints[0].data[0] != -1:
-                    cv2.line(img, (int(keypoints[0].x), int(keypoints[0].y)),
-                             (int(keypoints[1].x), int(keypoints[1].y)), color, 4)
-                if keypoints[2].data[0] != -1:
-                    cv2.line(img, (int(keypoints[1].x), int(keypoints[1].y)),
-                             (int(keypoints[2].x), int(keypoints[2].y)), color, 4)
         cv2.imshow('Results', img)
         cv2.waitKey(0)
 
@@ -84,8 +81,8 @@ def fall_detection_on_img(img, draw_pose=False, draw_fall_detection_lines=False)
     fall_detector.download(".", verbose=True)
 
     print("Running detector on image without humans...")
-    fall_detection_on_img(Image.open("test_images/no_person.png"), True)
+    fall_detection_on_img(Image.open("test_images/no_person.png"))
     print("Running detector on image with a fallen person...")
-    fall_detection_on_img(Image.open("test_images/fallen.png"), True)
+    fall_detection_on_img(Image.open("test_images/fallen.png"))
     print("Running detector on image with a standing person...")
-    fall_detection_on_img(Image.open("test_images/standing.png"), True)
+    fall_detection_on_img(Image.open("test_images/standing.png"))
diff --git a/projects/python/perception/fall_detection/demos/inference_tutorial.ipynb b/projects/python/perception/fall_detection/demos/inference_tutorial.ipynb
diff --git a/projects/python/perception/fall_detection/demos/webcam_demo.py b/projects/python/perception/fall_detection/demos/webcam_demo.py
@@ -77,27 +77,29 @@ def __next__(self):
 
             for detection in detections:
                 fallen = detection[0].data
-                keypoints = detection[1]
-                pose = detection[2]
+                pose = detection[1]
 
                 if draw_poses:
                     draw(img, pose)
 
-                color = (255, 255, 255)
                 if fallen == 1:
                     color = (0, 0, 255)
                     x, y, w, h = get_bbox(pose)
                     cv2.rectangle(img, (x, y), (x + w, y + h), color, 2)
-                    cv2.putText(img, "Detected fallen person", (5, 55), cv2.FONT_HERSHEY_SIMPLEX,
-                                0.75, color, 1, cv2.LINE_AA)
-
-                if draw_poses:
-                    if keypoints[0].data[0] != -1:
-                        cv2.line(img, (int(keypoints[0].x), int(keypoints[0].y)),
-                                 (int(keypoints[1].x), int(keypoints[1].y)), color, 4)
-                    if keypoints[2].data[0] != -1:
-                        cv2.line(img, (int(keypoints[1].x), int(keypoints[1].y)),
-                                 (int(keypoints[2].x), int(keypoints[2].y)), color, 4)
+                    cv2.putText(img, "Detected fallen person", (x, y-5), cv2.FONT_HERSHEY_SIMPLEX,
+                                0.5, color, 1, cv2.LINE_AA)
+                elif fallen == -1:
+                    color = (0, 255, 0)
+                    x, y, w, h = get_bbox(pose)
+                    cv2.rectangle(img, (x, y), (x + w, y + h), color, 2)
+                    cv2.putText(img, "Detected standing person", (x, y-5), cv2.FONT_HERSHEY_SIMPLEX,
+                                0.5, color, 1, cv2.LINE_AA)
+                elif fallen == 0:
+                    color = (255, 255, 255)
+                    x, y, w, h = get_bbox(pose)
+                    cv2.rectangle(img, (x, y), (x + w, y + h), color, 2)
+                    cv2.putText(img, "Unable to detect if fallen", (x, y-5), cv2.FONT_HERSHEY_SIMPLEX,
+                                0.5, color, 1, cv2.LINE_AA)
             cv2.imshow('Result', img)
             cv2.waitKey(1)
             counter += 1

diff --git a/src/opendr/perception/fall_detection/fall_detector_learner.py b/src/opendr/perception/fall_detection/fall_detector_learner.py
@@ -17,19 +17,21 @@
 from urllib.request import urlretrieve
 
 from tqdm import tqdm
-from numpy import arctan2, linalg, rad2deg
+from numpy import arctan2, linalg, rad2deg, ndarray
 
 from opendr.engine.constants import OPENDR_SERVER_URL
 from opendr.engine.learners import Learner
 from opendr.engine.datasets import ExternalDataset, DatasetIterator
-from opendr.engine.target import Category, Keypoint
+from opendr.engine.target import Category
 from opendr.engine.data import Image
 
 
 class FallDetectorLearner(Learner):
-    def __init__(self, pose_estimator):
+    def __init__(self, pose_estimator=None):
         super().__init__()
+        self.pose_estimator = pose_estimator
 
+    def initialize(self, pose_estimator):
         self.pose_estimator = pose_estimator
 
     def fit(self, dataset, val_dataset=None, logging_path='', silent=True, verbose=True):
@@ -46,6 +48,9 @@ def eval(self, dataset, verbose=True):
         Lastly, it returns the absolute number of frames where the pose detection was entirely missed.
 
         """
+        if self.pose_estimator is None:
+            raise AttributeError("self.pose_estimator is None, please initialize the fall detector learner by "
+                                 "providing a pose estimator using initialize(pose_estimator) first.")
         data = self.__prepare_val_dataset(dataset)
 
         tp = 0
@@ -216,16 +221,27 @@ def download(path=None, mode="test_data", verbose=False,
             if verbose:
                 print("Test data download complete.")
 
-    def infer(self, img):
-        poses = self.pose_estimator.infer(img)
-        results = []
-        for pose in poses:
-            results.append(self.__naive_fall_detection(pose))
+    def infer(self, input_):
+        if type(input_) is Image or type(input_) is ndarray:
+            if self.pose_estimator is None:
+                raise AttributeError("self.pose_estimator is None, please initialize the fall detector learner by "
+                                     "providing a pose estimator in the learner's constructor or by "
+                                     "using fall_detector.initialize(pose_estimator).")
+            poses = self.pose_estimator.infer(input_)
+            results = []
+            for pose in poses:
+                results.append(self.__naive_fall_detection(pose))
+            return results
 
-        if len(results) >= 1:
+        elif type(input_) is list:
+            results = []
+            for pose in input_:
+                results.append(self.__naive_fall_detection(pose))
             return results
 
-        return []
+        else:
+            raise ValueError(f"Input provided is wrong type: {type(input_)}, please provide one of "
+                             f"{Image}, {ndarray} (opencv image) or a list of poses.")
 
     @staticmethod
     def __get_angle_to_horizontal(v1, v2):
@@ -242,20 +258,20 @@ def __naive_fall_detection(self, pose):
         """
         # Hip detection, hip average serves as the middle point of the pose
         if pose["r_hip"][0] != -1 and pose["l_hip"][0] != -1:
-            hips = (pose["r_hip"] + pose["l_hip"])/2
+            hips = (pose["r_hip"] + pose["l_hip"]) / 2
         elif pose["r_hip"][0] != -1:
             hips = pose["r_hip"]
         elif pose["l_hip"][0] != -1:
             hips = pose["l_hip"]
         else:
             # Can't detect fall without hips
-            return Category(0), [Keypoint([-1, -1]), Keypoint([-1, -1]), Keypoint([-1, -1])], pose
+            return Category(0), pose
 
         # Figure out head average position
         head = [-1, -1]
         # Try to detect approximate head position from shoulders, eyes, neck
         if pose["r_eye"][0] != -1 and pose["l_eye"][0] != -1 and pose["neck"][0] != -1:  # Eyes and neck detected
-            head = (pose["r_eye"] + pose["l_eye"] + pose["neck"])/3
+            head = (pose["r_eye"] + pose["l_eye"] + pose["neck"]) / 3
         elif pose["r_eye"][0] != -1 and pose["l_eye"][0] != -1:  # Eyes detected
             head = (pose["r_eye"] + pose["l_eye"]) / 2
         elif pose["r_sho"][0] != -1 and pose["l_sho"][0] != -1:  # Shoulders detected
@@ -313,21 +329,21 @@ def __naive_fall_detection(self, pose):
 
         if calves_vertical != -1:
             if calves_vertical == 0:  # Calves are not vertical, so person has fallen
-                return Category(1), [Keypoint(head), Keypoint(hips), Keypoint(legs)], pose
+                return Category(1), pose
 
         if legs_vertical != -1:
             if legs_vertical == 0:  # Legs are not vertical, probably not under torso, so person has fallen
-                return Category(1), [Keypoint(head), Keypoint(hips), Keypoint(legs)], pose
+                return Category(1), pose
             elif legs_vertical == 1:  # Legs are vertical, so person is standing
-                return Category(-1), [Keypoint(head), Keypoint(hips), Keypoint(legs)], pose
+                return Category(-1), pose
         elif torso_vertical != -1:
             if torso_vertical == 0:  # Torso is not vertical, without legs we assume person has fallen
-                return Category(1), [Keypoint(head), Keypoint(hips), Keypoint(legs)], pose
+                return Category(1), pose
             elif torso_vertical == 1:  # Torso is vertical, without legs we assume person is standing
-                return Category(-1), [Keypoint(head), Keypoint(hips), Keypoint(legs)], pose
+                return Category(-1), pose
         else:
             # Only hips detected, can't detect fall
-            return Category(0), [Keypoint([-1, -1]), Keypoint([-1, -1]), Keypoint([-1, -1])], pose
+            return Category(0), pose
 
 
 class URFallDatasetIterator(DatasetIterator):

diff --git a/tests/sources/tools/perception/fall_detection/test_fall_detector.py b/tests/sources/tools/perception/fall_detection/test_fall_detector.py
@@ -82,14 +82,23 @@ def test_infer(self):
         img = Image.open(os.path.join(self.temp_dir, "test_images", "fallen.png"))
         # Detector should detect fallen person on fallen.png
         self.assertTrue(self.fall_detector.infer(img)[0][0].data == 1,
-                        msg="Fall detector didn't detect fallen person on fallen.png")
+                        msg="Fall detector didn't detect fallen person on provided image fallen.png")
+        poses = self.pose_estimator.infer(img)
+        self.assertTrue(self.fall_detector.infer(poses)[0][0].data == 1,
+                        msg="Fall detector didn't detect fallen person on poses provided for fallen.png")
 
         img = Image.open(os.path.join(self.temp_dir, "test_images", "standing.png"))
         # Detector should detect standing person on standing.png
         self.assertTrue(self.fall_detector.infer(img)[0][0].data == -1,
                         msg="Fall detector didn't detect standing person on standing.png")
+        poses = self.pose_estimator.infer(img)
+        self.assertTrue(self.fall_detector.infer(poses)[0][0].data == -1,
+                        msg="Fall detector didn't detect standing person on poses provided for standing.png")
 
         img = Image.open(os.path.join(self.temp_dir, "test_images", "no_person.png"))
         # Detector should not detect fallen nor standing person on no_person.png
         self.assertTrue(len(self.fall_detector.infer(img)) == 0,
                         msg="Fall detector detected fallen or standing person on no_person.png")
+        poses = self.pose_estimator.infer(img)
+        self.assertTrue(len(self.fall_detector.infer(poses)) == 0,
+                        msg="Fall detector detected fallen or standing person on poses provided for no_person.png")
diff --git a/...n/pose_estimation/high_resolution_pose_estimation/test_high_resolution_pose_estimation.py b/...n/pose_estimation/high_resolution_pose_estimation/test_high_resolution_pose_estimation.py
@@ -87,7 +87,7 @@ def test_infer(self):
 
         img = Image.open(os.path.join(self.temp_dir, "dataset", "image", "000000000785_1080.jpg"))
         # Default pretrained mobilenet model detects 18 keypoints on img with id 785
-        self.assertGreater(len(self.pose_estimator.infer(img)[0].data), 0,
+        self.assertGreater(len(self.pose_estimator.infer(img)[0][0].data), 0,
                            msg="Returned pose must have non-zero number of keypoints.")