Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange result for face detection gpu demo #9

Closed
xuguozhi opened this issue Jul 16, 2019 · 20 comments
Closed

Strange result for face detection gpu demo #9

xuguozhi opened this issue Jul 16, 2019 · 20 comments

Comments

@xuguozhi
Copy link

Screenshot_2019-07-16-13-49-43-467_com.google.mediapipe.apps.facedetectiongpu.png
Hi, I have encountered strange face detection results in face detection GPU demo. I have uploaded the screenshot image above. Any suggestions? BTW, my phone is Android 8.0.0

@xuguozhi
Copy link
Author

The results of the Face Detection CPU demo is fine

@mgyong
Copy link

mgyong commented Jul 16, 2019

@xuguozhi Why did the first screenshot show the bounding box being positioned wrongly? Was this temporarily only?

@jiuqiant
Copy link
Contributor

jiuqiant commented Jul 16, 2019

Thanks for reporting. We are aware of such rendering issue. Some other user already got the same issue on a xiaomi mix2s. We believe it's a GPU rendering issue inside the AnnotationOverlayCalculator for certain types of the Android phones. Unfortunately, we fail to reproduce it on our testing devices. To help us reproduce this issue, please let us know your device type if possible. Thanks.

@xuguozhi
Copy link
Author

xuguozhi commented Jul 16, 2019

@xuguozhi Why did the first screenshot show the bounding box being positioned wrongly? Was this temporarily only?

@mgyong It's not temporarily, it seems always showing like that. My testing phone is XIaomi black shark https://www.mi.com/blackshark-game2/

@xuguozhi
Copy link
Author

xuguozhi commented Jul 16, 2019

Thanks for reporting. We are aware of such rendering issue. Some other user already got the same issue on a xiaomi mix2s. We believe it's a GPU rendering issue inside the AnnotationOverlayCalculator for certain types of the Android phones. Unfortunately, we fail to reproduce it on our testing devices. To help us reproduce this issue, please let us know your device type if possible. Thanks.

@jiuqiant Xiaomi black shark https://www.mi.com/blackshark-game2/

@jiuqiant
Copy link
Contributor

@xuguozhi, we are able to reproduce the problem on a Redmi Note 4. We will be working on a fix. Thanks.

@mcclanahoochie
Copy link

mcclanahoochie commented Jul 16, 2019

Hi @xuguozhi, I think the root cause has been found:

Some Xiaomi phones have an odd-size camera image resolution (1269x1692) by default, and mediapipe GpuBuffer assumes all texture sizes are evenly divisible by 4.

I can provide a temporary workaround until a proper solution is found.
Please make the following edits:

  • in tflite_tensors_to_detections_calculator.cc line 631, change num_coords_ to kNumCoordsPerBox, such that the final line will now read size_t raw_anchors_length = num_boxes_ * kNumCoordsPerBox;

  • in CameraXPreviewHelper.java line 49, add .setTargetResolution(new Size(600,800)) to the builder, such that the final line will now read new PreviewConfig.Builder().setLensFacing(cameraLensFacing).setTargetResolution(new Size(600,800)).build();

The first fix (num coords) is a true bug, larger than this one, and fixes a memory allocation issue.

The second fix (java) is a way to request a different size camera texture. You can play around with the request size, but the goal is to have the camera give a multiple-of-4-sized image. In my experiments, the size given here results in 1080x1920 image.

Hope that helps, and thanks for working with us to help make MediaPipe better

@xuguozhi
Copy link
Author

xuguozhi commented Jul 17, 2019

Hi @mcclanahoochie

Sorry, in line 624, should it be
size_t raw_boxes_length = num_boxes_ * kNumCoordsPerBox; or size_t raw_anchors_length = num_boxes_ * kNumCoordsPerBox;?

@jiuqiant
Copy link
Contributor

jiuqiant commented Jul 17, 2019

@xuguozhi, sorry for the confusion.
Please modify the line 631 of tflite_tensors_to_detections_calculator.cc to be size_t raw_anchors_length = num_boxes_ * kNumCoordsPerBox;

@xuguozhi
Copy link
Author

@jiuqiant @mcclanahoochie

@xuguozhi, sorry for the confusion.
Please modify the line 631 of tflite_tensors_to_detections_calculator.cc to be size_t raw_anchors_length = num_boxes_ * kNumCoordsPerBox;

It doesn't work, the boxes in face detection GPU or object detection GPU are in red, but not standard rectangles. It appears like rectangles with wrongly affine transformation. However, both the CPU version of face detection or object detection works fine.

@mcclanahoochie
Copy link

Ah, sorry about line number mixup, fixed.

Seeing red squares is progress!

Did you also modify the camera size? Can you verify the new resolution? Another screenshot may also help.

@xuguozhi
Copy link
Author

Ah, sorry about line number mixup, fixed.

Seeing red squares is progress!

Did you also modify the camera size? Can you verify the new resolution? Another screenshot may also help.

Hi @mcclanahoochie,
I have modified the camera size as: new PreviewConfig.Builder().setLensFacing(cameraLensFacing).setTargetResolution(new Size(600,800)).build();
and the screenshot is like this:
Screenshot_2019-07-17-10-44-50-007_com.google.mediapipe.apps.facedetectiongpu.png

@mcclanahoochie
Copy link

What is the resolution of the camera frames? (before and after the java edit)

Another option, instead of the java edit, is to insert a ImageTransformationCalculator calculator in the beginning of the graph to resize the image to a known size.
It would look like this (comments removed, 1200x1600 based on 1269x1692 on the phone here):


input_stream: "input_video"
output_stream: "output_video"

node { 
  calculator: "RealTimeFlowLimiterCalculator"
  input_stream: "input_video"
  input_stream: "FINISHED:detections"
  input_stream_info: {
    tag_index: "FINISHED"
    back_edge: true
  }
  output_stream: "throttled_input_video_0"
}

node: {
  calculator: "ImageTransformationCalculator"
  input_stream: "IMAGE_GPU:throttled_input_video_0"
  output_stream: "IMAGE_GPU:throttled_input_video"
  node_options: {
    [type.googleapis.com/mediapipe.ImageTransformationCalculatorOptions] {
      output_width: 1200
      output_height: 1600
    }
  }
}

node: {
  calculator: "ImageTransformationCalculator"
  input_stream: "IMAGE_GPU:throttled_input_video"
  output_stream: "IMAGE_GPU:transformed_input_video"
  output_stream: "LETTERBOX_PADDING:letterbox_padding"
  node_options: {
    [type.googleapis.com/mediapipe.ImageTransformationCalculatorOptions] {
      output_width: 128
      output_height: 128
      scale_mode: FIT
    }
  }
}

node {
  calculator: "TfLiteConverterCalculator"
  input_stream: "IMAGE_GPU:transformed_input_video"
  output_stream: "TENSORS_GPU:image_tensor"
  node_options: {
    [type.googleapis.com/mediapipe.TfLiteConverterCalculatorOptions] {
      zero_center: true
      flip_vertically: true
    }
  }
}

node {
  calculator: "TfLiteInferenceCalculator"
  input_stream: "TENSORS_GPU:image_tensor"
  output_stream: "TENSORS_GPU:detection_tensors"
  node_options: {
    [type.googleapis.com/mediapipe.TfLiteInferenceCalculatorOptions] {
      model_path: "facedetector_front.tflite"
    }
  }
}

node {
  calculator: "SsdAnchorsCalculator"
  output_side_packet: "anchors"
  node_options: {
    [type.googleapis.com/mediapipe.SsdAnchorsCalculatorOptions] {
      num_layers: 4
      min_scale: 0.1484375
      max_scale: 0.75
      input_size_height: 128
      input_size_width: 128
      anchor_offset_x: 0.5
      anchor_offset_y: 0.5
      strides: 8
      strides: 16
      strides: 16
      strides: 16
      aspect_ratios: 1.0
      fixed_anchor_size: true
    }
  }
}

node {
  calculator: "TfLiteTensorsToDetectionsCalculator"
  input_stream: "TENSORS_GPU:detection_tensors"
  input_side_packet: "ANCHORS:anchors"
  output_stream: "DETECTIONS:detections"
  node_options: {
    [type.googleapis.com/mediapipe.TfLiteTensorsToDetectionsCalculatorOptions] {
      num_classes: 1
      num_boxes: 896
      num_coords: 16
      box_coord_offset: 0
      keypoint_coord_offset: 4
      num_keypoints: 6
      num_values_per_keypoint: 2
      sigmoid_score: true
      score_clipping_thresh: 100.0
      reverse_output_order: true
      x_scale: 128.0
      y_scale: 128.0
      h_scale: 128.0
      w_scale: 128.0
      flip_vertically: true
    }
  }
}

node {
  calculator: "NonMaxSuppressionCalculator"
  input_stream: "detections"
  output_stream: "filtered_detections"
  node_options: {
    [type.googleapis.com/mediapipe.NonMaxSuppressionCalculatorOptions] {
      min_suppression_threshold: 0.3
      min_score_threshold: 0.75
      overlap_type: INTERSECTION_OVER_UNION
      algorithm: WEIGHTED
    }
  }
}

node {
  calculator: "DetectionLabelIdToTextCalculator"
  input_stream: "filtered_detections"
  output_stream: "labeled_detections"
  node_options: {
    [type.googleapis.com/mediapipe.DetectionLabelIdToTextCalculatorOptions] {
      label_map_path: "facedetector_front_labelmap.txt"
    }
  }
}

node {
  calculator: "DetectionLetterboxRemovalCalculator"
  input_stream: "DETECTIONS:labeled_detections"
  input_stream: "LETTERBOX_PADDING:letterbox_padding"
  output_stream: "DETECTIONS:output_detections"
}

node {
  calculator: "DetectionsToRenderDataCalculator"
  input_stream: "DETECTION_VECTOR:output_detections"
  output_stream: "RENDER_DATA:render_data"
  node_options: {
    [type.googleapis.com/mediapipe.DetectionsToRenderDataCalculatorOptions] {
      thickness: 8.0
      color { r: 255 g: 0 b: 0 }
    }
  }
}

node {
  calculator: "AnnotationOverlayCalculator"
  input_stream: "INPUT_FRAME_GPU:throttled_input_video"
  input_stream: "render_data"
  output_stream: "OUTPUT_FRAME_GPU:output_video"
  node_options: {
    [type.googleapis.com/mediapipe.AnnotationOverlayCalculatorOptions] {
      flip_text_vertically: true
    }
  }
}

This is to test my theory about the %4 size issue. This graph fixes the skew on the Xiaomi phone here.

@mgyong
Copy link

mgyong commented Jul 23, 2019

@xuguozhi Did you try out @mcclanahoochie ImageTransformationCalculator suggestion? and did it work? If it did, pls let us know

@xuguozhi
Copy link
Author

@xuguozhi Did you try out @mcclanahoochie ImageTransformationCalculator suggestion? and did it work? If it did, pls let us know
@mgyong Sorry for the late reply, I will try it soon :)

@xuguozhi
Copy link
Author

Hi, the same issues appear on OPPO Find X and Xiaomi MAX2 phones.
@mcclanahoochie insert a ImageTransformationCalculator calculator in the beginning of the graph to resize the image to a known size. Which line to intert?

@jiuqiant
Copy link
Contributor

@xuguozhi What @mcclanahoochie gives you is a new MediaPipe graph. You can visualize it in http://viz.mediapipe.dev. Please manually replace the content of the face detection gpu graph with the code snippet in @mcclanahoochie's comment .

FYI, the new graph looks like:
Screen Shot 2019-07-24 at 10 01 51 AM

@xuguozhi
Copy link
Author

@mgyong @jiuqiant @mcclanahoochie Cool~, it works! Thanks, you guys are great!

@mcclanahoochie
Copy link

Awesome!

@Risingabhi
Copy link

Not sure but when i input high resolution image in Mediapipe i get error 👍
NoneType Object
But when i crop and send same image it gives me error. any clue?
girl
girl1
Uploading girl.PNG…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants