Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove perspective from depth data #8044

Closed
mrehacek opened this issue Dec 22, 2020 · 12 comments
Closed

Remove perspective from depth data #8044

mrehacek opened this issue Dec 22, 2020 · 12 comments

Comments

@mrehacek
Copy link

Required Info
Camera Model D415
Operating System & Version Win 10
SDK Version 2.38.1
Language C++

Issue Description

Is there a way to remove perspective from the collected depth data?
Let me explain on a depth map, where camera was mounted on a ceiling (=parallel to the floor) and looking down at people:
example
If the person is standing right below the camera (as in white rectangle), the camera is mostly only seeing his head (blue=high points). But in the corners of the camera's FOV, we can see the whole body, and that's my problem.

I understand that this is simply due to the fact how the depth camera works, as the IR rays are projected from one point, and I'm guessing no one tries to remove that, because it would mean loosing data. I'm working on a multicamera setup though, and when every camera in the setup has their own perspective, I cannot fuse pointclouds of the cameras together. Is there any function/combination of functions in librealsense that would allow me to calculate a ortographic/multiview/plan projection for the frame? If it's not in the lib, what should search for?

I have already looked at wiki (Projection in RealSense SDK 2.0) , #7279, #5225 but it didn't help me.

Thank you!

@MartyG-RealSense
Copy link
Collaborator

Hi @mrehacek If your aim is to "stitch" together into a single point-cloud the individual point clouds of multiple cameras with different viewpoints then this can be done. A common method for doing this is to perform an affine transform.

https://community.intel.com/t5/Items-with-no-label/Combining-multiple-depth-streams-into-one/m-p/483016/highlight/true#M5089

An example of an affine transform instruction in librealsense is rs2_transform_point_to_point

#5583

Alternatively, point clouds from different perspectives can be stitched with ROS:

https://www.intelrealsense.com/how-to-multiple-camera-setup-with-ros/

At a more advanced level, the scenario of stitching point clouds from up to twenty 400 Series cameras has been demonstrated by the CONIX Research Center at Carnegie Mellon.

https://github.com/conix-center/pointcloud_stitching

@ev-mp
Copy link
Collaborator

ev-mp commented Dec 28, 2020

@mrehacek ,if you're trying to perform "head count" then you can probably mitigate the issues you're dealing with by limiting the max visible distance. This way you'll filter out all the depth data for objects at heights below (TBD) cm from the floor.
The SDK provides a dedicated post-processing threshold filter to crop the viewing frustum according to user-defined min and max ranges.

You can try it in the realsense-viewer application:
image

@MartyG-RealSense
Copy link
Collaborator

Hi @mrehacek Do you require further assistance with this case, please? Thanks!

@mrehacek
Copy link
Author

mrehacek commented Jan 3, 2021

Hi @MartyG-RealSense and @ev-mp, thank you for your assistance and sorry for late reply - I needed to do some tests. I wish you all the best in the new year :)

@MartyG-RealSense I'm doing affine transforms using Eigen, and visualizing using PCL. I have looked at the Conix center project you mentioned, they do not seem to be doing more then hardcoded affine transforms, the same as me.
@ev-mp I'm doing head tracking, yes. I cannot use thresholding like that, it would ignore children, maybe adaptive thresholding would work. But this is not related.

I think I still didn't present the problem clearly enough, so I'm going to support it with more pictures.

Here is a visualization from Viewer, the same setup as in first post - camera on the ceiling facing floor:
perspective-wall-described
I have drawn lines corresponding to room geometry. By drawing rs2::frame I believe that data from it are projected on the plane with a perspective distortion.

However, when doing the stitching of point clouds generated using rs2::pointcloud, I believe that rs2::pointcloud is actually doing (non-affine) transformation to remove the perspective from data obtained from rs2::frame. Is this the case or am I wrong?

I think that I was further confused by using spatial filter. My hypothesis is, that the data generated by rs2::pointcloud do not contain perspective, and can therefore be stitched with only affine transform. But there is noise which spits some of the points (those further from the camera, around the contour of the person perphaps) in a way that creates a sort of ellipsoid, see left part of the image:
spatial-filter-comparison
On the left, one can clearly see people in the noise. However, on the right, after using spatial filter, the noisy parts get smoother and solidified. Then the blob is stretched in the direction from the camera, as a shadow of the person, which for me, created an illusion of a perspective. But now I see there is no perspective, it's just a problem created by using spatial filter.

So.
The problem with the filtering makes the stitched point cloud less useful to me now, then just doing my calculations on a 2D image from rs2::frame. But if the rs2::frame only has data with perspective distortion, it is a complication. Is there a way of getting rid of the perspective?

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Jan 4, 2021

Hi @mrehacek The effect in the above image resembles a distortion phenomenon called ghost noise. This distortion can take different forms. The points spray from the humans in your particular case resembles point-cloud images of humans from a previous case where the camera was also mounted in an elevated position.

#7021

image

The RealSense user in that case settled on using the Medium Density camera preset in combination with a high Laser Power setting to improve their results (Medium Density provides a balance between accuracy and image 'fill-rate').

If the humans are walking when the image is captured, it also reminds me of a case where humans were leaving a trail behind them on the image. Dorodnic the RealSense SDK Manager thought that it might be an artifact from a Temporal Filter. Do you have a temporal filter included in your application, please?

#7445

@mrehacek
Copy link
Author

mrehacek commented Jan 4, 2021

@MartyG-RealSense I have noticed I could remove the trails because the points are distinctively sparse, then found this issue under the name "shadow points" in pcl lib, it's precisely what you're mentioning! I will note the camera settings also, but I'm working only with a dataset, so I won't be able to confirm.
I have tried turning off the temporal filter, the filter (or defaults) seems good, isn't the cause. I have successfully used pcl::StatisticalOutlierRemoval to get rid of it. I cannot tell if the method will be fast enough though, until I implement the rest of my pipeline.
I consider this particular part with the point clouds clear now.

I will try to better summarize the questions we still didn't discuss:

  1. The depth data in rs2::frame are from the perspective of the camera, as there was no other processing inside the lib?
  2. By using rs2_deproject_pixel_to_point() one can remove the perspective of the camera? I see in the sources the rs2::pointcloud does use it, and it seems like the only option.
  3. If I want to project back from point cloud to rs2::frame with different projection, I need to write something similar to rs2_project_point_to_pixel()?

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Jan 5, 2021

  1. The origin point of depth data on the 400 Series cameras is the center of the left infrared imager. From the perspective of the stereo camera looking out at the world, the left imager is on the left side of the camera module. Thus, when the user is facing the D4 camera, the left imager is actually on the right side of the camera module.

The link below describes the coordinate system in relation to the center of the left IR imager.

#7279 (comment)

  1. rs2_deproject_pixel_to_point maps 2D image pixel coordinates to 3D world coordinates. An example of its use would be finding the 3D z-depth of a coordinate, or converting a 2D image to a 3D point-cloud image.

  2. Yes, rs2_project_point_to_pixel translates in the other direction, from 3D point coordinatess to 2D pixel coordinates.

Reading back through the case from the start, it seems as though when you talk about 'removing perspective', you are having difficulty aligning images from multiple cameras as they are showing full-body views of the human bodies in the scene that the cameras are located in instead of just the top of the head. So you would prefer to just have a view of the head alone if possible, excluding the rest of the body (removing perspective).

You confirm that you are performing a head count and so only need the heads, but cannot limit the maximum observable depth to neck level using a threshold filter because this may exclude the heads of children passing beneath the camera.

@mrehacek
Copy link
Author

mrehacek commented Jan 7, 2021

Thank you @MartyG-RealSense, your observation is correct.
As for the use of tracking sdks, I will have occlusions, cannot use them. I've only seen full-body skeletal tracking, I assume they fail if provided with this ceiling-view data. But I will try cubemos, maybe their ML model can deal with it.

I will need a few more days for testing, if you could please leave this open.

@MartyG-RealSense
Copy link
Collaborator

Yes, I will check with you for an update in 7 days if I have not heard from you by then. Good luck!

@MartyG-RealSense
Copy link
Collaborator

Hi @mrehacek Do you have an update that you can provide, please? Thanks!

@mrehacek
Copy link
Author

@MartyG-RealSense I do not. However, I think I now know which direction I'm heading, so you can close this if you need. Thank you, and wish you well!

@MartyG-RealSense
Copy link
Collaborator

Okay, thanks very much @mrehacek - please feel free to create a new question on this forum at a future date if you need to. Good luck!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants