-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove perspective from depth data #8044
Comments
Hi @mrehacek If your aim is to "stitch" together into a single point-cloud the individual point clouds of multiple cameras with different viewpoints then this can be done. A common method for doing this is to perform an affine transform. An example of an affine transform instruction in librealsense is rs2_transform_point_to_point Alternatively, point clouds from different perspectives can be stitched with ROS: https://www.intelrealsense.com/how-to-multiple-camera-setup-with-ros/ At a more advanced level, the scenario of stitching point clouds from up to twenty 400 Series cameras has been demonstrated by the CONIX Research Center at Carnegie Mellon. https://github.com/conix-center/pointcloud_stitching |
@mrehacek ,if you're trying to perform "head count" then you can probably mitigate the issues you're dealing with by limiting the max visible distance. This way you'll filter out all the depth data for objects at heights below (TBD) cm from the floor. |
Hi @mrehacek Do you require further assistance with this case, please? Thanks! |
Hi @MartyG-RealSense and @ev-mp, thank you for your assistance and sorry for late reply - I needed to do some tests. I wish you all the best in the new year :) @MartyG-RealSense I'm doing affine transforms using Eigen, and visualizing using PCL. I have looked at the Conix center project you mentioned, they do not seem to be doing more then hardcoded affine transforms, the same as me. I think I still didn't present the problem clearly enough, so I'm going to support it with more pictures. Here is a visualization from Viewer, the same setup as in first post - camera on the ceiling facing floor: However, when doing the stitching of point clouds generated using rs2::pointcloud, I believe that rs2::pointcloud is actually doing (non-affine) transformation to remove the perspective from data obtained from rs2::frame. Is this the case or am I wrong? I think that I was further confused by using spatial filter. My hypothesis is, that the data generated by rs2::pointcloud do not contain perspective, and can therefore be stitched with only affine transform. But there is noise which spits some of the points (those further from the camera, around the contour of the person perphaps) in a way that creates a sort of ellipsoid, see left part of the image: So. |
Hi @mrehacek The effect in the above image resembles a distortion phenomenon called ghost noise. This distortion can take different forms. The points spray from the humans in your particular case resembles point-cloud images of humans from a previous case where the camera was also mounted in an elevated position. The RealSense user in that case settled on using the Medium Density camera preset in combination with a high Laser Power setting to improve their results (Medium Density provides a balance between accuracy and image 'fill-rate'). If the humans are walking when the image is captured, it also reminds me of a case where humans were leaving a trail behind them on the image. Dorodnic the RealSense SDK Manager thought that it might be an artifact from a Temporal Filter. Do you have a temporal filter included in your application, please? |
@MartyG-RealSense I have noticed I could remove the trails because the points are distinctively sparse, then found this issue under the name "shadow points" in pcl lib, it's precisely what you're mentioning! I will note the camera settings also, but I'm working only with a dataset, so I won't be able to confirm. I will try to better summarize the questions we still didn't discuss:
|
The link below describes the coordinate system in relation to the center of the left IR imager.
Reading back through the case from the start, it seems as though when you talk about 'removing perspective', you are having difficulty aligning images from multiple cameras as they are showing full-body views of the human bodies in the scene that the cameras are located in instead of just the top of the head. So you would prefer to just have a view of the head alone if possible, excluding the rest of the body (removing perspective). You confirm that you are performing a head count and so only need the heads, but cannot limit the maximum observable depth to neck level using a threshold filter because this may exclude the heads of children passing beneath the camera. |
Thank you @MartyG-RealSense, your observation is correct. I will need a few more days for testing, if you could please leave this open. |
Yes, I will check with you for an update in 7 days if I have not heard from you by then. Good luck! |
Hi @mrehacek Do you have an update that you can provide, please? Thanks! |
@MartyG-RealSense I do not. However, I think I now know which direction I'm heading, so you can close this if you need. Thank you, and wish you well! |
Okay, thanks very much @mrehacek - please feel free to create a new question on this forum at a future date if you need to. Good luck! |
Issue Description
Is there a way to remove perspective from the collected depth data?
Let me explain on a depth map, where camera was mounted on a ceiling (=parallel to the floor) and looking down at people:
If the person is standing right below the camera (as in white rectangle), the camera is mostly only seeing his head (blue=high points). But in the corners of the camera's FOV, we can see the whole body, and that's my problem.
I understand that this is simply due to the fact how the depth camera works, as the IR rays are projected from one point, and I'm guessing no one tries to remove that, because it would mean loosing data. I'm working on a multicamera setup though, and when every camera in the setup has their own perspective, I cannot fuse pointclouds of the cameras together. Is there any function/combination of functions in librealsense that would allow me to calculate a ortographic/multiview/plan projection for the frame? If it's not in the lib, what should search for?
I have already looked at wiki (Projection in RealSense SDK 2.0) , #7279, #5225 but it didn't help me.
Thank you!
The text was updated successfully, but these errors were encountered: