-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aligning depth picture coordinates to real world coordinates #8333
Comments
Hi @Danilich1994 The placement of your two cameras is fine, though you may achieve improved depth sensing results if you position the cameras closer together so that their fields of view overlap. This has the benefit of producing redundant depth data due to more than one camera observing the same area where the FOVs overlap. Using overlapping may require using a third camera to achieve the same overall width of view as two cameras spaced further apart though. The more cameras that you use, the fewer blind-spots there will be. The link below provides information on methods of stitching point clouds using RealSense. The RealSense user in that case was also using a top-down view and trying to exclude detail representing areas outside of the top view, so that overall case may be useful to read from the start. |
@MartyG-RealSense, thank you for your reply. It, definitely helped me to understand logic behind process which must be done to solve my problem. But, let's make sure I understood in right way:
Some clarifying questions: |
I have not performed the stitching process myself, so it is difficult for me to verify the listed procedure, beyond knowing that the procedure should involve moving and rotating individual clouds and then appending them together. I would say though that my expectation- like your own 'clarifying question' about depth frame conversion - would be that the point cloud would be generated to produce XYZ coordinates before the affine transform is applied to the 3D point cloud data with rs2_transform_point_to_point. Some more information about rs2_transform_point_to_point can be found in the link below. The link below may also be useful in regard to multi-camera data alignment: |
Great! Only last step is unknown for me - how apply affine transform to pointcloud. |
I'm not certain if the Python method in the link below for obtaining point cloud vertices for a Python array will be applicable to your particular project, but I will share it in case it is helpful: |
Hello @MartyG-RealSense, thanks for the assistance! I'm slowly going forward in terms of my project, and that is great! |
Is this view still a top-down view from the ceiling like the diagram in your opening comment of this discussion, please? I ran extensive tests to replicate your image. I had similar color change on my image during camera movement and could not achieve much improvement. In lower light conditions (e.g late afternoon in a lounge sized indoor room), maximizing laser power to '360' instead of the default of '150' may offer some improvement in image quality. Also, other projects with a top-down view also sometimes take the same approach that you did of depth-clamping out (thresholding) the far distance (such as the floor of a top-down view) If your project is able to do so (if it will not affect the greyscale image processing), using depth to color alignment can enhance an image considerably. It also makes it easier to distinguish foreground pixels from background pixels. Python has an example alignment program called align_depth2color.py If you do not require a fully colorized depth image, you could have a greyscale image by default by setting the White to Black color scheme instead of the Jet color scheme. |
Your scenario of the objects attached to the wall reminds me of a past case where a RealSense user did the same with VHS videotape boxes to try to detect them and measure the distance to them. They also had difficulties with picking the boxes out from the image. Their ultimate goal was to view objects from a top-down perspective. https://support.intelrealsense.com/hc/en-us/community/posts/360033574174-415-depth-sense-granularity I ran extensive further tests but the best improvement that I could get in image quality of the far wall was to use a D455 camera (which has 2x better accuracy over distance compared to the D435 models) or disable the Histogram Equalization option in the colorization settings, as shown below. RealSense 400 Series cameras can also better perceive white walls at a far distance if a physical optical filter called a longpass filter is applied over the camera lenses on the outside of the camera, as described in Section 4.2.1 of Intel's white-paper document about optical filters. As you are using a D415, may I ask which resolution you are using please, as it is not mentioned in the discussion. For the D415 model, the optimal depth accuracy resolution is 1280x720 (whereas on the D435 models it is 848x480). |
Depth frame resolution, in my case, is 1280x720 while capturing. Then, decimation filter reduce it to 640x360. |
Python code for disabling Histogram Equalization can be found in the link below. The colorization options such as the Color Preset change how the depth data is colored, not the depth itself. |
Hi @Danilich1994 Do you require further assistance with this case, please? Thanks! |
Hi @MartyG-RealSense. Thank you for your assistance, you are very helpful! I think I found the solution for my problem. So I'll close this issue. |
You are very welcome - great to hear that you found a solution that works for you. Thanks for the update! |
Hello @MartyG-RealSense.
What is difference between colorizer and sensor visual presets and do they interact somehow? |
The Visual Presets that are best known are the ones that affect depth data, such as Default, Hand, High Accuracy, High Density, etc. These affect which depth coordinates are rendered in the depth image. The Depth Visualization presets do not affect which depth coordinates are rendered, but instead determine the style in which those coordinates are color-shaded according to their depth values. I do not believe that there is any interaction between the depth presets and the depth colorization presets. Minimum Distance and Maximum Distance are the min and max distances at which color visualization will be applied to the image. Yes, the Color Schemes such as Jet and White To Black / Black To White are color schemes for colorization. Pages 11 to 14 of a PDF guide to the RealSense Viewer that Intel published explain the colorization settings very well. The link below also provides useful resources about programming colorization settings. https://support.intelrealsense.com/hc/en-us/community/posts/360048767633/comments/360012325493 |
Hi @Danilich1994 Do you require further assistance with this case, please? Thanks! |
Case closed due to no further comments received. |
@Danilich1994 Sorry to re-active this thread but I was wondering if you succeed to apply rs2_transform_point_to_point() to apply affine transformation to your point cloud ? I want to do the same in order to, then, save a .ply file that is rotated by 90° ... Thank you for your help ! |
Hello, @julienguegan. Yeah I did it. And it's not necessary to use rs2_transform_point_to_point if you want to just rotate ready pointcloud. This function in realsense is used to "change" poincloud origin from depth to color camera for future color data alignment. So it uses depth - color transformation matrix defined by camera properties. So, if you need to rotate all points around X for 90 degrees without any translation, you have to put proper numbers into matrix: And multiply all pointcloud points by the matrix: Code is taken from RealSense python example - box_dimensioner_multicam. So if you want, you can go through this example to get understanding of basic pointcloud usage. |
@Danilich1994 Actually, I know that I can do affine transformation using the method you explained but I wanted to use rs2_transform_point_to_point() because it is from the librealsense library. I was hoping to make it works on a pointcloud object or a frame object (and not numpy array). Because my final objective is to save the point cloud as a .ply file and currently to save this file, I am using the following code lines :
I am using only librealsense method and object to do so and I am not sure if it is really possible to interface it with numpy array ... I ask @MartyG-RealSense about this here but it seems that he does not have any good solutions 😐 |
@julienguegan if you want to use only realsense library functions, then you have to pass into rs2_transform_point_to_point() old origin to new origin extrinsic parameters as realsense object. And I don't know how to do it. May be @MartyG-RealSense can help? Otherwise - get depth frame; convert it to numpy array; using rs2_deproject_pixel_to_point(...) convert depth map to poincloud; apply affine transform; THIS Link gives two options how to convert pointcloud as numpy array into ply file. Otherwise - just save ply file. I suppose you are going to use this data in future, so, anyway conversion into numpy array will be present. Just don't forget to apply affine transform before using it. |
@Danilich1994 yes, agree. I am thinking to use open3d as it seems more relevant, documented, flexible for this matter |
Hello,
Here you can see my current camera setup (2x D415).
The main idea is to use two cameras (placed parallel to scanning plane) to increase overall FOV. Distance between cameras, linear FOV on certain height are calculated based on cameras FOV 69.4x42.5. The "h" height is chosen so that the edges of the frames between the cameras coincide without overlapping.
I'm trying to scan top view of objects which are placed on plain surface at about the same height from the camera (orange ones on the picture). "c" on picture represents threshold filter for depth image, so I can catch only objects surfaces (within some range), and ignore objects side surfaces (which can be seen coz of parallax). I'm taking "photos" with two cameras simultaneously and then just stitch them together to get larger image.
Now I have only two pieces of whole process: scanning tool and program which takes images and searches for objects on them. But I'm missing essential thing - connection between the real world coordinates and image coordinates. Coz I want to locate objects centers coordinates (x,y) in real world. On the picture, it can be seen "origin" point, this is origin of scanning area. How to bind stitched image origin to real world scene origin?
After active google searching, I found a lot info related to multicameras setups. There are info about inward/outward configurations, but none of the options doesn't look like mine.
Here is how I understand connection process:
Am I right?
Does my camera setup even work? Google research considering translation panorama creation shows that it is big pain.
What if I'll add two more cameras along x axis? Should I overlap frames to exclude "black" strip of invalid depth info on the left side of frame?
The text was updated successfully, but these errors were encountered: