Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some D455 not woring as slave in sync system #12877

Closed
FrGrQuim opened this issue Apr 22, 2024 · 22 comments
Closed

Some D455 not woring as slave in sync system #12877

FrGrQuim opened this issue Apr 22, 2024 · 22 comments
Labels

Comments

@FrGrQuim
Copy link

FrGrQuim commented Apr 22, 2024

Required Info
Camera Model D455
Firmware Version 5.16.0.1
Operating System & Version Linux (Ubuntu 22)
Kernel Version (Linux Only) 5.15.0
Platform PC
SDK Version 2.55.1
Language
Segment Robot

Issue Description

For a robotic project, we are utilizing the RealSense D455 with synchronization. However, for the past few weeks, we've encountered a new issue: when switching between master and slave roles, the "new" slave camera ceases to send framesets. Once this occurs, a hardware reset is necessary to resume data transmission. This problem arises exclusively when depth frames are configured at 5Hz (we've also tested at 15Hz and 30Hz, where it functions correctly). Upon reverting the roles, the cameras operate correctly with synchronized depth frames (although a hard reset is needed if the slave camera was "broken").

If the depth stream is disabled on the master camera, the slave camera functions properly. Similarly, if the sync cable is disconnected, frames from both cameras are received but remain unsynchronized (which is normal).

We've replicated this issue using both a complete sync cable and a cable with only pins 5 and 9 connected.

Additionally, we've measured the sync signal:
image
image (1)
image (2)
M2 and S2 represent the initial Master and Slave cameras (when the system functions properly).

This behavior has been precisely replicated using the RealSense Viewer program by solely adjusting the frequency of the depth frame and the role of the two cameras (all other parameters remain unchanged). Therefor it seems that the problem don't come from our hardware or our application.

Is it possible that we've damaged a camera, rendering it unable to function as a slave (crashing upon receiving a sync signal)? If so, why does this occur only at 5Hz, and how can we prevent it from happening to other cameras? If not, do you have any other insights into why we're experiencing this behavior?

If you require any further information, please don't hesitate to ask, and I'll do my best to provide it.

@MartyG-RealSense
Copy link
Collaborator

Hi @FrGrQuim Changing a camera's Inter Cam Sync Mode setting from master to slave should not damage it and it does not sound as though there is damage since it works normally at 15 and 30 Hz.

The ideal situation is if the master is enabled first and the slave(s) secondly afterwards.

Are you changing the camera from master to slave whilst depth streaming is still enabled? If you are, does the problem still occur at 5 Hz if you disable the depth streams, change the Inter Cam Sync Modes and then re-enable the depth streams?

@FrGrQuim
Copy link
Author

Hi @MartyG-RealSense, thanks for your response. We always enable the master before the slave, and in our app, we disable the stream before configuring the cameras. (In the RealSense Viewer, it's not possible to change the Inter-Cam Sync Mode when the stream is enabled).

If the issue isn't with the hardware, another hypothesis is that there may be corruption in the persistent memory. I've updated the cameras, so I don't believe it could be from the firmware, but perhaps there's a persistent configuration that causes the cameras to crash under certain conditions. Is there a way to factory reset the camera, not just the calibration parameters?

Also, is there a method to retrieve error codes, logs, or any other type of information that could help us understand why the camera isn't sending frames?

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Apr 23, 2024

A factory-reset of the calibration creates a new calibration table inside the camera. The camera can not function fully correctly if the calibration table has somehow become corrupted, so writing a new table through a calibration reset can rectify problems caused by a damaged cable. The calibration reset can be performed in the Viewer using the instructions at #10182 (comment)

Also in the Viewer, you can expand open a debug console panel that displays a continuously updating message log by clicking on a small upward-pointing arrow icon in the bottom corner of the Viewer window.

image

You can set a path on your computer for the log to be sent to a file by clicking on the gear-wheel icon at the top corner of the Viewer window and selecting the Settings option from its drop-down menu, then selecting the General tab on the settings window that pops up and clicking on the box beside 'Output librealsense log to file'.

Select a path on your computer to save the log file to. Finally, click the Apply button on the bottom of the Settings pop-up to confirm the enabling of logging to file.

image

The link below provides information about accessing other types of logging such as kernel logs.

https://github.com/IntelRealSense/librealsense/blob/master/doc/troubleshooting.md

@FrGrQuim
Copy link
Author

I have attempted to factory reset the calibration table of the two cameras, but this didn't resolve the issue. However, when I checked the logs in the RealSense Viewer app, I observed the following logs when I started the depth stream of the slave camera (the master camera's depth stream was already enabled without any error logs):
Screenshot from 2024-04-24 13-31-21

After that, I encountered the following log that loops indefinitely:
Screenshot from 2024-04-24 13-36-09

@MartyG-RealSense
Copy link
Collaborator

Do the ** Stream Start Failure** and Left IC2 Config Error errors display every time that this particular camera has its depth stream enabled?

'Left IC2 Config Error' is a very rare error that typically indicates that an interposer cable that joins two circuit boards together inside the camera may not be fully seated in its connectors. It is unlikely that there would be a problem with this internal cable though if that particular camera is able to start the depth stream normally most of the time.

@FrGrQuim
Copy link
Author

The Left IC2 Config Error occurs each time I initiate the depth stream on this camera under the following conditions:

  • The camera is in slave mode
  • The stream is set to 5Hz
  • The sync cable is connected (with both the master and slave cameras)
  • The depth stream of the master camera is enabled

If any of these conditions are not met, I don't encounter the Left IC2 Config Error, and the stream starts correctly. Additionally, if the camera is already stuck in this issue, attempting to restart the stream (even without meeting the aforementioned conditions) will still fail with the Left IC2 Config Error. A hardware reset is necessary to successfully restart the stream.

The occurrence of Stream Start Failure is not consistent, and I haven't identified the specific conditions that trigger it.

@MartyG-RealSense
Copy link
Collaborator

There is likely not a physical hardware fault, then.

I would recommend avoiding use of 5 Hz if possible and using a minimum of 15 Hz. At 5 Hz new frames arrive in the frame pipeline slowly enough that the risk of a problem occurring increases compared to 15 Hz and above, where such problems experienced at 5 Hz disappear.

@FrGrQuim
Copy link
Author

Okay, so currently, there is no explanation for the root cause of this issue? Is there something we can do to help you understand this problem? Our robot cannot handle 15Hz streams, and even if we limit the rate by requesting a framerate of 5Hz, we still want to keep the camera's framerate consistent with our application.

Additionally, we would like to retrieve the camera logs (similar to the ones available on the RealSense Viewer) directly on the robot. Is there an API that allows us to do this? I tried using the rs2::device::get_info with this parameter:
Screenshot from 2024-04-25 13-40-53
However, I couldn't find where I can locate the meaning of the debug OP code.

@MartyG-RealSense
Copy link
Collaborator

I do not have hardware sync equipment to replicate and test your sync setup, unfortunately.

It may be worth testing to see whether the problem still occurs when using genlock hardware sync. This will involve setting 5 FPS and Inter Cam Sync Mode '1' on your master camera and Inter Cam Sync Mode '4' (instead of 2) on your slave camera, with the slave camera's FPS set to 15 instead of 5. This is because when using a genlock sync mode with a master camera, the slave FPS must be 3x the master FPS. The slave camera will actually operate at the master's 5 FPS, not 15.

If you are programming your own application with script code then a workaround in the past for problems with a particular FPS speed has been to only use every "nth" frame - for example, setting 30 FPS but only using every 6th frame in order to achieve an actual FPS of 5. A RealSense user developed a Python script to demonstrate this principle at #3169 - it sounds as though you are using C++ but the Python script might provide insights about how to implement a similar custom FPS mechanism in C++.

There is not a list of meanings provided by Intel for the debug op codes as the purpose of the logging mechanism was for RealSense end-users to provide the log to Intel so that an Intel RealSense firmware engineer could interpret it.

The RealSense SDK provides a C++ firmware logging tool called rs-fw-logger that provides similar functionality to the RealSense Viewer's firmware log mechanism.

https://github.com/IntelRealSense/librealsense/tree/master/tools/fw-logger

Further information about this logging tool can be found at #1215

@FrGrQuim
Copy link
Author

Hi @MartyG-RealSense,

I've tested genlock 3 mode, and it appears to be functioning correctly. I hope this result can help you understand the root cause of this issue.
Thank you for your assistance. However, we remain interested in understanding the underlying issue and how to address it (or at least prevent its recurrence with new cameras).

@MartyG-RealSense
Copy link
Collaborator

Genlock mode '3' is known as Full Slave mode and follows the same rules as slave mode 2 except that it attempts to sync both depth and RGB instead of only depth. The RGB sync in this mode was intended for use with the D415 camera model and it did not work well, but I'm pleased to hear that it helped to resolve your depth syncing between master and slave.

@FrGrQuim
Copy link
Author

FrGrQuim commented Apr 30, 2024

Sorry I made a mistake, it's the Inter Cam Sync Mode '4' that I have used.

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Apr 30, 2024

No problem, thanks very much for the clarification.

A key difference between mode 2 and mode 4 (other than mode 4 not syncing RGB) is that mode 2 listens for a sync trigger pulse on each frame and if it does not recognize a pulse within a certain time period then it proceeds with initiating an unsynced capture on that frame. So the camera is always capturing regardless of whether a sync trigger is received or not.

With mode 4 though, the camera waits indefinitely for a trigger pulse and then initiates capture on that frame once the pulse is received. Then it starts waiting indefinitely again until the next trigger pulse arrives.

@MartyG-RealSense
Copy link
Collaborator

Hi @FrGrQuim Do you have an update about this case that you can provide, please? Thanks!

@FrGrQuim
Copy link
Author

Hi @MartyG-RealSense,

After testing the genlock mode for two weeks, I can provide some feedback.

The genlock mode seems to work well for several consecutive hours, but after this period, the frames from the slave camera start to be consistently +/- 66ms more recent than the frames from the master camera (66 ms corresponds to a time period equivalent to 15Hz). Performing a hard reset of the camera did not correct the problem, but unplugging and re-plugging it did.

So far, I have tested only one set of cameras, and I will try another set today. I will update you with the results.

@MartyG-RealSense
Copy link
Collaborator

Thank you, @FrGrQuim - I look forward to your next update.

@MartyG-RealSense
Copy link
Collaborator

Hi @FrGrQuim Do you require further assistance with this case, please? Thanks!

@MartyG-RealSense
Copy link
Collaborator

Case closed due to no further comments received.

@FrGrQuim
Copy link
Author

Hi Marty,

Sorry for the delay in responding, but we had other priorities to work on. The genlock problem I described in the previous message also happens with other robots. Until now, we have reverted to the classic master/slave sync, changing the master/slave depending on the composition that works. However, now we need a method that works for every robot. Therefore, we want to return to the genlock method, but we must understand why, after several hours, there is a 66ms delay between the two frames.

Do you know if the clocks are synchronized via the trigger? Indeed, if the clocks are not synchronized, the timestamp can shift even if the frames are taken at the same time. Is the trigger sent by the master before or after taking the picture? I saw in a document that in genlock, the slave starts taking the picture when the trigger is received. If the master sends the trigger after it finishes taking its picture, a delay will always be present between the two frames.

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Jun 12, 2024

In the case of non-genlock at least (I do not know if the same applies to genlock), it is normal and correct for there to be a noticeable gap between the timestamps of master and slave after an extended period of time. Please visit the link below and scroll down to the paragraph that starts with the line Now to the somewhat counter intuitive aspect of time stamps.

https://dev.intelrealsense.com/docs/multiple-depth-cameras-configuration#3-multi-camera-programming

In genlock (Inter Cam Sync Mode 4 and above), the capture is only taken after a trigger is received, and the camera then waits until the next trigger is received before performing another capture and then waiting again for the next trigger after that.

After the trigger is initially received, there is a period of 'exposure time' and then the capture takes place when that exposure period has completed.

@FrGrQuim
Copy link
Author

It's strange because we already let the system run during more than 24H and we never see the shift appear (we allow 10ms between the two timestamp) which seems to show that the synchronization is not active. However, when we disable the synchronization, we clearly see that the delay between the two frames timestamp is present from the start-up.

Also my question about the master trigger timing was focused on the master only. I link a picture explaining the two possibilities:

master_trigger_timing (1)

Are we in case 1 or 2? If it is the case 2 how the slave know the delay between the trigger and the start of picture it must apply to finish in the same time than the master?

@MartyG-RealSense
Copy link
Collaborator

The trigger from the master camera needs to arrive at the slave camera first before the slave can initiate capture.

The hardware sync documentation only talks about the master camera in terms of it being the generator of the trigger pulse that the slave cameras will sync their capture timing to. The master is not described as generating captures itself when a trigger pulse is produced by it.

When a trigger is received by a slave camera and the exposure period begins, the camera sensor initiates capture once the period of exposure time has elapsed and exposure is completed.

image

When sync is not enabled (when the Inter Cam Sync Mode of all cameras is '0'), the cameras will each independently initiate their captures without listening for a trigger pulse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants