Communicating Failure in Updated Protocol #68

dcyoung · 2017-04-04T01:36:52Z

Scanse will start shipping devices shortly. The latest firmware that will come installed on the devices is using an updated communication protocol. The updated protocol is outlined in the docs on the motor-ready branch.

The relevant additions for this issue are...
Data Start - DS will NOT trigger data acquisition unless the motor is spinning ( > 0Hz) and the speed has stabilized.

If the motor is spinning and the speed has stabilized, then the device will respond to a DS command with a normal header ('DS00P\n'... ie: status bytes == '00') and data acquisition will commence.
If the motor is stationary or the speed has not stabilized, then the device will respond to a DS command with a modified header and data acquisition will NOT commence. This modified header will use the status bytes to indicate the type of failure. For example 'DS12S\n' (status bytes == '12') indicates that the motor speed has not yet stabilized, while 'DS13T\n' (status bytes == '13') indicates that the motor is stationary/0Hz. In either case the data acquisition will not commence.

Adjust Motor Speed - MS will not trigger an adjustment if the motor speed has not yet stabilized. Similar to above, the response message will use status bytes to indicate the failure.

Status bytes '00' indicate normal processing (ie: not issue)
Status bytes '11' indicate the original command was sent with an invalid parameter
Status bytes '12' indicate that the motor speed has not yet stabilized

We need a way to communicate these failures to the user without exiting or destroying the sweep. At the very least, the high level functions in sweep.cc need to be able to handle these conditions. I'm wondering about the following options. @daniel-j-h I am very open to suggestions, but need to get this going ASAP.

Add return values to many of the void functions (ie: sweep_device_start_scanning() could return false or an int capable of representing success + multiple failure conditions).
Error on these failures, but store the most recent error somewhere. Do not destroy anything on these kinds of errors. Include well defined error codes in the header such that the sweep.cc functions (and even the user) can check the most recent error code in the event of a failure.

Ie:

if(start scanning produces error){
  check the error type
  if (error type indicates the motor is stationary) {
    adjust the motor speed
    wait for motor speed to stabilize
    try to start scanning again
  }
}

Internal functions could opt to handle some of this logic. For example: sweep_device_set_motor_speed() might encounter an error because the motor speed is still adjusting from a previous command. The function could witness the error, wait for the motor speed to stabilize and then try again. All of this could happen without requiring any user code. However, some conditions like invalid parameters or a stationary motor will require user handling.

The text was updated successfully, but these errors were encountered:

daniel-j-h · 2017-04-04T08:10:41Z

Thanks for flagging this. Thinking a bit about a clean API here what if we split the start_scanning and stop_scanning functions into two aspects:

Start and stop motor: blocks and waits for stabilization. Sets a boolean flag in the device when motor speed stabilized. I would block here because it's the simplest interface for the user. If non-blocking is needed the user can do this herself, e.g. by fulfilling a future's promise on unblock.
Start and stop scanning: gathers point cloud data. Sets a boolean flag in the device when gathering data (we already do this). Asserts for the motor speed stabilization flag to be true.

The motor speed adjust function can assert for stabilized && !scanning. The get scan function can assert for stabilized && scanning. This would work as long as the motor speed adjustment function blocks until the new speed stabilized, too.

Another option would be to hide the stabilization from the user completely: if the speed is still stabilizing we simply return a scan with less / zero samples in the get scan function. The adjust motor speed function could return an error when the speed is still stabilizing.

Is there a way to figure out how long we have to wait for the speed to stabilize? I would prefer the simplest and most effective solution here. If that means we can e.g. simply block for a second in the constructor and adjust motor speed function, I would do so.

MikeGitb · 2017-04-04T08:30:17Z

Is there a particular reason, why it is not possible to change the motor speed before it has stabilized?

Regarding sweep_device_start_scanning(): Personally I'd prefer return values over some errno like mechanism.

If you decide to implement a blocking Interface, please allow for a (maybe user configurable) timeout.

dcyoung · 2017-04-04T17:53:46Z

Start and stop motor:

@daniel-j-h Are you proposing the addition of a new set of functions for stopping/starting the motor, separate from adjusting the motor speed? This set of functions would toggle the motor between motor speed setting 0Hz and whatever speed setting it was last using? This would require us to store the most recent motor speed... which means we'd have to query the motor speed on device creation as well. This would also necessitate that the set_motor_speed method only work with 1:10Hz, and the 0Hz setting only be applicable through the start/stop motor function?

This would handle the failure case for DS where it returns 'DS13T\n', indicating a stationary motor. Inside the start_scanning function we could check for this failure. In the event of a 13 status code failure, start_scanning could call the new start_motor method which would change the motor speed to the previous value before it was stopped. However, this solution requires that we:

store state information (current/previous motor speed)
query motor speed at creation
handle the odd edge case where the motor speed is 0Hz at the time of device creation

I can see this setup being possible, but I'm wondering if abstracting motor speeds 0:10Hz into 2 separate concepts is necessary. ie:

Stationary (0Hz) vs. Mobile (>0Hz)
Speed (1:10Hz)

Or.... are you proposing that the device be stationary at all times, unless it is scanning. This would remove some control from the user, but would make the logic more simple. All calls to start_scanning would then block for a significant period of time (minimum 6 seconds). ie:

device_construct enforces that the motor is stationary from instantiation
- calls stop_scanning to accomplish the following:
  - stop scanning
  - stop motor
start_scanning takes a motor speed as an input parameter. Then it performs the following:
- adjusts motor speed to the input value
  - requires 0-6 seconds depending on how recent any previous change to motor speed occurred
- waits for motor speed to stabilize
  - requires ~6 seconds
- begins data acquisition
stop_scanning performs the following:
- stops data acquisition
- stops the motor
set_motor_speed performs the following:
- waits for motor speed to stabilize
  - requires 0-6 seconds depending on how recently any previous change to motor speed occurred. If the motor speed was already stable, including stationary 0Hz case it will take 0 seconds. But if the motor speed was just changed before this new call to set_motor_speed, then this wait could take up to ~6seconds.
- adjusts motor speed

This removes the need to store any motor speed information, as the parameter is always passed into start_scanning. This makes the logic extremely simple for the user, but removes control and lengthens the process of start/stop scanning. In the best case scenario, the start_scanning takes ~6 seconds. In the worst case scenario start_scanning could take up to ~12 seconds (ex: stop_scanning is called immediately before start_scanning). This is because it has to wait for the previous change to stabilize, change motor speed, wait for the new change to stabilize and then finally it can start scanning.

Is there a way to figure out how long we have to wait for the speed to stabilize?

Is there a particular reason, why it is not possible to change the motor speed before it has stabilized?

@daniel-j-h and @MikeGitb When the motor speed is changed, the sensor also performs a calibration on the encoder ticks to account for variation in plastics on each device. This routine takes ~6 seconds. After changing the motor speed or when the device is powering on, this time window is a grace period where the device will not handle certain tasks related to motor speed. We hope to add more optimized control in the future, where the grace period might decrease but currently this is what we have to work with.

please allow for a (maybe user configurable) timeout

@MikeGitb I agree that we definitely want timeouts, but I think these should be based on the maximum possible duration for a motor speed stabilization, and not a user config. If you look at the sweep_device_wait_until_motor_ready() function, you can see that there is a timeout in there. Currently its set to something high like 10 seconds, but this could be dialed back to just longer than the maximum time set by the firmware ie: >6seconds.

MikeGitb · 2017-04-04T18:33:54Z

@dcyoung: Ok, sorry I was again thinking too much in terms of running this lib on a microcontroller/embedded system. Calling a function that might block for 6-10 seconds would be an absolute no-go for me (although it is probably fine for most other systems). But I think I'll just write my own library for my specific needs.

Regarding the grace period: I don't know how you control the motor, but I find it strange that you can't just change the setpoint before you reached the previous one.

dcyoung · 2017-04-04T18:46:52Z

@MikeGitb i edited the original response with more details. Its really due to a calibration routine that runs when the motor speed is adjusted. As for using the library without blocking. I see no reason why we couldn't break the functions down into base units, and provide them as well. For example, the sweep_device_wait_until_motor_ready() makes use of the sweep_device_get_motor_ready() function which does not block.

dcyoung · 2017-04-04T21:18:28Z

What if we implemented all of the base (non-blocking) functions similar to the current fashion. Ie: the methods populate an error in the event of a status failure. The user would have access to these methods, but we would also provide simple abstractions that block.

Do NOT maintain that the device be stationary when not scanning. Ie: device is allowed to be at any speed when not scanning.
The methods would then work like this:

low_level_set_motor_speed [low-level, non-blocking]
- sends MS command
- wait for response
- error on error or failure
low_level_get_motor_ready [low-level, non blocking]
- sends MZ command
- wait for response
low_level_start_scanning [low-level, non blocking]
- sends DS command
- wait for response
- error on error or failure
wait_until_motor_stabilized [high-level, blocking]
- loops on low_level_get_motor_ready() until timeout or true...
- will return very quickly if the motor speed was not changed soon before... worst case 6 second block
start_scanning [high level, blocking]
- calls wait_until_motor_stabilized()
- calls low_level_start_scanning()
set_motor_speed [high-level, blocking]
- calls get_motor_speed, and returns early if the requested speed is the same as the current speed
- calls wait_until_motor_stabilized()
- calls low_level_set_motor_speed()

High level user code could then look like:

set_motor_speed(...)
start_scanning()

A user looking for non-blocking functions could simply write their own logic, being careful not to call the low_level_start_scanning or low_level_set_motor_speed unless they've first checked that...

the motor speed is not 0Hz via get_motor_speed
the motor speed is stabilized via low_level_get_motor_ready

dcyoung · 2017-04-05T01:13:42Z

I pushed a status-codes branch with these changes that seems to be working well.

Let me know what you think or if you have questions.

PR is up #70

dheera · 2017-04-20T02:03:02Z

Is there any reason the device should ever not be scanning? Can it be always outputting scans?

dcyoung · 2017-04-20T18:42:11Z

@dheera There are quite a few reasons. The most important reason is that the device cannot communicate or process commands (other than STOP) while scanning. Additionally, unlike the rest of the communication protocol, data packets are not delimited. This is meant to keep the data stream compact and efficient. The packets are parsed by expected size and then validated. So even if the device could process and respond to other commands while streaming, the responses would be lost in the stream.

These kinds of discussions are related more to the device's firmware and communication protocol, to which this SDK conforms. In the future, Scanse may offer other platforms for providing feedback about the device firmware. But for now, if you have suggestions or feedback regarding the device's firmware or the communication protocol, always feel free to send a support message directly via the Scanse website.

dcyoung · 2017-04-21T18:07:44Z

Resolved in b867c37

dcyoung added Discussion libsweep labels Apr 4, 2017

dcyoung closed this as completed Apr 21, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Communicating Failure in Updated Protocol #68

Communicating Failure in Updated Protocol #68

dcyoung commented Apr 4, 2017 •

edited

Loading

daniel-j-h commented Apr 4, 2017

MikeGitb commented Apr 4, 2017 •

edited

Loading

dcyoung commented Apr 4, 2017 •

edited

Loading

MikeGitb commented Apr 4, 2017

dcyoung commented Apr 4, 2017

dcyoung commented Apr 4, 2017 •

edited

Loading

dcyoung commented Apr 5, 2017 •

edited

Loading

dheera commented Apr 20, 2017

dcyoung commented Apr 20, 2017 •

edited

Loading

dcyoung commented Apr 21, 2017

Communicating Failure in Updated Protocol #68

Communicating Failure in Updated Protocol #68

Comments

dcyoung commented Apr 4, 2017 • edited Loading

daniel-j-h commented Apr 4, 2017

MikeGitb commented Apr 4, 2017 • edited Loading

dcyoung commented Apr 4, 2017 • edited Loading

MikeGitb commented Apr 4, 2017

dcyoung commented Apr 4, 2017

dcyoung commented Apr 4, 2017 • edited Loading

dcyoung commented Apr 5, 2017 • edited Loading

dheera commented Apr 20, 2017

dcyoung commented Apr 20, 2017 • edited Loading

dcyoung commented Apr 21, 2017

dcyoung commented Apr 4, 2017 •

edited

Loading

MikeGitb commented Apr 4, 2017 •

edited

Loading

dcyoung commented Apr 4, 2017 •

edited

Loading

dcyoung commented Apr 4, 2017 •

edited

Loading

dcyoung commented Apr 5, 2017 •

edited

Loading

dcyoung commented Apr 20, 2017 •

edited

Loading