Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NotifySetpointStop odd behavior #464

Closed
The-SS opened this issue Aug 24, 2021 · 8 comments
Closed

NotifySetpointStop odd behavior #464

The-SS opened this issue Aug 24, 2021 · 8 comments

Comments

@The-SS
Copy link
Contributor

The-SS commented Aug 24, 2021

I was testing out NotifySetpointStop() but noticed some strange behavior.

  • cmdPosition followed by NotifySetpointStop(100), sleeping for 0.1sec, then land: lands after 0.1 sec
  • cmdPosition followed by NotifySetpointStop(1000), sleeping for 1sec, then land: cf starts to drift, but then land kicks in and the cf lands normally
  • cmdPosition followed by NotifySetpointStop(5000), sleeping for 5sec, then land: once NotifySetpointStop is called, the propellers just stop immediately and the drone crash lands. 5 seconds later, the drone hovers motors start again and the drone hovers at landing height then lands properly. (The motors stopping was also noted in PR #202)
  • cmdPosition followed by NotifySetpointStop(5000) then land: even though the cf should still be in low-level control mode for 5 seconds, the drone lands normally.
  • cmdPosition followed by NotifySetpointStop(100) then land: the drone lands normally. (this is how it's used in other crazyswarm)
  • cmdPosition followed by land: The drone drifts a bit for about 0.5 second then lands where it should have landed before drifting
  • cmdPosition followed by sleep for 5sec then land: The drone drifts a bit for about 0.5 second then the motors stop and it crash lands. When land is called, the drone hovers again and lands in it's place.

NotifySetpointStop doesn't seem to be doing what it should, at least not always. When the "remainValidMillisecs" argument is large, it seems to cause the propellers to stop if a high-level command is not received immediately. Varying the duration "remainValidMillisecs" doesn't seem to be effective.

Does the cf usually use 100ms as the default timeout before switching to high-level control (even if NotifySetpointStop is not called)?

@whoenig
Copy link
Contributor

whoenig commented Aug 26, 2021

This seems more like a firmware issue, rather than a Crazyswarm problem? I personally haven't tried switching between modes recently.

@The-SS
Copy link
Contributor Author

The-SS commented Aug 28, 2021

I just thought I'd note it in case anyone else is trying to use it or if someone else is using this successfully. I'm just landing with low-level control (using cmdPosition makes this easy) followed by cmdStop.

@jpreiss
Copy link
Collaborator

jpreiss commented Oct 3, 2021

Thanks for doing these detailed experiments. I implemented the firmware for notifySetpointsStop but I would not be shocked if I made a mistake.

@jpreiss jpreiss closed this as completed Oct 3, 2021
@jpreiss jpreiss reopened this Oct 3, 2021
@jpreiss
Copy link
Collaborator

jpreiss commented Oct 3, 2021

Before discussing, I will repost some code for convenience. I have added comments and removed some concurrency-related stuff for simplicity. First, here is the timeout logic in commander.c:

if ((currentTime - setpoint->timestamp) > COMMANDER_WDT_TIMEOUT_SHUTDOWN) {
  // enableHighLevel is controlled by the param system - 
  // it's always true if you use the default Crazyswarm launch file.
  if (enableHighLevel) { 
    crtpCommanderHighLevelGetSetpoint(setpoint, state);
  }
  if (!enableHighLevel || crtpCommanderHighLevelIsStopped()) {
    memcpy(setpoint, &nullSetpoint, sizeof(nullSetpoint));
  }
} else if ((currentTime - setpoint->timestamp) > COMMANDER_WDT_TIMEOUT_STABILIZE) {
  xQueueOverwrite(priorityQueue, &priorityDisable);
  // Leveling ...
  setpoint->mode.x = modeDisable;
  setpoint->mode.y = modeDisable;
  setpoint->mode.roll = modeAbs;
  setpoint->mode.pitch = modeAbs;
  setpoint->mode.yaw = modeVelocity;
  setpoint->attitude.roll = 0;
  setpoint->attitude.pitch = 0;
  setpoint->attitudeRate.yaw = 0;
  // Keep Z as it is
}

and here is what happens when notifySetpointsStop is received

uint32_t currentTime = xTaskGetTickCount();
// Note: this MIN has no effect unless currentTime is very small, i.e. we just rebooted.
// In normal operation we can assume:
// timeSetback == COMMANDER_WDT_TIMEOUT_SHUTDOWN - M2T(remainValidMillisecs).
int timeSetback = MIN(
  COMMANDER_WDT_TIMEOUT_SHUTDOWN - M2T(remainValidMillisecs),
  currentTime
);
tempSetpoint.timestamp = currentTime - timeSetback;

So essentially, we are dictating the time at which the first if branch in the previous code block is taken. In other words, it tells the firmware: "if I have not sent a high level command after remainValidMillisecs, something is wrong." Therefore, the script should always use a timeout period that is comfortably longer than the expected time needed to send the high-level command.

On the other hand, if the timeout period is longer than COMMANDER_WDT_TIMEOUT_STABILIZE, then the else if branch will still be taken. This will cause the "drifting" behavior you observed, as the controller is no longer trying to hold the position in the xy-plane. This is probably not desired, but we never thought about it because we always assumed that the timeout period would be less than 0.5 seconds, the current value of COMMANDER_WDT_TIMEOUT_STABILIZE.

Remember, the only purpose of notifySetpointsStop is to account for the time it takes to execute the next line of your script and send another command. The default value of 0.1 seconds is already intended to be conservative. I am not aware of any practical reason why you would want to use a timeout of 1 or 5 seconds.

Now, moving on to your experiments. Keep in mind that my interpretations might be wrong, since this part of the firmware is complicated.

  1. cmdPosition followed by NotifySetpointStop(100), sleeping for 0.1sec, then land: lands after 0.1 sec
  • Bad input. This is like a race condition. The script should never intentionally exceed the timeout period before sending the high-level command. If the timeout activates and you have used the high-level commander previously (e.g. takeoff), then the setpoint will revert to the end of the last high-level trajectory, which could be far from the current position. If the timout activates and you have never used the high-level commander (e.g. you took off using position commands), the motors will stop.
  1. cmdPosition followed by NotifySetpointStop(1000), sleeping for 1sec, then land: cf starts to drift, but then land kicks in and the cf lands normally
  • Input not wrong, but not designed for. The drifting occurs because COMMANDER_WDT_TIMEOUT_STABILIZE is shorter than the supplied timeout period, as explained after the second code block.
  1. cmdPosition followed by NotifySetpointStop(5000), sleeping for 5sec, then land: once NotifySetpointStop is called, the propellers just stop immediately and the drone crash lands. 5 seconds later, the drone hovers motors start again and the drone hovers at landing height then lands properly. (The motors stopping was also noted in PR #202)
  • Bug. We would not be surprised to see the motors stop after 5 seconds due to the race condition. However, they should not stop immediately.
  1. cmdPosition followed by NotifySetpointStop(5000) then land: even though the cf should still be in low-level control mode for 5 seconds, the drone lands normally.
  • Don't know. This is not what the firmware code intends to do, but it probably should be! The current code will never switch back to high-level commander until the timeout expires, but the high level commander still starts the trajectory the moment it receives the command. I would expect the CF to hold the last position for 0.5 seconds, then start drifiting due to COMMANDER_WDT_TIMEOUT_STABILIZE. Then after 4.5 more seconds, if the landing duration was less than 5 seconds, it will hit the crtpCommanderHighLevelIsStopped() branch in the first block. This will send the nullSetpoint which (EDIT) is zero-initialized, which corresponds to modeDisable for all controls, so essentially should let the CF fall.
  1. cmdPosition followed by NotifySetpointStop(100) then land: the drone lands normally. (this is how it's used in other crazyswarm)
  • Good. This is the intended use.
  1. cmdPosition followed by land: The drone drifts a bit for about 0.5 second then lands where it should have landed before drifting
  • Not sure. This is equivalent to calling notifySetpointsStop(COMMANDER_WDT_TIMEOUT_SHUTDOWN), which is 2.0 seconds by default. I would expect it to hold the last position for 0.5 seconds, then drift for 1.5 seconds, then land. How sure are you about the time?
  1. cmdPosition followed by sleep for 5sec then land: The drone drifts a bit for about 0.5 second then the motors stop and it crash lands. When land is called, the drone hovers again and lands in it's place.
  • As designed. Again, are you sure the drift is not 2.0 seconds? It's questionable whether or not the final land should exit the emergency state (i.e. cause crtpCommanderHighLevelIsStopped() to become false, so the CF tracks the landing trajectory), but the motor stop is expected.

In summary: I think there is probably at least one firmware bug related to notifySetpointsStop. I am most concerned with the motor stop in item 3. However, it seems that all the serious problems are happening when you use long timeout values. Is there any reason why you want to use values longer than a few hundred milliseconds?

I think this part of the firmware is bug-prone because there are too many different subsystems interacting. I would like to replace several parts of the firmware with a unified state machine that deals with low-level setpoint timeout, high/low-level switching, and idle/active transition in the high-level commander (takeoff/land) in one place. Maybe free-fall, tumble, and not-flying detection could also be subsumed. I am also not sure why the param commander.enHighLevel needs to exist, aside from historical reasons.

Edit: The notion of "priority" for setpoints is another factor that's complicated to reconcile with timeout, low/high switching, etc.

@The-SS
Copy link
Contributor Author

The-SS commented Oct 4, 2021

From the notifySetpointsStop docstring it sounded like the drone would hover in place until the specified time ends (maybe it would be good to note that this is true only for COMMANDER_WDT_TIMEOUT_STABILIZE milliseconds). We have multiple drones reaching their targets at different times. I think we were trying to exploit notifySetpointsStop to basically keep the drone hovering at the destination for a long time without having to continuously send it a new command. I know this is not the intended use (lol).

Regarding the reported times, I tried repeating the experiments several times to make sure that they were right. So, I think the reported 0.5 sec in 6 and 7 are correct. I am almost certain it wasn't 2 sec (which is the value in COMMANDER_WDT_TIMEOUT_STABILIZE). However, it's been over a month since then and I cannot repeat the experiments now, so take this with a grain of salt.

@jpreiss
Copy link
Collaborator

jpreiss commented Oct 4, 2021

For your application, why not notifySetpointsStop(100) followed immediately by goTo(hover_destination)?

@The-SS
Copy link
Contributor Author

The-SS commented Oct 4, 2021

High-level commands have additional time over-head that we didn't want (they are slower than the low-level commands). We could just send the drone its final location repeatedly, but we thought maybe we can also avoid that if notifySetpointsStop holds the drone in place until the provided time has elapsed.
I ended up using cmdPosition to land the drone followed by cmdStop. It's pretty straightforward to do that.

@jpreiss
Copy link
Collaborator

jpreiss commented Apr 8, 2022

The firmware changes in bitcraze/crazyflie-firmware#903 should have made notifySetpointsStop reliable. They were merged into the official firmware release, but I do not know how much they have been tested besides my own expriments.

Now the remainValidMillisecs parameter is ignored. The last setpoint will remain valid according to the usual timeout rules. This means you must send a high-level command less than COMMANDER_WDT_TIMEOUT_STABILIZE duration after sending notifySetpointsStop.

@jpreiss jpreiss closed this as completed Apr 8, 2022
jpreiss pushed a commit to jpreiss/crazyswarm that referenced this issue Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants