-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Syslink requires flow control #703
Comments
@ataffanel Has this been prioritized/planned on? I am currently hitting this problem when implementing the param/log TOC retrieval using crazyflie-link-cpp. The logic is to first send all requests for every single entry of the TOC in the first loop and then receiving all responses in the second loop. Sometimes I am missing responses. The workaround to have it in a single loop (i.e., send a request and then wait for the response) works, but is much less efficient. From what I remember, one potential workaround was to somehow limit the number of packets that are in flight per CRTP port to be 1? Is there a less restrictive workaround? Also, this should likely be implemented directly in crazyflie-link-cpp then? |
I had actually decided, independently of your message, to dedicate my week to implementing Syslink flow control :-). No promises, we are a skeleton crew currently at Bitcraze, but all my non-support/production time will be on this. I have had the same problem implemening TOC fetch in Rust. My solution is the same as yours, I fetch the element one by one. The few time fetching as a bulk worked, it reduces the fetch time from 2.4 seconds down to ~0.8 seconds. It is interesting to note that when fetching multiple tocs (log and param+param values) in parallel, the runtime is still 2.4 seconds since we are using the link very sparsely so the final win might not be that high. But I am happy that this is a very easily repeatable test case for the flow control. I would not advise you to start on in-flight limiting right now, it would indeed be a solution but it sounds complex to implement and it will become mostly useless once flow control is implemented. I will update this ticket during the week with my progress. One thing I am not sure about yet but that might happen is that we might eventually need a new version of Safelink (the radio packet counters). If I remember well, the current implementation of safelink both in the Crazyflie and in the libs does not allow for the Crazyflie to block the up-link even though it could. Without that, the only option is to stop ack-ing packet at radio level which is going to work but is not ideal so we should avoid it in the future. |
Awesome - thanks! Let me know if you need any help/brainstorming. For the C++ side it sometimes works to fetch in bulk, but sometimes it doesn't. My code is at https://github.com/whoenig/crazyflie_cpp/blob/70beb44478c627b4ec7b9e50573e71d61d6c0dfe/src/Crazyflie.cpp#L735-L766 (including an assert that finds the situation in Debug builds, but I can imagine that release builds are even worse.) The timing is exactly the same for me in C++. Using the (optional) cache, protected by CRC and numEntries, brings it down to about 0.2s. I don't think changing the safelink protocol is a big deal. We can always bump up the protocol version, use the "old" implementation to query the version first (potentially even without any use of safelink), and then switch to the newer version if the protocol number is high enough. |
This one is pretty important to me. It will allow me to switch to crazyflie-link-cpp as backend for crazyflie_cpp and the Crazyswarm. At the same time, I expect fixing this will make all libraries/clients more robust especially for areas with a bad radio connection. In some places it might also allow us to simplify the library/client code (where we have to apply ugly workarounds otherwise, see discussion above). |
I am looking at this next week. I did not manage to unlock enough time during the summer and it ended up being a bit harder than I initially though (as usual ....). Now I have a good idea of what needs to be done, I will update this ticket next week with my progress. |
After some great talk with @tobbeanton I have a new easier strategy:
So the strategy is to implement a new syslink packet that tell the nRF to stop or restart sending radio packets to the STM. This is higher level flow control assuming the low level UART is fine. We have asserts in the low level UART code so if those are a problem we will see it. We have some strategy like Non Maskable Interrupt, above FreeRTOS, and DMA that can improve the low level UART handling if needed. I will make a PR so that you can test it with your use-case @whoenig |
Cool! The main drawback I see is that it might require some tuning on when to send those packets to keep the risk of race conditions very low in practice. |
Yes agreed, I plan on keeping a wide margin to account for delay in the stop packet arrival to the nRF. On possible problem is that since Safelink on the nRF is not securing the uplink, the only way the nRF will have to stop the packet flow will be to not send any Ack back which will also stop the downlink, this is potential for a deadlock. If this is a problem that can be solved by implementing the uplink in Safelink both on the nRF and the Lib (it is an incompatible change so it will need some new packets at connection time). |
I would still prefer having that, as it might reduce lots of headache in the future. We already have to manually enable syslink at connection time. Couldn't we use this as a |
Hi @ataffanel: Is there any update on this issue? We are actively developing and using Crazyswarm2 with the native-link backend and do occasional issues that I think are related to this. |
As you probably noticed by now, no there is no update on this... I don't think @ataffanel has worked on this. Perhaps a good topic for the our next '3rd week' project? Anyway I'll make a todo for us on our board |
This will be on hold for a while due to arnauds absence. Perhaps I should use an 'inactive' tag? |
@knmcguire @krichardsson , I did not find TAG side code in repository. Do you know where it is? |
@amitx64 we don't know what you mean with TAG in the context of this issue. Please ask any support questions on https://discussions.bitcraze.io/ |
Since Arnaud is back, perhaps this could actually be worked on:-)? |
not yet back :) but next week he'll come back. Next week is an async week and jetlag for him so perhaps the week after? |
The communication between NRF51 and STM32 requires flow control to avoid packet loss. Currently, there is hardware-supported flow control from STM->NRF51 to prevent the STM to send data too quickly, but not the other way around. However, for long-lasting operations on the STM side, packets might be lost. A simple example is using the CRTP Echo port with an echo delay. Since the HW is not prepared for duplex HW flow control, this will require some protocol changes between the NRF51 and STM32.
The text was updated successfully, but these errors were encountered: