-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Firmware USB streaming can stall after the first 16K USB transfer following RX restart #1042
Comments
This was referenced Jan 29, 2022
martinling
added a commit
to martinling/hackrf
that referenced
this issue
Feb 3, 2022
This fixes bug greatscottgadgets#1042, which occured when an RX->OFF->RX sequence happened quickly enough that the loop in rx_mode() did not see the change. As a result, the enable_baseband_streaming() call at the start of that function was not repeated for the new RX operation, so RX progress stalled. To solve this, the vendor request handler now increments a sequence number when it changes the transceiver mode. Instead of the RX loop checking whether the transceiver mode is still RX, it now checks whether the current sequence number is the same as when it was started. If not, there must have been at least one mode change, so the loop exits, and the main loop starts the necessary loop for the new mode. The same behaviour is implemented for the TX and sweep loops. For this approach to be reliable, we must ensure that when deciding which mode and sequence number to use, we take both values from the same set_transceiver_mode request. To achieve this, we briefly disable the USB0 interrupt to stop the vendor request handler from running whilst reading the mode and sequence number together. Then the loop dispatch proceeds using those pre-read values.
This was referenced Feb 3, 2022
martinling
added a commit
to martinling/hackrf
that referenced
this issue
Feb 8, 2022
This is a defensive change to make the transceiver code easier to reason about, and to avoid the possibility of races such as that seen in greatscottgadgets#1042. Previously, set_transceiver_mode() was called in the vendor request handler for the SET_TRANSCEIVER_MODE request, as well in the callback for a USB configuration change. Both these calls are made from the USB0 ISR, so could interrupt the rx_mode(), tx_mode() and sweep_mode() functions at any point. It was hard to tell if this was safe. Instead, set_transceiver_mode() has been removed, and its work is split into three parts: - request_transceiver_mode(), which is safe to call from ISR context. All this function does is update the requested mode and increment a sequence number. This builds on work already done in PR greatscottgadgets#1029, but the interface has been simplified to use a shared volatile structure. - transceiver_startup(), which transitions the transceiver from an idle state to the configuration required for a specific mode, including setting up the RF path, configuring the M0, adjusting LEDs and UI etc. - transceiver_shutdown(), which transitions the transceiver back to an idle state. The *_loop functions that implement the transceiver modes now call transceiver_startup() before starting work, and transceiver_shutdown() before returning, and all this happens in the main thread of execution. As such, it is now guaranteed that all the steps involved happen in a consistent order, with the transceiver starting from an idle state, and being returned to an idle state before control returns to the main loop. For consistency of interface, an off_mode() function has been added to implement the behaviour of the OFF transceiver mode. Since the transceiver is already guaranteed to be in an idle state when this is called, the only work required is to set the UI mode and wait for a new mode request.
martinling
added a commit
to martinling/hackrf
that referenced
this issue
Feb 8, 2022
This is a defensive change to make the transceiver code easier to reason about, and to avoid the possibility of races such as that seen in greatscottgadgets#1042. Previously, set_transceiver_mode() was called in the vendor request handler for the SET_TRANSCEIVER_MODE request, as well in the callback for a USB configuration change. Both these calls are made from the USB0 ISR, so could interrupt the rx_mode(), tx_mode() and sweep_mode() functions at any point. It was hard to tell if this was safe. Instead, set_transceiver_mode() has been removed, and its work is split into three parts: - request_transceiver_mode(), which is safe to call from ISR context. All this function does is update the requested mode and increment a sequence number. This builds on work already done in PR greatscottgadgets#1029, but the interface has been simplified to use a shared volatile structure. - transceiver_startup(), which transitions the transceiver from an idle state to the configuration required for a specific mode, including setting up the RF path, configuring the M0, adjusting LEDs and UI etc. - transceiver_shutdown(), which transitions the transceiver back to an idle state. The *_mode() functions that implement the transceiver modes now call transceiver_startup() before starting work, and transceiver_shutdown() before returning, and all this happens in the main thread of execution. As such, it is now guaranteed that all the steps involved happen in a consistent order, with the transceiver starting from an idle state, and being returned to an idle state before control returns to the main loop. For consistency of interface, an off_mode() function has been added to implement the behaviour of the OFF transceiver mode. Since the transceiver is already guaranteed to be in an idle state when this is called, the only work required is to set the UI mode and wait for a new mode request.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This has been very hard to pin down, but I think I've now proven it.
In both #883 and #916, we saw host-side problems that caused a rapid RX stop/start sequence to fail, due to concurrency bugs in our usage of libusb. I believe I have now fixed all these host-side issues in #1029.
However, there was still a failure mode being seen in which no libusb errors occured, yet the RX data flow failed to restart. The RX LED was on when this happened, indicating that the transceiver mode change had been processed on the device. The libusb event handling was proceeding correctly, but a bulk transfer was never filled.
I've now succeeded in using LUNA to catch this happening on the wire. A pcap file, trimmed to just the relevant section, is attached: clip.pcap.zip
After the vendor requests to change the transceiver mode to
OFF
, and then toRX
, only a single 16KB transfer from the firmware is completed. The host continues to sendIN
tokens to request data from the endpoint, but the device only responds withNAK
.From the host side, it appears as if no data at all is received, because we submit libusb transfers for 256KB of data, with no timeout.
I have no idea what's going wrong on the firmware side yet, but I do know that the symptoms are reproducible both with and without the changes in #982.
The text was updated successfully, but these errors were encountered: