Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for extra power cycle between firmware flashes on Linux platforms #6

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Elianabeth
Copy link

Resolves the issue where on Linux, the firmware can't be loaded onto the board unless a load is attempted, the board is power cycled (and replugged), and then loaded again (first reported by WPI).

@Elianabeth
Copy link
Author

Also, for context on why it failed to load the firmware before - The bootloader would never actually enter the flashing stage as the loader would give up early due to falsely assuming the bootloader update step failed. (verify_resp(ser, BootloaderResponseCode.StartUpdate) fails)

The null char being sent in the first place is likely due to some driver-specific weirdness between linux and window/mac's serial drivers. When you replug the board, the bootloader will respond with the sequence '0x0 0x1 0x2 0x3' to indicate it's ready for the next flash step.

However, when you just reset the board after it's already plugged in, you can receive the stream '0x0 0x1 0x0 0x2 0x3'. The loader mistakenly reads that '0x0' char after the '0x1' as an error code (in verify_resp), however this isn't actually an error and is the result of linux serial driver weirdness.

@Elianabeth
Copy link
Author

Updated based on feedback to continue waiting while serial timeouts are received.

"Kyle Scaplen:
Please note that the reason the fixes in that PR work is that the byte b”\x00" is never an expected return code from the bootloader. Also, if the serial read were to time out with this fix applied, you would receive b”" back and not b”\x00" causing the while loop to exit and fail.
The actual bug here is that when working with a linux host, extra null bytes are received over the UART line (i.e. the byte 0x00 is never sent by the bootloader firmware, but the python library is reading a byte with the value 0x00). We have only seen this when using a linux host machine, and are not sure the best way to solve this."

@krscaplen
Copy link

@dtwalter approved

@jzhan357
Copy link

jzhan357 commented Mar 1, 2023

The fix in this PR allows firmware loading to complete, but won't fix further communications over the serial device, which can be witnessed with cat {TTY} immediately exiting.
A bit of experimenting led myself and another UCSC student to find that stty -F {TTY} sane and screen {TTY} 115200 fix the serial device.
Another more drastic solution is to unload and reload the cdc_acm driver with sudo modprobe -r cdc_acm and sudo modprobe cdc_acm. This causes all ttyACM devices to be dropped, which may be unacceptable on shared computing environments.

@jzhan357
Copy link

jzhan357 commented Mar 1, 2023

The actual issue seems to be pyserial reconfiguring the serial device in ways that cause reads to the device to immediately exit if no data is in the buffer.
A bit of poking with pyserial shows that it sets min to 0, which seems to be the culprit.

@jzhan357
Copy link

jzhan357 commented Mar 1, 2023

A bit of testing shows that when opening the serial device, setting inter_byte_timeout to a non-None value prevents pyserial from setting min to 0. The offending line is here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants