Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

µSD-logging fails in combination with other SPI decks like loco-deck and flow-deck #270

Closed
Woody7777 opened this issue Dec 1, 2017 · 21 comments
Labels

Comments

@Woody7777
Copy link

Using both the µSD-card deck and the loco deck, logging will work only a short time (a few 100 log entries) before it stops. Interestingly, this behaviour seems not to depend on variables logged, sampling rate or buffer used for µSD-config file but solely on time after startup, e.g. with 1000 Hz you'll get ~150 log entries while with 10 Hz you'll get only 20 or so (not exactly reproducible to my experience).

I used the latest firmware and VM2017.03 to reproduce this behaviour. Unfortunately, I don't have a debugger at hand currently. Maybe someone can have a look at the debug prints?

@tobbeanton
Copy link
Member

I've started to investigate this and my first impression is that the usdWriteTask doesn't get time to run. Will investigate further why and if so really is the case.

@ataffanel ataffanel changed the title µSD-logging fails in combination with loco-deck µSD-logging fails in combination with other SPI decks like loco-deck and flow-deck Feb 5, 2018
@ataffanel
Copy link
Member

The problem has also been observed with the flow deck: https://forum.bitcraze.io/viewtopic.php?f=2&t=2821

@simplexsigil
Copy link

I played around with changing the task priorities of the USDLOG_TASK_PRI and USDWRITE_TASK_PRI as well as the priority of the "pamotion" task of the flow deck (it was hard coded to 3).

However, apart from not getting enough runtime, there also appears to be a mutex hold problem.

For the values
USDWRITE_TASK_PRI = 2
"pamotion"-Task-Priority = 1
USDLOG_TASK_PRI = 0

The following error occurs:
SYS: The system resumed after watchdog timeout [WARNING]
SYS: Assert failed at src/lib/FreeRTOS/tasks.c:3495

With these settings this always happens, with the standard settings it only happens sometimes.

@Woody7777
Copy link
Author

I made an interesting discovery when playing around with the baudrate for SPI communication:
As long as it is set to the default (2MBit), the error described in the post before occurs.
However, changing it to something higher, the error disappears but logging still seizes after a couple of log entries - as well when using higher priorities. Still, I was asking myself why we're using the lowest baudrate available?

Unfortunately, a watchdog timeout sometimes occurs when starting up using these settings (with no logged assert). I only have one crazyflie for testing, so I can't verify the behaviour.

Well, priorities and baudrate are just the most obvious settings that I think may lead to this issue.
I'm sure there are people out there knowing more details regarding this module.
I know it's not easy to instruct someone on issues like that, but I'd be really happy to help... on my own, I'm missing clues where to have a look next :-(

@tobbeanton
Copy link
Member

I quickly looked at this too and I could not find an easy solution, but I got a bit wiser. Getting fast enough bus access is one of the reasons. The flow deck runs on max 2MHz clock and the loco deck does a lot of small accesses. Either of these is giving the sd-card problems. Why it just stops logging I don't know though, that feels like a bug.

Optimizing the SPI access could be one way forward and e.g. only switch SPI speed of needed. Another thing would be to look at the mutex protection of the SPI driver. Maybe it doesn't work as it is supposed to.

@shushuai3
Copy link

In order to use both flow and SD decks, I change the SD port from spi1 to spi3. After changing the related driver as well as DMA stream, it can log for any long time while flow deck is working.

However, this is not the direct way to solve the bug. Look forward to seeing the smart solution.

@tungtumkit
Copy link

@shushuai3 can you share your changes (i.e which line of code was modified) to make this work ? For our research experiment this is very important. Thanks in advance :).

@shushuai3
Copy link

@tungnx94 You can find the code on branch ''sdspi'' in https://github.com/shushuai3/crazyflie-firmware.git
Remember to change the sd card connection as CS->PA3, SCLK->PC10, MISO->PC11, MOSI->PC12, while keeping the VCC, GND, and OW. Good luck.

@tungtumkit
Copy link

@shushuai3 I pulled your code just now. Can I use it rightaway or I need to modify anything ?

@ataffanel
Copy link
Member

@tungnx94 this requires to quite heavily patch the sd-card deck, nothing impossible but you need a good soldering iron and good soldering skills

@wydmynd
Copy link

wydmynd commented Oct 2, 2019

I just want to comment that I tried the code from SdSPIUpdate branch on tungnx94's fork. It worked.
The wiring is not so bad if instead of modifying the SD deck you just wire the relevant pads from an existing deck to a microSD breakout like this and use deck-force to force the initiation of the deck driver. tried successfully with wiring to a multiranger.

@wydmynd
Copy link

wydmynd commented Nov 21, 2019

to keep using this hack in recent firmwares you must remove all reference to RZR platforms from cf2_platform.c
the use of an SPI channel for the IMU in the RZR causes a collision

@tobbeanton
Copy link
Member

In commit bfedc1c the work from @shushuai3 was merged and updated and now put behind a compile flag.

To have the sd-card on SPI3 the SD-deck needs to be patched:
CS->RX2(PA3), SCLK->TX1(PC10), MISO->RX1(PC11), MOSI->IO_4(PC12)
or a sd-card adapter breakout can be used. Compile the firmware with "CFLAGS += -DUSDDECK_USE_ALT_PINS_AND_SPI" in config.mk.

A side effect is that the led-ring will not work due to DMA conflict with SPI3.

@wydmynd
Copy link

wydmynd commented Mar 16, 2020

thank you! this helps alot. just making sure - will this fix work on the Bolt? Since SPI configuration there is slightly different.

@matejkarasek
Copy link
Contributor

I have tested this on the Bolt with the Loco deck and it is working.
You have to change the pinout of the Loco deck as described on the wiki (very bottom of the page)

@ZaneKaminski
Copy link

ZaneKaminski commented Mar 27, 2020

Are the schematic and layout files for the microSD deck available? What microSD and connector part numbers are used on the board? Seems only the Crazyflie 1.0 hardware is available on Bitcraze's GitHub, not the decks.

@shushuai3
Copy link

@ZaneKaminski Both microSD and Crazyflie 2.0 schematics are in the wiki (https://wiki.bitcraze.io/projects:crazyflie2:index).
Meanwhile, thank @tobbeanton for merging the code into the master.

@matejkarasek
Copy link
Contributor

Great to read this issue has been solved!

Already wanted to celebrate, but I just tested this with the latest clean master (on both the CF2.1 and the Bolt) and, unfortunately, the logging will stop after a (very short) while and the crazyflie restarts, blinking red 5 times and showing the following console message:

...
DECK_CORE: Deck 0 test [OK].
DECK_CORE: Deck 1 test [OK].
SYS: The system resumed after watchdog timeout [WARNING]
SYS: No assert information found

No assert is what I got most of the times, however, once I also saw this

SYS: Assert failed at .//vendor/FreeRTOS/timers.c:822

and once this

ESTKALMAN: WARNING: Kalman prediction rate low (96)

in the console just before the restart.

The setup is CF2.1, Loco Deck & SD card Deck, LPS is set to TDoA v3, and I use the following config.txt:

10   # frequency
50    # buffer size
log   # file name
1     # enable on startup (0/1)
1     # mode (0: disabled, 1: synchronous stabilizer, 2: asynchronous)
stabilizer.roll
stabilizer.pitch
stabilizer.yaw
stabilizer.thrust
pm.vbat
stateEstimate.x
stateEstimate.y
stateEstimate.z'

At 10 Hz I get a few seconds of data before it crashes, going to 1Hz allows for ~30 seconds but then also crashes.
As expected, switching to TWR results in an immediate crash...

Anyone else experiencing this?

@tobbeanton
Copy link
Member

We probably didn't manage to test all the cases but possibly the sd-card could be to slow. Do you have any other card you could test with? When I tested I logged e.g. this:

1000   # frequency
100    # buffer size
log   # file name
1     # enable on startup (0/1)
1     # mode (0: disabled, 1: synchronous stabilizer, 2: asynchronous)
tdoa2.d7-0
tdoa2.d0-1
tdoa2.d1-2
tdoa2.d2-3
tdoa2.d3-4
tdoa2.d4-5
tdoa2.d5-6
tdoa2.d6-7
tdoa2.cc0
tdoa2.cc1
tdoa2.cc2
tdoa2.cc3
tdoa2.cc4
tdoa2.cc5
tdoa2.cc6
tdoa2.cc7
tdoa2.dist7-0
tdoa2.dist0-1
tdoa2.dist1-2
tdoa2.dist2-3
tdoa2.dist3-4
tdoa2.dist4-5
tdoa2.dist5-6
tdoa2.dist6-7

There are some new logging parameters to see the SPI access and write rate:
image

I used a SanDisc Ultra 16GB card.

@matejkarasek
Copy link
Contributor

Switching to a different uSD card type indeed solved the problem (CF2.1, uSD & LPS with TDoA v3), thanks for the tip Tobias!

  • 16GB SanDisk Ultra (class 10, A1) seems to work fine
  • 2GB Transcend (ts2gusd) apparently too slow

Still some glitches once in a while, but no more crashes:

Sample rate vs time (should be 250 Hz)
frequency

@tobbeanton
Copy link
Member

Glad it worked with another sd-card!

It would be nice to get rid of the glitches and it might be possible by increasing the log buffer a lot. The problem is that the sd-card blockes from time to time to write down to flash.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants