You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm currently debugging a nasty freeze on a CM3+ with a 4.19 kernel using a Yocto build based on meta-raspberrypi. The symptoms are as follows:
Everything from the ui to all other services of the device lock up.
No new SSH connections are accepted (but previously existing ones, most of the time, persist)
Attached USB keyboards do not get recognized (no num lock, but SysRq keys over serial still work!)
This usually happens within the first few minutes of the device running (most I/O happens there).
More I/O over a long time range seems to trigger the bug more often (I don't have clear data on this yet),
but disabling quite some of our applications make the freeze less likely.
MemFree & MemAvailable seem to rise in moment of freeze. So not a typical OOM situation.
If it happens, it often happens in batches, then often stays away for a few days.
After quite some time, I found a kind of workaround to alleviate our situation. Applying this device-tree overlay seems to make the freeze appear far less likely or not at all anymore (not sure yet which of the two applies)
The non-removable part does not seem to be necessary - I added it because I saw the kernel getting stuck in mmc_rescan often. Since we have a eMMC in the CM3+ we only need to execute it once. Freezes happen also with only non-removable, so the critical part seems to be brcm,force-pio which effectively forces the brcm2538-sdhost driver to not use DMA, but use the slower PIO instead.
Steps to reproduce the behaviour
Let our proprietary (sorry) system run for some time in a regular I/O board with good power supply.
Usually you get a freeze within 24 hours.
No good way was discovered to actually trigger the problem. Letting the system run under stress-ng seems fine.
Device (s)
Raspberry Pi CM3+
System
$ uname -a
Linux hostname 4.19.126-v7 #1 SMP Fri Sep 30 08:06:14 UTC 2022 armv7l GNU/Linux
$ vcgencmd version
Aug 26 2022 14:04:36
Copyright (c) 2012 Broadcom
version 102f1e848393c2112206fadffaaf86db04e98326 (clean) (release) (start)
This log was from one of the devices that exhibited the issue. The actual freezed tasks differ a bit from time to time, but the common one is always mmc_rescan.
With device-tree overlay that only has "non-removable" set the log becomes a bit more interesting: There are a lot of tasks and the memory info at the top seems to indicate that there's enough free pages. A bit of swap was used and applications/tasks using it seem to be stuck completely too. Another bunch of tasks are stuck in rpi_firmware_transaction which is interesting since they should not be blocked by MMC.
Note: The latter log was obtained from a serial connection with SysRq keys.
Additional context
This issue was reported at the meta-raspberrypi repository. It was suggested to report this here. The real question is: why does setting force-pio help? Is it an actual workaround a bug or is it just making the freeze less likely due to slower I/O.
The text was updated successfully, but these errors were encountered:
Some update on this: This was apparently fixed after upgrading the kernel to 5.15.x. No freezes so far. Since this is probably the only proper fix I would consider this as closed. I will come back if we still get freezes and create another issue then.
Describe the bug
Hello,
I'm currently debugging a nasty freeze on a CM3+ with a 4.19 kernel using a Yocto build based on meta-raspberrypi. The symptoms are as follows:
but disabling quite some of our applications make the freeze less likely.
After quite some time, I found a kind of workaround to alleviate our situation. Applying this device-tree overlay seems to make the freeze appear far less likely or not at all anymore (not sure yet which of the two applies)
The
non-removable
part does not seem to be necessary - I added it because I saw the kernel getting stuck inmmc_rescan
often. Since we have a eMMC in the CM3+ we only need to execute it once. Freezes happen also with onlynon-removable
, so the critical part seems to bebrcm,force-pio
which effectively forces thebrcm2538-sdhost
driver to not use DMA, but use the slower PIO instead.Steps to reproduce the behaviour
stress-ng
seems fine.Device (s)
Raspberry Pi CM3+
System
Logs
This log was from one of the devices that exhibited the issue. The actual freezed tasks differ a bit from time to time, but the common one is always
mmc_rescan
.dmesg-prior-to-fix.log
With device-tree overlay that only has "non-removable" set the log becomes a bit more interesting: There are a lot of tasks and the memory info at the top seems to indicate that there's enough free pages. A bit of swap was used and applications/tasks using it seem to be stuck completely too. Another bunch of tasks are stuck in
rpi_firmware_transaction
which is interesting since they should not be blocked by MMC.freeze-with-non-removable-but-without-force-pio.log
Note: The latter log was obtained from a serial connection with SysRq keys.
Additional context
This issue was reported at the meta-raspberrypi repository. It was suggested to report this here. The real question is: why does setting force-pio help? Is it an actual workaround a bug or is it just making the freeze less likely due to slower I/O.
The text was updated successfully, but these errors were encountered: