Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[4.19.126] Sporadic freeze related to mmc #5190

Closed
sahib opened this issue Sep 30, 2022 · 2 comments
Closed

[4.19.126] Sporadic freeze related to mmc #5190

sahib opened this issue Sep 30, 2022 · 2 comments

Comments

@sahib
Copy link

sahib commented Sep 30, 2022

Describe the bug

Hello,

I'm currently debugging a nasty freeze on a CM3+ with a 4.19 kernel using a Yocto build based on meta-raspberrypi. The symptoms are as follows:

  • Everything from the ui to all other services of the device lock up.
  • No new SSH connections are accepted (but previously existing ones, most of the time, persist)
  • Attached USB keyboards do not get recognized (no num lock, but SysRq keys over serial still work!)
  • This usually happens within the first few minutes of the device running (most I/O happens there).
  • More I/O over a long time range seems to trigger the bug more often (I don't have clear data on this yet),
    but disabling quite some of our applications make the freeze less likely.
  • MemFree & MemAvailable seem to rise in moment of freeze. So not a typical OOM situation.
  • If it happens, it often happens in batches, then often stays away for a few days.

After quite some time, I found a kind of workaround to alleviate our situation. Applying this device-tree overlay seems to make the freeze appear far less likely or not at all anymore (not sure yet which of the two applies)

/dts-v1/;
/plugin/;

/ {
	compatible = "brcm,bcm2708";
	fragment@0 {
		target = <&sdhost>;
		__overlay__ {
			non-removable;
                        brcm,force-pio;
		};
	};
};

The non-removable part does not seem to be necessary - I added it because I saw the kernel getting stuck in mmc_rescan often. Since we have a eMMC in the CM3+ we only need to execute it once. Freezes happen also with only non-removable, so the critical part seems to be brcm,force-pio which effectively forces the brcm2538-sdhost driver to not use DMA, but use the slower PIO instead.

Steps to reproduce the behaviour

  • Let our proprietary (sorry) system run for some time in a regular I/O board with good power supply.
  • Usually you get a freeze within 24 hours.
  • No good way was discovered to actually trigger the problem. Letting the system run under stress-ng seems fine.

Device (s)

Raspberry Pi CM3+

System

$ uname -a
Linux hostname 4.19.126-v7 #1 SMP Fri Sep 30 08:06:14 UTC 2022 armv7l GNU/Linux
$ vcgencmd version
Aug 26 2022 14:04:36
Copyright (c) 2012 Broadcom
version 102f1e848393c2112206fadffaaf86db04e98326 (clean) (release) (start)
$ cat /proc/cmdline
8250.nr_uarts=1 bcm2708_fb.fbwidth=480 bcm2708_fb.fbheight=800 bcm2708_fb.fbswap=1 dwc_otg.lpm_enable=0 
usbhid.mousepoll=0 vc_mem.mem_base=0x3ec00000 vc_mem.mem_size=0x40000000 cma=512M@128M coherent_pool=6M 
fbcon=vc:2-4 logo.nologo quiet video=HDMI-A-1:480x800MR-24@60 ostree=/ostree/boot.1
/poky/2fb5f4d11ab0a08ad01ad2b44fee499ee3d129c276dd0f41184fdc79850e413e/0  ostree_root=/dev/mmcblk0p2 
root=/dev/ram0 rw rootwait rootdelay=2 ramdisk_size=8192 panic=1
$ cat config.txt
disable_overscan=1
gpu_mem=128
boot_delay=0
boot_delay_ms=0
disable_splash=1
dispmanx_offline=1
dtparam=i2c1=on
dtparam=i2c_arm=on
enable_uart=1
dtoverlay=vc4-kms-v3d
hdmi_edid_file=1
avoid_warnings=2
mask_gpu_interrupt0=0x400
dtparam=audio=on

Logs

This log was from one of the devices that exhibited the issue. The actual freezed tasks differ a bit from time to time, but the common one is always mmc_rescan.

dmesg-prior-to-fix.log

With device-tree overlay that only has "non-removable" set the log becomes a bit more interesting: There are a lot of tasks and the memory info at the top seems to indicate that there's enough free pages. A bit of swap was used and applications/tasks using it seem to be stuck completely too. Another bunch of tasks are stuck in rpi_firmware_transaction which is interesting since they should not be blocked by MMC.

freeze-with-non-removable-but-without-force-pio.log

Note: The latter log was obtained from a serial connection with SysRq keys.

Additional context

This issue was reported at the meta-raspberrypi repository. It was suggested to report this here. The real question is: why does setting force-pio help? Is it an actual workaround a bug or is it just making the freeze less likely due to slower I/O.

@sahib
Copy link
Author

sahib commented Jul 5, 2023

Some update on this: This was apparently fixed after upgrading the kernel to 5.15.x. No freezes so far. Since this is probably the only proper fix I would consider this as closed. I will come back if we still get freezes and create another issue then.

@sahib sahib closed this as completed Jul 5, 2023
@sresam89
Copy link

sresam89 commented Mar 7, 2024

@sahib can you review this post please? raspberrypi/firmware#1522 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants