Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A lot of random errors from brcmfmac #1289

Closed
tomty89 opened this issue Nov 21, 2019 · 18 comments
Closed

A lot of random errors from brcmfmac #1289

tomty89 opened this issue Nov 21, 2019 · 18 comments

Comments

@tomty89
Copy link

tomty89 commented Nov 21, 2019

Describe the bug
Samples:

Nov 21 03:51:21 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: Failed access turning clock off: -110
Nov 21 03:51:21 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: HT Avail request error: -110
Nov 21 03:51:21 alarmpi kernel: brcmfmac: brcmf_sdio_readframes: RXHEADER FAILED: -110
Nov 21 03:51:21 alarmpi kernel: brcmfmac: brcmf_sdio_rxfail: abort command, terminate frame, send NAK
Nov 21 03:51:21 alarmpi kernel: brcmfmac: brcmf_sdio_readframes: RXHEADER FAILED: -110
Nov 21 03:51:21 alarmpi kernel: brcmfmac: brcmf_sdio_rxfail: abort command, terminate frame, send NAK
Nov 21 03:51:23 alarmpi kernel: brcmfmac: brcmf_sdio_rxfail: count never zeroed: last 0xffff
Nov 21 03:51:23 alarmpi kernel: brcmfmac: brcmf_sdio_hostmail: unexpected NAKHANDLED!
Nov 21 03:51:23 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: Failed access turning clock off: -110
Nov 21 03:51:23 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: HT Avail request error: -110
--- repeat many times ---
Nov 21 03:56:52 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: HT Avail request error: -110
Nov 21 03:56:52 alarmpi kernel: brcmfmac: brcmf_sdio_readframes: RXHEADER FAILED: -110
Nov 21 03:56:52 alarmpi kernel: brcmfmac: brcmf_sdio_rxfail: abort command, terminate frame, send NAK
Nov 21 03:56:52 alarmpi kernel: brcmfmac: brcmf_sdio_hostmail: mailbox indicates firmware halted
Nov 21 03:56:52 alarmpi kernel: brcmfmac: brcmf_sdio_hostmail: Version mismatch, dongle reports 255, expecting 4
Nov 21 03:56:52 alarmpi kernel: brcmfmac: brcmf_sdio_hostmail: Unknown mailbox data content: 0xffffffff
Nov 21 03:56:52 alarmpi kernel: brcmfmac: brcmf_sdio_readframes: RXHEADER FAILED: -110
Nov 21 03:56:Nov 21 03:51:21 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: Failed access turning clock off: -110
Nov 21 03:51:21 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: HT Avail request error: -110
Nov 21 03:51:21 alarmpi kernel: brcmfmac: brcmf_sdio_readframes: RXHEADER FAILED: -110
Nov 21 03:51:21 alarmpi kernel: brcmfmac: brcmf_sdio_rxfail: abort command, terminate frame, send NAK
Nov 21 03:51:21 alarmpi kernel: brcmfmac: brcmf_sdio_readframes: RXHEADER FAILED: -110
Nov 21 03:51:21 alarmpi kernel: brcmfmac: brcmf_sdio_rxfail: abort command, terminate frame, send NAK
Nov 21 03:51:23 alarmpi kernel: brcmfmac: brcmf_sdio_rxfail: count never zeroed: last 0xffff
Nov 21 03:51:23 alarmpi kernel: brcmfmac: brcmf_sdio_hostmail: unexpected NAKHANDLED!
Nov 21 03:51:23 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: Failed access turning clock off: -110
Nov 21 03:51:23 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: HT Avail request error: -110
52 alarmpi kernel: brcmfmac: brcmf_sdio_rxfail: abort command, terminate frame, send NAK
Nov 21 03:56:55 alarmpi kernel: brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
Nov 21 03:56:55 alaNov 21 03:51:21 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: Failed access turning clock off: -110
Nov 21 03:51:21 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: HT Avail request error: -110
Nov 21 03:51:21 alarmpi kernel: brcmfmac: brcmf_sdio_readframes: RXHEADER FAILED: -110
Nov 21 03:51:21 alarmpi kernel: brcmfmac: brcmf_sdio_rxfail: abort command, terminate frame, send NAK
Nov 21 03:51:21 alarmpi kernel: brcmfmac: brcmf_sdio_readframes: RXHEADER FAILED: -110
Nov 21 03:51:21 alarmpi kernel: brcmfmac: brcmf_sdio_rxfail: abort command, terminate frame, send NAK
Nov 21 03:51:23 alarmpi kernel: brcmfmac: brcmf_sdio_rxfail: count never zeroed: last 0xffff
Nov 21 03:51:23 alarmpi kernel: brcmfmac: brcmf_sdio_hostmail: unexpected NAKHANDLED!
Nov 21 03:51:23 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: Failed access turning clock off: -110
Nov 21 03:51:23 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: HT Avail request error: -110
rmpi kernel: brcmfmac: brcmf_sdio_isr: failed backplane access
Nov 21 03:56:57 alarmpi kernel: brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
Nov 21 03:56:57 alarmpi kernel: brcmfmac: brcmf_set_mpc: fail to set mpc
Nov 21 03:56:57 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: HT Avail request error: -110
Nov 21 03:57:00 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: HT Avail request error: -5
Nov 21 03:57:02 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: HT Avail request error: -5
Nov 21 03:57:05 alarmpi kernel: brcmfmac: brcmf_configure_wpaie: wme_bss_disable error -110
Nov 21 03:57:05 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: HT Avail request error: -110
Nov 21 03:57:07 alarmpi kernel: brcmfmac: brcmf_set_mpc: fail to set mpc
Nov 21 03:57:07 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: HT Avail request error: -110
Nov 21 03:57:10 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: HT Avail request error: -5
Nov 21 03:57:12 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: HT Avail request error: -5
Nov 21 03:57:15 alarmpi kernel: brcmfmac: brcmf_cfg80211_del_station: SCB_DEAUTHENTICATE_FOR_REASON failed -110
Nov 21 03:57:15 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: HT Avail request error: -110
Nov 21 03:57:18 alarmpi kernel: brcmfmac: brcmf_cfg80211_del_station: SCB_DEAUTHENTICATE_FOR_REASON failed -110
Nov 21 03:57:43 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: HT Avail request error: -5
Nov 21 03:57:45 alarmpi kernel: brcmfmac: brcmf_cfg80211_change_iface: WLC_SET_INFRA error (-110)
Nov 21 03:57:45 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: HT Avail request error: -5
Nov 21 03:57:48 alarmpi kernel: brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
Nov 21 03:57:48 alarmpi kernel: brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-110)
Nov 21 03:57:48 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: HT Avail request error: -5
Nov 21 03:57:50 alarmpi kernel: brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
Nov 21 03:57:50 alarmpi kernel: brcmfmac: brcmf_cfg80211_get_tx_power: error (-110)
Nov 21 03:57:50 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: HT Avail request error: -5
Nov 21 03:57:53 alarmpi kernel: brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
Nov 21 03:57:53 alarmpi kernel: brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-110)
Nov 21 03:57:53 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: HT Avail request error: -5
Nov 21 03:57:55 alarmpi kernel: brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110
Nov 21 03:57:55 alarmpi kernel: brcmfmac: brcmf_cfg80211_get_tx_power: error (-110)
Nov 21 03:57:55 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: HT Avail request error: -5
Nov 21 03:57:58 alarmpi kernel: brcmfmac: _brcmf_set_multicast_list: Setting mcast_list failed, -110
Nov 21 03:57:58 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: HT Avail request error: -5
Nov 21 03:58:00 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: HT Avail request error: -110
Nov 21 03:58:03 alarmpi kernel: brcmfmac: _brcmf_set_multicast_list: Setting allmulti failed, -110
Nov 21 03:58:03 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: HT Avail request error: -5
Nov 21 03:58:06 alarmpi kernel: brcmfmac: brcmf_cfg80211_change_iface: WLC_SET_INFRA error (-110)
Nov 21 03:58:06 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: HT Avail request error: -5
Nov 21 03:58:08 alarmpi kernel: brcmfmac: _brcmf_set_multicast_list: Setting BRCMF_C_SET_PROMISC failed, -110
Nov 21 03:58:08 alarmpi kernel: brcmfmac: brcmf_sdio_htclk: HT Avail request error: -110
Nov 21 03:58:11 alarmpi kernel: brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed w/status -110

To reproduce
Once I update the bootloader to 6e4a723, there are errors flooding (in an occasion the wlan interface won't even show up). Downgrading to b9f83b1 eliminates them. I am not sure if it has anything to do with the kernel update, but downgrading to 4.19.83 instead doesn't help.

System
OS is Arch Linux ARM
Device is Raspberry Pi 4 4GB
uname -a:

Linux alarmpi 4.19.84-1-ARCH #1 SMP PREEMPT Thu Nov 21 00:52:06 UTC 2019 armv7l GNU/Linux
@tomty89
Copy link
Author

tomty89 commented Nov 21, 2019

The WLAN interface is being used as an AP by the way. Here's the diff of the shipped configuration of hostapd and the one I am using:

[root@alarmpi ~]# diff /etc/hostapd/hostapd.conf{.orig,}
88c88
< ssid=test
---
> ssid=redacted
155c155
< hw_mode=g
---
> hw_mode=a
165c165
< channel=1
---
> channel=153
560c560
< #ieee80211n=1
---
> ieee80211n=1
595c595
< #ht_capab=[HT40-][SHORT-GI-20][SHORT-GI-40]
---
> ht_capab=[HT40-][SHORT-GI-20][SHORT-GI-40]
598c598
< #require_ht=1
---
> require_ht=0
1467c1467
< #wpa=2
---
> wpa=2
1476c1476
< #wpa_passphrase=secret passphrase
---
> wpa_passphrase=redacted
1531c1531
< #wpa_pairwise=TKIP CCMP
---
> wpa_pairwise=CCMP

@pelwell
Copy link
Contributor

pelwell commented Nov 21, 2019

Can you confirm that, starting with a working system using the kernel and firmware from b9f83b1, only updating start4.elf and fixup4.dat from 6e4a723 causes the errors you've reported? Have you tried the intermediate commit - 68ec481?

@tomty89
Copy link
Author

tomty89 commented Nov 21, 2019

So I downgraded to 4.19.83 again, have start4.elf and fixup4.dat from 68ec481 as the only .elf and .dat in /boot. Similar issue occurs, and upgrading to 4.19.84 doesn't eliminate it, but downgrading start4.elf and fixup4.dat to those from b9f83b1 does (they work for both 4.19.83 and 4.19.84).

For the record the errors that shows up with .83 and .84 seem to differ. Also the interface seems to have a higher chance of disappearing with .83 (I don't exactly remember if it is ever gone with .84 / always gone with .83).

@pelwell
Copy link
Contributor

pelwell commented Nov 21, 2019

Thanks - that narrows it down a bit. Unfortunately there are a large number of internal commits in that one revision, and I've not seen the error yet.

  1. How long do you typically have to wait after booting to see a problem?
  2. Assuming the errors aren't immediate, can you try keeping the system moderately busy with a background thread or two running while true; do true; done &?

@pelwell
Copy link
Contributor

pelwell commented Nov 21, 2019

  1. If the Pi is busy, try keeping it idle instead.

@timg236
Copy link

timg236 commented Nov 21, 2019

It would be useful to know if the errors only occur during throttled so running
vcgencmd get_throttled and vcgencmd measure_temp and vcgencmd measure_clock emmc

Another thing to try is setting this on config.txt
core_freq=500
core_freq_min=500

The HDMI display resolution will also alter the voltage and clock requirements so it would be useful to know that. If you are running 4Kp60 try 4kp30

@tomty89
Copy link
Author

tomty89 commented Nov 21, 2019

Right after boot. And although the errors differ a bit every boot, so far they always show up. I have systemd-networkd brings up the wlan interface on boot (hostapd is started on boot as well). Was going to see if the errors show up if it is not brought up but a miss in file copying stops the Pi from booting now, so I can't test further (maybe until next month).

For the record, the linux-raspberrypi4 package in Arch Linux ARM pulls in some extra brcmfmac firmware of unknown origin. Not sure if it is relevant (was going to delete them and see if it helps, but again I can't boot it anymore, so)

P.S. It's a headless server. Neither of the HDMI ports was being used.

@pelwell
Copy link
Contributor

pelwell commented Nov 21, 2019

systemd has been described as "aggressively parallel", starting all the things as concurrently as possible. Current draw is at its maximum during boot, and the WiFi chip has been seen to suffer when the power supply is marginal, so that's something else to bear in mind along with @timg236's suggestions.

@tomty89
Copy link
Author

tomty89 commented Nov 21, 2019

Well, I will test with another power supply (even though the current one should be 5V/3A), and see if unplugging the usb 3.0 thumb drive helps (it was being used for the root filesystem), when I can.

@timg236
Copy link

timg236 commented Nov 22, 2019

I was able to reproduce the issue and resolve it by setting over_voltage=1. If that fails try over_voltage=2

Please could you try just that change in your original config and report the values for 'vcgencmd measure voltage' with and without the config change.

N.B. The voltage varies with clock speed and higher resolutions require higher clock speeds so it's important to be consistent there i.e. if my theory is correct then a higher resolution might appear to make WiFi work as a side effect of increasing clocks / voltage.

@timg236
Copy link

timg236 commented Nov 27, 2019

@tomty89 Please could you try the over_voltage=6240 setting to see if that resolves the WiFi issue?

@tomty89
Copy link
Author

tomty89 commented Nov 28, 2019

While I couldn't test it yet anyway, why 6240?

From https://www.raspberrypi.org/documentation/configuration/config-txt/overclocking.md:

Values above 6 are only allowed when force_turbo is specified

Also:

[-16,8] equates to [0.8V,1.4V] with 0.025V steps.

Doesn't that mean 8 is the largest value that makes sense to the option?

So should I set force_turbo as well? And what does this drastically large value do anyway?

@timg236
Copy link

timg236 commented Nov 28, 2019

Sorry that should be 6250. It’s possibly board specific but large numbers translate to microvolts and that’s the smallest step that makes sense on your board.
If you are not comfortable with that then testing with over_voltage=1 would also be useful

@pelwell
Copy link
Contributor

pelwell commented Nov 28, 2019

FYI values over 500,000 (used to be 1,000,000) are treated as absolute values in microvolts, e.g. 880000 is 0.88V.

@JamesH65
Copy link
Contributor

I had this error a couple of days ago, completely filled up the SD card with syslog/kernel errors. If it reoccurs will take a closer look. Looks like the error report, if nothing else, needs to be rate limited.

popcornmix added a commit that referenced this issue Nov 29, 2019
kernel: configs:2711: Add V3D clock support on arm64
kernel: config: arm64: Switch to upstream cpufreq driver

kernel: arm/dts: 2711: Add 'pcie0' alias

kernel: drm/v3d: Set dma_mask as well as coherent_dma_mask

firmware: platform: Allow fixed voltage with avs_disable=1

firmware: EMMC: Use PLLD for EMMC for 250MHz host-clock
See: #1289

firmware: platform: Round down effective frequencies when they exceed max
See: #1290

firmware: arm_loader: Pass video mode via kernel command for composite
See: #1285

firmware: Fix lens shading table generation buglet
See: https://www.raspberrypi.org/forums/viewtopic.php?f=43&t=190586&start=75#p1534672

firmware: hdmi: Use RB2 timing for 2560x1440@60 if pixel clock is 241.5 MHz

firmware: arm_dt: Look for ethernet0 before ethernet

firmware: arm_dt: Set PCIe dma-ranges from memory size
popcornmix added a commit to Hexxeh/rpi-firmware that referenced this issue Nov 29, 2019
kernel: configs:2711: Add V3D clock support on arm64
kernel: config: arm64: Switch to upstream cpufreq driver

kernel: arm/dts: 2711: Add 'pcie0' alias

kernel: drm/v3d: Set dma_mask as well as coherent_dma_mask

firmware: platform: Allow fixed voltage with avs_disable=1

firmware: EMMC: Use PLLD for EMMC for 250MHz host-clock
See: raspberrypi/firmware#1289

firmware: platform: Round down effective frequencies when they exceed max
See: raspberrypi/firmware#1290

firmware: arm_loader: Pass video mode via kernel command for composite
See: raspberrypi/firmware#1285

firmware: Fix lens shading table generation buglet
See: https://www.raspberrypi.org/forums/viewtopic.php?f=43&t=190586&start=75#p1534672

firmware: hdmi: Use RB2 timing for 2560x1440@60 if pixel clock is 241.5 MHz

firmware: arm_dt: Look for ethernet0 before ethernet

firmware: arm_dt: Set PCIe dma-ranges from memory size
@popcornmix
Copy link
Contributor

Can you try with latest rpi-update firmware?

@tomty89
Copy link
Author

tomty89 commented Dec 16, 2019

601d36d fixed it and no more error with current master either. Thanks!

@tomty89 tomty89 closed this as completed Dec 16, 2019
@timg236
Copy link

timg236 commented Dec 16, 2019

Great, thanks for confirming the fix :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants