Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

USB disconnects and reconnects cause kernel panics #1272

Closed
jkuek opened this issue Jan 28, 2016 · 45 comments
Closed

USB disconnects and reconnects cause kernel panics #1272

jkuek opened this issue Jan 28, 2016 · 45 comments

Comments

@jkuek
Copy link

jkuek commented Jan 28, 2016

I'm experiencing an infrequent crash, which has a number of reported errors:

Other common messages seen are
"Unable to handle kernel paging request at virtual address 0x......"

In normal use, this seems to occur because one or USB devices have spontaneously disconnected and reconnected. The root cause of this is still unknown (but outside the scope of this issue)

However I am able to make the OS crash easily by running the following script to force USB devices to reconnect:

echo 0 >  /sys/bus/platform/drivers/dwc_otg/20980000.usb/buspower
echo 1 >  /sys/bus/platform/drivers/dwc_otg/20980000.usb/buspower

I have a second script which runs this in a loop (with a "sleep 2" in between each cycle), and this results in a kernel panic after a few minutes.

I'm not running any other applications at the time, and this is at the moment just a fresh install of Raspbian Jessie Lite.

I'm using a RPi B+ , with the following USB connections:
Port 2 - keyboard
Port 4 - Sierra Wireless 3g modem
Port 5 - Audio card

I've attached the output of lsusb -v, lsusb.txt

I initially thought it might be related to one of the USB devices, but I tried unplugging the modem and audio card but the problem still occurred. Interestingly, I have never seen the crash if I don't have any USB devices connected.

I originally noticed the problem on Raspbian Jessie Lite, kernel 4.1.13+. I also tried running rpi-update to update the kernel to 4.1.16+, but the problem persisted.

I'm not sure how to get the output of the kernel dumps, after the problem occurs, I reset the unit and there's no kernel info in the logs. Some photos are here: wp_20160128_16_27_29_pro
wp_20160128_17_19_45_pro

What other information can I provide to help resolve the issue?

@jkuek
Copy link
Author

jkuek commented Jan 28, 2016

I am able to crash Raspbian Wheezy 3.18.11+ with the same script, although the trace is different

@jkuek
Copy link
Author

jkuek commented Feb 2, 2016

I updated to the 4.4.0 development kernel and the problem still occurs. It's worth noting that having multiple USB devices plugged in seems to make it easier to reproduce (i.e. less cycles of USB reset needed to crash), but it can occur with just a keyboard plugged in.

@jkuek
Copy link
Author

jkuek commented Feb 3, 2016

Here's the syslog output after one run of my script:

echo 0 >  /sys/bus/platform/drivers/dwc_otg/20980000.usb/buspower
echo 1 >  /sys/bus/platform/drivers/dwc_otg/20980000.usb/buspower

on kernel 4.4.0+, same USB configuration as listed in the original post. This time it shows a warning, but this doesn't always happen, If I leave the script running in a loop, then Linux eventually dies with one of the other error messages listed earlier.

Feb  3 02:57:17 raspberrypi kernel: [  129.093029] ERROR::dwc_otg_hcd_urb_enqueue:505: Not connected
Feb  3 02:57:17 raspberrypi kernel: [  129.093029] 
Feb  3 02:57:17 raspberrypi avahi-daemon[347]: Interface eth0.IPv6 no longer relevant for mDNS.
Feb  3 02:57:17 raspberrypi avahi-daemon[347]: Leaving mDNS multicast group on interface eth0.IPv6 with address fe80::f222:ec88:3118:6fa.
Feb  3 02:57:17 raspberrypi kernel: [  129.292945] usb 1-1: USB disconnect, device number 2
Feb  3 02:57:17 raspberrypi kernel: [  129.292986] usb 1-1.1: USB disconnect, device number 3
Feb  3 02:57:17 raspberrypi kernel: [  129.293262] smsc95xx 1-1.1:1.0 eth0: unregister 'smsc95xx' usb-20980000.usb-1.1, smsc95xx USB 2.0 Ethernet
Feb  3 02:57:17 raspberrypi kernel: [  129.293356] smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
Feb  3 02:57:18 raspberrypi kernel: [  129.313049] ------------[ cut here ]------------
Feb  3 02:57:18 raspberrypi kernel: [  129.313131] WARNING: CPU: 0 PID: 15 at include/asm-generic/dma-mapping-common.h:274 __DWC_DMA_FREE+0xac/0xc8()
Feb  3 02:57:18 raspberrypi kernel: [  129.313148] Modules linked in: cfg80211 rfkill evdev snd_usb_audio joydev snd_usbmidi_lib snd_hwdep sierra_net snd_rawmidi snd_seq_device sierra usbserial bcm2835_gpiomem snd_bcm2835 bcm2835_wdt snd_pcm snd_timer snd uio_pdrv_genirq uio
Feb  3 02:57:18 raspberrypi kernel: [  129.313255] CPU: 0 PID: 15 Comm: kworker/0:1 Not tainted 4.4.0+ #834
Feb  3 02:57:18 raspberrypi kernel: [  129.313269] Hardware name: BCM2708
Feb  3 02:57:18 raspberrypi kernel: [  129.313307] Workqueue: usb_hub_wq hub_event
Feb  3 02:57:18 raspberrypi kernel: [  129.313377] [<c00169b8>] (unwind_backtrace) from [<c0013944>] (show_stack+0x20/0x24)
Feb  3 02:57:18 raspberrypi kernel: [  129.313412] [<c0013944>] (show_stack) from [<c02df2f4>] (dump_stack+0x20/0x28)
Feb  3 02:57:18 raspberrypi kernel: [  129.313447] [<c02df2f4>] (dump_stack) from [<c0021a28>] (warn_slowpath_common+0x8c/0xc4)
Feb  3 02:57:18 raspberrypi kernel: [  129.313479] [<c0021a28>] (warn_slowpath_common) from [<c0021b1c>] (warn_slowpath_null+0x2c/0x34)
Feb  3 02:57:18 raspberrypi kernel: [  129.313515] [<c0021b1c>] (warn_slowpath_null) from [<c03f8368>] (__DWC_DMA_FREE+0xac/0xc8)
Feb  3 02:57:18 raspberrypi kernel: [  129.313549] [<c03f8368>] (__DWC_DMA_FREE) from [<c03f1e20>] (dwc_otg_hcd_qh_free+0xbc/0xec)
Feb  3 02:57:18 raspberrypi kernel: [  129.313579] [<c03f1e20>] (dwc_otg_hcd_qh_free) from [<c03ea6bc>] (dwc_otg_hcd_endpoint_disable+0xac/0xc0)
Feb  3 02:57:18 raspberrypi kernel: [  129.313609] [<c03ea6bc>] (dwc_otg_hcd_endpoint_disable) from [<c03edca8>] (endpoint_disable+0x28/0x34)
Feb  3 02:57:18 raspberrypi kernel: [  129.313649] [<c03edca8>] (endpoint_disable) from [<c03c4148>] (usb_hcd_disable_endpoint+0x3c/0x40)
Feb  3 02:57:18 raspberrypi kernel: [  129.313687] [<c03c4148>] (usb_hcd_disable_endpoint) from [<c03c66c8>] (usb_disable_endpoint+0x70/0x8c)
Feb  3 02:57:18 raspberrypi kernel: [  129.313720] [<c03c66c8>] (usb_disable_endpoint) from [<c03c6904>] (usb_disable_device+0x1bc/0x1d8)
Feb  3 02:57:18 raspberrypi kernel: [  129.313752] [<c03c6904>] (usb_disable_device) from [<c03bec10>] (usb_disconnect+0x78/0x208)
Feb  3 02:57:18 raspberrypi kernel: [  129.313781] [<c03bec10>] (usb_disconnect) from [<c03bed44>] (usb_disconnect+0x1ac/0x208)
Feb  3 02:57:18 raspberrypi kernel: [  129.313809] [<c03bed44>] (usb_disconnect) from [<c03bf85c>] (hub_event+0x300/0xe4c)
Feb  3 02:57:18 raspberrypi kernel: [  129.313839] [<c03bf85c>] (hub_event) from [<c0036b84>] (process_one_work+0x110/0x39c)
Feb  3 02:57:18 raspberrypi kernel: [  129.313866] [<c0036b84>] (process_one_work) from [<c0036f6c>] (worker_thread+0x15c/0x4e4)
Feb  3 02:57:18 raspberrypi kernel: [  129.313901] [<c0036f6c>] (worker_thread) from [<c003c6e8>] (kthread+0xe0/0xfc)
Feb  3 02:57:18 raspberrypi kernel: [  129.313933] [<c003c6e8>] (kthread) from [<c000f608>] (ret_from_fork+0x14/0x2c)
Feb  3 02:57:18 raspberrypi kernel: [  129.313950] ---[ end trace 38aa7ae261971bca ]---
Feb  3 02:57:18 raspberrypi kernel: [  129.313979] ------------[ cut here ]------------
Feb  3 02:57:18 raspberrypi kernel: [  129.314012] WARNING: CPU: 0 PID: 15 at include/asm-generic/dma-mapping-common.h:274 __DWC_DMA_FREE+0xac/0xc8()
Feb  3 02:57:18 raspberrypi kernel: [  129.314025] Modules linked in: cfg80211 rfkill evdev snd_usb_audio joydev snd_usbmidi_lib snd_hwdep sierra_net snd_rawmidi snd_seq_device sierra usbserial bcm2835_gpiomem snd_bcm2835 bcm2835_wdt snd_pcm snd_timer snd uio_pdrv_genirq uio
Feb  3 02:57:18 raspberrypi kernel: [  129.314112] CPU: 0 PID: 15 Comm: kworker/0:1 Tainted: G        W       4.4.0+ #834
Feb  3 02:57:18 raspberrypi kernel: [  129.314125] Hardware name: BCM2708
Feb  3 02:57:18 raspberrypi kernel: [  129.314147] Workqueue: usb_hub_wq hub_event
Feb  3 02:57:18 raspberrypi kernel: [  129.314187] [<c00169b8>] (unwind_backtrace) from [<c0013944>] (show_stack+0x20/0x24)
Feb  3 02:57:18 raspberrypi kernel: [  129.314215] [<c0013944>] (show_stack) from [<c02df2f4>] (dump_stack+0x20/0x28)
Feb  3 02:57:18 raspberrypi kernel: [  129.314243] [<c02df2f4>] (dump_stack) from [<c0021a28>] (warn_slowpath_common+0x8c/0xc4)
Feb  3 02:57:18 raspberrypi kernel: [  129.314271] [<c0021a28>] (warn_slowpath_common) from [<c0021b1c>] (warn_slowpath_null+0x2c/0x34)
Feb  3 02:57:18 raspberrypi kernel: [  129.314303] [<c0021b1c>] (warn_slowpath_null) from [<c03f8368>] (__DWC_DMA_FREE+0xac/0xc8)
Feb  3 02:57:18 raspberrypi kernel: [  129.314334] [<c03f8368>] (__DWC_DMA_FREE) from [<c03f1e20>] (dwc_otg_hcd_qh_free+0xbc/0xec)
Feb  3 02:57:18 raspberrypi kernel: [  129.314362] [<c03f1e20>] (dwc_otg_hcd_qh_free) from [<c03ea6bc>] (dwc_otg_hcd_endpoint_disable+0xac/0xc0)
Feb  3 02:57:18 raspberrypi kernel: [  129.314390] [<c03ea6bc>] (dwc_otg_hcd_endpoint_disable) from [<c03edca8>] (endpoint_disable+0x28/0x34)
Feb  3 02:57:18 raspberrypi kernel: [  129.314422] [<c03edca8>] (endpoint_disable) from [<c03c4148>] (usb_hcd_disable_endpoint+0x3c/0x40)
Feb  3 02:57:18 raspberrypi kernel: [  129.314455] [<c03c4148>] (usb_hcd_disable_endpoint) from [<c03c66c8>] (usb_disable_endpoint+0x70/0x8c)
Feb  3 02:57:18 raspberrypi kernel: [  129.314486] [<c03c66c8>] (usb_disable_endpoint) from [<c03c68f0>] (usb_disable_device+0x1a8/0x1d8)
Feb  3 02:57:18 raspberrypi kernel: [  129.314514] [<c03c68f0>] (usb_disable_device) from [<c03bec10>] (usb_disconnect+0x78/0x208)
Feb  3 02:57:18 raspberrypi kernel: [  129.314541] [<c03bec10>] (usb_disconnect) from [<c03bed44>] (usb_disconnect+0x1ac/0x208)
Feb  3 02:57:18 raspberrypi kernel: [  129.314566] [<c03bed44>] (usb_disconnect) from [<c03bf85c>] (hub_event+0x300/0xe4c)
Feb  3 02:57:18 raspberrypi kernel: [  129.314592] [<c03bf85c>] (hub_event) from [<c0036b84>] (process_one_work+0x110/0x39c)
Feb  3 02:57:18 raspberrypi kernel: [  129.314617] [<c0036b84>] (process_one_work) from [<c0036f6c>] (worker_thread+0x15c/0x4e4)
Feb  3 02:57:18 raspberrypi kernel: [  129.314644] [<c0036f6c>] (worker_thread) from [<c003c6e8>] (kthread+0xe0/0xfc)
Feb  3 02:57:18 raspberrypi kernel: [  129.314673] [<c003c6e8>] (kthread) from [<c000f608>] (ret_from_fork+0x14/0x2c)
Feb  3 02:57:18 raspberrypi kernel: [  129.314688] ---[ end trace 38aa7ae261971bcb ]---
Feb  3 02:57:18 raspberrypi dhcpcd[581]: eth0: arp if_readrawpacket: Network is down
Feb  3 02:57:18 raspberrypi kernel: [  129.333121] usb 1-1.2: USB disconnect, device number 5
Feb  3 02:57:18 raspberrypi dhcpcd[581]: eth0: carrier lost
Feb  3 02:57:18 raspberrypi kernel: [  129.345031] usb 1-1.4: USB disconnect, device number 6
Feb  3 02:57:18 raspberrypi avahi-daemon[347]: Interface eth0.IPv4 no longer relevant for mDNS.
Feb  3 02:57:18 raspberrypi kernel: [  129.359543] usb 1-1.5: USB disconnect, device number 4
Feb  3 02:57:18 raspberrypi kernel: [  129.360124] sierra ttyUSB0: Sierra USB modem converter now disconnected from ttyUSB0
Feb  3 02:57:18 raspberrypi kernel: [  129.360240] sierra 1-1.5:1.0: device disconnected
Feb  3 02:57:18 raspberrypi kernel: [  129.360857] sierra ttyUSB1: Sierra USB modem converter now disconnected from ttyUSB1
Feb  3 02:57:18 raspberrypi kernel: [  129.360968] sierra 1-1.5:1.1: device disconnected
Feb  3 02:57:18 raspberrypi avahi-daemon[347]: Leaving mDNS multicast group on interface eth0.IPv4 with address 192.168.166.77.
Feb  3 02:57:18 raspberrypi kernel: [  129.364122] sierra ttyUSB2: Sierra USB modem converter now disconnected from ttyUSB2
Feb  3 02:57:18 raspberrypi kernel: [  129.364246] sierra 1-1.5:1.3: device disconnected
Feb  3 02:57:18 raspberrypi kernel: [  129.364561] sierra_net 1-1.5:1.7 wwan0: unregister 'sierra_net' usb-20980000.usb-1.5, Sierra Wireless USB-to-WWAN Modem
Feb  3 02:57:18 raspberrypi kernel: [  129.379400] sierra_net 1-1.5:1.7 wwan0 (unregistered): usb_control_msg failed, status -19
Feb  3 02:57:18 raspberrypi avahi-daemon[347]: Withdrawing address record for fe80::f222:ec88:3118:6fa on eth0.
Feb  3 02:57:18 raspberrypi avahi-daemon[347]: Withdrawing address record for 192.168.166.77 on eth0.
Feb  3 02:57:18 raspberrypi avahi-daemon[347]: Withdrawing workstation service for eth0.
Feb  3 02:57:18 raspberrypi avahi-daemon[347]: Interface wwan0.IPv6 no longer relevant for mDNS.
Feb  3 02:57:18 raspberrypi avahi-daemon[347]: Leaving mDNS multicast group on interface wwan0.IPv6 with address fe80::2bc6:b343:fc1d:35be.
Feb  3 02:57:18 raspberrypi avahi-daemon[347]: Withdrawing address record for fe80::2bc6:b343:fc1d:35be on wwan0.
Feb  3 02:57:18 raspberrypi avahi-daemon[347]: Withdrawing workstation service for wwan0.
Feb  3 02:57:18 raspberrypi kernel: [  129.542987] Indeed it is in host mode hprt0 = 00021501
Feb  3 02:57:18 raspberrypi kernel: [  129.862879] usb 1-1: new high-speed USB device number 7 using dwc_otg
Feb  3 02:57:18 raspberrypi kernel: [  129.863135] Indeed it is in host mode hprt0 = 00001101
Feb  3 02:57:18 raspberrypi kernel: [  130.193345] usb 1-1: New USB device found, idVendor=0424, idProduct=9514
Feb  3 02:57:18 raspberrypi kernel: [  130.193412] usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
Feb  3 02:57:18 raspberrypi kernel: [  130.209964] hub 1-1:1.0: USB hub found
Feb  3 02:57:18 raspberrypi kernel: [  130.210178] hub 1-1:1.0: 5 ports detected
Feb  3 02:57:19 raspberrypi kernel: [  130.492870] usb 1-1.1: new high-speed USB device number 8 using dwc_otg
Feb  3 02:57:19 raspberrypi kernel: [  130.593336] usb 1-1.1: New USB device found, idVendor=0424, idProduct=ec00
Feb  3 02:57:19 raspberrypi kernel: [  130.593379] usb 1-1.1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
Feb  3 02:57:19 raspberrypi kernel: [  130.608658] smsc95xx v1.0.4
Feb  3 02:57:19 raspberrypi kernel: [  130.689434] smsc95xx 1-1.1:1.0 eth0: register 'smsc95xx' at usb-20980000.usb-1.1, smsc95xx USB 2.0 Ethernet, b8:27:eb:9c:3f:64
Feb  3 02:57:19 raspberrypi dhcpcd[581]: eth0: deleting address fe80::f222:ec88:3118:6fa
Feb  3 02:57:19 raspberrypi kernel: [  130.782882] usb 1-1.5: new high-speed USB device number 9 using dwc_otg
Feb  3 02:57:19 raspberrypi dhcpcd[581]: eth0: deleting route to 192.168.166.0/24
Feb  3 02:57:19 raspberrypi dhcpcd[581]: eth0: deleting default route via 192.168.166.254
Feb  3 02:57:19 raspberrypi kernel: [  130.884465] usb 1-1.5: config 1 has an invalid interface number: 7 but max is 3
Feb  3 02:57:19 raspberrypi kernel: [  130.884512] usb 1-1.5: config 1 has no interface number 2
Feb  3 02:57:19 raspberrypi kernel: [  130.886061] usb 1-1.5: New USB device found, idVendor=1199, idProduct=68a3
Feb  3 02:57:19 raspberrypi kernel: [  130.886101] usb 1-1.5: New USB device strings: Mfr=3, Product=2, SerialNumber=0
Feb  3 02:57:19 raspberrypi kernel: [  130.886123] usb 1-1.5: Product: SL8084T
Feb  3 02:57:19 raspberrypi kernel: [  130.886142] usb 1-1.5: Manufacturer: Sierra Wireless, Incorporated
Feb  3 02:57:19 raspberrypi kernel: [  130.890455] sierra 1-1.5:1.0: Sierra USB modem converter detected
Feb  3 02:57:19 raspberrypi kernel: [  130.891908] usb 1-1.5: Sierra USB modem converter now attached to ttyUSB0
Feb  3 02:57:19 raspberrypi kernel: [  130.892805] sierra 1-1.5:1.1: Sierra USB modem converter detected
Feb  3 02:57:19 raspberrypi kernel: [  130.906451] usb 1-1.5: Sierra USB modem converter now attached to ttyUSB1
Feb  3 02:57:19 raspberrypi kernel: [  130.907240] sierra 1-1.5:1.3: Sierra USB modem converter detected
Feb  3 02:57:19 raspberrypi kernel: [  130.911344] usb 1-1.5: Sierra USB modem converter now attached to ttyUSB2
Feb  3 02:57:19 raspberrypi kernel: [  130.920175] sierra_net 1-1.5:1.7 wwan0: register 'sierra_net' at usb-20980000.usb-1.5, Sierra Wireless USB-to-WWAN Modem, 06:12:10:a2:02:07
Feb  3 02:57:19 raspberrypi kernel: [  131.192869] usb 1-1.2: new low-speed USB device number 10 using dwc_otg
Feb  3 02:57:20 raspberrypi kernel: [  131.311945] usb 1-1.2: New USB device found, idVendor=1a2c, idProduct=0021
Feb  3 02:57:20 raspberrypi kernel: [  131.311988] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
Feb  3 02:57:20 raspberrypi kernel: [  131.312009] usb 1-1.2: Product: USB Keykoard
Feb  3 02:57:20 raspberrypi kernel: [  131.312029] usb 1-1.2: Manufacturer: USB
Feb  3 02:57:20 raspberrypi kernel: [  131.342982] input: USB USB Keykoard as /devices/platform/soc/20980000.usb/usb1/1-1/1-1.2/1-1.2:1.0/0003:1A2C:0021.0004/input/input3
Feb  3 02:57:20 raspberrypi dhcpcd[581]: eth0: removing interface
Feb  3 02:57:20 raspberrypi kernel: [  131.405856] hid-generic 0003:1A2C:0021.0004: input,hidraw0: USB HID v1.10 Keyboard [USB USB Keykoard] on usb-20980000.usb-1.2/input0
Feb  3 02:57:20 raspberrypi rsyslogd-2007: action 'action 17' suspended, next retry is Wed Feb  3 02:57:50 2016 [try http://www.rsyslog.com/e/2007 ]
Feb  3 02:57:20 raspberrypi dhcpcd[581]: wwan0: removing interface
Feb  3 02:57:20 raspberrypi dhcpcd[581]: wwan0: deleting address fe80::2bc6:b343:fc1d:35be
Feb  3 02:57:20 raspberrypi ntpd[598]: Deleting interface #6 wwan0, fe80::2bc6:b343:fc1d:35be#123, interface stats: received=0, sent=0, dropped=0, active_time=99 secs
Feb  3 02:57:20 raspberrypi ntpd[598]: Deleting interface #5 eth0, fe80::f222:ec88:3118:6fa#123, interface stats: received=0, sent=0, dropped=0, active_time=99 secs
Feb  3 02:57:20 raspberrypi ntpd[598]: Deleting interface #3 eth0, 192.168.166.77#123, interface stats: received=0, sent=8, dropped=0, active_time=99 secs
Feb  3 02:57:20 raspberrypi ntpd[598]: 103.242.68.68 interface 192.168.166.77 -> (none)
Feb  3 02:57:20 raspberrypi ntpd[598]: 203.99.128.34 interface 192.168.166.77 -> (none)
Feb  3 02:57:20 raspberrypi ntpd[598]: 103.239.8.22 interface 192.168.166.77 -> (none)
Feb  3 02:57:20 raspberrypi ntpd[598]: 202.78.240.38 interface 192.168.166.77 -> (none)
Feb  3 02:57:20 raspberrypi ntpd[598]: peers refreshed
Feb  3 02:57:21 raspberrypi dhcpcd[581]: eth0: adding address fe80::f222:ec88:3118:6fa
Feb  3 02:57:21 raspberrypi dhcpcd[581]: eth0: waiting for carrier
Feb  3 02:57:21 raspberrypi kernel: [  132.686507] smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
Feb  3 02:57:21 raspberrypi kernel: [  132.687844] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
Feb  3 02:57:21 raspberrypi dhcpcd[581]: wwan0: adding address fe80::8750:7251:8d3a:2672
Feb  3 02:57:21 raspberrypi avahi-daemon[347]: Joining mDNS multicast group on interface wwan0.IPv6 with address fe80::8750:7251:8d3a:2672.
Feb  3 02:57:21 raspberrypi kernel: [  133.032084] IPv6: ADDRCONF(NETDEV_UP): wwan0: link is not ready
Feb  3 02:57:21 raspberrypi avahi-daemon[347]: New relevant interface wwan0.IPv6 for mDNS.
Feb  3 02:57:21 raspberrypi avahi-daemon[347]: Registering new address record for fe80::8750:7251:8d3a:2672 on wwan0.*.
Feb  3 02:57:21 raspberrypi dhcpcd[581]: wwan0: waiting for carrier
Feb  3 02:57:21 raspberrypi dhcpcd[581]: eth0: removing interface
Feb  3 02:57:21 raspberrypi dhcpcd[581]: eth0: deleting address fe80::f222:ec88:3118:6fa
Feb  3 02:57:22 raspberrypi dhcpcd[581]: wwan0: new hardware address: 06:12:10:a2:01:07
Feb  3 02:57:22 raspberrypi dhcpcd[581]: wwan0: removing interface
Feb  3 02:57:22 raspberrypi dhcpcd[581]: wwan0: deleting address fe80::8750:7251:8d3a:2672
Feb  3 02:57:22 raspberrypi avahi-daemon[347]: Withdrawing address record for fe80::8750:7251:8d3a:2672 on wwan0.
Feb  3 02:57:22 raspberrypi avahi-daemon[347]: Leaving mDNS multicast group on interface wwan0.IPv6 with address fe80::8750:7251:8d3a:2672.
Feb  3 02:57:22 raspberrypi avahi-daemon[347]: Interface wwan0.IPv6 no longer relevant for mDNS.
Feb  3 02:57:22 raspberrypi kernel: [  134.219030] smsc95xx 1-1.1:1.0 eth0: link up, 100Mbps, full-duplex, lpa 0xC1E1
Feb  3 02:57:22 raspberrypi kernel: [  134.227609] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Feb  3 02:57:30 raspberrypi kernel: [  141.412941] hid-generic 0003:1A2C:0021.0005: usb_submit_urb(ctrl) failed: -1
Feb  3 02:57:30 raspberrypi kernel: [  141.412989] hid-generic 0003:1A2C:0021.0005: timeout initializing reports
Feb  3 02:57:30 raspberrypi kernel: [  141.414006] input: USB USB Keykoard as /devices/platform/soc/20980000.usb/usb1/1-1/1-1.2/1-1.2:1.1/0003:1A2C:0021.0005/input/input4
Feb  3 02:57:30 raspberrypi kernel: [  141.474519] hid-generic 0003:1A2C:0021.0005: input,hidraw1: USB HID v1.10 Device [USB USB Keykoard] on usb-20980000.usb-1.2/input1
Feb  3 02:57:30 raspberrypi kernel: [  141.712927] usb 1-1.4: new full-speed USB device number 11 using dwc_otg
Feb  3 02:57:30 raspberrypi kernel: [  141.825464] usb 1-1.4: New USB device found, idVendor=1908, idProduct=2070
Feb  3 02:57:30 raspberrypi kernel: [  141.825507] usb 1-1.4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
Feb  3 02:57:30 raspberrypi kernel: [  141.825529] usb 1-1.4: Product: USB PnP Sound Device
Feb  3 02:57:30 raspberrypi kernel: [  141.825548] usb 1-1.4: Manufacturer: Generic
Feb  3 02:57:30 raspberrypi kernel: [  141.825565] usb 1-1.4: SerialNumber: 〲㌱㈱
Feb  3 02:57:30 raspberrypi kernel: [  141.842989] usb 1-1.4: 1:1: cannot get freq at ep 0x2
Feb  3 02:57:30 raspberrypi kernel: [  141.844839] usb 1-1.4: 2:1: cannot get freq at ep 0x82
Feb  3 02:57:30 raspberrypi kernel: [  141.880042] input: Generic USB PnP Sound Device as /devices/platform/soc/20980000.usb/usb1/1-1/1-1.4/1-1.4:1.3/0003:1908:2070.0006/input/input5
Feb  3 02:57:30 raspberrypi kernel: [  141.935226] hid-generic 0003:1908:2070.0006: input,hidraw2: USB HID v2.01 Device [Generic USB PnP Sound Device] on usb-20980000.usb-1.4/input3
Feb  3 02:59:04 raspberrypi kernel: [  235.745060] smsc95xx 1-1.1:1.0 eth0: link down
Feb  3 02:59:04 raspberrypi rsyslogd-2007: action 'action 17' suspended, next retry is Wed Feb  3 02:59:34 2016 [try http://www.rsyslog.com/e/2007 ]
Feb  3 02:59:06 raspberrypi kernel: [  237.305140] smsc95xx 1-1.1:1.0 eth0: link up, 100Mbps, full-duplex, lpa 0xC1E1
Feb  3 02:59:13 raspberrypi dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port 67
Feb  3 02:59:13 raspberrypi dhclient: DHCPNAK from 192.168.166.2
Feb  3 02:59:13 raspberrypi dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 8
Feb  3 02:59:13 raspberrypi dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port 67
Feb  3 02:59:13 raspberrypi dhclient: DHCPOFFER from 192.168.166.2
Feb  3 02:59:13 raspberrypi dhclient: DHCPACK from 192.168.166.2
Feb  3 02:59:13 raspberrypi avahi-daemon[347]: Joining mDNS multicast group on interface eth0.IPv4 with address 192.168.166.77.
Feb  3 02:59:13 raspberrypi avahi-daemon[347]: New relevant interface eth0.IPv4 for mDNS.
Feb  3 02:59:13 raspberrypi avahi-daemon[347]: Registering new address record for 192.168.166.77 on eth0.IPv4.
Feb  3 02:59:14 raspberrypi dhclient: bound to 192.168.166.77 -- renewal in 310951 seconds.

@HiassofT
Copy link
Contributor

HiassofT commented Feb 3, 2016

The reason for the 4.4 backtrace seems to be the same issue that we had with the i2s driver, dma_free_coherent being called with interrupts disabled. dwc_otg_hcd_qh_free calls DWC_DMA_FREE inside a spinlock_irqsave block.

Looks like preparation for using DMA pool had been started but it looks unfinished and everything's "#if 0"ed out.

@jkuek
Copy link
Author

jkuek commented Feb 4, 2016

Here's another backtrace from 4.1.13
panic 4.1.13.txt

@jkuek
Copy link
Author

jkuek commented Feb 4, 2016

Here's kern.log showing multiple oops and eventual panic on 4.4.1
panic 4.4.1.txt

@jkuek
Copy link
Author

jkuek commented Feb 4, 2016

Another log, different backtrace
dwc error 4.4.1.txt

@jkuek jkuek changed the title Kernel panics (USB-triggered?) USB disconnects and reconnects cause kernel panics Feb 9, 2016
@jkuek
Copy link
Author

jkuek commented Feb 9, 2016

Updated the kernel using apt-get to 4.1.17+ #834, still getting a crash, but this time it says

kernel BUG at drivers/usb/host/dwc_otg/dwc_otg_hcd_intr.c:2425!
Internal error: Oops - BUG: 0 [#1] ARM

Full crash log: crash 4.1.17.txt

Is anyone still maintaining the dwc_otg driver? I see there is a dwc2 driver in mainline, should I switch to that instead?

@P33M
Copy link
Contributor

P33M commented Feb 10, 2016

It's possible that memory's being leaked when disconnecting/reconnecting as dwc_otg will stop and start the host driver.

dwc2 is the upstream driver but lacks FIQ support. If you have no low- or full-speed devices then it will probably work OK. It is the go-to for Pi Zero in device mode, however.

@tpfkanep
Copy link

I found usb_modeswitch 2.3.0-1 (on archlinux) to be the cause of my kernel panics when I plug in/out my usb modem. When I downgrade the package to the previous version (2.2.6-1), all is fine.

@jkuek
Copy link
Author

jkuek commented Mar 13, 2016

@tpfkanep This particular problem doesn't just happen with the USB modem. I can reproduce it with a USB audio card or other devices as well. Having multiple USB devices seems to make it easier to reproduce though.

@jkuek
Copy link
Author

jkuek commented Mar 21, 2016

Seen again on 4.4.6+ #861:

[  767.863412] Unable to handle kernel NULL pointer dereference at virtual address 00000034
[  767.875664] pgd = c0004000
[  767.882356] [00000034] *pgd=00000000
[  767.889882] Internal error: Oops: 817 [#1] ARM
[  767.898390] Modules linked in: cfg80211 rfkill joydev evdev snd_usb_audio snd_usbmidi_lib snd_hwdep snd_rawmidi snd_seq_device sierra sierra_net usbserial snd_soc_bcm2835_i2s regmap_mmio snd_soc_core snd_pcm_dmaengine spi_bcm2835 snd_bcm2835 bcm2835_gpiomem bcm2835_wdt snd_pcm snd_timer snd uio_pdrv_genirq uio
[  767.935694] CPU: 0 PID: 173 Comm: kworker/0:5 Not tainted 4.4.6+ #861
[  767.946820] Hardware name: BCM2708
[  767.955090] Workqueue: usb_hub_wq hub_event
[  767.964199] task: da75b680 ti: d8c18000 task.ti: d8c18000
[  767.974568] PC is at assign_and_init_hc+0x60/0x648
[  767.984376] LR is at dwc_otg_hcd_select_transactions+0xb8/0x3a0
[  767.995263] pc : [<c03ed8c0>]    lr : [<c03efacc>]    psr: 00000193
[  767.995263] sp : d8c19698  ip : d8c196d0  fp : d8c196cc
[  768.016908] r10: d8c00a48  r9 : daf94460  r8 : da5ca1c0
[  768.027283] r7 : daf94400  r6 : d8c00a20  r5 : da5ca080  r4 : daf169c0
[  768.039059] r3 : daf94460  r2 : 00000000  r1 : 00000000  r0 : daf94400
[  768.050823] Flags: nzcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
[  768.063511] Control: 00c5387d  Table: 18e78008  DAC: 00000055
[  768.074696] Process kworker/0:5 (pid: 173, stack limit = 0xd8c18188)
[  768.086546] Stack: (0xd8c19698 to 0xd8c1a000)
[  768.096315] 9680:                                                       daf94428 d8c196e4
[  768.110122] 96a0: daf94430 daf94400 00000001 daf94428 d8c196e4 daf94430 daf94400 daf94460
[  768.124018] 96c0: d8c19714 d8c196d0 c03efacc c03ed86c 00000000 daf94438 c03edc30 daf169c0
[  768.137960] 96e0: da405f28 20000193 c08a88a0 daf94400 daf94428 daf94430 02000008 c08a8700
[  768.151863] 9700: f301080e 00000008 d8c19734 d8c19718 c03f39c0 c03efa20 00000000 daf94400
[  768.165837] 9720: daf94000 00000008 d8c19764 d8c19738 c03f5cf4 c03f38fc c03f20d8 dac1ec00
[  768.179936] 9740: daea1bc0 00000038 00000000 00000000 00000000 dac1ec00 d8c19774 d8c19768
[  768.194135] 9760: c03f20f4 c03f5a0c d8c19784 d8c19778 c03c4fc4 c03f20e4 d8c197bc d8c19788
[  768.208363] 9780: c005a9b4 c03c4f9c 0000000a 00000010 32343537 dac1ec00 dac1ec10 00000000
[  768.222554] 97a0: 00000000 00000000 da57fc38 daf82580 d8c197d4 d8c197c0 c005ab48 c005a92c
[  768.236665] 97c0: 00020000 dac1ec00 d8c197ec d8c197d8 c005d8d4 c005ab1c 00000038 c089a864
[  768.250759] 97e0: d8c197fc d8c197f0 c005a028 c005d844 d8c19824 d8c19800 c005a2c0 c005a004
[  768.264700] 9800: d8c19850 c0859534 d8c19850 ffffffff d8c19884 daf19c00 d8c19834 d8c19828
[  768.278615] 9820: c0010748 c005a274 d8c1984c d8c19838 c000940c c0010728 c03fcdac 60000013
[  768.292592] 9840: d8c198b4 d8c19850 c05a5024 c00093dc 00000000 00000000 da405f00 c0858f74
[  768.306637] 9860: 60000013 5a7b0d00 da5ca5c0 daf94400 daf19c00 da57fc38 daf82580 d8c198b4
[  768.320729] 9880: d8c198a0 d8c198a0 c03fcd98 c03fcdac 60000013 ffffffff c03fcd98 c00b4f28
[  768.334716] 98a0: 00000000 00000000 d8c1990c d8c198b8 c03f1f14 c03fcd80 00000004 da7b0d00
[  768.348820] 98c0: 5a7b0d00 00000001 00000000 c00198ac 00000000 da57fc50 00000400 60000013
[  768.363089] 98e0: c0862b5c daf82580 daf19c00 daf82588 00000000 02400000 00000002 00000000
[  768.377438] 9900: d8c199bc d8c19910 c03c6fcc c03f1d2c daf94400 c03f6024 d8c19954 d8c19928
[  768.391853] 9920: c03f6024 c03fbf14 00000000 00000000 d8c00a20 c03f6de4 da5ca080 da5ca1c0
[  768.406331] 9940: 00000001 d8d71840 d8c19974 c03f033c 00000008 daf94438 daf9441c daf94460
[  768.420856] 9960: d8d71840 d8c6e780 d8c19994 daf94400 00000000 d8d71858 da5ca1c0 00000001
[  768.435291] 9980: d8d71840 d8c6e780 d8c199c4 d8c19998 60000013 daf82580 02400000 00000000
[  768.449779] 99a0: 00000000 da57fc00 00000002 00000000 d8c19a04 d8c199c0 c03c88b4 c03c6f10
[  768.464401] 99c0: c03f1f14 c03fcd80 00000040 00000000 00000000 00000001 00000040 daf82580
[  768.479203] 99e0: d8c19a08 00000000 d8c19a4c 00001388 00000000 00000000 d8c19a3c d8c19a08
[  768.494090] 9a00: c03c8e38 c03c85d8 00000000 d8c19a0c d8c19a0c 00000000 d8c19a3c da7b0d00
[  768.509039] 9a20: 00000004 00000007 80002800 da57fc00 d8c19a7c d8c19a40 c03c8f60 c03c8df0
[  768.524095] 9a40: d8c4d514 00000021 00000021 00000000 da7b0a80 da7b0a80 00000004 00000007
[  768.539232] 9a60: 00000000 00000021 00000000 da42dc00 d8c19ac4 d8c19a80 c03bc54c c03c8ebc
[  768.554370] 9a80: 00000000 00000007 da7b0a80 00000004 00001388 c096e33c c82a0640 00000007
[  768.569488] 9aa0: da42dc00 bf0d947c d8f6b400 da57fc00 bf0d96e8 00000000 d8c19aec d8c19ac8
[  768.584690] 9ac0: c03bc5a8 c03bc4d0 00000007 d8c4d514 00000004 c05a15b0 d8c18000 d8c4d500
[  768.599951] 9ae0: d8c19b1c d8c19af0 bf0d884c c03bc588 00000007 d8c4d514 00000004 da57fc00
[  768.615319] 9b00: bf0d96e8 d8c4d500 da42dc00 d8f6b420 d8c19b34 d8c19b20 bf0d88a8 bf0d8810
[  768.630825] 9b20: 00000000 da42dc00 d8c19b4c d8c19b38 bf0d8d74 bf0d888c bf0d96e0 bf0d91a8
[  768.646366] 9b40: d8c19b7c d8c19b50 c03cc63c bf0d8d34 c03cc52c c095195c d8f6b420 00000000
[  768.661940] 9b60: bf0d96e8 00000911 00000000 c08a8000 d8c19ba4 d8c19b80 c03670d8 c03cc538
[  768.677499] 9b80: 00000007 d8f6b420 d8c19bf0 bf0d96e8 00000001 c0951938 d8c19bc4 d8c19ba8
[  768.693059] 9ba0: c03673dc c0366f30 d8c00a00 00000000 d8c19bf0 c0367330 d8c19bec d8c19bc8
[  768.708652] 9bc0: c0365514 c036733c dad34f5c d8c009f4 c05a31cc d8f6b420 d8f6b420 d8f6b454
[  768.724287] 9be0: d8c19c14 d8c19bf0 c0366ec0 c03654ac d8f6b420 00000001 d8f6b428 d8f6b420
[  768.739723] 9c00: c08a8018 00000000 d8c19c24 d8c19c18 c036752c c0366e30 d8c19c44 d8c19c28
[  768.754935] 9c20: c0366544 c036751c db38bd80 d8f6b428 d8f6b420 da57fc68 d8c19c84 d8c19c48
[  768.770052] 9c40: c0364534 c03664bc da5c2600 da57fc00 d8c19c84 d8c19c60 c03c9e5c d8f6b400
[  768.785148] 9c60: d8eed85c d8eed85c d8eed84c da57fc68 d8eed85c d8c1a2c8 d8c19cfc d8c19c88
[  768.800279] 9c80: c03cb038 c0364190 00000001 00000000 00000000 00000000 00001388 da403c30
[  768.815467] 9ca0: 00000001 00000004 da44b080 d8eed800 d8eed80c da57fc04 d8eed85c daf19c00
[  768.830722] 9cc0: 00000001 c03d420c d8f6b420 da44b08c c03d420c da57fc00 00000001 00000000
[  768.846072] 9ce0: c08a8620 0000090b 00000000 c08a7ed8 d8c19d14 d8c19d00 c03d4404 c03cab9c
[  768.861454] 9d00: c095195c da57fc68 d8c19d24 d8c19d18 c03cb6d8 c03d43d4 d8c19d4c d8c19d28
[  768.876854] 9d20: c03670d8 c03cb6bc 00000007 da57fc68 d8c19d98 c08a8620 00000001 dac03ac0
[  768.892311] 9d40: d8c19d6c d8c19d50 c03673dc c0366f30 dad5f1c0 00000000 d8c19d98 c0367330
[  768.907796] 9d60: d8c19d94 d8c19d70 c0365514 c036733c dad34f5c dad5f1b4 c05a31cc da57fc68
[  768.923310] 9d80: da57fc68 da57fc9c d8c19dbc d8c19d98 c0366ec0 c03654ac da57fc68 00000001
[  768.938824] 9da0: da57fc70 da57fc68 c08a8018 00000000 d8c19dcc d8c19dc0 c036752c c0366e30
[  768.954283] 9dc0: d8c19dec d8c19dd0 c0366544 c036751c db38bd80 da57fc70 da57fc68 d8ed7068
[  768.969701] 9de0: d8c19e2c d8c19df0 c0364534 c03664bc 393831a0 0039333a d8c19e2c d8c19e08
[  768.985126] 9e00: c0358480 da57fc00 da57fc68 da44b5a0 00000000 d8f93930 00000005 daf19c00
[  769.000587] 9e20: d8c19e64 d8c19e30 c03c2c6c c0364190 00000000 00000028 d8c19e64 da57fc00
[  769.016009] 9e40: 00000000 d8ed7000 00000000 d8f93930 00000005 daf19c00 d8c19eec d8c19e68
[  769.031444] 9e60: c03c3afc c03c2a78 c0862b01 00000000 00000000 00000064 00000010 da57fde4
[  769.046958] 9e80: d8ed7000 daf19c00 d8f6e6f4 d8f93a20 d8f93838 d8ed709c d8f6e600 d8f9382c
[  769.062534] 9ea0: d8ed7000 d8f6e6f4 d8f6e6f4 d8f93830 d8f9387d d8f93800 00000101 dbfd9c00
[  769.078101] 9ec0: d8c19eec d8f93930 d8c00ae0 c086072c 00000000 00000000 dbfda500 c086072c
[  769.093684] 9ee0: d8c19f2c d8c19ef0 c00374c4 c03c302c c086072c c086073c 00000008 c08bdf1c
[  769.109260] 9f00: d8c00ae0 c086072c d8c00af8 c086073c 00000008 c08bdf1c d8c00ae0 c086072c
[  769.124807] 9f20: d8c19f64 d8c19f30 c0037784 c00373b4 c0037744 00000000 00000000 00000000
[  769.140378] 9f40: da532080 d8c00ae0 c0037744 00000000 00000000 00000000 d8c19fac d8c19f68
[  769.155925] 9f60: c003d118 c0037750 80007d46 00000000 da532080 d8c00ae0 00000000 d8c19f7c
[  769.171406] 9f80: d8c19f7c 00000000 d8c19f88 d8c19f88 da532080 c003d03c 00000000 00000000
[  769.186818] 9fa0: 00000000 d8c19fb0 c000f8a8 c003d048 00000000 00000000 00000000 00000000
[  769.202207] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[  769.217521] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 03227821 494d4448
[  769.232804] [<c03ed8c0>] (assign_and_init_hc) from [<c03efacc>] (dwc_otg_hcd_select_transactions+0xb8/0x3a0)
[  769.249805] [<c03efacc>] (dwc_otg_hcd_select_transactions) from [<c03f39c0>] (dwc_otg_hcd_handle_sof_intr+0xd0/0x11c)
[  769.267613] [<c03f39c0>] (dwc_otg_hcd_handle_sof_intr) from [<c03f5cf4>] (dwc_otg_hcd_handle_intr+0x2f4/0x370)
[  769.284836] [<c03f5cf4>] (dwc_otg_hcd_handle_intr) from [<c03f20f4>] (dwc_otg_hcd_irq+0x1c/0x28)
[  769.300849] [<c03f20f4>] (dwc_otg_hcd_irq) from [<c03c4fc4>] (usb_hcd_irq+0x34/0x48)
[  769.315802] [<c03c4fc4>] (usb_hcd_irq) from [<c005a9b4>] (handle_irq_event_percpu+0x94/0x1f0)
[  769.331548] [<c005a9b4>] (handle_irq_event_percpu) from [<c005ab48>] (handle_irq_event+0x38/0x4c)
[  769.347708] [<c005ab48>] (handle_irq_event) from [<c005d8d4>] (handle_level_irq+0x9c/0x134)
[  769.363367] [<c005d8d4>] (handle_level_irq) from [<c005a028>] (generic_handle_irq+0x30/0x44)
[  769.379139] [<c005a028>] (generic_handle_irq) from [<c005a2c0>] (__handle_domain_irq+0x58/0xb8)
[  769.395216] [<c005a2c0>] (__handle_domain_irq) from [<c0010748>] (handle_IRQ+0x2c/0x30)
[  769.410586] [<c0010748>] (handle_IRQ) from [<c000940c>] (bcm2835_handle_irq+0x3c/0x58)
[  769.425886] [<c000940c>] (bcm2835_handle_irq) from [<c05a5024>] (__irq_svc+0x44/0x5c)
[  769.441083] Exception stack(0xd8c19850 to 0xd8c19898)
[  769.453456] 9840:                                     00000000 00000000 da405f00 c0858f74
[  769.468887] 9860: 60000013 5a7b0d00 da5ca5c0 daf94400 daf19c00 da57fc38 daf82580 d8c198b4
[  769.484346] 9880: d8c198a0 d8c198a0 c03fcd98 c03fcdac 60000013 ffffffff
[  769.498244] [<c05a5024>] (__irq_svc) from [<c03fcdac>] (DWC_SPINUNLOCK_IRQRESTORE+0x38/0x58)
[  769.514082] [<c03fcdac>] (DWC_SPINUNLOCK_IRQRESTORE) from [<c03f1f14>] (dwc_otg_urb_enqueue+0x1f4/0x36c)
[  769.531037] [<c03f1f14>] (dwc_otg_urb_enqueue) from [<c03c6fcc>] (usb_hcd_submit_urb+0xc8/0x8e4)
[  769.547301] [<c03c6fcc>] (usb_hcd_submit_urb) from [<c03c88b4>] (usb_submit_urb+0x2e8/0x4d8)
[  769.563241] [<c03c88b4>] (usb_submit_urb) from [<c03c8e38>] (usb_start_wait_urb+0x54/0xcc)
[  769.578993] [<c03c8e38>] (usb_start_wait_urb) from [<c03c8f60>] (usb_control_msg+0xb0/0xe8)
[  769.594824] [<c03c8f60>] (usb_control_msg) from [<c03bc54c>] (__usbnet_write_cmd+0x88/0xb8)
[  769.610628] [<c03bc54c>] (__usbnet_write_cmd) from [<c03bc5a8>] (usbnet_write_cmd+0x2c/0x34)
[  769.626527] [<c03bc5a8>] (usbnet_write_cmd) from [<bf0d884c>] (sierra_net_send_cmd.constprop.2+0x48/0x7c [sierra_net])
[  769.644785] [<bf0d884c>] (sierra_net_send_cmd.constprop.2 [sierra_net]) from [<bf0d88a8>] (sierra_net_dosync+0x28/0xa4 [sierra_net])
[  769.664336] [<bf0d88a8>] (sierra_net_dosync [sierra_net]) from [<bf0d8d74>] (sierra_net_probe+0x4c/0x58 [sierra_net])
[  769.682583] [<bf0d8d74>] (sierra_net_probe [sierra_net]) from [<c03cc63c>] (usb_probe_interface+0x110/0x1fc)
[  769.699936] [<c03cc63c>] (usb_probe_interface) from [<c03670d8>] (really_probe+0x1b4/0x2a8)
[  769.715822] [<c03670d8>] (really_probe) from [<c03673dc>] (__device_attach_driver+0xac/0xb0)
[  769.731798] [<c03673dc>] (__device_attach_driver) from [<c0365514>] (bus_for_each_drv+0x74/0xa8)
[  769.748184] [<c0365514>] (bus_for_each_drv) from [<c0366ec0>] (__device_attach+0x9c/0xe0)
[  769.763979] [<c0366ec0>] (__device_attach) from [<c036752c>] (device_initial_probe+0x1c/0x20)
[  769.780167] [<c036752c>] (device_initial_probe) from [<c0366544>] (bus_probe_device+0x94/0x9c)
[  769.796492] [<c0366544>] (bus_probe_device) from [<c0364534>] (device_add+0x3b0/0x54c)
[  769.812100] [<c0364534>] (device_add) from [<c03cb038>] (usb_set_configuration+0x4a8/0x9f4)
[  769.828142] [<c03cb038>] (usb_set_configuration) from [<c03d4404>] (generic_probe+0x3c/0x88)
[  769.844273] [<c03d4404>] (generic_probe) from [<c03cb6d8>] (usb_probe_device+0x28/0x2c)
[  769.859979] [<c03cb6d8>] (usb_probe_device) from [<c03670d8>] (really_probe+0x1b4/0x2a8)
[  769.875788] [<c03670d8>] (really_probe) from [<c03673dc>] (__device_attach_driver+0xac/0xb0)
[  769.891911] [<c03673dc>] (__device_attach_driver) from [<c0365514>] (bus_for_each_drv+0x74/0xa8)
[  769.908414] [<c0365514>] (bus_for_each_drv) from [<c0366ec0>] (__device_attach+0x9c/0xe0)
[  769.924297] [<c0366ec0>] (__device_attach) from [<c036752c>] (device_initial_probe+0x1c/0x20)
[  769.940569] [<c036752c>] (device_initial_probe) from [<c0366544>] (bus_probe_device+0x94/0x9c)
[  769.956930] [<c0366544>] (bus_probe_device) from [<c0364534>] (device_add+0x3b0/0x54c)
[  769.972615] [<c0364534>] (device_add) from [<c03c2c6c>] (usb_new_device+0x200/0x3a8)
[  769.988074] [<c03c2c6c>] (usb_new_device) from [<c03c3afc>] (hub_event+0xadc/0xf3c)
[  770.003361] [<c03c3afc>] (hub_event) from [<c00374c4>] (process_one_work+0x11c/0x39c)
[  770.018798] [<c00374c4>] (process_one_work) from [<c0037784>] (worker_thread+0x40/0x4d8)
[  770.034538] [<c0037784>] (worker_thread) from [<c003d118>] (kthread+0xdc/0xf8)
[  770.049393] [<c003d118>] (kthread) from [<c000f8a8>] (ret_from_fork+0x14/0x2c)
[  770.064216] Code: e1520003 05942034 15941034 05872064 (15821034)
[  770.077988] ---[ end trace 465aa20bc07a3b82 ]---
[  770.090270] Kernel panic - not syncing: Fatal exception in interrupt

@tpfkanep
Copy link

tpfkanep commented Jul 3, 2016

A user on the usb_modeswitch forum posted the following solution (working great on my Pi for a couple of days now):

I solved it - use the following config:

TargetVendor=0x1bbb
TargetProductList="0000,0017,00b7,011e,0191,0195"
StandardEject=1

instead of:

TargetVendor=0x1bbb
TargetProductList="0000,0017,00b7,011e,0191,0195"
MessageContent="55534243123456788000000080000606f50402527000000000000000000000"

http://www.draisberghof.de/usb_modeswitch/bb/viewtopic.php?f=4&t=2498

@jkuek
Copy link
Author

jkuek commented Jul 3, 2016

@tpfkanep Thanks for the info.

This might fix your particular issue, but as previously mentioned, I can reproduce the kernel panics even without usb_modeswitch and a usb modem, e.g with a USB audio card.

@Ruffio
Copy link

Ruffio commented Aug 17, 2016

@jkuek has your issue been resolved? If so, please close this issue. Thanks.

@vpelletier
Copy link
Contributor

vpelletier commented Dec 22, 2016

I experience what looks like the same crash. This raspberry pi B+ is headless, uses the onboard ethernet adapter and has 3 USB devices plugged in. I connected another, otherwise unused, B+ to the failing pi's on-board serial console to capture the crash, see below.

This is with a pure Debian sid armel install plus rpi_update-provided kernel. Both Debian and the lernel were updated about 4 days ago, and the pi was last rebooted a bit under 4 days ago:

>>> import time
>>> time.ctime(246972.664501)
'Sat Jan  3 20:36:12 1970'

(ie, the crash timestamp represents an offset of 3 days and 20 hours since kernel boot)

[246972.664501] Unable to handle kernel NULL pointer dereference at virtual address 00000034
[246972.672987] pgd = d5adc000
[246972.675879] [00000034] *pgd=00000000
[246972.679683] Internal error: Oops: 805 [#1] ARM

Entering kdb (current=0xd63e8da0, pid 2769) Oops: (null)
due to oops @ 0xc03fcae0
CPU: 0 PID: 2769 Comm: python Not tainted 4.4.38+ #938
Hardware name: BCM2708
task: d63e8da0 ti: d5afe000 task.ti: d5afe000
PC is at assign_and_init_hc+0x60/0x648
LR is at dwc_otg_hcd_select_transactions+0x26c/0x3a0
pc : [<c03fcae0>]    lr : [<c03feea0>]    psr: 00000193
sp : d5aff808  ip : d5aff840  fp : d5aff83c
r10: dafae400  r9 : d5869988  r8 : d5a2e680
r7 : dafae400  r6 : d5869960  r5 : d5a2e780  r4 : daf2f9c0
r3 : dafae460  r2 : 00000000  r1 : 00000000  r0 : dafae400
Flags: nzcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 00c5387d  Table: 15adc008  DAC: 00000055
CPU: 0 PID: 2769 Comm: python Not tainted 4.4.38+ #938
Hardware name: BCM2708
[<c0016cfc>] (unwind_backtrace) from [<c0013c20>] (show_stack+0x20/0x24)
[<c0013c20>] (show_stack) from [<c02e437c>] (dump_stack+0x20/0x28)
[<c02e437c>] (dump_stack) from [<c0010a9c>] (show_regs+0x1c/0x20)
[<c0010a9c>] (show_regs) from [<c00969f4>] (kdb_main_loop+0x32c/0x74c)
[<c00969f4>] (kdb_main_loop) from [<c00994f4>] (kdb_stub+0x1d0/0x404)
[<c00994f4>] (kdb_stub) from [<c008fb10>] (kgdb_handle_exception+0x28c/0x7bc)
[<c008fb10>] (kgdb_handle_exception) from [<c0016244>] (kgdb_notify+0x3c/0x94)
[<c0016244>] (kgdb_notify) from [<c003e018>] (notifier_call_chain+0x54/0x94)
[<c003e018>] (notifier_call_chain) from [<c003e220>] (notify_die+0x48/0x54)
[<c003e220>] (notify_die) from [<c0013cec>] (die+0xc8/0x374)
[<c0013cec>] (die) from [<c001e264>] (__do_kernel_fault.part.0+0x64/0x1f4)
[<c001e264>] (__do_kernel_fault.part.0) from [<c0576598>] (do_page_fault+0x234/0x3c4)
[<c0576598>] (do_page_fault) from [<c05767d4>] (do_translation_fault+0xac/0xb4)
[<c05767d4>] (do_translation_fault) from [<c0009204>] (do_DataAbort+0x44/0xc4)
[<c0009204>] (do_DataAbort) from [<c0575cfc>] (__dabt_svc+0x3c/0x60)
Exception stack(0xd5aff7b8 to 0xd5aff800)
f7a0:                                                       dafae400 00000000
f7c0: 00000000 dafae460 daf2f9c0 d5a2e780 d5869960 dafae400 d5a2e680 d5869988
f7e0: dafae400 d5aff83c d5aff840 d5aff808 c03feea0 c03fcae0 00000193 ffffffff
[<c0575cfc>] (__dabt_svc) from [<c03fcae0>] (assign_and_init_hc+0x60/0x648)
[<c03fcae0>] (assign_and_init_hc) from [<c03feea0>] (dwc_otg_hcd_select_transactions+0x26c/0x3a0)
[<c03feea0>] (dwc_otg_hcd_select_transactions) from [<c0402060>] (release_channel+0x138/0x224)
[<c0402060>] (release_channel) from [<c040251c>] (halt_channel+0x9c/0x150)
[<c040251c>] (halt_channel) from [<c0402634>] (complete_non_periodic_xfer+0x64/0x84)
[<c0402634>] (complete_non_periodic_xfer) from [<c0403334>] (handle_hc_xfercomp_intr+0x2b0/0x46c)
[<c0403334>] (handle_hc_xfercomp_intr) from [<c04047b0>] (dwc_otg_hcd_handle_hc_n_intr+0x394/0x718)
[<c04047b0>] (dwc_otg_hcd_handle_hc_n_intr) from [<c0404bec>] (dwc_otg_hcd_handle_hc_intr+0xb8/0xf8)
[<c0404bec>] (dwc_otg_hcd_handle_hc_intr) from [<c0404ea0>] (dwc_otg_hcd_handle_intr+0x274/0x370)
[<c0404ea0>] (dwc_otg_hcd_handle_intr) from [<c0401314>] (dwc_otg_hcd_irq+0x1c/0x28)
[<c0401314>] (dwc_otg_hcd_irq) from [<c03d21e0>] (usb_hcd_irq+0x34/0x48)
[<c03d21e0>] (usb_hcd_irq) from [<c005ac60>] (handle_irq_event_percpu+0x94/0x1f0)
[<c005ac60>] (handle_irq_event_percpu) from [<c005adf4>] (handle_irq_event+0x38/0x4c)
[<c005adf4>] (handle_irq_event) from [<c005db80>] (handle_level_irq+0x9c/0x134)
[<c005db80>] (handle_level_irq) from [<c005a2d4>] (generic_handle_irq+0x30/0x44)
[<c005a2d4>] (generic_handle_irq) from [<c005a56c>] (__handle_domain_irq+0x58/0xb8)
[<c005a56c>] (__handle_domain_irq) from [<c0010748>] (handle_IRQ+0x2c/0x30)
[<c0010748>] (handle_IRQ) from [<c000940c>] (bcm2835_handle_irq+0x3c/0x58)
[<c000940c>] (bcm2835_handle_irq) from [<c0575d64>] (__irq_svc+0x44/0x5c)
Exception stack(0xd5affac0 to 0xd5affb08)
fac0: 00000000 00000001 00000001 00000000 d5a68b38 da46cb80 dafe6000 40000113
fae0: dafe6480 00000000 00000000 d5affb44 d5affb10 d5affb10 c03c8880 c03c8894
fb00: 40000113 ffffffff
[<c0575d64>] (__irq_svc) from [<c03c8894>] (usbnet_start_xmit+0x570/0x658)
[<c03c8894>] (usbnet_start_xmit) from [<c049987c>] (dev_hard_start_xmit+0x258/0x3dc)
[<c049987c>] (dev_hard_start_xmit) from [<c04bb1d0>] (sch_direct_xmit+0xf8/0x1dc)
[<c04bb1d0>] (sch_direct_xmit) from [<c0499d44>] (__dev_queue_xmit+0x278/0x554)
[<c0499d44>] (__dev_queue_xmit) from [<c049a03c>] (dev_queue_xmit+0x1c/0x20)
[<c049a03c>] (dev_queue_xmit) from [<c04d4558>] (ip_finish_output2+0x18c/0x360)
[<c04d4558>] (ip_finish_output2) from [<c04d6078>] (ip_finish_output+0x168/0x250)
[<c04d6078>] (ip_finish_output) from [<c04d6ba4>] (ip_output+0xa4/0x114)
[<c04d6ba4>] (ip_output) from [<c04d630c>] (ip_local_out+0x50/0x54)
[<c04d630c>] (ip_local_out) from [<c04d6608>] (ip_queue_xmit+0x140/0x378)
[<c04d6608>] (ip_queue_xmit) from [<c04edb28>] (tcp_transmit_skb+0x448/0x914)
[<c04edb28>] (tcp_transmit_skb) from [<c04ee1e0>] (tcp_write_xmit+0x1ec/0xea0)
[<c04ee1e0>] (tcp_write_xmit) from [<c04ef274>] (tcp_push_one+0x44/0x50)
[<c04ef274>] (tcp_push_one) from [<c04e16dc>] (tcp_sendmsg+0x3f8/0xadc)
[<c04e16dc>] (tcp_sendmsg) from [<c0509548>] (inet_sendmsg+0x44/0x78)
[<c0509548>] (inet_sendmsg) from [<c047cd58>] (sock_sendmsg+0x24/0x34)
[<c047cd58>] (sock_sendmsg) from [<c047ddb0>] (SyS_sendto+0xc8/0xf0)
[<c047ddb0>] (SyS_sendto) from [<c047de00>] (SyS_send+0x28/0x30)
[<c047de00>] (SyS_send) from [<c000f820>] (ret_fast_syscall+0x0/0x1c)

@P33M
Copy link
Contributor

P33M commented Dec 22, 2016

Please provide the output of sudo lsusb -v. Also please provide information on

  • Details of any userspace programs running and accessing devices
  • The typical length of time it takes to trigger an OOPS and the usage case when the OOPS happens

@vpelletier
Copy link
Contributor

That pi is used to stream TV in my lan. Are plugged in:

  • a micro SD card adapter, to which the main sd card is rsync'ed periodically (once a month)

  • a TV tuner

  • a smartcard reader (needed to decrypt broadcast)

  • plus the b+ internal hub and ethernet adapter

The workload at crash time is: recfsusb2i (userland usb driver for the TV tuner) reading the encrypted mpeg transport stream, talking to the smartcard reader via pcscd, decrypting the stream, piping the output to a short python program which handles a tcp socket, sending the output verbatim to any connected peer.

In the case of this crash, a single host was connected.

Crashs happen with a rather low probability compared to the number of transmitted packets (TCP bandwidth when streaming is peer*2MB/s, peer being 1 or 0 in practice). For about an hour per day of streaming, crashes happen once or twice a week. I expect the TV tuner to also send 2MB/s of data and the smartcard reader just a tiny amount (not measured, though).

So, from a USB level, I expect the relevant pieces to be:

  • an interrupt endpoint (smartcard reader card inserted/removed notification), plus the interrupt endpoints for the internal hub
  • 4MB/s total bulk bandwidth usage, 2 up and 2 down

I rebooted from kdb, and have built a kernel with vmlinux image:

$ git checkout rpi-4.4.y
$ git rev-parse HEAD
5e46914b3417fe9ff42546dcacd0f41f9a0fb172
$ make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- bcmrpi_defconfig
$ nice -n 19 make -j16 ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- zImage vmlinux modules dtbs

and just booted it. Next time, I should have an actually usable kgdb... @Hexxeh: Could you build and make vmlinux available whenever you make a release ? It would help debugging.

Here are the descriptors, minus serial numbers:

Bus 001 Device 006: ID 1e3d:4082 Chipsbank Microelectronics Co., Ltd CBM4082 SD Card Reader
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0
  bDeviceProtocol         0
  bMaxPacketSize0        64
  idVendor           0x1e3d Chipsbank Microelectronics Co., Ltd
  idProduct          0x4082 CBM4082 SD Card Reader
  bcdDevice            1.00
  iManufacturer           0
  iProduct                2 Flash Disk
  iSerial                 3
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           32
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0
    bmAttributes         0x80
      (Bus Powered)
    MaxPower              100mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass         8 Mass Storage
      bInterfaceSubClass      6 SCSI
      bInterfaceProtocol     80 Bulk-Only
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x01  EP 1 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
Device Qualifier (for other device speed):
  bLength                10
  bDescriptorType         6
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0
  bDeviceProtocol         0
  bMaxPacketSize0        64
  bNumConfigurations      1
Device Status:     0x0000
  (Bus Powered)

Bus 001 Device 005: ID 04e6:5120 SCM Microsystems, Inc. SCR331-DI SmartCard Reader
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0
  bDeviceProtocol         0
  bMaxPacketSize0        16
  idVendor           0x04e6 SCM Microsystems, Inc.
  idProduct          0x5120 SCR331-DI SmartCard Reader
  bcdDevice            6.45
  iManufacturer           1 SCM Microsystems Inc.
  iProduct                2 SCR331-DI USB Smart Card Reader
  iSerial                 5
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           93
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          3 Vendor Class
    bmAttributes         0x80
      (Bus Powered)
    MaxPower              200mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           3
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass      0
      bInterfaceProtocol      0
      iInterface              4 Vendor Interface
      ** UNRECOGNIZED:  36 21 00 01 01 01 03 00 00 00 a0 0f 00 00 e0 2e 00 00 00 01 2a 00 00 29 40 05 00 00 fc 00 00 00 00 00 00 00 00 00 00 00 ba 00 01 00 07 01 00 00 ff ff 00 00 00 01
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x01  EP 1 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x82  EP 2 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x83  EP 3 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0010  1x 16 bytes
        bInterval              16
Device Status:     0x0000
  (Bus Powered)

Bus 001 Device 004: ID 0511:0046 N'Able (DataBook) Technologies, Inc.
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0
  bDeviceProtocol         0
  bMaxPacketSize0        64
  idVendor           0x0511 N'Able (DataBook) Technologies, Inc.
  idProduct          0x0046
  bcdDevice            2.00
  iManufacturer           1 DigiBest
  iProduct                2 ISDB-T 2045
  iSerial                 0
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           46
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0
    bmAttributes         0x80
      (Bus Powered)
    MaxPower              500mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           4
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass      0
      bInterfaceProtocol      0
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x02  EP 2 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x84  EP 4 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x85  EP 5 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
Device Status:     0x0000
  (Bus Powered)

Bus 001 Device 003: ID 0424:ec00 Standard Microsystems Corp. SMSC9512/9514 Fast Ethernet Adapter
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass          255 Vendor Specific Class
  bDeviceSubClass         0
  bDeviceProtocol         1
  bMaxPacketSize0        64
  idVendor           0x0424 Standard Microsystems Corp.
  idProduct          0xec00 SMSC9512/9514 Fast Ethernet Adapter
  bcdDevice            2.00
  iManufacturer           0
  iProduct                0
  iSerial                 0
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           39
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower                2mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           3
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass      0
      bInterfaceProtocol    255
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x02  EP 2 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x83  EP 3 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0010  1x 16 bytes
        bInterval               4
Device Qualifier (for other device speed):
  bLength                10
  bDescriptorType         6
  bcdUSB               2.00
  bDeviceClass          255 Vendor Specific Class
  bDeviceSubClass         0
  bDeviceProtocol         1
  bMaxPacketSize0        64
  bNumConfigurations      1
Device Status:     0x0001
  Self Powered

Bus 001 Device 002: ID 0424:9514 Standard Microsystems Corp. SMC9514 Hub
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 Unused
  bDeviceProtocol         2 TT per port
  bMaxPacketSize0        64
  idVendor           0x0424 Standard Microsystems Corp.
  idProduct          0x9514 SMC9514 Hub
  bcdDevice            2.00
  iManufacturer           0
  iProduct                0
  iSerial                 0
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           41
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower                2mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 Unused
      bInterfaceProtocol      1 Single TT
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0001  1x 1 bytes
        bInterval              12
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       1
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 Unused
      bInterfaceProtocol      2 TT per port
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0001  1x 1 bytes
        bInterval              12
Hub Descriptor:
  bLength               9
  bDescriptorType      41
  nNbrPorts             5
  wHubCharacteristic 0x000d
    Per-port power switching
    Compound device
    Per-port overcurrent protection
    TT think time 8 FS bits
  bPwrOn2PwrGood       50 * 2 milli seconds
  bHubContrCurrent      1 milli Ampere
  DeviceRemovable    0x02
  PortPwrCtrlMask    0xff
 Hub Port Status:
   Port 1: 0000.0503 highspeed power enable connect
   Port 2: 0000.0503 highspeed power enable connect
   Port 3: 0000.0103 power enable connect
   Port 4: 0000.0100 power
   Port 5: 0000.0503 highspeed power enable connect
Device Qualifier (for other device speed):
  bLength                10
  bDescriptorType         6
  bcdUSB               2.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 Unused
  bDeviceProtocol         0 Full speed (or root) hub
  bMaxPacketSize0        64
  bNumConfigurations      1
Device Status:     0x0001
  Self Powered

Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 Unused
  bDeviceProtocol         1 Single TT
  bMaxPacketSize0        64
  idVendor           0x1d6b Linux Foundation
  idProduct          0x0002 2.0 root hub
  bcdDevice            4.04
  iManufacturer           3 Linux 4.4.38+ dwc_otg_hcd
  iProduct                2 DWC OTG Controller
  iSerial                 1 20980000.usb
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           25
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower                0mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 Unused
      bInterfaceProtocol      0 Full speed (or root) hub
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0004  1x 4 bytes
        bInterval              12
Hub Descriptor:
  bLength               9
  bDescriptorType      41
  nNbrPorts             1
  wHubCharacteristic 0x0008
    Ganged power switching
    Per-port overcurrent protection
    TT think time 8 FS bits
  bPwrOn2PwrGood        1 * 2 milli seconds
  bHubContrCurrent      0 milli Ampere
  DeviceRemovable    0x00
  PortPwrCtrlMask    0xff
 Hub Port Status:
   Port 1: 0000.0503 highspeed power enable connect
Device Status:     0x0001
  Self Powered

@jkuek
Copy link
Author

jkuek commented Dec 23, 2016

I haven't tried recently but when I originally reported the issue, I was able to reproduce it fairly often by disconnecting/reconnecting USB devices from the RPi. The disconnects were either manual, or by changing the buspower file as documented.

It was definitely easier to reproduce with multiple USB devices plugged in, but the issue did still occur with a single USB device (I tried both USB audio and USB 3g modem)

My conclusion is there was something happening when multiple USB devices (or a composite device with many interfaces) are enumerating...

@vpelletier
Copy link
Contributor

vpelletier commented Dec 24, 2016

I got a similar crash and have a working gdb this time. It turns out I needed more than just vmlinux to get full debug symbols.

The crash is still in assign_and_init_hc, but this time it is the userland driver which crashed.

kdb output:

Entering kdb (current=0xd8b06d00, pid 4391) Oops: (null)
due to oops @ 0xc04039a8
CPU: 0 PID: 4391 Comm: recfsusb2i Not tainted 4.4.39+ #76
Hardware name: BCM2708
task: d8b06d00 ti: d4c46000 task.ti: d4c46000
PC is at assign_and_init_hc+0x60/0x648
LR is at dwc_otg_hcd_select_transactions+0xb8/0x3a4
pc : [<c04039a8>]    lr : [<c0405ba4>]    psr: 00000193
sp : d4c47de0  ip : d4c47e20  fp : d4c47e1c
r10: d8ba6208  r9 : dafae460  r8 : d8bb3a00
r7 : dafae400  r6 : d8ba61e0  r5 : d8bb30c0  r4 : daf2f960
r3 : dafae460  r2 : 00000000  r1 : 00000000  r0 : dafae400
Flags: nzcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 00c5387d  Table: 14ca0008  DAC: 00000055
CPU: 0 PID: 4391 Comm: recfsusb2i Not tainted 4.4.39+ #76
Hardware name: BCM2708
[<c0016a1c>] (unwind_backtrace) from [<c0013a14>] (show_stack+0x20/0x24)
[<c0013a14>] (show_stack) from [<c02e9ee4>] (dump_stack+0x20/0x28)
[<c02e9ee4>] (dump_stack) from [<c0010850>] (show_regs+0x1c/0x20)
[<c0010850>] (show_regs) from [<c00980b0>] (kdb_main_loop+0x32c/0x870)
[<c00980b0>] (kdb_main_loop) from [<c009ad04>] (kdb_stub+0x1c4/0x3f8)
[<c009ad04>] (kdb_stub) from [<c0091238>] (kgdb_handle_exception+0x2a8/0x7b8)
[<c0091238>] (kgdb_handle_exception) from [<c0015f74>] (kgdb_notify+0x3c/0x94)
[<c0015f74>] (kgdb_notify) from [<c003e594>] (notifier_call_chain+0x54/0x90)
[<c003e594>] (notifier_call_chain) from [<c003e798>] (notify_die+0x48/0x54)
[<c003e798>] (notify_die) from [<c0013ad8>] (die+0xc0/0x340)
[<c0013ad8>] (die) from [<c001e16c>] (__do_kernel_fault.part.0+0x64/0x1f4)
[<c001e16c>] (__do_kernel_fault.part.0) from [<c057e3f8>] (do_page_fault+0x1e8/0x414)
[<c057e3f8>] (do_page_fault) from [<c000922c>] (do_DataAbort+0x48/0xc8)
[<c000922c>] (do_DataAbort) from [<c057dbbc>] (__dabt_svc+0x3c/0x60)
Exception stack(0xd4c47d90 to 0xd4c47dd8)
7d80:                                     dafae400 00000000 00000000 dafae460
7da0: daf2f960 d8bb30c0 d8ba61e0 dafae400 d8bb3a00 dafae460 d8ba6208 d4c47e1c
7dc0: d4c47e20 d4c47de0 c0405ba4 c04039a8 00000193 ffffffff
[<c057dbbc>] (__dabt_svc) from [<c04039a8>] (assign_and_init_hc+0x60/0x648)
[<c04039a8>] (assign_and_init_hc) from [<c0405ba4>] (dwc_otg_hcd_select_transactions+0xb8/0x3a4)
[<c0405ba4>] (dwc_otg_hcd_select_transactions) from [<c0409a40>] (dwc_otg_hcd_handle_sof_intr+0xd0/0x120)
[<c0409a40>] (dwc_otg_hcd_handle_sof_intr) from [<c040bccc>] (dwc_otg_hcd_handle_intr+0x2d4/0x360)
[<c040bccc>] (dwc_otg_hcd_handle_intr) from [<c0408188>] (dwc_otg_hcd_irq+0x1c/0x28)
[<c0408188>] (dwc_otg_hcd_irq) from [<c03d907c>] (usb_hcd_irq+0x34/0x48)
[<c03d907c>] (usb_hcd_irq) from [<c005b3f8>] (handle_irq_event_percpu+0x58/0x218)
[<c005b3f8>] (handle_irq_event_percpu) from [<c005b5f0>] (handle_irq_event+0x38/0x4c)
[<c005b5f0>] (handle_irq_event) from [<c005e3f4>] (handle_level_irq+0xa4/0x13c)
[<c005e3f4>] (handle_level_irq) from [<c005aab0>] (generic_handle_irq+0x30/0x44)
[<c005aab0>] (generic_handle_irq) from [<c005ad48>] (__handle_domain_irq+0x58/0xb8)
[<c005ad48>] (__handle_domain_irq) from [<c0010504>] (handle_IRQ+0x28/0x2c)
[<c0010504>] (handle_IRQ) from [<c000943c>] (bcm2835_handle_irq+0x44/0x5c)
[<c000943c>] (bcm2835_handle_irq) from [<c057def4>] (__irq_usr+0x54/0x80)
Exception stack(0xd4c47fb0 to 0xd4c47ff8)
7fa0:                                     00000004 00000003 ee5a6c66 d28c953f
7fc0: f2fe785c 07fa333e 3afb0755 2071e31d cf722380 890575ba aa3c7534 4677563a
7fe0: c2b528aa bed679f8 3d01346b b6f7736c 20000010 ffffffff

reassembled gdb session:

(gdb) lx-dmesg
[...]
[81169.614263] Unable to handle kernel NULL pointer dereference at virtual address 00000034
[81169.622633] pgd = d4ca0000
[81169.625424] [00000034] *pgd=18a15831, *pte=00000000, *ppte=00000000
[81169.631923] Internal error: Oops: 817 [#1] ARM
[81269.525220] Unable to handle kernel NULL pointer dereference at virtual address 00000034
[81269.542157] pgd = c0004000
[81269.549264] [00000034] *pgd=00000000
[81269.557194] Internal error: Oops: 817 [#2] ARM
[81269.566015] KGDB: waiting... or $3#33 for KDB
(gdb) bt
#0  0xc04039a8 in assign_and_init_hc (hcd=0xdafae400, qh=0x0 <__vectors_start>) at drivers/usb/host/dwc_otg/dwc_otg_hcd.c:1204
#1  0xc0405ba4 in dwc_otg_hcd_select_transactions (hcd=0xdafae400) at drivers/usb/host/dwc_otg/dwc_otg_hcd.c:1997
#2  0xc04063a4 in dwc_otg_hcd_urb_enqueue (hcd=0xdafae400, dwc_otg_urb=<optimized out>, ep_handle=0xda48b9d8, atomic_alloc=<optimized out>) at drivers/usb/host/dwc_otg/dwc_otg_hcd.c:550
#3  0xc0407ffc in dwc_otg_urb_enqueue (hcd=0xdaf52e00, urb=0xda475000, mem_flags=<optimized out>) at drivers/usb/host/dwc_otg/dwc_otg_hcd_linux.c:818
#4  0xc03db20c in usb_hcd_submit_urb (urb=0xda475000, mem_flags=37748736) at drivers/usb/core/hcd.c:1624
#5  0xc03dcb48 in usb_submit_urb (urb=0xda475000, mem_flags=37748736) at drivers/usb/core/urb.c:542
#6  0xc041464c in usb_stor_msg_common (us=0xdafae400, timeout=0) at drivers/usb/storage/transport.c:154
#7  0xc0414b10 in usb_stor_bulk_transfer_buf (us=0xda5483f0, pipe=3221259776, buf=<optimized out>, length=31, act_len=0x0 <__vectors_start>) at drivers/usb/storage/transport.c:401
#8  0xc0414dc0 in usb_stor_Bulk_transport (srb=0xd4c63300, us=0xda5483f0) at drivers/usb/storage/transport.c:1106
#9  0xc04156e8 in usb_stor_invoke_transport (srb=0xd4c63300, us=0xda5483f0) at drivers/usb/storage/transport.c:604
#10 0xc0414330 in usb_stor_transparent_scsi_command (srb=<optimized out>, us=<optimized out>) at drivers/usb/storage/protocol.c:122
#11 0xc0416cb4 in usb_stor_control_thread (__us=0xda5483f0) at drivers/usb/storage/usb.c:381
#12 0xc003d7e0 in kthread (_create=0xda471d60) at kernel/kthread.c:209
#13 0xc000f668 in ret_from_fork () at arch/arm/kernel/entry-common.S:118
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) print hcd->free_hc_list.cqh_first.hc_list_entry
$41 = {cqe_next = 0x0 <__vectors_start>, cqe_prev = 0x0 <__vectors_start>}
(gdb) disassemble /m
Dump of assembler code for function assign_and_init_hc:
[...]
1201            hc = DWC_CIRCLEQ_FIRST(&hcd->free_hc_list);
   0xc040398c <+68>:    mov     r3, r7
   0xc0403990 <+72>:    ldr     r4, [r3, #96]!  ; 0x60

1202
1203            /* Remove the host channel from the free list. */
1204            DWC_CIRCLEQ_REMOVE_INIT(&hcd->free_hc_list, hc, hc_list_entry);
   0xc0403994 <+76>:    ldr     r2, [r4, #48]   ; 0x30
   0xc0403998 <+80>:    cmp     r2, r3
   0xc040399c <+84>:    ldreq   r2, [r4, #52]   ; 0x34
   0xc04039a0 <+88>:    ldrne   r1, [r4, #52]   ; 0x34
   0xc04039a4 <+92>:    streq   r2, [r7, #100]  ; 0x64
=> 0xc04039a8 <+96>:    strne   r1, [r2, #52]   ; 0x34
   0xc04039ac <+100>:   ldr     r2, [r4, #52]   ; 0x34
   0xc04039b0 <+104>:   cmp     r3, r2
   0xc04039b4 <+108>:   ldr     r3, [r4, #48]   ; 0x30
   0xc04039b8 <+112>:   strne   r3, [r2, #48]   ; 0x30
   0xc04039bc <+116>:   streq   r3, [r7, #96]   ; 0x60
   0xc04039c0 <+120>:   mov     r2, #0
   0xc04039c8 <+128>:   str     r2, [r4, #48]   ; 0x30
   0xc04039cc <+132>:   str     r2, [r4, #52]   ; 0x34

1205
1206            qh->channel = hc;
   0xc04039d0 <+136>:   str     r4, [r6, #16]

So the null pointer dereference happens when removing an entry from free_hc_list.cqh_first.

I'll keep reading the code and report if I find something - but if anyone is familiar with this code, please do not wait for me.

[edit]: a bit more from the gdb session:

(gdb) print *hcd
$13 = {
  otg_dev = 0xdaf952e0,
  core_if = 0xdafae000,
  fops = 0xc08769a0 <hcd_fops>,
  flags = {
    d32 = 2,
    b = {
      port_connect_status_change = 0,
      port_connect_status = 1,
      port_reset_change = 0,
      port_enable_change = 0,
      port_suspend_change = 0,
      port_over_current_change = 0,
      port_l1_change = 0,
      reserved = 0
    }
  },
  non_periodic_sched_inactive = {next = 0xda480ec8, prev = 0xda480ec8},
  non_periodic_sched_active = {next = 0xd8bdabc8, prev = 0xda6ef088},
  non_periodic_qh_ptr = 0xd8bdabc8,
  periodic_sched_inactive = {next = 0xda6ef388, prev = 0xdaf9f868},
  periodic_sched_ready = {next = 0xd8ba6208, prev = 0xd8ba6208},
  periodic_sched_assigned = {next = 0xdafae438, prev = 0xdafae438},
  periodic_sched_queued = {next = 0xdafae440, prev = 0xdafae440},
  periodic_usecs = 3,
  frame_usecs = {97, 100, 100, 100, 100, 100, 30, 0},
  frame_number = 1791,
  periodic_qh_count = 3,
  free_hc_list = {cqh_first = 0xdaf2f960, cqh_last = 0xdaf2fa20},
  periodic_channels = 0,
  non_periodic_channels = 0,
  available_host_channels = 51,
  hc_ptr_array = {
    0xdaf2f960,
    0xdaf2f9c0,
    0xdaf2fa20,
    0xdaf2fa80,
    0xdaf2fae0,
    0xdaf2fb40,
    0xdaf2fba0,
    0xdaf2fc00,
    0x0 <__vectors_start>,
    0x0 <__vectors_start>,
    0x0 <__vectors_start>,
    0x0 <__vectors_start>,
    0x0 <__vectors_start>,
    0x0 <__vectors_start>,
    0x0 <__vectors_start>,
    0x0 <__vectors_start>
  },
  status_buf = 0xdb812000 "",
  status_buf_dma = 1535188992,
  conn_timer = 0xdaf95540,
  reset_tasklet = 0xdaf95580,
  completion_tasklet = 0xdaf955a0,
  completed_urb_list = {tqh_first = 0x0 <__vectors_start>, tqh_last = 0xdafae4c8},
  lock = 0x1 <__vectors_start>,
  channel_lock = 0x1 <__vectors_start>,
  priv = 0xdaf52e00,
  otg_port = 1 '\001',
  frame_list = 0x0 <__vectors_start>,
  hub_port = {0 <repeats 128 times>},
  frame_list_dma = 0,
  fiq_stack = 0xdac36000,
  fiq_state = 0xdafae800,
  fiq_dmab = 0xdb814000
}

[edit2] Dumping details from free_hc_list and hc_ptr_array

(gdb) print  hcd->core_if->core_params->host_channels
$53 = 8
(gdb) print &hcd->free_hc_list
$54 = (struct hc_list *) 0xdafae460

(gdb) print hcd->hc_ptr_array[0]->hc_list_entry
$45 = {cqe_next = 0x0 <__vectors_start>, cqe_prev = 0x0 <__vectors_start>}
(gdb) print hcd->hc_ptr_array[1]->hc_list_entry
$46 = {cqe_next = 0x0 <__vectors_start>, cqe_prev = 0x0 <__vectors_start>}
(gdb) print hcd->hc_ptr_array[2]->hc_list_entry
$47 = {cqe_next = 0xdafae460, cqe_prev = 0xdaf2fa80}
(gdb) print hcd->hc_ptr_array[3]->hc_list_entry
$48 = {cqe_next = 0xdaf2fa20, cqe_prev = 0xdaf2f960}
(gdb) print hcd->hc_ptr_array[4]->hc_list_entry
$49 = {cqe_next = 0xdaf2fa80, cqe_prev = 0xdaf2fa80}
(gdb) print hcd->hc_ptr_array[5]->hc_list_entry
$50 = {cqe_next = 0xdaf2f9c0, cqe_prev = 0xdaf2f9c0}
(gdb) print hcd->hc_ptr_array[6]->hc_list_entry
$51 = {cqe_next = 0xdaf2f9c0, cqe_prev = 0xdaf2f9c0}
(gdb) print hcd->hc_ptr_array[7]->hc_list_entry
$52 = {cqe_next = 0xdaf2fb40, cqe_prev = 0xdaf2fb40}

So hc_ptr_array contains what looks like a happy mess of links (circular links being asymetric in many places)... Which I fail to understand so far:

  • So free_hc_list contains entries 0, 2 and 3 when following cqh_last on head and cqe_prev on each element. 3's cqe_next is free_hc_list. 0 is the broken entry.

  • Entry 4 is linked (next & prev) to entry 2.

  • Entries 5 and 6 are linked (next & prev) to entry 1, which has null links. Entry 7 is linked (next & prev) to entry 5.

@vpelletier
Copy link
Contributor

So, after more investigations...

The qh=0 in #0 0xc04039a8 in assign_and_init_hc (hcd=0xdafae400, qh=0x0 <__vectors_start>) at drivers/usb/host/dwc_otg/dwc_otg_hcd.c:1204 is a red herring: just r1 being being reused early in the function.

The issue seems to revolve around free_hc_list getting broken, until the first entry has null pointers in its hc_list_entry. At a glance, the circular list operations are fine. The pattern seems to be producer/consumer, producer being an interrupt handler (and related ddma code), and consumer being dwc_otg_hcd.c, initialising the list on initialisation, filling it on hcd role start and "doing stuff" when the free list is not empty. I did not check locking yet, and am trying to see if there would be a "surprise" re-init which could confuse the data structure. I'm not sure how to add tests for locking.

@vpelletier
Copy link
Contributor

For the reference, I switched to dwc2 driver (loading the proper devicetree overlay), and did not have a crash so far. If I understand correctly, the dwc2 module lacks some ARM-specific IRQ-ish mechanism the out-of[-vanilla]-tree driver has. As the hc_ptr_array items are produced in the interrupt handler it makes some sense.

At a glance, I did not notice a system load increase, but this rpi is comfortably overspec for what it does so it would take a large one for an increase to be noticeable.

@jkuek
Copy link
Author

jkuek commented Jan 23, 2017

Thanks for the update. Last time I tried the dwc2 driver, I had some problems with other USB full-speed devices, eg. keyboards. I'm not sure whether that is still the case.

@P33M
Copy link
Contributor

P33M commented May 4, 2017

There's been several bugfixes that may have a bearing on this issue. Please retest with a Pi that has latest rpi-update kernel.

@jkuek
Copy link
Author

jkuek commented May 8, 2017

I re-ran the tests today, and could still cause the system to throw a kernel panic.

I've attached the console output.
console.txt

For completeness, here is my USB set-up:

pi@raspberrypi:~$ lsusb -t
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=dwc_otg/1p, 480M
    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/5p, 480M
        |__ Port 1: Dev 3, If 0, Class=Vendor Specific Class, Driver=smsc95xx, 480M
        |__ Port 2: Dev 6, If 0, Class=Human Interface Device, Driver=usbhid, 1.5M
        |__ Port 2: Dev 6, If 1, Class=Human Interface Device, Driver=usbhid, 1.5M
        |__ Port 4: Dev 4, If 1, Class=Vendor Specific Class, Driver=sierra, 480M
        |__ Port 4: Dev 4, If 2, Class=Vendor Specific Class, Driver=sierra, 480M
        |__ Port 4: Dev 4, If 0, Class=Vendor Specific Class, Driver=sierra, 480M
        |__ Port 4: Dev 4, If 7, Class=Vendor Specific Class, Driver=sierra_net, 480M
        |__ Port 4: Dev 4, If 3, Class=Vendor Specific Class, Driver=sierra, 480M
        |__ Port 5: Dev 5, If 0, Class=Audio, Driver=snd-usb-audio, 12M
        |__ Port 5: Dev 5, If 1, Class=Audio, Driver=snd-usb-audio, 12M
        |__ Port 5: Dev 5, If 2, Class=Audio, Driver=snd-usb-audio, 12M
        |__ Port 5: Dev 5, If 3, Class=Human Interface Device, Driver=usbhid, 12M
pi@raspberrypi:~$ lsusb
Bus 001 Device 005: ID 0d8c:0016 C-Media Electronics, Inc.
Bus 001 Device 004: ID 1199:68a3 Sierra Wireless, Inc. MC8700 Modem
Bus 001 Device 006: ID 1a2c:0021 China Resource Semico Co., Ltd Keyboard
Bus 001 Device 003: ID 0424:ec00 Standard Microsystems Corp. SMSC9512/9514 Fast Ethernet Adapter
Bus 001 Device 002: ID 0424:9514 Standard Microsystems Corp.
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

To trigger the panic, I am running this script:

#!/bin/bash
tail -f /var/log/syslog &

while :
do
	./usb
	sleep 10
done

which calls this script:

echo 0 >  /sys/bus/platform/drivers/dwc_otg/20980000.usb/buspower
sleep 5
echo 1 >  /sys/bus/platform/drivers/dwc_otg/20980000.usb/buspower

I ran it a few times, the issue appears to happen randomly

@P33M
Copy link
Contributor

P33M commented May 8, 2017

I'm running the same script (with a few modifications) with 3x full-speed devices and a keyboard in addition to a connected eth0 interface.

I'm not seeing a crash, which implies that whatever happening is a function of what devices you have connected (:()
What does userspace do when the devices get connected? Do programs start automatically using the devices?

@P33M
Copy link
Contributor

P33M commented May 8, 2017

Ah. I have seen one crash after ~30 minutes of runtime. Investigating...

@P33M
Copy link
Contributor

P33M commented May 8, 2017

I've also seen two crashes where in dwc_otg_hcd_handle_hc_fsm(), qh->qtd_list ends up being NULL. Unsure if this is related to having a corrupted free_hc_list.

I have found that hcd->available_host_channels gets steadily incremented above the hardware maximum of 8 channels as we potentially increment it twice when dequeueing a transfer currently being performed by the FIQ. This may or may not be enough to trigger double-frees/double-adds in free_hc_list...

@P33M
Copy link
Contributor

P33M commented May 8, 2017

Smoking gun located: we modify host channel lists, queue heads and URB lists in kill_urbs_in_qh_list(), which is called from cil_hcd_disconnect(). It appears that we do this without taking either the HCD lock or the host channel lock.

@jkuek
Copy link
Author

jkuek commented May 9, 2017

Sounds like great progress. As I've mentioned previously, different combinations of USB devices do seem to influence how often the crash occurs, but I've seen it fail with just a USB keyboard attached.

@jkuek
Copy link
Author

jkuek commented May 9, 2017

@P33M There was a comment in this issue from HiassofT on 4 Feb 2016. I don't claim to understand it, but it might help get to the bottom of what's going on.

@P33M
Copy link
Contributor

P33M commented May 9, 2017

The more I dig the more wrong-ness I keep finding...

  • As far as I can tell, hcd->channel_lock is basically useless as it's only held when the hcd->lock is being held.
  • There are multiple call sites for cil_hcd_disconnect(), and depending on the state of the OTG core (device, B host or A host) the hcd->lock is held in A host but dropped in the other two cases (?)
  • It's possible to queue a FIQ split transaction with the state machine not in the idle state. This is easily reproducible as the second or third transfer when enumerating any low- or full-speed device typically ends up in this state. This is bad because based on my current understanding, the host channel shouldn't even be considered for a transfer if one is still running.

@P33M
Copy link
Contributor

P33M commented May 10, 2017

As far as point 2) goes, cil_hcd_disconnect() is called while holding the core lock but not the hcd lock. After adding the HCD lock inside the disconnect callback, I no longer get spurious crashes in dwc_otg_hcd_handle_hc_fsm() (or at least, I've not seen any more) but the crash due to a corrupt free_hc_list in assign_and_init_hc() is still there.
hcd->available_host_channels is still getting incremented above the hardware maximum which is baffling.

I think I will swap away from root port disconnect crashes while I have a more reproducible bug pending (queueing transactions without a sane initial FIQ state).

@P33M
Copy link
Contributor

P33M commented May 12, 2017

Point 3) is a small wild goose chase - because there are only two non-periodic transaction queues (inactive and active), on each scheduling call dwc_otg_queue_transactions() will repeatedly process queue heads that have already been assigned to a channel. I've replicated the early-out mechanism for FIQ enabled channels.

I've fixed several potential crash-inducing conditions due to the FIQ racing against cil_hcd_disconnect() and as far as I can tell, the bug has gone down in severity but is still present - now I get a much longer interval between crashes in assign_and_init_hc(), but before then the driver "forgets" about host channels that it has available and no USB transfers happen after that point.

@P33M
Copy link
Contributor

P33M commented May 15, 2017

I believe I have a fix for the list corruption - in kill_all_urbs()->kill_urbs_in_qh_list(), if a transfer is in progress, the channel is forcibly halted but HAINTMSK is not altered. Subsequently in dwc_otg_hcd_disconnect_cb() the channel is unconditionally cleaned up (all interrupt sources masked) and reinserted into free_hc_list.

The corruption happens because if the CPU is slow in executing the second unnecessary disable, a bit in HAINT is set as the channel already halted, which then triggers ...handle_hc_n_intr() which results in the channel being inserted into the list a second time.

The fix is to just let the dequeue code path run to completion and delete the second cleanup.

@jkuek
Copy link
Author

jkuek commented May 16, 2017

I have run the test script for a few hours - so far so good.
I notice a number of new error messages show up in syslog:

May 16 14:18:23 raspberrypi kernel: [ 8600.477366] ERROR::dwc_otg_hcd_urb_enqueue:487: Not connected
May 16 14:18:23 raspberrypi kernel: [ 8600.477366] 
May 16 14:18:23 raspberrypi kernel: [ 8600.477481] input: USB USB Keykoard as /devices/platform/soc/20980000.usb/usb1/1-1/1-1.2/1-1.2:1.1/0003:1A2C:0021.0006/input/input5
May 16 14:18:23 raspberrypi kernel: [ 8600.477542] ERROR::dwc_otg_hcd_urb_enqueue:487: Not connected
May 16 14:18:23 raspberrypi kernel: [ 8600.477542] 
May 16 14:18:23 raspberrypi kernel: [ 8600.505325] ERROR::dwc_otg_hcd_urb_enqueue:487: Not connected
May 16 14:18:23 raspberrypi kernel: [ 8600.505325] 
May 16 14:18:24 raspberrypi kernel: [ 8600.545356] ERROR::dwc_otg_hcd_urb_enqueue:487: Not connected
May 16 14:18:24 raspberrypi kernel: [ 8600.545356] 

I've left the unit to run overnight, will report again tomorrow.

As an aside, I've seen issues on current 4.4.x kernels where sometimes disconnecting/reconnecting my 3G modem can seemingly cause all other USB devices to disconnect and reconnect. I had put this down as a probably hardware problem (perhaps power-related?), but have you seen anything in your recent travels that might suggest a software reason?

@P33M
Copy link
Contributor

P33M commented May 16, 2017

3G modems typically play fast and loose with the USB maximum Vbus current draw (500mA). Peak loading when the modem is transmitting can exceed several watts.

I can well imagine that the power supplies draw a surge on hotplug that upsets Vbus on the Pi.

I've run the disconnect/reconnect script overnight with 3 full-speed devices plugged in, no crashes resulted. Closing as fixed.

@P33M P33M closed this as completed May 16, 2017
@jkuek
Copy link
Author

jkuek commented May 16, 2017

Ah yes, but in my case the modem is externally powered, and it's just a USB connection to the Pi.

I concur with your findings, I ran the script overnight and the unit is still running! Thanks.

Will those syslog error messages be in the official release? The additional extra empty line could be removed.

@jkuek
Copy link
Author

jkuek commented May 17, 2017

@P33M just noticed one issue:

I stopped the script after running it for > 24 hours, but my USB network and 3G devices were no longer working.

My USB devices were present:

pi@raspberrypi:~$ lsusb
Bus 001 Device 100: ID 0d8c:0016 C-Media Electronics, Inc.
Bus 001 Device 099: ID 1199:68a3 Sierra Wireless, Inc. MC8700 Modem
Bus 001 Device 101: ID 1a2c:0021 China Resource Semico Co., Ltd Keyboard
Bus 001 Device 098: ID 0424:ec00 Standard Microsystems Corp. SMSC9512/9514 Fast Ethernet Adapter
Bus 001 Device 097: ID 0424:9514 Standard Microsystems Corp.
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
pi@raspberrypi:~$ lsusb -t
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=dwc_otg/1p, 480M
    |__ Port 1: Dev 97, If 0, Class=Hub, Driver=hub/5p, 480M
        |__ Port 1: Dev 98, If 0, Class=Vendor Specific Class, Driver=smsc95xx, 480M
        |__ Port 2: Dev 101, If 0, Class=Human Interface Device, Driver=usbhid, 1.5M
        |__ Port 2: Dev 101, If 1, Class=Human Interface Device, Driver=usbhid, 1.5M
        |__ Port 4: Dev 99, If 1, Class=Vendor Specific Class, Driver=sierra, 480M
        |__ Port 4: Dev 99, If 2, Class=Vendor Specific Class, Driver=sierra, 480M
        |__ Port 4: Dev 99, If 0, Class=Vendor Specific Class, Driver=sierra, 480M
        |__ Port 4: Dev 99, If 7, Class=Vendor Specific Class, Driver=sierra_net, 480M
        |__ Port 4: Dev 99, If 3, Class=Vendor Specific Class, Driver=sierra, 480M
        |__ Port 5: Dev 100, If 0, Class=Audio, Driver=snd-usb-audio, 12M
        |__ Port 5: Dev 100, If 1, Class=Audio, Driver=snd-usb-audio, 12M
        |__ Port 5: Dev 100, If 2, Class=Audio, Driver=snd-usb-audio, 12M
        |__ Port 5: Dev 100, If 3, Class=Human Interface Device, Driver=usbhid, 12M
pi@raspberrypi:~$

However, neither eth0 nor wwan0 (3g modem) are accessible.

pi@raspberrypi:~$ ifconfig
lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:90 errors:0 dropped:0 overruns:0 frame:0
          TX packets:90 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1
          RX bytes:12554 (12.2 KiB)  TX bytes:12554 (12.2 KiB)
pi@raspberrypi:~$ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
12436: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether b8:27:eb:8f:13:fc brd ff:ff:ff:ff:ff:ff
12437: wwan0: <BROADCAST,MULTICAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether de:65:07:14:49:07 brd ff:ff:ff:ff:ff:ff

I checked the syslog and the dhcpcd.service failed about 2 hours after starting the test

May 16 16:35:31 raspberrypi kernel: [ 8074.530821] Indeed it is in host mode hprt0 = 00021501
May 16 16:35:31 raspberrypi kernel: [ 8074.730699] usb 1-1: new high-speed USB device number 18 using dwc_otg
May 16 16:35:31 raspberrypi kernel: [ 8074.730914] Indeed it is in host mode hprt0 = 00001101
May 16 16:35:31 raspberrypi kernel: [ 8074.961137] usb 1-1: New USB device found, idVendor=0424, idProduct=9514
May 16 16:35:31 raspberrypi kernel: [ 8074.961160] usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
May 16 16:35:31 raspberrypi kernel: [ 8074.977154] hub 1-1:1.0: USB hub found
May 16 16:35:31 raspberrypi kernel: [ 8074.977326] hub 1-1:1.0: 5 ports detected
May 16 16:35:32 raspberrypi kernel: [ 8075.290717] usb 1-1.1: new high-speed USB device number 19 using dwc_otg
May 16 16:35:32 raspberrypi kernel: [ 8075.421146] usb 1-1.1: New USB device found, idVendor=0424, idProduct=ec00
May 16 16:35:32 raspberrypi kernel: [ 8075.421169] usb 1-1.1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
May 16 16:35:32 raspberrypi kernel: [ 8075.424638] smsc95xx v1.0.5
May 16 16:35:32 raspberrypi kernel: [ 8075.518008] smsc95xx 1-1.1:1.0 eth0: register 'smsc95xx' at usb-20980000.usb-1.1, smsc95xx USB 2.0 Ethernet, b8:27:eb:8f:13:fc
May 16 16:35:32 raspberrypi kernel: [ 8075.610701] usb 1-1.4: new high-speed USB device number 20 using dwc_otg
May 16 16:35:32 raspberrypi kernel: [ 8075.742809] usb 1-1.4: config 1 has an invalid interface number: 7 but max is 4
May 16 16:35:32 raspberrypi kernel: [ 8075.742831] usb 1-1.4: config 1 has no interface number 4
May 16 16:35:32 raspberrypi kernel: [ 8075.744753] usb 1-1.4: New USB device found, idVendor=1199, idProduct=68a3
May 16 16:35:32 raspberrypi kernel: [ 8075.744772] usb 1-1.4: New USB device strings: Mfr=3, Product=2, SerialNumber=0
May 16 16:35:32 raspberrypi kernel: [ 8075.744783] usb 1-1.4: Product: SL8084T
May 16 16:35:32 raspberrypi kernel: [ 8075.744793] usb 1-1.4: Manufacturer: Sierra Wireless, Incorporated
May 16 16:35:32 raspberrypi kernel: [ 8075.762976] sierra 1-1.4:1.0: Sierra USB modem converter detected
May 16 16:35:32 raspberrypi kernel: [ 8075.764075] usb 1-1.4: Sierra USB modem converter now attached to ttyUSB0
May 16 16:35:32 raspberrypi kernel: [ 8075.764871] sierra 1-1.4:1.1: Sierra USB modem converter detected
May 16 16:35:32 raspberrypi kernel: [ 8075.765968] usb 1-1.4: Sierra USB modem converter now attached to ttyUSB1
May 16 16:35:32 raspberrypi kernel: [ 8075.766733] sierra 1-1.4:1.2: Sierra USB modem converter detected
May 16 16:35:32 raspberrypi kernel: [ 8075.775632] usb 1-1.4: Sierra USB modem converter now attached to ttyUSB2
May 16 16:35:32 raspberrypi kernel: [ 8075.776476] sierra 1-1.4:1.3: Sierra USB modem converter detected
May 16 16:35:32 raspberrypi kernel: [ 8075.777822] usb 1-1.4: Sierra USB modem converter now attached to ttyUSB3
May 16 16:35:32 raspberrypi kernel: [ 8075.793309] sierra_net 1-1.4:1.7 wwan0: register 'sierra_net' at usb-20980000.usb-1.4, Sierra Wireless USB-to-WWAN Modem, de:65:07:14:fc:07
May 16 16:35:32 raspberrypi kernel: [ 8075.900766] usb 1-1.5: new full-speed USB device number 21 using dwc_otg
May 16 16:35:32 raspberrypi kernel: [ 8076.039883] usb 1-1.5: New USB device found, idVendor=0d8c, idProduct=0016
May 16 16:35:32 raspberrypi kernel: [ 8076.039907] usb 1-1.5: New USB device strings: Mfr=1, Product=2, SerialNumber=0
May 16 16:35:32 raspberrypi kernel: [ 8076.039918] usb 1-1.5: Product: USB Audio Device
May 16 16:35:32 raspberrypi kernel: [ 8076.039928] usb 1-1.5: Manufacturer: C-Media Electronics Inc.
May 16 16:35:32 raspberrypi kernel: [ 8076.103015] input: C-Media Electronics Inc. USB Audio Device as /devices/platform/soc/20980000.usb/usb1/1-1/1-1.5/1-1.5:1.3/0003:0D8C:0016.05F2/input/input1521
May 16 16:35:32 raspberrypi kernel: [ 8076.171643] hid-generic 0003:0D8C:0016.05F2: input,hidraw0: USB HID v1.00 Device [C-Media Electronics Inc. USB Audio Device] on usb-20980000.usb-1.5/input3
May 16 16:35:33 raspberrypi dhcpcd[514]: wwan0: adding address fe80::9b31:ff3e:31f5:e93e
May 16 16:35:33 raspberrypi kernel: [ 8076.470769] usb 1-1.2: new low-speed USB device number 22 using dwc_otg
May 16 16:35:33 raspberrypi dhcpcd[514]: wwan0: waiting for carrier
May 16 16:35:33 raspberrypi dhcpcd[514]: wwan0: carrier acquired
May 16 16:35:33 raspberrypi kernel: [ 8076.616668] usb 1-1.2: New USB device found, idVendor=1a2c, idProduct=0021
May 16 16:35:33 raspberrypi kernel: [ 8076.616693] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
May 16 16:35:33 raspberrypi kernel: [ 8076.616705] usb 1-1.2: Product: USB Keykoard
May 16 16:35:33 raspberrypi kernel: [ 8076.616715] usb 1-1.2: Manufacturer: USB
May 16 16:35:33 raspberrypi kernel: [ 8076.629507] input: USB USB Keykoard as /devices/platform/soc/20980000.usb/usb1/1-1/1-1.2/1-1.2:1.0/0003:1A2C:0021.05F3/input/input1522
May 16 16:35:33 raspberrypi dhcpcd[514]: wwan0: IAID 07:14:fc:07
May 16 16:35:33 raspberrypi kernel: [ 8076.692634] hid-generic 0003:1A2C:0021.05F3: input,hidraw1: USB HID v1.10 Keyboard [USB USB Keykoard] on usb-20980000.usb-1.2/input0
May 16 16:35:33 raspberrypi dhcpcd[514]: eth0: adding address fe80::e3be:a19f:6472:355d
May 16 16:35:33 raspberrypi dhcpcd[514]: eth0: waiting for carrier
May 16 16:35:33 raspberrypi kernel: [ 8076.987227] smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
May 16 16:35:33 raspberrypi kernel: [ 8076.988534] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
May 16 16:35:33 raspberrypi dhcpcd[514]: wwan0: rebinding lease of 100.74.94.245
May 16 16:35:33 raspberrypi dhcpcd[514]: wwan0: leased 100.74.94.245 for 259200 seconds
May 16 16:35:33 raspberrypi dhcpcd[514]: wwan0: forcing router 100.74.94.245 through interface
May 16 16:35:33 raspberrypi dhcpcd[514]: wwan0: adding route to 0.0.0.0/0
May 16 16:35:34 raspberrypi dhcpcd[514]: wwan0: soliciting an IPv6 router
May 16 16:35:34 raspberrypi ntpd[538]: Listen normally on 1528 wwan0 100.74.94.245 UDP 123
May 16 16:35:34 raspberrypi ntpd[538]: Listen normally on 1529 wwan0 fe80::9b31:ff3e:31f5:e93e UDP 123
May 16 16:35:34 raspberrypi ntpd[538]: peers refreshed
May 16 16:35:35 raspberrypi kernel: [ 8078.535492] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
May 16 16:35:35 raspberrypi dhcpcd[514]: eth0: carrier acquired
May 16 16:35:35 raspberrypi kernel: [ 8078.543181] smsc95xx 1-1.1:1.0 eth0: link up, 100Mbps, full-duplex, lpa 0xC1E1
May 16 16:35:35 raspberrypi dhcpcd[514]: eth0: IAID eb:8f:13:fc
May 16 16:35:35 raspberrypi dhcpcd[514]: eth0: soliciting a DHCP lease
May 16 16:35:35 raspberrypi dhcpcd[514]: eth0: soliciting an IPv6 router
May 16 16:35:37 raspberrypi ntpd[538]: Listen normally on 1530 eth0 fe80::e3be:a19f:6472:355d UDP 123
May 16 16:35:37 raspberrypi ntpd[538]: peers refreshed
May 16 16:35:40 raspberrypi dhcpcd[514]: eth0: probing for an IPv4LL address
May 16 16:35:40 raspberrypi kernel: [ 8084.173399] input: USB USB Keykoard as /devices/platform/soc/20980000.usb/usb1/1-1/1-1.2/1-1.2:1.1/0003:1A2C:0021.05F4/input/input1523
May 16 16:35:40 raspberrypi kernel: [ 8084.173468] ERROR::dwc_otg_hcd_urb_enqueue:487: Not connected
May 16 16:35:40 raspberrypi kernel: [ 8084.173468]
May 16 16:35:40 raspberrypi kernel: [ 8084.173688] ERROR::dwc_otg_hcd_urb_enqueue:487: Not connected
May 16 16:35:40 raspberrypi kernel: [ 8084.173688]
May 16 16:35:40 raspberrypi kernel: [ 8084.173808] ERROR::dwc_otg_hcd_urb_enqueue:487: Not connected
May 16 16:35:40 raspberrypi kernel: [ 8084.173808]
May 16 16:35:41 raspberrypi kernel: [ 8084.200688] ERROR::dwc_otg_hcd_urb_enqueue:487: Not connected
May 16 16:35:41 raspberrypi kernel: [ 8084.200688]
May 16 16:35:41 raspberrypi kernel: [ 8084.240688] ERROR::dwc_otg_hcd_urb_enqueue:487: Not connected
May 16 16:35:41 raspberrypi kernel: [ 8084.240688]
May 16 16:35:41 raspberrypi kernel: [ 8084.242300] hid-generic 0003:1A2C:0021.05F4: input,hidraw2: USB HID v1.10 Device [USB USB Keykoard] on usb-20980000.usb-1.2/input1
May 16 16:35:41 raspberrypi kernel: [ 8084.310745] ERROR::dwc_otg_hcd_urb_enqueue:487: Not connected
May 16 16:35:41 raspberrypi kernel: [ 8084.310745]
May 16 16:35:41 raspberrypi dhcpcd[514]: eth0: arp if_readrawpacket: Network is down
May 16 16:35:41 raspberrypi dhcpcd[514]: eth0: carrier lost
May 16 16:35:41 raspberrypi kernel: [ 8084.331008] usb 1-1: USB disconnect, device number 18
May 16 16:35:41 raspberrypi kernel: [ 8084.331033] usb 1-1.1: USB disconnect, device number 19
May 16 16:35:41 raspberrypi kernel: [ 8084.331438] smsc95xx 1-1.1:1.0 eth0: unregister 'smsc95xx' usb-20980000.usb-1.1, smsc95xx USB 2.0 Ethernet
May 16 16:35:41 raspberrypi kernel: [ 8084.331520] smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
May 16 16:35:41 raspberrypi kernel: [ 8084.351870] usb 1-1.2: USB disconnect, device number 22
May 16 16:35:41 raspberrypi kernel: [ 8084.365330] usb 1-1.4: USB disconnect, device number 20
May 16 16:35:41 raspberrypi kernel: [ 8084.380602] sierra ttyUSB0: Sierra USB modem converter now disconnected from ttyUSB0
May 16 16:35:41 raspberrypi kernel: [ 8084.380840] sierra 1-1.4:1.0: device disconnected
May 16 16:35:41 raspberrypi kernel: [ 8084.383358] sierra ttyUSB1: Sierra USB modem converter now disconnected from ttyUSB1
May 16 16:35:41 raspberrypi kernel: [ 8084.383501] sierra 1-1.4:1.1: device disconnected
May 16 16:35:41 raspberrypi kernel: [ 8084.392216] sierra ttyUSB2: Sierra USB modem converter now disconnected from ttyUSB2
May 16 16:35:41 raspberrypi kernel: [ 8084.392336] sierra 1-1.4:1.2: device disconnected
May 16 16:35:41 raspberrypi kernel: [ 8084.402119] sierra ttyUSB3: Sierra USB modem converter now disconnected from ttyUSB3
May 16 16:35:41 raspberrypi kernel: [ 8084.402246] sierra 1-1.4:1.3: device disconnected
May 16 16:35:41 raspberrypi kernel: [ 8084.402711] sierra_net 1-1.4:1.7 wwan0: unregister 'sierra_net' usb-20980000.usb-1.4, Sierra Wireless USB-to-WWAN Modem
May 16 16:35:41 raspberrypi kernel: [ 8084.412115] sierra_net 1-1.4:1.7 wwan0 (unregistered): usb_control_msg failed, status -19
May 16 16:35:41 raspberrypi kernel: [ 8084.413073] usb 1-1.5: USB disconnect, device number 21
May 16 16:35:41 raspberrypi systemd[1]: dhcpcd.service: main process exited, code=killed, status=11/SEGV
May 16 16:35:41 raspberrypi dhcpcd[13144]: sending signal TERM to pid 514
May 16 16:35:41 raspberrypi dhcpcd[13144]: dhcpcd not running
May 16 16:35:41 raspberrypi dhcpcd[13144]: dhcpcd[13144]: sending signal TERM to pid 514
May 16 16:35:41 raspberrypi dhcpcd[13144]: dhcpcd[13144]: dhcpcd not running
May 16 16:35:41 raspberrypi systemd[1]: dhcpcd.service: control process exited, code=exited status=1
May 16 16:35:41 raspberrypi systemd[1]: Unit dhcpcd.service entered failed state.
May 16 16:35:42 raspberrypi ntpd[538]: Deleting interface #1530 eth0, fe80::e3be:a19f:6472:355d#123, interface stats: received=0, sent=0, dropped=0, active_time=5 secs
May 16 16:35:42 raspberrypi ntpd[538]: Deleting interface #1529 wwan0, fe80::9b31:ff3e:31f5:e93e#123, interface stats: received=0, sent=0, dropped=0, active_time=8 secs
May 16 16:35:42 raspberrypi ntpd[538]: Deleting interface #1528 wwan0, 100.74.94.245#123, interface stats: received=4, sent=4, dropped=0, active_time=8 secs
May 16 16:35:42 raspberrypi ntpd[538]: 122.252.184.186 interface 100.74.94.245 -> (none)
May 16 16:35:43 raspberrypi ntpd[538]: 103.242.70.5 interface 100.74.94.245 -> (none)
May 16 16:35:43 raspberrypi ntpd[538]: 130.217.226.50 interface 100.74.94.245 -> (none)
May 16 16:35:43 raspberrypi ntpd[538]: 203.97.213.52 interface 100.74.94.245 -> (none)
May 16 16:35:43 raspberrypi ntpd[538]: peers refreshed

After restarting dhcpcd (systemctl restart dhcpcd), everything is ok again.

@P33M
Copy link
Contributor

P33M commented May 17, 2017

So eth0 came back without a reboot, correct?

@jkuek
Copy link
Author

jkuek commented May 17, 2017

as reported, eth0 seemed to be ok from a driver perspective (it was present in /sys/class/net) but did not show in the list when I ran ifconfig.
Everything was fine again after starting dhcpcd.service (without rebooting the device)

@dimaqq
Copy link

dimaqq commented Sep 21, 2017

@P33M do you know if your fix (or some other fix) landed upstream?

I'm having similar hangs on desktop on stock 4.4.0 and custom USB device that's externally powered.

@P33M
Copy link
Contributor

P33M commented Sep 21, 2017

The fixes are specific to the dwc_otg driver, which is a downstream implementation.

Also, unless your desktop hardware is powered by a mobile phone SoC using an OTG USB core, it's highly unlikely that this issue is the one you're seeing.

@dimaqq
Copy link

dimaqq commented Sep 21, 2017

Ugh, thanks, I'll keep digging!

@l-spitz
Copy link

l-spitz commented Sep 26, 2017

@P33M I'm getting a lot of issues similar to what you describe, when running webcams over the usb, namely sporadically get dmesg messages, and eventually get following, which requires a power cycle to recover from.

I'm running,
Linux version 4.9.41-v7+ (dc4@dc4-XPS13-9333) (gcc version 4.9.3 (crosstool-NG crosstool-ng-1.22.0-88-g8460611) ) #1023 SMP Tue Aug 8 16:00:15 BST 2017

Is the fix you mention in this release?

[25134.316545] Unable to handle kernel NULL pointer dereference at virtual address 00000104
[25134.319093] pgd = 80004000
[25134.320352] [00000104] *pgd=00000000
[25134.321583] Internal error: Oops: 817 [#1] SMP ARM
[25134.322799] Modules linked in: evdev snd_usb_audio snd_hwdep snd_usbmidi_lib uvcvideo snd_rawmidi snd_seq_device videobuf2_vmalloc videobuf2_memops snd_pcm videobuf2_v4l2 snd_timer videobuf2_core snd videodev media cmac bnep hci_uart btbcm bluetooth brcmfmac brcmutil cfg80211 rfkill bcm2835_gpiomem i2c_bcm2835 uio_pdrv_genirq fixed uio i2c_dev ip_tables x_tables ipv6
[25134.329754] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.41-v7+ #1023
[25134.331237] Hardware name: BCM2835
[25134.332698] task: 80c06cc0 task.stack: 80c00000
[25134.334163] PC is at uvc_queue_next_buffer+0x7c/0xf0 [uvcvideo]
[25134.335611] LR is at 0x100
[25134.337000] pc : [<7f4463dc>]    lr : [<00000100>]    psr: 60000193
               sp : 80c01d48  ip : 00000100  fp : 80c01d64
[25134.339793] r10: b67be400  r9 : b73a0b98  r8 : 00000002
[25134.341171] r7 : bad41800  r6 : b73a0ce8  r5 : b73a0b98  r4 : b3844c00
[25134.342544] r3 : b73a0b98  r2 : 00000200  r1 : 00000000  r0 : 60000193
[25134.343899] Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
[25134.345250] Control: 10c5383d  Table: 373d006a  DAC: 00000055
[25134.346573] Process swapper/0 (pid: 0, stack limit = 0x80c00210)
[25134.347926] Stack: (0x80c01d48 to 0x80c02000)
[25134.349287] 1d40:                   000ea600 b3844c00 b73a0800 b67be420 80c01d9c 80c01d68
[25134.352041] 1d60: 7f449b0c 7f44636c 7f4536c0 b3844c00 00000000 b3844c00 b67be400 b73a0800
[25134.354782] 1d80: b73a0ce8 80b79340 00000000 00000001 80c01dbc 80c01da0 7f4486b8 7f4499a8
[25134.357512] 1da0: b67be400 60000113 00000000 00000000 80c01ddc 80c01dc0 80557f5c 7f448640
[25134.360270] 1dc0: b346c4c0 b67be400 b67be400 00000000 80c01dfc 80c01de0 80558088 80557ee8
[25134.363150] 1de0: baad6f30 b346c4c0 b9538000 b67be400 80c01e24 80c01e00 805820c8 80558048
[25134.366144] 1e00: 80c01e44 20000113 804ec0a0 b9463940 b9463944 00000000 80c01e34 80c01e28
[25134.369264] 1e20: 80591178 80582068 80c01e5c 80c01e38 80123144 80591164 801230d0 80c02080
[25134.372539] 1e40: 00000000 80c00020 00000100 b9c05180 80c01ec4 80c01e60 8010169c 801230dc
[25134.375884] 1e60: 801741b4 80177910 00000001 00200100 80c02d00 0025e4d4 0000000a 80c7b980
[25134.379313] 1e80: 80c71f38 80c0312c 80b793b8 80c02080 80c01e60 00000001 80173074 80b7cec0
[25134.382807] 1ea0: 00000000 00000000 00000001 b9c05180 80c00000 babffa80 80c01ed4 80c01ec8
[25134.386318] 1ec0: 80122cd8 8010151c 80c01efc 80c01ed8 801736b8 80122bd8 80c01f18 80c04670
[25134.389831] 1ee0: 60000013 ffffffff 80c01f4c 80c15f08 80c01f14 80c01f00 8010150c 80173654
[25134.393347] 1f00: 80c040a4 80108a58 80c01f74 80c01f18 8071a47c 80101470 00000000 ba35c470
[25134.396863] 1f20: 00000000 00000000 80c00000 80c0312c 00000001 80c03198 80c15f08 80c15f08
[25134.400380] 1f40: babffa80 80c01f74 80c040a4 80c01f68 80108a54 80108a58 60000013 ffffffff
[25134.403896] 1f60: 80108a54 00000000 80c01f84 80c01f78 80719bc4 80108a30 80c01f9c 80c01f88
[25134.407409] 1f80: 80161df0 80719b9c 808023f0 ffffffff 80c01fac 80c01fa0 80714450 80161d18
[25134.410923] 1fa0: 80c01ff4 80c01fb0 80b00d28 807143d8 ffffffff ffffffff 00000000 80b006f8
[25134.414436] 1fc0: 00000000 80b56a30 00000000 80c7b294 80c0311c 80b56a2c 80c089a8 0000406a
[25134.417951] 1fe0: 410fd034 00000000 00000000 80c01ff8 0000807c 80b00980 00000000 00000000
[25134.421514] [<7f4463dc>] (uvc_queue_next_buffer [uvcvideo]) from [<7f449b0c>] (uvc_video_decode_isoc+0x170/0x198 [uvcvideo])
[25134.425101] [<7f449b0c>] (uvc_video_decode_isoc [uvcvideo]) from [<7f4486b8>] (uvc_video_complete+0x84/0xec [uvcvideo])
[25134.428676] [<7f4486b8>] (uvc_video_complete [uvcvideo]) from [<80557f5c>] (__usb_hcd_giveback_urb+0x80/0x160)
[25134.432217] [<80557f5c>] (__usb_hcd_giveback_urb) from [<80558088>] (usb_hcd_giveback_urb+0x4c/0xfc)
[25134.435753] [<80558088>] (usb_hcd_giveback_urb) from [<805820c8>] (completion_tasklet_func+0x6c/0x98)
[25134.439300] [<805820c8>] (completion_tasklet_func) from [<80591178>] (tasklet_callback+0x20/0x24)
[25134.442850] [<80591178>] (tasklet_callback) from [<80123144>] (tasklet_hi_action+0x74/0x10c)
[25134.446398] [<80123144>] (tasklet_hi_action) from [<8010169c>] (__do_softirq+0x18c/0x3cc)
[25134.449948] [<8010169c>] (__do_softirq) from [<80122cd8>] (irq_exit+0x10c/0x168)
[25134.453494] [<80122cd8>] (irq_exit) from [<801736b8>] (__handle_domain_irq+0x70/0xc4)
[25134.457052] [<801736b8>] (__handle_domain_irq) from [<8010150c>] (bcm2836_arm_irqchip_handle_irq+0xa8/0xac)
[25134.460637] [<8010150c>] (bcm2836_arm_irqchip_handle_irq) from [<8071a47c>] (__irq_svc+0x5c/0x7c)
[25134.464210] Exception stack(0x80c01f18 to 0x80c01f60)
[25134.465988] 1f00:                                                       00000000 ba35c470
[25134.469402] 1f20: 00000000 00000000 80c00000 80c0312c 00000001 80c03198 80c15f08 80c15f08
[25134.472754] 1f40: babffa80 80c01f74 80c040a4 80c01f68 80108a54 80108a58 60000013 ffffffff
[25134.476105] [<8071a47c>] (__irq_svc) from [<80108a58>] (arch_cpu_idle+0x34/0x4c)
[25134.479442] [<80108a58>] (arch_cpu_idle) from [<80719bc4>] (default_idle_call+0x34/0x48)
[25134.482789] [<80719bc4>] (default_idle_call) from [<80161df0>] (cpu_startup_entry+0xe4/0x160)
[25134.486143] [<80161df0>] (cpu_startup_entry) from [<80714450>] (rest_init+0x84/0x88)
[25134.489488] [<80714450>] (rest_init) from [<80b00d28>] (start_kernel+0x3b4/0x3c0)
[25134.492827] Code: e594215c e594c158 e1a03005 e3a0ec01 (e58c2004)

popcornmix pushed a commit that referenced this issue Dec 22, 2021
Fix up unprivileged test case results for 'Dest pointer in r0' verifier tests
given they now need to reject R0 containing a pointer value, and add a couple
of new related ones with 32bit cmpxchg as well.

  root@foo:~/bpf/tools/testing/selftests/bpf# ./test_verifier
  #0/u invalid and of negative number OK
  #0/p invalid and of negative number OK
  [...]
  #1268/p XDP pkt read, pkt_meta' <= pkt_data, bad access 1 OK
  #1269/p XDP pkt read, pkt_meta' <= pkt_data, bad access 2 OK
  #1270/p XDP pkt read, pkt_data <= pkt_meta', good access OK
  #1271/p XDP pkt read, pkt_data <= pkt_meta', bad access 1 OK
  #1272/p XDP pkt read, pkt_data <= pkt_meta', bad access 2 OK
  Summary: 1900 PASSED, 0 SKIPPED, 0 FAILED

Acked-by: Brendan Jackman <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
popcornmix pushed a commit that referenced this issue Dec 22, 2021
commit e523102 upstream.

Fix up unprivileged test case results for 'Dest pointer in r0' verifier tests
given they now need to reject R0 containing a pointer value, and add a couple
of new related ones with 32bit cmpxchg as well.

  root@foo:~/bpf/tools/testing/selftests/bpf# ./test_verifier
  #0/u invalid and of negative number OK
  #0/p invalid and of negative number OK
  [...]
  #1268/p XDP pkt read, pkt_meta' <= pkt_data, bad access 1 OK
  #1269/p XDP pkt read, pkt_meta' <= pkt_data, bad access 2 OK
  #1270/p XDP pkt read, pkt_data <= pkt_meta', good access OK
  #1271/p XDP pkt read, pkt_data <= pkt_meta', bad access 1 OK
  #1272/p XDP pkt read, pkt_data <= pkt_meta', bad access 2 OK
  Summary: 1900 PASSED, 0 SKIPPED, 0 FAILED

Acked-by: Brendan Jackman <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants