-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
root filesystem corruption on apt-get upgrade to Raspbian Jessie #398
Comments
Do you use a real /tmp folder or a tmpfs one? IIRC, dpkg might fill up /tmp hence some errors. |
The real errors are: [ 1421.942344] mmc0: final write to SD card still running This means that communication with the SDCard has failed in some way... This is the usual corruption we've seen in the part and is something I'm working on diagnosing now. I've written an SD protocol analyser expressly for this purpose and it's working, so it's really a matter of time. Can't be very helpful on timing yet but I'm hoping to nail this one down over the next two weeks Gordon |
It's been two weeks. Any progress? |
Yes, I've been able to identify a bug somewhere in the dma - FIFO interface in the Arasan module. Basically under very specific circumstances we fail to transmit data correctly to the sd card which results in corruption. But it was a very specific sd card that caused the problem so still don't understand what it is about the card that triggers the effect. Stay tuned but I'm hoping to fully diagnose the issue in the next two weeks Gordon |
Looks like #413 is where these issues are now being discussed / referenced. |
@zerxy |
Not yet. I have a card ready for testing, a class 10 Transcend. If that tests okay I'll try the new firmware with my Sandisk card. |
I've just finished testing. Upgraded from Wheezy to Jessie twice on two different cards, a class 4 Sandisk and a class 10 Transcend. No SD card errors were logged and no corruption has been detected. |
Great! Okay to close? |
Trinity has reported: BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 IP: __lock_acquire (kernel/locking/lockdep.c:3070 (discriminator 1)) CPU: 6 PID: 16173 Comm: trinity-c364 Tainted: G W 3.15.0-rc1-next-20140415-sasha-00020-gaa90d09 #398 lock_acquire (arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602) _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) remove_migration_pte (mm/migrate.c:137) rmap_walk (mm/rmap.c:1628 mm/rmap.c:1699) remove_migration_ptes (mm/migrate.c:224) migrate_pages (mm/migrate.c:922 mm/migrate.c:960 mm/migrate.c:1126) migrate_misplaced_page (mm/migrate.c:1733) __handle_mm_fault (mm/memory.c:3762 mm/memory.c:3812 mm/memory.c:3925) handle_mm_fault (mm/memory.c:3948) __get_user_pages (mm/memory.c:1851) __mlock_vma_pages_range (mm/mlock.c:255) __mm_populate (mm/mlock.c:711) SyS_mlockall (include/linux/mm.h:1799 mm/mlock.c:817 mm/mlock.c:791) I believe this comes about because, whereas collapsing and splitting THP functions take anon_vma lock in write mode (which excludes concurrent rmap walks), faulting THP functions (write protection and misplaced NUMA) do not - and mostly they do not need to. But they do use a pmdp_clear_flush(), set_pmd_at() sequence which, for an instant (indeed, for a long instant, given the inter-CPU TLB flush in there), leaves *pmd neither present not trans_huge. Which can confuse a concurrent rmap walk, as when removing migration ptes, seen in the dumped trace. Although that rmap walk has a 4k page to insert, anon_vmas containing THPs are in no way segregated from 4k-page anon_vmas, so the 4k-intent mm_find_pmd() does need to cope with that instant when a trans_huge pmd is temporarily absent. I don't think we need strengthen the locking at the THP end: it's easily handled with an ACCESS_ONCE() before testing both conditions. And since mm_find_pmd() had only one caller who wanted a THP rather than a pmd, let's slightly repurpose it to fail when it hits a THP or non-present pmd, and open code split_huge_page_address() again. Signed-off-by: Hugh Dickins <[email protected]> Reported-by: Sasha Levin <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Bob Liu <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Dave Jones <[email protected]> Cc: David Rientjes <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
This issue is now happening for the B+ with latest 2014-06-20 Raspbian. See here for discussions: http://www.raspberrypi.org/forums/viewtopic.php?f=28&t=85210&p=602257#p602257 |
commit f72e7dc upstream. Trinity has reported: BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 IP: __lock_acquire (kernel/locking/lockdep.c:3070 (discriminator 1)) CPU: 6 PID: 16173 Comm: trinity-c364 Tainted: G W 3.15.0-rc1-next-20140415-sasha-00020-gaa90d09 #398 lock_acquire (arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602) _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) remove_migration_pte (mm/migrate.c:137) rmap_walk (mm/rmap.c:1628 mm/rmap.c:1699) remove_migration_ptes (mm/migrate.c:224) migrate_pages (mm/migrate.c:922 mm/migrate.c:960 mm/migrate.c:1126) migrate_misplaced_page (mm/migrate.c:1733) __handle_mm_fault (mm/memory.c:3762 mm/memory.c:3812 mm/memory.c:3925) handle_mm_fault (mm/memory.c:3948) __get_user_pages (mm/memory.c:1851) __mlock_vma_pages_range (mm/mlock.c:255) __mm_populate (mm/mlock.c:711) SyS_mlockall (include/linux/mm.h:1799 mm/mlock.c:817 mm/mlock.c:791) I believe this comes about because, whereas collapsing and splitting THP functions take anon_vma lock in write mode (which excludes concurrent rmap walks), faulting THP functions (write protection and misplaced NUMA) do not - and mostly they do not need to. But they do use a pmdp_clear_flush(), set_pmd_at() sequence which, for an instant (indeed, for a long instant, given the inter-CPU TLB flush in there), leaves *pmd neither present not trans_huge. Which can confuse a concurrent rmap walk, as when removing migration ptes, seen in the dumped trace. Although that rmap walk has a 4k page to insert, anon_vmas containing THPs are in no way segregated from 4k-page anon_vmas, so the 4k-intent mm_find_pmd() does need to cope with that instant when a trans_huge pmd is temporarily absent. I don't think we need strengthen the locking at the THP end: it's easily handled with an ACCESS_ONCE() before testing both conditions. And since mm_find_pmd() had only one caller who wanted a THP rather than a pmd, let's slightly repurpose it to fail when it hits a THP or non-present pmd, and open code split_huge_page_address() again. Signed-off-by: Hugh Dickins <[email protected]> Reported-by: Sasha Levin <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Bob Liu <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Dave Jones <[email protected]> Cc: David Rientjes <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Jiri Slaby <[email protected]>
[ Upstream commit 28261da ] Because of both sides doing L2CAP disconnection at the same time, it was possible to receive L2CAP Disconnection Response with CID that was already freed. That caused problems if CID was already reused and L2CAP Connection Request with same CID was sent out. Before this patch kernel deleted channel context regardless of the state of the channel. Example where leftover Disconnection Response (frame raspberrypi#402) causes local device to delete L2CAP channel which was not yet connected. This in turn confuses remote device's stack because same CID is re-used without properly disconnecting. Btmon capture before patch: ** snip ** > ACL Data RX: Handle 43 flags 0x02 dlen 8 raspberrypi#394 [hci1] 10.748949 Channel: 65 len 4 [PSM 3 mode 0] {chan 2} RFCOMM: Disconnect (DISC) (0x43) Address: 0x03 cr 1 dlci 0x00 Control: 0x53 poll/final 1 Length: 0 FCS: 0xfd < ACL Data TX: Handle 43 flags 0x00 dlen 8 raspberrypi#395 [hci1] 10.749062 Channel: 65 len 4 [PSM 3 mode 0] {chan 2} RFCOMM: Unnumbered Ack (UA) (0x63) Address: 0x03 cr 1 dlci 0x00 Control: 0x73 poll/final 1 Length: 0 FCS: 0xd7 < ACL Data TX: Handle 43 flags 0x00 dlen 12 raspberrypi#396 [hci1] 10.749073 L2CAP: Disconnection Request (0x06) ident 17 len 4 Destination CID: 65 Source CID: 65 > HCI Event: Number of Completed Packets (0x13) plen 5 raspberrypi#397 [hci1] 10.752391 Num handles: 1 Handle: 43 Count: 1 > HCI Event: Number of Completed Packets (0x13) plen 5 raspberrypi#398 [hci1] 10.753394 Num handles: 1 Handle: 43 Count: 1 > ACL Data RX: Handle 43 flags 0x02 dlen 12 raspberrypi#399 [hci1] 10.756499 L2CAP: Disconnection Request (0x06) ident 26 len 4 Destination CID: 65 Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 12 raspberrypi#400 [hci1] 10.756548 L2CAP: Disconnection Response (0x07) ident 26 len 4 Destination CID: 65 Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 12 raspberrypi#401 [hci1] 10.757459 L2CAP: Connection Request (0x02) ident 18 len 4 PSM: 1 (0x0001) Source CID: 65 > ACL Data RX: Handle 43 flags 0x02 dlen 12 raspberrypi#402 [hci1] 10.759148 L2CAP: Disconnection Response (0x07) ident 17 len 4 Destination CID: 65 Source CID: 65 = bluetoothd: 00:1E:AB:4C:56:54: error updating services: Input/o.. 10.759447 > HCI Event: Number of Completed Packets (0x13) plen 5 raspberrypi#403 [hci1] 10.759386 Num handles: 1 Handle: 43 Count: 1 > ACL Data RX: Handle 43 flags 0x02 dlen 12 raspberrypi#404 [hci1] 10.760397 L2CAP: Connection Request (0x02) ident 27 len 4 PSM: 3 (0x0003) Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 16 raspberrypi#405 [hci1] 10.760441 L2CAP: Connection Response (0x03) ident 27 len 8 Destination CID: 65 Source CID: 65 Result: Connection successful (0x0000) Status: No further information available (0x0000) < ACL Data TX: Handle 43 flags 0x00 dlen 27 raspberrypi#406 [hci1] 10.760449 L2CAP: Configure Request (0x04) ident 19 len 19 Destination CID: 65 Flags: 0x0000 Option: Maximum Transmission Unit (0x01) [mandatory] MTU: 1013 Option: Retransmission and Flow Control (0x04) [mandatory] Mode: Basic (0x00) TX window size: 0 Max transmit: 0 Retransmission timeout: 0 Monitor timeout: 0 Maximum PDU size: 0 > HCI Event: Number of Completed Packets (0x13) plen 5 raspberrypi#407 [hci1] 10.761399 Num handles: 1 Handle: 43 Count: 1 > ACL Data RX: Handle 43 flags 0x02 dlen 16 raspberrypi#408 [hci1] 10.762942 L2CAP: Connection Response (0x03) ident 18 len 8 Destination CID: 66 Source CID: 65 Result: Connection successful (0x0000) Status: No further information available (0x0000) *snip* Similar case after the patch: *snip* > ACL Data RX: Handle 43 flags 0x02 dlen 8 #22702 [hci0] 1664.411056 Channel: 65 len 4 [PSM 3 mode 0] {chan 3} RFCOMM: Disconnect (DISC) (0x43) Address: 0x03 cr 1 dlci 0x00 Control: 0x53 poll/final 1 Length: 0 FCS: 0xfd < ACL Data TX: Handle 43 flags 0x00 dlen 8 #22703 [hci0] 1664.411136 Channel: 65 len 4 [PSM 3 mode 0] {chan 3} RFCOMM: Unnumbered Ack (UA) (0x63) Address: 0x03 cr 1 dlci 0x00 Control: 0x73 poll/final 1 Length: 0 FCS: 0xd7 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #22704 [hci0] 1664.411143 L2CAP: Disconnection Request (0x06) ident 11 len 4 Destination CID: 65 Source CID: 65 > HCI Event: Number of Completed Pac.. (0x13) plen 5 #22705 [hci0] 1664.414009 Num handles: 1 Handle: 43 Count: 1 > HCI Event: Number of Completed Pac.. (0x13) plen 5 #22706 [hci0] 1664.415007 Num handles: 1 Handle: 43 Count: 1 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #22707 [hci0] 1664.418674 L2CAP: Disconnection Request (0x06) ident 17 len 4 Destination CID: 65 Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #22708 [hci0] 1664.418762 L2CAP: Disconnection Response (0x07) ident 17 len 4 Destination CID: 65 Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #22709 [hci0] 1664.421073 L2CAP: Connection Request (0x02) ident 12 len 4 PSM: 1 (0x0001) Source CID: 65 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #22710 [hci0] 1664.421371 L2CAP: Disconnection Response (0x07) ident 11 len 4 Destination CID: 65 Source CID: 65 > HCI Event: Number of Completed Pac.. (0x13) plen 5 #22711 [hci0] 1664.424082 Num handles: 1 Handle: 43 Count: 1 > HCI Event: Number of Completed Pac.. (0x13) plen 5 #22712 [hci0] 1664.425040 Num handles: 1 Handle: 43 Count: 1 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #22713 [hci0] 1664.426103 L2CAP: Connection Request (0x02) ident 18 len 4 PSM: 3 (0x0003) Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 16 #22714 [hci0] 1664.426186 L2CAP: Connection Response (0x03) ident 18 len 8 Destination CID: 66 Source CID: 65 Result: Connection successful (0x0000) Status: No further information available (0x0000) < ACL Data TX: Handle 43 flags 0x00 dlen 27 #22715 [hci0] 1664.426196 L2CAP: Configure Request (0x04) ident 13 len 19 Destination CID: 65 Flags: 0x0000 Option: Maximum Transmission Unit (0x01) [mandatory] MTU: 1013 Option: Retransmission and Flow Control (0x04) [mandatory] Mode: Basic (0x00) TX window size: 0 Max transmit: 0 Retransmission timeout: 0 Monitor timeout: 0 Maximum PDU size: 0 > ACL Data RX: Handle 43 flags 0x02 dlen 16 #22716 [hci0] 1664.428804 L2CAP: Connection Response (0x03) ident 12 len 8 Destination CID: 66 Source CID: 65 Result: Connection successful (0x0000) Status: No further information available (0x0000) *snip* Fix is to check that channel is in state BT_DISCONN before deleting the channel. This bug was found while fuzzing Bluez's OBEX implementation using Synopsys Defensics. Reported-by: Matti Kamunen <[email protected]> Reported-by: Ari Timonen <[email protected]> Signed-off-by: Matias Karhumaa <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 28261da ] Because of both sides doing L2CAP disconnection at the same time, it was possible to receive L2CAP Disconnection Response with CID that was already freed. That caused problems if CID was already reused and L2CAP Connection Request with same CID was sent out. Before this patch kernel deleted channel context regardless of the state of the channel. Example where leftover Disconnection Response (frame #402) causes local device to delete L2CAP channel which was not yet connected. This in turn confuses remote device's stack because same CID is re-used without properly disconnecting. Btmon capture before patch: ** snip ** > ACL Data RX: Handle 43 flags 0x02 dlen 8 #394 [hci1] 10.748949 Channel: 65 len 4 [PSM 3 mode 0] {chan 2} RFCOMM: Disconnect (DISC) (0x43) Address: 0x03 cr 1 dlci 0x00 Control: 0x53 poll/final 1 Length: 0 FCS: 0xfd < ACL Data TX: Handle 43 flags 0x00 dlen 8 #395 [hci1] 10.749062 Channel: 65 len 4 [PSM 3 mode 0] {chan 2} RFCOMM: Unnumbered Ack (UA) (0x63) Address: 0x03 cr 1 dlci 0x00 Control: 0x73 poll/final 1 Length: 0 FCS: 0xd7 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #396 [hci1] 10.749073 L2CAP: Disconnection Request (0x06) ident 17 len 4 Destination CID: 65 Source CID: 65 > HCI Event: Number of Completed Packets (0x13) plen 5 #397 [hci1] 10.752391 Num handles: 1 Handle: 43 Count: 1 > HCI Event: Number of Completed Packets (0x13) plen 5 #398 [hci1] 10.753394 Num handles: 1 Handle: 43 Count: 1 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #399 [hci1] 10.756499 L2CAP: Disconnection Request (0x06) ident 26 len 4 Destination CID: 65 Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #400 [hci1] 10.756548 L2CAP: Disconnection Response (0x07) ident 26 len 4 Destination CID: 65 Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #401 [hci1] 10.757459 L2CAP: Connection Request (0x02) ident 18 len 4 PSM: 1 (0x0001) Source CID: 65 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #402 [hci1] 10.759148 L2CAP: Disconnection Response (0x07) ident 17 len 4 Destination CID: 65 Source CID: 65 = bluetoothd: 00:1E:AB:4C:56:54: error updating services: Input/o.. 10.759447 > HCI Event: Number of Completed Packets (0x13) plen 5 #403 [hci1] 10.759386 Num handles: 1 Handle: 43 Count: 1 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #404 [hci1] 10.760397 L2CAP: Connection Request (0x02) ident 27 len 4 PSM: 3 (0x0003) Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 16 #405 [hci1] 10.760441 L2CAP: Connection Response (0x03) ident 27 len 8 Destination CID: 65 Source CID: 65 Result: Connection successful (0x0000) Status: No further information available (0x0000) < ACL Data TX: Handle 43 flags 0x00 dlen 27 #406 [hci1] 10.760449 L2CAP: Configure Request (0x04) ident 19 len 19 Destination CID: 65 Flags: 0x0000 Option: Maximum Transmission Unit (0x01) [mandatory] MTU: 1013 Option: Retransmission and Flow Control (0x04) [mandatory] Mode: Basic (0x00) TX window size: 0 Max transmit: 0 Retransmission timeout: 0 Monitor timeout: 0 Maximum PDU size: 0 > HCI Event: Number of Completed Packets (0x13) plen 5 #407 [hci1] 10.761399 Num handles: 1 Handle: 43 Count: 1 > ACL Data RX: Handle 43 flags 0x02 dlen 16 #408 [hci1] 10.762942 L2CAP: Connection Response (0x03) ident 18 len 8 Destination CID: 66 Source CID: 65 Result: Connection successful (0x0000) Status: No further information available (0x0000) *snip* Similar case after the patch: *snip* > ACL Data RX: Handle 43 flags 0x02 dlen 8 #22702 [hci0] 1664.411056 Channel: 65 len 4 [PSM 3 mode 0] {chan 3} RFCOMM: Disconnect (DISC) (0x43) Address: 0x03 cr 1 dlci 0x00 Control: 0x53 poll/final 1 Length: 0 FCS: 0xfd < ACL Data TX: Handle 43 flags 0x00 dlen 8 #22703 [hci0] 1664.411136 Channel: 65 len 4 [PSM 3 mode 0] {chan 3} RFCOMM: Unnumbered Ack (UA) (0x63) Address: 0x03 cr 1 dlci 0x00 Control: 0x73 poll/final 1 Length: 0 FCS: 0xd7 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #22704 [hci0] 1664.411143 L2CAP: Disconnection Request (0x06) ident 11 len 4 Destination CID: 65 Source CID: 65 > HCI Event: Number of Completed Pac.. (0x13) plen 5 #22705 [hci0] 1664.414009 Num handles: 1 Handle: 43 Count: 1 > HCI Event: Number of Completed Pac.. (0x13) plen 5 #22706 [hci0] 1664.415007 Num handles: 1 Handle: 43 Count: 1 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #22707 [hci0] 1664.418674 L2CAP: Disconnection Request (0x06) ident 17 len 4 Destination CID: 65 Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #22708 [hci0] 1664.418762 L2CAP: Disconnection Response (0x07) ident 17 len 4 Destination CID: 65 Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #22709 [hci0] 1664.421073 L2CAP: Connection Request (0x02) ident 12 len 4 PSM: 1 (0x0001) Source CID: 65 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #22710 [hci0] 1664.421371 L2CAP: Disconnection Response (0x07) ident 11 len 4 Destination CID: 65 Source CID: 65 > HCI Event: Number of Completed Pac.. (0x13) plen 5 #22711 [hci0] 1664.424082 Num handles: 1 Handle: 43 Count: 1 > HCI Event: Number of Completed Pac.. (0x13) plen 5 #22712 [hci0] 1664.425040 Num handles: 1 Handle: 43 Count: 1 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #22713 [hci0] 1664.426103 L2CAP: Connection Request (0x02) ident 18 len 4 PSM: 3 (0x0003) Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 16 #22714 [hci0] 1664.426186 L2CAP: Connection Response (0x03) ident 18 len 8 Destination CID: 66 Source CID: 65 Result: Connection successful (0x0000) Status: No further information available (0x0000) < ACL Data TX: Handle 43 flags 0x00 dlen 27 #22715 [hci0] 1664.426196 L2CAP: Configure Request (0x04) ident 13 len 19 Destination CID: 65 Flags: 0x0000 Option: Maximum Transmission Unit (0x01) [mandatory] MTU: 1013 Option: Retransmission and Flow Control (0x04) [mandatory] Mode: Basic (0x00) TX window size: 0 Max transmit: 0 Retransmission timeout: 0 Monitor timeout: 0 Maximum PDU size: 0 > ACL Data RX: Handle 43 flags 0x02 dlen 16 #22716 [hci0] 1664.428804 L2CAP: Connection Response (0x03) ident 12 len 8 Destination CID: 66 Source CID: 65 Result: Connection successful (0x0000) Status: No further information available (0x0000) *snip* Fix is to check that channel is in state BT_DISCONN before deleting the channel. This bug was found while fuzzing Bluez's OBEX implementation using Synopsys Defensics. Reported-by: Matti Kamunen <[email protected]> Reported-by: Ari Timonen <[email protected]> Signed-off-by: Matias Karhumaa <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 28261da ] Because of both sides doing L2CAP disconnection at the same time, it was possible to receive L2CAP Disconnection Response with CID that was already freed. That caused problems if CID was already reused and L2CAP Connection Request with same CID was sent out. Before this patch kernel deleted channel context regardless of the state of the channel. Example where leftover Disconnection Response (frame #402) causes local device to delete L2CAP channel which was not yet connected. This in turn confuses remote device's stack because same CID is re-used without properly disconnecting. Btmon capture before patch: ** snip ** > ACL Data RX: Handle 43 flags 0x02 dlen 8 #394 [hci1] 10.748949 Channel: 65 len 4 [PSM 3 mode 0] {chan 2} RFCOMM: Disconnect (DISC) (0x43) Address: 0x03 cr 1 dlci 0x00 Control: 0x53 poll/final 1 Length: 0 FCS: 0xfd < ACL Data TX: Handle 43 flags 0x00 dlen 8 #395 [hci1] 10.749062 Channel: 65 len 4 [PSM 3 mode 0] {chan 2} RFCOMM: Unnumbered Ack (UA) (0x63) Address: 0x03 cr 1 dlci 0x00 Control: 0x73 poll/final 1 Length: 0 FCS: 0xd7 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #396 [hci1] 10.749073 L2CAP: Disconnection Request (0x06) ident 17 len 4 Destination CID: 65 Source CID: 65 > HCI Event: Number of Completed Packets (0x13) plen 5 #397 [hci1] 10.752391 Num handles: 1 Handle: 43 Count: 1 > HCI Event: Number of Completed Packets (0x13) plen 5 #398 [hci1] 10.753394 Num handles: 1 Handle: 43 Count: 1 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #399 [hci1] 10.756499 L2CAP: Disconnection Request (0x06) ident 26 len 4 Destination CID: 65 Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #400 [hci1] 10.756548 L2CAP: Disconnection Response (0x07) ident 26 len 4 Destination CID: 65 Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #401 [hci1] 10.757459 L2CAP: Connection Request (0x02) ident 18 len 4 PSM: 1 (0x0001) Source CID: 65 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #402 [hci1] 10.759148 L2CAP: Disconnection Response (0x07) ident 17 len 4 Destination CID: 65 Source CID: 65 = bluetoothd: 00:1E:AB:4C:56:54: error updating services: Input/o.. 10.759447 > HCI Event: Number of Completed Packets (0x13) plen 5 #403 [hci1] 10.759386 Num handles: 1 Handle: 43 Count: 1 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #404 [hci1] 10.760397 L2CAP: Connection Request (0x02) ident 27 len 4 PSM: 3 (0x0003) Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 16 #405 [hci1] 10.760441 L2CAP: Connection Response (0x03) ident 27 len 8 Destination CID: 65 Source CID: 65 Result: Connection successful (0x0000) Status: No further information available (0x0000) < ACL Data TX: Handle 43 flags 0x00 dlen 27 #406 [hci1] 10.760449 L2CAP: Configure Request (0x04) ident 19 len 19 Destination CID: 65 Flags: 0x0000 Option: Maximum Transmission Unit (0x01) [mandatory] MTU: 1013 Option: Retransmission and Flow Control (0x04) [mandatory] Mode: Basic (0x00) TX window size: 0 Max transmit: 0 Retransmission timeout: 0 Monitor timeout: 0 Maximum PDU size: 0 > HCI Event: Number of Completed Packets (0x13) plen 5 #407 [hci1] 10.761399 Num handles: 1 Handle: 43 Count: 1 > ACL Data RX: Handle 43 flags 0x02 dlen 16 #408 [hci1] 10.762942 L2CAP: Connection Response (0x03) ident 18 len 8 Destination CID: 66 Source CID: 65 Result: Connection successful (0x0000) Status: No further information available (0x0000) *snip* Similar case after the patch: *snip* > ACL Data RX: Handle 43 flags 0x02 dlen 8 #22702 [hci0] 1664.411056 Channel: 65 len 4 [PSM 3 mode 0] {chan 3} RFCOMM: Disconnect (DISC) (0x43) Address: 0x03 cr 1 dlci 0x00 Control: 0x53 poll/final 1 Length: 0 FCS: 0xfd < ACL Data TX: Handle 43 flags 0x00 dlen 8 #22703 [hci0] 1664.411136 Channel: 65 len 4 [PSM 3 mode 0] {chan 3} RFCOMM: Unnumbered Ack (UA) (0x63) Address: 0x03 cr 1 dlci 0x00 Control: 0x73 poll/final 1 Length: 0 FCS: 0xd7 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #22704 [hci0] 1664.411143 L2CAP: Disconnection Request (0x06) ident 11 len 4 Destination CID: 65 Source CID: 65 > HCI Event: Number of Completed Pac.. (0x13) plen 5 #22705 [hci0] 1664.414009 Num handles: 1 Handle: 43 Count: 1 > HCI Event: Number of Completed Pac.. (0x13) plen 5 #22706 [hci0] 1664.415007 Num handles: 1 Handle: 43 Count: 1 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #22707 [hci0] 1664.418674 L2CAP: Disconnection Request (0x06) ident 17 len 4 Destination CID: 65 Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #22708 [hci0] 1664.418762 L2CAP: Disconnection Response (0x07) ident 17 len 4 Destination CID: 65 Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #22709 [hci0] 1664.421073 L2CAP: Connection Request (0x02) ident 12 len 4 PSM: 1 (0x0001) Source CID: 65 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #22710 [hci0] 1664.421371 L2CAP: Disconnection Response (0x07) ident 11 len 4 Destination CID: 65 Source CID: 65 > HCI Event: Number of Completed Pac.. (0x13) plen 5 #22711 [hci0] 1664.424082 Num handles: 1 Handle: 43 Count: 1 > HCI Event: Number of Completed Pac.. (0x13) plen 5 #22712 [hci0] 1664.425040 Num handles: 1 Handle: 43 Count: 1 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #22713 [hci0] 1664.426103 L2CAP: Connection Request (0x02) ident 18 len 4 PSM: 3 (0x0003) Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 16 #22714 [hci0] 1664.426186 L2CAP: Connection Response (0x03) ident 18 len 8 Destination CID: 66 Source CID: 65 Result: Connection successful (0x0000) Status: No further information available (0x0000) < ACL Data TX: Handle 43 flags 0x00 dlen 27 #22715 [hci0] 1664.426196 L2CAP: Configure Request (0x04) ident 13 len 19 Destination CID: 65 Flags: 0x0000 Option: Maximum Transmission Unit (0x01) [mandatory] MTU: 1013 Option: Retransmission and Flow Control (0x04) [mandatory] Mode: Basic (0x00) TX window size: 0 Max transmit: 0 Retransmission timeout: 0 Monitor timeout: 0 Maximum PDU size: 0 > ACL Data RX: Handle 43 flags 0x02 dlen 16 #22716 [hci0] 1664.428804 L2CAP: Connection Response (0x03) ident 12 len 8 Destination CID: 66 Source CID: 65 Result: Connection successful (0x0000) Status: No further information available (0x0000) *snip* Fix is to check that channel is in state BT_DISCONN before deleting the channel. This bug was found while fuzzing Bluez's OBEX implementation using Synopsys Defensics. Reported-by: Matti Kamunen <[email protected]> Reported-by: Ari Timonen <[email protected]> Signed-off-by: Matias Karhumaa <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
I am experiencing root filesystem corruption when I do a apt-get upgrade from Raspbian Wheezy (from 2013-09-25-wheezy-raspbian.img) to Jessie. Prior to this I used rpi-update to update to latest VideoCore libraries and SDK with no errors at this point in dmesg. Relevant dmesg output after performing apt-get:
Relevant apt-get output:
The file /var/lib/dpkg/triggers/aspell-autobuildhash appears to be corrupt.
After this I deleted .rpi-firmware/ and redid rpi-update. Got the following errors in dmesg:
Booting from a 8 gig class 4 Sandisk SD card which was purchased locally from a reputable bricks and mortar retailer so I know it's genuine and using files from September 10 in /boot . No overclocking or overvolting.
The text was updated successfully, but these errors were encountered: