Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[syzkaller] KASAN: wild-memory-access Write in subflow_req_destructor #125

Closed
cpaasch opened this issue Dec 10, 2020 · 7 comments
Closed
Assignees

Comments

@cpaasch
Copy link
Member

cpaasch commented Dec 10, 2020

TCP: request_sock_subflow: Possible SYN flooding on port 20000. Sending cookies.  Check SNMP counters.
==================================================================
BUG: KASAN: wild-memory-access in instrument_atomic_read_write include/linux/instrumented.h:101 [inline]
BUG: KASAN: wild-memory-access in atomic_fetch_sub_release include/asm-generic/atomic-instrumented.h:220 [inline]
BUG: KASAN: wild-memory-access in __refcount_sub_and_test include/linux/refcount.h:272 [inline]
BUG: KASAN: wild-memory-access in __refcount_dec_and_test include/linux/refcount.h:315 [inline]
BUG: KASAN: wild-memory-access in refcount_dec_and_test include/linux/refcount.h:333 [inline]
BUG: KASAN: wild-memory-access in sock_put include/net/sock.h:1798 [inline]
BUG: KASAN: wild-memory-access in subflow_req_destructor+0x5b/0x120 net/mptcp/subflow.c:40
Write of size 4 at addr 043d2cdf043d2c6f by task syz-executor.4/30810

CPU: 1 PID: 30810 Comm: syz-executor.4 Not tainted 5.10.0-rc6 #52
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xbe/0xf9 lib/dump_stack.c:118
 __kasan_report mm/kasan/report.c:549 [inline]
 kasan_report.cold+0x5/0x37 mm/kasan/report.c:562
 check_memory_region_inline mm/kasan/generic.c:186 [inline]
 check_memory_region+0x14e/0x1b0 mm/kasan/generic.c:192
 instrument_atomic_read_write include/linux/instrumented.h:101 [inline]
 atomic_fetch_sub_release include/asm-generic/atomic-instrumented.h:220 [inline]
 __refcount_sub_and_test include/linux/refcount.h:272 [inline]
 __refcount_dec_and_test include/linux/refcount.h:315 [inline]
 refcount_dec_and_test include/linux/refcount.h:333 [inline]
 sock_put include/net/sock.h:1798 [inline]
 subflow_req_destructor+0x5b/0x120 net/mptcp/subflow.c:40
 __reqsk_free include/net/request_sock.h:117 [inline]
 tcp_conn_request+0x2480/0x2df0 net/ipv4/tcp_input.c:6883
 subflow_v4_conn_request net/mptcp/subflow.c:418 [inline]
 subflow_v4_conn_request+0x9b/0x150 net/mptcp/subflow.c:408
 tcp_rcv_state_process+0x9e9/0x4ba0 net/ipv4/tcp_input.c:6332
 tcp_v4_do_rcv+0x343/0x8b0 net/ipv4/tcp_ipv4.c:1695
 tcp_v4_rcv+0x2667/0x2e60 net/ipv4/tcp_ipv4.c:2043
 ip_protocol_deliver_rcu+0x2b/0x200 net/ipv4/ip_input.c:204
 ip_local_deliver_finish net/ipv4/ip_input.c:231 [inline]
 NF_HOOK include/linux/netfilter.h:409 [inline]
 ip_local_deliver+0x2da/0x390 net/ipv4/ip_input.c:252
 dst_input include/net/dst.h:447 [inline]
 ip_rcv_finish net/ipv4/ip_input.c:428 [inline]
 ip_rcv_finish net/ipv4/ip_input.c:414 [inline]
 NF_HOOK include/linux/netfilter.h:409 [inline]
 ip_rcv+0xef/0x140 net/ipv4/ip_input.c:539
 __netif_receive_skb_one_core+0x197/0x1e0 net/core/dev.c:5305
 __netif_receive_skb+0x27/0x1c0 net/core/dev.c:5419
 process_backlog+0x1e5/0x6e0 net/core/dev.c:6309
 napi_poll net/core/dev.c:6787 [inline]
 net_rx_action+0x3fa/0xe30 net/core/dev.c:6870
 __do_softirq+0x187/0x585 kernel/softirq.c:298
 asm_call_irq_on_stack+0x12/0x20
 </IRQ>
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:26 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:77 [inline]
 do_softirq_own_stack+0x32/0x40 arch/x86/kernel/irq_64.c:77
 do_softirq.part.0+0x26/0x30 kernel/softirq.c:343
 do_softirq arch/x86/include/asm/preempt.h:26 [inline]
 __local_bh_enable_ip+0x46/0x50 kernel/softirq.c:195
 local_bh_enable include/linux/bottom_half.h:32 [inline]
 rcu_read_unlock_bh include/linux/rcupdate.h:730 [inline]
 ip_finish_output2+0x71e/0x17e0 net/ipv4/ip_output.c:231
 __ip_finish_output+0x516/0x880 net/ipv4/ip_output.c:308
 dst_output include/net/dst.h:441 [inline]
 ip_local_out+0x18a/0x1f0 net/ipv4/ip_output.c:126
 __ip_queue_xmit+0x77c/0x1500 net/ipv4/ip_output.c:532
 __tcp_transmit_skb+0x2bed/0x36a0 net/ipv4/tcp_output.c:1405
 tcp_transmit_skb net/ipv4/tcp_output.c:1423 [inline]
 tcp_connect+0x24e9/0x3510 net/ipv4/tcp_output.c:3853
 tcp_v4_connect+0x1461/0x1ba0 net/ipv4/tcp_ipv4.c:312
 __inet_stream_connect+0x812/0xd50 net/ipv4/af_inet.c:664
 inet_stream_connect+0x53/0xa0 net/ipv4/af_inet.c:728
 mptcp_stream_connect+0x161/0x790 net/mptcp/protocol.c:3188
 __sys_connect_file net/socket.c:1830 [inline]
 __sys_connect+0x268/0x2f0 net/socket.c:1847
 __do_sys_connect net/socket.c:1857 [inline]
 __se_sys_connect net/socket.c:1854 [inline]
 __x64_sys_connect+0x6f/0xb0 net/socket.c:1854
 do_syscall_64+0x33/0x40 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fe9e52ad469
Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ff 49 2b 00 f7 d8 64 89 01 48
RSP: 002b:00007fe9e599ddc8 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 000000000069bf40 RCX: 00007fe9e52ad469
RDX: 0000000000000010 RSI: 0000000020000040 RDI: 0000000000000005
RBP: 000000000069bf40 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000069bf4c
R13: 00007ffd81a3182f R14: 000000000041556d R15: 0000000000000003
==================================================================

HEAD is at:
05cb27b ("DO-NOT-MERGE: mptcp: enabled by default") (HEAD, tag: export/20201209T060936, mptcp_net-next/export) (12 hours ago)
525593c ("DO-NOT-MERGE: mptcp: add GitHub Actions") (12 hours ago)
6aa8731 ("DO-NOT-MERGE: mptcp: use kmalloc on kasan build") (12 hours ago)
2227bfd ("mptcp: let MPTCP create max size skbs") (12 hours ago)
908c632 ("mptcp: pm: simplify select_local_address()") (12 hours ago)
a771b76 ("mptcp: parse and act on incoming FASTCLOSE option") (12 hours ago)
7dbc6b7 ("tcp: parse mptcp options contained in reset packets") (12 hours ago)
4598a67 ("mptcp: hold mptcp socket before calling tcp_done") (12 hours ago)
3630500 ("mptcp: use MPTCPOPT_HMAC_LEN macro") (12 hours ago)
905c00c ("selftests: mptcp: add the flush addrs testcase") (12 hours ago)
2d0de9b ("mptcp: remove address when netlink flushes addrs") (12 hours ago)
389cb8d ("mptcp: use the variable sk instead of open-coding") (12 hours ago)
62ad6da ("mptcp: rename add_addr_signal and mptcp_add_addr_status") (12 hours ago)
56607a9 ("mptcp: drop rm_addr_signal flag") (12 hours ago)
f561498 ("mptcp: print out port and ahmac when receiving ADD_ADDR") (12 hours ago)
faec918 ("mptcp: add port parameter for mptcp_pm_announce_addr") (12 hours ago)
1bab32f ("mptcp: send out dedicated packet for ADD_ADDR using port") (12 hours ago)
a7429bb ("mptcp: add the outgoing ADD_ADDR port support") (12 hours ago)
a8787a8 ("mptcp: use adding up size to get ADD_ADDR length") (12 hours ago)
1690597 ("mptcp: add port support for ADD_ADDR suboption writing") (12 hours ago)
4021cd8 ("mptcp: unify ADD_ADDR and ADD_ADDR6 suboptions writing") (12 hours ago)
0b86309 ("mptcp: unify ADD_ADDR and echo suboptions writing") (12 hours ago)
c855f89 ("bpf:selftests: add bpf_mptcp_sock() verifier tests") (12 hours ago)
0eaea54 ("bpf:selftests: add MPTCP test base") (12 hours ago)
eed59ab ("bpf: add 'bpf_mptcp_sock' structure and helper") (12 hours ago)
6dd1da9 ("mptcp: attach subflow socket to parent cgroup") (12 hours ago)
58a4d0c ("bpf: expose is_mptcp flag to bpf_tcp_sock") (12 hours ago)
d188dfe ("mptcp: be careful on subflows shutdown") (12 hours ago)
9910201 ("mptcp: plug subflow context memory leak") (12 hours ago)
ae1cd5e ("mptcp: link MPC subflow into msk only after accept") (12 hours ago)
afae3cc ("net: atheros: simplify the return expression of atl2_phy_setup_autoneg_adv()") (mptcp_net-next/net-next) (18 hours ago)

No reproducer yet.

CONFIG-file:
CONFIG.txt

@cpaasch
Copy link
Member Author

cpaasch commented Dec 10, 2020

Different stack-trace but same issue (I guess ;-) ):

TCP: request_sock_subflow: Possible SYN flooding on port 20000. Sending cookies.  Check SNMP counters.
BUG: unable to handle page fault for address: fffff52003024ffb
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 123fee067 P4D 123fee067 PUD 1000ea067 PMD 10a7f5067 PTE 0
Oops: 0000 [#1] SMP KASAN
CPU: 0 PID: 11336 Comm: syz-executor.6 Not tainted 5.10.0-rc6 #52
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:bytes_is_nonzero mm/kasan/generic.c:91 [inline]
RIP: 0010:memory_is_nonzero mm/kasan/generic.c:108 [inline]
RIP: 0010:memory_is_poisoned_n mm/kasan/generic.c:134 [inline]
RIP: 0010:memory_is_poisoned mm/kasan/generic.c:165 [inline]
RIP: 0010:check_memory_region_inline mm/kasan/generic.c:183 [inline]
RIP: 0010:check_memory_region+0xdd/0x1b0 mm/kasan/generic.c:192
Code: 38 00 74 f2 41 b8 01 00 00 00 48 85 c0 75 6b 5b 44 89 c0 5d 41 5c c3 4d 85 c9 74 4d 49 01 d9 eb 09 48 83 c0 01 4c 39 c8 74 3f <80> 38 00 74 f2 eb d3 41 bc 08 00 00 00 45 29 c4 49 89 d8 4d 8d 0c
RSP: 0018:ffffc90000007728 EFLAGS: 00010286
RAX: fffff52003024ffb RBX: fffff52003024ffb RCX: ffffffff82aeb14b
RDX: 0000000000000001 RSI: 0000000000000004 RDI: ffffc90018127fd8
RBP: fffff52003024ffc R08: ffff888109e1b900 R09: fffff52003024ffc
R10: ffffc90018127fdb R11: fffff52003024ffb R12: ffffc90018127f58
R13: ffffc90018127fd8 R14: ffff888104fd6270 R15: ffffffff849fd280
FS:  00007f934a5d0700(0000) GS:ffff88811b400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffff52003024ffb CR3: 00000001054b5003 CR4: 0000000000170ef0
Call Trace:
 <IRQ>
 instrument_atomic_read_write include/linux/instrumented.h:101 [inline]
 atomic_fetch_sub_release include/asm-generic/atomic-instrumented.h:220 [inline]
 __refcount_sub_and_test include/linux/refcount.h:272 [inline]
 __refcount_dec_and_test include/linux/refcount.h:315 [inline]
 refcount_dec_and_test include/linux/refcount.h:333 [inline]
 sock_put include/net/sock.h:1798 [inline]
 subflow_req_destructor+0x5b/0x120 net/mptcp/subflow.c:40
 __reqsk_free include/net/request_sock.h:117 [inline]
 tcp_conn_request+0x2480/0x2df0 net/ipv4/tcp_input.c:6883
 subflow_v4_conn_request net/mptcp/subflow.c:418 [inline]
 subflow_v4_conn_request+0x9b/0x150 net/mptcp/subflow.c:408
 tcp_rcv_state_process+0x9e9/0x4ba0 net/ipv4/tcp_input.c:6332
 tcp_v4_do_rcv+0x343/0x8b0 net/ipv4/tcp_ipv4.c:1695
 tcp_v4_rcv+0x2667/0x2e60 net/ipv4/tcp_ipv4.c:2043
 ip_protocol_deliver_rcu+0x2b/0x200 net/ipv4/ip_input.c:204
 ip_local_deliver_finish net/ipv4/ip_input.c:231 [inline]
 NF_HOOK include/linux/netfilter.h:409 [inline]
 ip_local_deliver+0x2da/0x390 net/ipv4/ip_input.c:252
 dst_input include/net/dst.h:447 [inline]
 ip_rcv_finish net/ipv4/ip_input.c:428 [inline]
 ip_rcv_finish net/ipv4/ip_input.c:414 [inline]
 NF_HOOK include/linux/netfilter.h:409 [inline]
 ip_rcv+0xef/0x140 net/ipv4/ip_input.c:539
 __netif_receive_skb_one_core+0x197/0x1e0 net/core/dev.c:5305
 __netif_receive_skb+0x27/0x1c0 net/core/dev.c:5419
 process_backlog+0x1e5/0x6e0 net/core/dev.c:6309
 napi_poll net/core/dev.c:6787 [inline]
 net_rx_action+0x3fa/0xe30 net/core/dev.c:6870
 __do_softirq+0x187/0x585 kernel/softirq.c:298
 asm_call_irq_on_stack+0x12/0x20
 </IRQ>
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:26 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:77 [inline]
 do_softirq_own_stack+0x32/0x40 arch/x86/kernel/irq_64.c:77
 do_softirq.part.0+0x26/0x30 kernel/softirq.c:343
 do_softirq arch/x86/include/asm/preempt.h:26 [inline]
 __local_bh_enable_ip+0x46/0x50 kernel/softirq.c:195
 local_bh_enable include/linux/bottom_half.h:32 [inline]
 rcu_read_unlock_bh include/linux/rcupdate.h:730 [inline]
 ip_finish_output2+0x71e/0x17e0 net/ipv4/ip_output.c:231
 __ip_finish_output+0x516/0x880 net/ipv4/ip_output.c:308
 dst_output include/net/dst.h:441 [inline]
 ip_local_out+0x18a/0x1f0 net/ipv4/ip_output.c:126
 __ip_queue_xmit+0x77c/0x1500 net/ipv4/ip_output.c:532
 __tcp_transmit_skb+0x2bed/0x36a0 net/ipv4/tcp_output.c:1405
 tcp_transmit_skb net/ipv4/tcp_output.c:1423 [inline]
 tcp_connect+0x24e9/0x3510 net/ipv4/tcp_output.c:3853
 tcp_v4_connect+0x1461/0x1ba0 net/ipv4/tcp_ipv4.c:312
 __inet_stream_connect+0x812/0xd50 net/ipv4/af_inet.c:664
 inet_stream_connect+0x53/0xa0 net/ipv4/af_inet.c:728
 mptcp_stream_connect+0x161/0x790 net/mptcp/protocol.c:3188
 __sys_connect_file net/socket.c:1830 [inline]
 __sys_connect+0x268/0x2f0 net/socket.c:1847
 __do_sys_connect net/socket.c:1857 [inline]
 __se_sys_connect net/socket.c:1854 [inline]
 __x64_sys_connect+0x6f/0xb0 net/socket.c:1854
 do_syscall_64+0x33/0x40 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f9349f42469
Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ff 49 2b 00 f7 d8 64 89 01 48
RSP: 002b:00007f934a5cfdc8 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 000000000069c138 RCX: 00007f9349f42469
RDX: 0000000000000010 RSI: 0000000020000040 RDI: 0000000000000004
RBP: 000000000069c138 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000069c144
R13: 00007ffc285846ff R14: 000000000041556d R15: 0000000000000003
Modules linked in:
Dumping ftrace buffer:
   (ftrace buffer empty)
CR2: fffff52003024ffb
---[ end trace 1937b17370385895 ]---
RIP: 0010:bytes_is_nonzero mm/kasan/generic.c:91 [inline]
RIP: 0010:memory_is_nonzero mm/kasan/generic.c:108 [inline]
RIP: 0010:memory_is_poisoned_n mm/kasan/generic.c:134 [inline]
RIP: 0010:memory_is_poisoned mm/kasan/generic.c:165 [inline]
RIP: 0010:check_memory_region_inline mm/kasan/generic.c:183 [inline]
RIP: 0010:check_memory_region+0xdd/0x1b0 mm/kasan/generic.c:192
Code: 38 00 74 f2 41 b8 01 00 00 00 48 85 c0 75 6b 5b 44 89 c0 5d 41 5c c3 4d 85 c9 74 4d 49 01 d9 eb 09 48 83 c0 01 4c 39 c8 74 3f <80> 38 00 74 f2 eb d3 41 bc 08 00 00 00 45 29 c4 49 89 d8 4d 8d 0c
RSP: 0018:ffffc90000007728 EFLAGS: 00010286
RAX: fffff52003024ffb RBX: fffff52003024ffb RCX: ffffffff82aeb14b
RDX: 0000000000000001 RSI: 0000000000000004 RDI: ffffc90018127fd8
RBP: fffff52003024ffc R08: ffff888109e1b900 R09: fffff52003024ffc
R10: ffffc90018127fdb R11: fffff52003024ffb R12: ffffc90018127f58
R13: ffffc90018127fd8 R14: ffff888104fd6270 R15: ffffffff849fd280
FS:  00007f934a5d0700(0000) GS:ffff88811b400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffff52003024ffb CR3: 00000001054b5003 CR4: 0000000000170ef0

@cpaasch
Copy link
Member Author

cpaasch commented Jan 21, 2021

Last seen on 01/14/2021 - Keeping open for now.

@cpaasch
Copy link
Member Author

cpaasch commented Jan 28, 2021

Closing.

@cpaasch cpaasch closed this as completed Jan 28, 2021
@cpaasch
Copy link
Member Author

cpaasch commented Feb 3, 2021

It came back!

TCP: request_sock_subflow: Possible SYN flooding on port 20000. Sending cookies.  Check SNMP counters.
==================================================================
BUG: KASAN: wild-memory-access in instrument_atomic_read_write include/linux/instrumented.h:101 [inline]
BUG: KASAN: wild-memory-access in atomic_fetch_sub_release include/asm-generic/atomic-instrumented.h:220 [inline]
BUG: KASAN: wild-memory-access in __refcount_sub_and_test include/linux/refcount.h:272 [inline]
BUG: KASAN: wild-memory-access in __refcount_dec_and_test include/linux/refcount.h:315 [inline]
BUG: KASAN: wild-memory-access in refcount_dec_and_test include/linux/refcount.h:333 [inline]
BUG: KASAN: wild-memory-access in sock_put include/net/sock.h:1796 [inline]
BUG: KASAN: wild-memory-access in subflow_req_destructor+0x5b/0x120 net/mptcp/subflow.c:43
Write of size 4 at addr 403a380028740ce0 by task syz-executor.0/20884

CPU: 0 PID: 20884 Comm: syz-executor.0 Not tainted 5.11.0-rc5d82d76887ec676c2e37b496f1e7d094f3f2507d6 #68
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0xbe/0xf9 lib/dump_stack.c:120
 __kasan_report mm/kasan/report.c:400 [inline]
 kasan_report.cold+0x5f/0xd5 mm/kasan/report.c:413
 check_memory_region_inline mm/kasan/generic.c:179 [inline]
 check_memory_region+0x142/0x190 mm/kasan/generic.c:185
 instrument_atomic_read_write include/linux/instrumented.h:101 [inline]
 atomic_fetch_sub_release include/asm-generic/atomic-instrumented.h:220 [inline]
 __refcount_sub_and_test include/linux/refcount.h:272 [inline]
 __refcount_dec_and_test include/linux/refcount.h:315 [inline]
 refcount_dec_and_test include/linux/refcount.h:333 [inline]
 sock_put include/net/sock.h:1796 [inline]
 subflow_req_destructor+0x5b/0x120 net/mptcp/subflow.c:43
 __reqsk_free include/net/request_sock.h:117 [inline]
 tcp_conn_request+0x22a5/0x2e20 net/ipv4/tcp_input.c:6901
 subflow_v4_conn_request net/mptcp/subflow.c:462 [inline]
 subflow_v4_conn_request+0x9b/0x150 net/mptcp/subflow.c:452
 tcp_rcv_state_process+0x9bf/0x48f0 net/ipv4/tcp_input.c:6350
 tcp_v4_do_rcv+0x30e/0x860 net/ipv4/tcp_ipv4.c:1698
 tcp_v4_rcv+0x2490/0x2b40 net/ipv4/tcp_ipv4.c:2047
 ip_protocol_deliver_rcu+0x2b/0x200 net/ipv4/ip_input.c:204
 ip_local_deliver_finish net/ipv4/ip_input.c:231 [inline]
 NF_HOOK include/linux/netfilter.h:409 [inline]
 ip_local_deliver+0x2bf/0x370 net/ipv4/ip_input.c:252
 dst_input include/net/dst.h:447 [inline]
 ip_rcv_finish net/ipv4/ip_input.c:428 [inline]
 ip_rcv_finish net/ipv4/ip_input.c:414 [inline]
 NF_HOOK include/linux/netfilter.h:409 [inline]
 ip_rcv+0xeb/0x140 net/ipv4/ip_input.c:539
 __netif_receive_skb_one_core+0x197/0x1e0 net/core/dev.c:5332
 __netif_receive_skb+0x27/0x1c0 net/core/dev.c:5446
 process_backlog+0x1ad/0x560 net/core/dev.c:6325
 napi_poll net/core/dev.c:6803 [inline]
 net_rx_action+0x3d6/0xe90 net/core/dev.c:6886
 __do_softirq+0x183/0x56f kernel/softirq.c:343
 asm_call_irq_on_stack+0x12/0x20
 </IRQ>
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:26 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:77 [inline]
 do_softirq_own_stack+0x32/0x40 arch/x86/kernel/irq_64.c:77
 do_softirq kernel/softirq.c:246 [inline]
 do_softirq+0x5f/0x80 kernel/softirq.c:233
 __local_bh_enable_ip+0x46/0x50 kernel/softirq.c:196
 local_bh_enable include/linux/bottom_half.h:32 [inline]
 rcu_read_unlock_bh include/linux/rcupdate.h:737 [inline]
 ip_finish_output2+0x6d0/0x16f0 net/ipv4/ip_output.c:231
 __ip_finish_output+0x3bb/0x7c0 net/ipv4/ip_output.c:308
 dst_output include/net/dst.h:441 [inline]
 ip_local_out+0x184/0x1e0 net/ipv4/ip_output.c:126
 __ip_queue_xmit+0x77a/0x1500 net/ipv4/ip_output.c:532
 __tcp_transmit_skb+0x2a65/0x35e0 net/ipv4/tcp_output.c:1405
 tcp_transmit_skb net/ipv4/tcp_output.c:1423 [inline]
 tcp_connect+0x2a0b/0x3c20 net/ipv4/tcp_output.c:3856
 tcp_v4_connect+0x1437/0x1b90 net/ipv4/tcp_ipv4.c:312
 __inet_stream_connect+0x860/0xd90 net/ipv4/af_inet.c:664
 inet_stream_connect+0x53/0xa0 net/ipv4/af_inet.c:728
 mptcp_stream_connect+0x161/0x790 net/mptcp/protocol.c:3200
 __sys_connect_file net/socket.c:1835 [inline]
 __sys_connect+0x276/0x2f0 net/socket.c:1852
 __do_sys_connect net/socket.c:1862 [inline]
 __se_sys_connect net/socket.c:1859 [inline]
 __x64_sys_connect+0x6e/0xb0 net/socket.c:1859
 do_syscall_64+0x33/0x40 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fbd7bc61469
Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ff 49 2b 00 f7 d8 64 89 01 48
RSP: 002b:00007fbd7c351da8 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 000000000000002a RCX: 00007fbd7bc61469
RDX: 0000000000000010 RSI: 0000000020000040 RDI: 0000000000000004
RBP: 000000000000002a R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000069bf6c
R13: 00007ffe8d19b6cf R14: 00007fbd7c332000 R15: 0000000000000003
==================================================================

HEAD is at:
d82d76887ec6 ("mptcp: fix spurious retransmissions") (HEAD) (8 hours ago)
9c23f272d8c2 ("Eric's fix") (8 hours ago)
e2fe949 ("DO-NOT-MERGE: mptcp: enabled by default") (tag: export/20210202T061758, mptcp_net-next/export) (18 hours ago)
13b4d63 ("DO-NOT-MERGE: mptcp: add GitHub Actions") (18 hours ago)
c2c6844 ("DO-NOT-MERGE: mptcp: use kmalloc on kasan build") (18 hours ago)
6399d64 ("mptcp: add netlink event support") (18 hours ago)
cf6cac4 ("genetlink: add CAP_NET_ADMIN test for multicast bind") (18 hours ago)
875cda9 ("mptcp: avoid lock_fast usage in accept path") (18 hours ago)
d30f162 ("mptcp: pass subflow socket to a few helpers") (18 hours ago)
5ee34d5 ("mptcp: split __mptcp_close_ssk helper") (18 hours ago)
fbc8817 ("mptcp: move pm netlink work into pm_netlink") (18 hours ago)
77a274c ("mptcp: pm: add lockdep assertions") (18 hours ago)
e75bbfc ("selftests: mptcp: add command line arguments for mptcp_join.sh") (18 hours ago)
7b74dee ("selftests: mptcp: add testcases for ADD_ADDR with port") (18 hours ago)
7dad582 ("mptcp: add the mibs for ADD_ADDR with port") (18 hours ago)
fc67ce1 ("selftests: mptcp: add port argument for pm_nl_ctl") (18 hours ago)
4864e76 ("mptcp: deal with MPTCP_PM_ADDR_ATTR_PORT in PM netlink") (18 hours ago)
b99b4c0 ("mptcp: enable use_port when invoke addresses_equal") (18 hours ago)
6f9d0f9 ("mptcp: add port number check for MP_JOIN") (18 hours ago)
6f2398c ("mptcp: add a new helper subflow_req_create_thmac") (18 hours ago)
28985de ("mptcp: drop unused skb in subflow_token_join_request") (18 hours ago)
47c71c6 ("mptcp: create the listening socket for new port") (18 hours ago)
b8a22e0 ("selftests: mptcp: add testcases for newly added addresses") (18 hours ago)
5978f84 ("selftests: mptcp: use minus values for removing address numbers") (18 hours ago)
b8cda5a ("mptcp: send ack for every add_addr") (18 hours ago)
8333d08 ("mptcp: create subflow or signal addr for newly added address") (18 hours ago)
1031d4b ("mptcp: drop *_max fields in mptcp_pm_data") (18 hours ago)
bb2d333 ("mptcp: use WRITE_ONCE/READ_ONCE for the pernet *_max") (18 hours ago)
e2868c0 ("bpf:selftests: add bpf_mptcp_sock() verifier tests") (18 hours ago)
c681345 ("bpf:selftests: add MPTCP test base") (18 hours ago)
0bb29bf ("bpf: add 'bpf_mptcp_sock' structure and helper") (18 hours ago)
bc464f0 ("bpf: expose is_mptcp flag to bpf_tcp_sock") (18 hours ago)
d4a677f ("linux: handle MPTCP consistently with TCP") (18 hours ago)
2c87774 ("mptcp: fix length of MP_PRIO suboption") (18 hours ago)
9ae4bdc ("Merge branch 'rework-the-memory-barrier-for-scrq-entry'") (mptcp_net-next/net-next) (20 hours ago)

CONFIG-file:
CONFIG.txt

No reproducer...

@cpaasch cpaasch reopened this Feb 3, 2021
@matttbe
Copy link
Member

matttbe commented Feb 6, 2021

Should be fixed by a patch from @pabeni , see the ML

@cpaasch
Copy link
Member Author

cpaasch commented Feb 8, 2021

Yes - closing.

@cpaasch cpaasch closed this as completed Feb 8, 2021
@matttbe
Copy link
Member

matttbe commented Feb 8, 2021

Patch ref: 2195b45: mptcp: init mptcp request socket earlier

jenkins-tessares pushed a commit that referenced this issue Nov 18, 2021
When SUSPEND_DISCONNECTING bit is set that means Disconnect is pending
but the code was evaluating if the list is empty before calling
hci_conn_del which does the actual cleanup and remove the connection
from the list thus the bit is never cleared causing the suspend
procedure to always timeout when there are connections to be
disconnected:

Suspend/Resume - Success 5 (Pairing - Legacy) - waiting done
  Set the system into Suspend via force_suspend
= mgmt-tester: Suspend/Resume - Success 5 (Pairing -..   17:03:13.200458
= mgmt-tester: Set the system into Suspend via force_suspend    17:03:13.205812
< HCI Command: Write Scan E.. (0x03|0x001a) plen 1  #122 [hci0] 17:03:13.213561
        Scan enable: No Scans (0x00)
> HCI Event: Command Complete (0x0e) plen 4         #123 [hci0] 17:03:13.214710
      Write Scan Enable (0x03|0x001a) ncmd 1
        Status: Success (0x00)
< HCI Command: Disconnect (0x01|0x0006) plen 3      #124 [hci0] 17:03:13.215830
        Handle: 42
        Reason: Remote Device Terminated due to Power Off (0x15)
> HCI Event: Command Status (0x0f) plen 4           #125 [hci0] 17:03:13.216602
      Disconnect (0x01|0x0006) ncmd 1
        Status: Success (0x00)
> HCI Event: Disconnect Complete (0x05) plen 4      #126 [hci0] 17:03:13.217342
        Status: Success (0x00)
        Handle: 42
        Reason: Remote Device Terminated due to Power Off (0x15)
@ MGMT Event: Device Disconn.. (0x000c) plen 8  {0x0002} [hci0] 17:03:13.217688
        BR/EDR Address: 00:AA:01:01:00:00 (Intel Corporation)
        Reason: Connection terminated by local host for suspend (0x05)
@ MGMT Event: Device Disconn.. (0x000c) plen 8  {0x0001} [hci0] 17:03:13.217688
        BR/EDR Address: 00:AA:01:01:00:00 (Intel Corporation)
        Reason: Connection terminated by local host for suspend (0x05)
Suspend/Resume - Success 5 (Pairing - Legacy) - test timed out
= mgmt-tester: Suspend/Resume - Success 5 (Pairing -..   17:03:13.939317
Suspend/Resume - Success 5 (Pairing - Legacy) - teardown
= mgmt-tester: Suspend/Resume - Success 5 (Pairing -..   17:03:13.947267
[   13.284291] Bluetooth: hci0: Timed out waiting for suspend events
[   13.287324] Bluetooth: hci0: Suspend timeout bit: 6

Signed-off-by: Luiz Augusto von Dentz <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
jenkins-tessares pushed a commit that referenced this issue Jan 25, 2022
…ed bind()

Syzbot detected a NULL pointer dereference of nfc_llcp_sock->dev pointer
(which is a 'struct nfc_dev *') with calls to llcp_sock_sendmsg() after
a failed llcp_sock_bind(). The message being sent is a SOCK_DGRAM.

KASAN report:

  BUG: KASAN: null-ptr-deref in nfc_alloc_send_skb+0x2d/0xc0
  Read of size 4 at addr 00000000000005c8 by task llcp_sock_nfc_a/899

  CPU: 5 PID: 899 Comm: llcp_sock_nfc_a Not tainted 5.16.0-rc6-next-20211224-00001-gc6437fbf18b0 #125
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
  Call Trace:
   <TASK>
   dump_stack_lvl+0x45/0x59
   ? nfc_alloc_send_skb+0x2d/0xc0
   __kasan_report.cold+0x117/0x11c
   ? mark_lock+0x480/0x4f0
   ? nfc_alloc_send_skb+0x2d/0xc0
   kasan_report+0x38/0x50
   nfc_alloc_send_skb+0x2d/0xc0
   nfc_llcp_send_ui_frame+0x18c/0x2a0
   ? nfc_llcp_send_i_frame+0x230/0x230
   ? __local_bh_enable_ip+0x86/0xe0
   ? llcp_sock_connect+0x470/0x470
   ? llcp_sock_connect+0x470/0x470
   sock_sendmsg+0x8e/0xa0
   ____sys_sendmsg+0x253/0x3f0
   ...

The issue was visible only with multiple simultaneous calls to bind() and
sendmsg(), which resulted in most of the bind() calls to fail.  The
bind() was failing on checking if there is available WKS/SDP/SAP
(respective bit in 'struct nfc_llcp_local' fields).  When there was no
available WKS/SDP/SAP, the bind returned error but the sendmsg() to such
socket was able to trigger mentioned NULL pointer dereference of
nfc_llcp_sock->dev.

The code looks simply racy and currently it protects several paths
against race with checks for (!nfc_llcp_sock->local) which is NULL-ified
in error paths of bind().  The llcp_sock_sendmsg() did not have such
check but called function nfc_llcp_send_ui_frame() had, although not
protected with lock_sock().

Therefore the race could look like (same socket is used all the time):
  CPU0                                     CPU1
  ====                                     ====
  llcp_sock_bind()
  - lock_sock()
    - success
  - release_sock()
  - return 0
                                           llcp_sock_sendmsg()
                                           - lock_sock()
                                           - release_sock()
  llcp_sock_bind(), same socket
  - lock_sock()
    - error
                                           - nfc_llcp_send_ui_frame()
                                             - if (!llcp_sock->local)
    - llcp_sock->local = NULL
    - nfc_put_device(dev)
                                             - dereference llcp_sock->dev
  - release_sock()
  - return -ERRNO

The nfc_llcp_send_ui_frame() checked llcp_sock->local outside of the
lock, which is racy and ineffective check.  Instead, its caller
llcp_sock_sendmsg(), should perform the check inside lock_sock().

Reported-and-tested-by: [email protected]
Fixes: b874dec ("NFC: Implement LLCP connection less Tx path")
Cc: <[email protected]>
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
matttbe pushed a commit that referenced this issue Jan 26, 2024
Like commit 1cf3bfc ("bpf: Support 64-bit pointers to kfuncs")
for s390x, add support for 64-bit pointers to kfuncs for LoongArch.
Since the infrastructure is already implemented in BPF core, the only
thing need to be done is to override bpf_jit_supports_far_kfunc_call().

Before this change, several test_verifier tests failed:

  # ./test_verifier | grep # | grep FAIL
  #119/p calls: invalid kfunc call: ptr_to_mem to struct with non-scalar FAIL
  #120/p calls: invalid kfunc call: ptr_to_mem to struct with nesting depth > 4 FAIL
  #121/p calls: invalid kfunc call: ptr_to_mem to struct with FAM FAIL
  #122/p calls: invalid kfunc call: reg->type != PTR_TO_CTX FAIL
  #123/p calls: invalid kfunc call: void * not allowed in func proto without mem size arg FAIL
  #124/p calls: trigger reg2btf_ids[reg->type] for reg->type > __BPF_REG_TYPE_MAX FAIL
  #125/p calls: invalid kfunc call: reg->off must be zero when passed to release kfunc FAIL
  #126/p calls: invalid kfunc call: don't match first member type when passed to release kfunc FAIL
  #127/p calls: invalid kfunc call: PTR_TO_BTF_ID with negative offset FAIL
  #128/p calls: invalid kfunc call: PTR_TO_BTF_ID with variable offset FAIL
  #129/p calls: invalid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL
  #130/p calls: valid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL
  #486/p map_kptr: ref: reference state created and released on xchg FAIL

This is because the kfuncs in the loaded module are far away from
__bpf_call_base:

  ffff800002009440 t bpf_kfunc_call_test_fail1    [bpf_testmod]
  9000000002e128d8 T __bpf_call_base

The offset relative to __bpf_call_base does NOT fit in s32, which breaks
the assumption in BPF core. Enable bpf_jit_supports_far_kfunc_call() lifts
this limit.

Note that to reproduce the above result, tools/testing/selftests/bpf/config
should be applied, and run the test with JIT enabled, unpriv BPF enabled.

With this change, the test_verifier tests now all passed:

  # ./test_verifier
  ...
  Summary: 777 PASSED, 0 SKIPPED, 0 FAILED

Tested-by: Tiezhu Yang <[email protected]>
Signed-off-by: Hengqi Chen <[email protected]>
Signed-off-by: Huacai Chen <[email protected]>
matttbe pushed a commit that referenced this issue Sep 6, 2024
The start_kthread() and stop_thread() code was not always called with the
interface_lock held. This means that the kthread variable could be
unexpectedly changed causing the kthread_stop() to be called on it when it
should not have been, leading to:

 while true; do
   rtla timerlat top -u -q & PID=$!;
   sleep 5;
   kill -INT $PID;
   sleep 0.001;
   kill -TERM $PID;
   wait $PID;
  done

Causing the following OOPS:

 Oops: general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN PTI
 KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
 CPU: 5 UID: 0 PID: 885 Comm: timerlatu/5 Not tainted 6.11.0-rc4-test-00002-gbc754cc76d1b-dirty #125 a533010b71dab205ad2f507188ce8c82203b0254
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
 RIP: 0010:hrtimer_active+0x58/0x300
 Code: 48 c1 ee 03 41 54 48 01 d1 48 01 d6 55 53 48 83 ec 20 80 39 00 0f 85 30 02 00 00 49 8b 6f 30 4c 8d 75 10 4c 89 f0 48 c1 e8 03 <0f> b6 3c 10 4c 89 f0 83 e0 07 83 c0 03 40 38 f8 7c 09 40 84 ff 0f
 RSP: 0018:ffff88811d97f940 EFLAGS: 00010202
 RAX: 0000000000000002 RBX: ffff88823c6b5b28 RCX: ffffed10478d6b6b
 RDX: dffffc0000000000 RSI: ffffed10478d6b6c RDI: ffff88823c6b5b28
 RBP: 0000000000000000 R08: ffff88823c6b5b58 R09: ffff88823c6b5b60
 R10: ffff88811d97f957 R11: 0000000000000010 R12: 00000000000a801d
 R13: ffff88810d8b35d8 R14: 0000000000000010 R15: ffff88823c6b5b28
 FS:  0000000000000000(0000) GS:ffff88823c680000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000561858ad7258 CR3: 000000007729e001 CR4: 0000000000170ef0
 Call Trace:
  <TASK>
  ? die_addr+0x40/0xa0
  ? exc_general_protection+0x154/0x230
  ? asm_exc_general_protection+0x26/0x30
  ? hrtimer_active+0x58/0x300
  ? __pfx_mutex_lock+0x10/0x10
  ? __pfx_locks_remove_file+0x10/0x10
  hrtimer_cancel+0x15/0x40
  timerlat_fd_release+0x8e/0x1f0
  ? security_file_release+0x43/0x80
  __fput+0x372/0xb10
  task_work_run+0x11e/0x1f0
  ? _raw_spin_lock+0x85/0xe0
  ? __pfx_task_work_run+0x10/0x10
  ? poison_slab_object+0x109/0x170
  ? do_exit+0x7a0/0x24b0
  do_exit+0x7bd/0x24b0
  ? __pfx_migrate_enable+0x10/0x10
  ? __pfx_do_exit+0x10/0x10
  ? __pfx_read_tsc+0x10/0x10
  ? ktime_get+0x64/0x140
  ? _raw_spin_lock_irq+0x86/0xe0
  do_group_exit+0xb0/0x220
  get_signal+0x17ba/0x1b50
  ? vfs_read+0x179/0xa40
  ? timerlat_fd_read+0x30b/0x9d0
  ? __pfx_get_signal+0x10/0x10
  ? __pfx_timerlat_fd_read+0x10/0x10
  arch_do_signal_or_restart+0x8c/0x570
  ? __pfx_arch_do_signal_or_restart+0x10/0x10
  ? vfs_read+0x179/0xa40
  ? ksys_read+0xfe/0x1d0
  ? __pfx_ksys_read+0x10/0x10
  syscall_exit_to_user_mode+0xbc/0x130
  do_syscall_64+0x74/0x110
  ? __pfx___rseq_handle_notify_resume+0x10/0x10
  ? __pfx_ksys_read+0x10/0x10
  ? fpregs_restore_userregs+0xdb/0x1e0
  ? fpregs_restore_userregs+0xdb/0x1e0
  ? syscall_exit_to_user_mode+0x116/0x130
  ? do_syscall_64+0x74/0x110
  ? do_syscall_64+0x74/0x110
  ? do_syscall_64+0x74/0x110
  entry_SYSCALL_64_after_hwframe+0x71/0x79
 RIP: 0033:0x7ff0070eca9c
 Code: Unable to access opcode bytes at 0x7ff0070eca72.
 RSP: 002b:00007ff006dff8c0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
 RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00007ff0070eca9c
 RDX: 0000000000000400 RSI: 00007ff006dff9a0 RDI: 0000000000000003
 RBP: 00007ff006dffde0 R08: 0000000000000000 R09: 00007ff000000ba0
 R10: 00007ff007004b08 R11: 0000000000000246 R12: 0000000000000003
 R13: 00007ff006dff9a0 R14: 0000000000000007 R15: 0000000000000008
  </TASK>
 Modules linked in: snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hwdep snd_hda_core
 ---[ end trace 0000000000000000 ]---

This is because it would mistakenly call kthread_stop() on a user space
thread making it "exit" before it actually exits.

Since kthreads are created based on global behavior, use a cpumask to know
when kthreads are running and that they need to be shutdown before
proceeding to do new work.

Link: https://lore.kernel.org/all/[email protected]/

This was debugged by using the persistent ring buffer:

Link: https://lore.kernel.org/all/[email protected]/

Note, locking was originally used to fix this, but that proved to cause too
many deadlocks to work around:

  https://lore.kernel.org/linux-trace-kernel/[email protected]/

Cc: [email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: "Luis Claudio R. Goncalves" <[email protected]>
Link: https://lore.kernel.org/[email protected]
Fixes: e88ed22 ("tracing/timerlat: Add user-space interface")
Reported-by: Tomas Glozar <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants