-
-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
syzkaller: memory leak in subflow_create_ctx
#356
Comments
intel-lab-lkp
pushed a commit
to intel-lab-lkp/linux
that referenced
this issue
Mar 1, 2023
When subflow_syn_recv_sock() loses the inet hash insert race, the newly created children will be released by inet_csk_complete_hashdance(). In that scenario, without any further hint, the ulp release callback will keep the ulp context alive, expecting that the msk socket will later free it. Anyway the dying child is not linked to any msk socket, and the context will be leaked, as reported by Christoph. Address the issue explicitly releasing the context in the critical scenario. Reported-by: Christoph Paasch <[email protected]> Closes: multipath-tcp/mptcp_net-next#356 Fixes: cec37a6 ("mptcp: Handle MP_CAPABLE options for outgoing connections") Signed-off-by: Paolo Abeni <[email protected]>
matttbe
pushed a commit
that referenced
this issue
Mar 6, 2023
When subflow_syn_recv_sock() loses the inet hash insert race, the newly created children will be released by inet_csk_complete_hashdance(). In that scenario, without any further hint, the ulp release callback will keep the ulp context alive, expecting that the msk socket will later free it. Anyway the dying child is not linked to any msk socket, and the context will be leaked, as reported by Christoph. Address the issue explicitly releasing the context in the critical scenario. Fixes: cec37a6 ("mptcp: Handle MP_CAPABLE options for outgoing connections") Reported-by: Christoph Paasch <[email protected]> Closes: #356 Signed-off-by: Paolo Abeni <[email protected]> Reviewed-by: Matthieu Baerts <[email protected]>
matttbe
pushed a commit
that referenced
this issue
Mar 6, 2023
When subflow_syn_recv_sock() loses the inet hash insert race, the newly created children will be released by inet_csk_complete_hashdance(). In that scenario, without any further hint, the ulp release callback will keep the ulp context alive, expecting that the msk socket will later free it. Anyway the dying child is not linked to any msk socket, and the context will be leaked, as reported by Christoph. Address the issue explicitly releasing the context in the critical scenario. Fixes: cec37a6 ("mptcp: Handle MP_CAPABLE options for outgoing connections") Reported-by: Christoph Paasch <[email protected]> Closes: #356 Signed-off-by: Paolo Abeni <[email protected]> Reviewed-by: Matthieu Baerts <[email protected]>
matttbe
pushed a commit
that referenced
this issue
Mar 7, 2023
When subflow_syn_recv_sock() loses the inet hash insert race, the newly created children will be released by inet_csk_complete_hashdance(). In that scenario, without any further hint, the ulp release callback will keep the ulp context alive, expecting that the msk socket will later free it. Anyway the dying child is not linked to any msk socket, and the context will be leaked, as reported by Christoph. Address the issue explicitly releasing the context in the critical scenario. Fixes: cec37a6 ("mptcp: Handle MP_CAPABLE options for outgoing connections") Reported-by: Christoph Paasch <[email protected]> Closes: #356 Signed-off-by: Paolo Abeni <[email protected]> Reviewed-by: Matthieu Baerts <[email protected]>
matttbe
pushed a commit
that referenced
this issue
Mar 7, 2023
When subflow_syn_recv_sock() loses the inet hash insert race, the newly created children will be released by inet_csk_complete_hashdance(). In that scenario, without any further hint, the ulp release callback will keep the ulp context alive, expecting that the msk socket will later free it. Anyway the dying child is not linked to any msk socket, and the context will be leaked, as reported by Christoph. Address the issue explicitly releasing the context in the critical scenario. Fixes: cec37a6 ("mptcp: Handle MP_CAPABLE options for outgoing connections") Reported-by: Christoph Paasch <[email protected]> Closes: #356 Signed-off-by: Paolo Abeni <[email protected]> Reviewed-by: Matthieu Baerts <[email protected]>
@cpaasch this issue has been fixed we think but hard to confirm without reproducer. |
matttbe
pushed a commit
that referenced
this issue
Mar 7, 2023
When subflow_syn_recv_sock() loses the inet hash insert race, the newly created children will be released by inet_csk_complete_hashdance(). In that scenario, without any further hint, the ulp release callback will keep the ulp context alive, expecting that the msk socket will later free it. Anyway the dying child is not linked to any msk socket, and the context will be leaked, as reported by Christoph. Address the issue explicitly releasing the context in the critical scenario. Fixes: cec37a6 ("mptcp: Handle MP_CAPABLE options for outgoing connections") Reported-by: Christoph Paasch <[email protected]> Closes: #356 Signed-off-by: Paolo Abeni <[email protected]> Reviewed-by: Matthieu Baerts <[email protected]>
jenkins-tessares
pushed a commit
that referenced
this issue
Mar 8, 2023
When subflow_syn_recv_sock() loses the inet hash insert race, the newly created children will be released by inet_csk_complete_hashdance(). In that scenario, without any further hint, the ulp release callback will keep the ulp context alive, expecting that the msk socket will later free it. Anyway the dying child is not linked to any msk socket, and the context will be leaked, as reported by Christoph. Address the issue explicitly releasing the context in the critical scenario. Fixes: cec37a6 ("mptcp: Handle MP_CAPABLE options for outgoing connections") Reported-by: Christoph Paasch <[email protected]> Closes: #356 Signed-off-by: Paolo Abeni <[email protected]> Reviewed-by: Matthieu Baerts <[email protected]>
jenkins-tessares
pushed a commit
that referenced
this issue
Mar 8, 2023
When subflow_syn_recv_sock() loses the inet hash insert race, the newly created children will be released by inet_csk_complete_hashdance(). In that scenario, without any further hint, the ulp release callback will keep the ulp context alive, expecting that the msk socket will later free it. Anyway the dying child is not linked to any msk socket, and the context will be leaked, as reported by Christoph. Address the issue explicitly releasing the context in the critical scenario. Fixes: cec37a6 ("mptcp: Handle MP_CAPABLE options for outgoing connections") Reported-by: Christoph Paasch <[email protected]> Closes: #356 Signed-off-by: Paolo Abeni <[email protected]> Reviewed-by: Matthieu Baerts <[email protected]>
jenkins-tessares
pushed a commit
that referenced
this issue
Mar 9, 2023
When subflow_syn_recv_sock() loses the inet hash insert race, the newly created children will be released by inet_csk_complete_hashdance(). In that scenario, without any further hint, the ulp release callback will keep the ulp context alive, expecting that the msk socket will later free it. Anyway the dying child is not linked to any msk socket, and the context will be leaked, as reported by Christoph. Address the issue explicitly releasing the context in the critical scenario. Fixes: cec37a6 ("mptcp: Handle MP_CAPABLE options for outgoing connections") Reported-by: Christoph Paasch <[email protected]> Closes: #356 Signed-off-by: Paolo Abeni <[email protected]> Reviewed-by: Matthieu Baerts <[email protected]>
jenkins-tessares
pushed a commit
that referenced
this issue
Mar 9, 2023
When subflow_syn_recv_sock() loses the inet hash insert race, the newly created children will be released by inet_csk_complete_hashdance(). In that scenario, without any further hint, the ulp release callback will keep the ulp context alive, expecting that the msk socket will later free it. Anyway the dying child is not linked to any msk socket, and the context will be leaked, as reported by Christoph. Address the issue explicitly releasing the context in the critical scenario. Fixes: cec37a6 ("mptcp: Handle MP_CAPABLE options for outgoing connections") Reported-by: Christoph Paasch <[email protected]> Closes: #356 Signed-off-by: Paolo Abeni <[email protected]> Reviewed-by: Matthieu Baerts <[email protected]>
matttbe
added a commit
that referenced
this issue
Mar 9, 2023
This reverts commit ccd4c33. According to Paolo, this one is causing issue #356 and indirectly #370. Closes: #356 Signed-off-by: Matthieu Baerts <[email protected]>
matttbe
pushed a commit
that referenced
this issue
Dec 1, 2023
With latest upstream llvm18, the following test cases failed: $ ./test_progs -j #13/2 bpf_cookie/multi_kprobe_link_api:FAIL #13/3 bpf_cookie/multi_kprobe_attach_api:FAIL #13 bpf_cookie:FAIL #77 fentry_fexit:FAIL #78/1 fentry_test/fentry:FAIL #78 fentry_test:FAIL #82/1 fexit_test/fexit:FAIL #82 fexit_test:FAIL #112/1 kprobe_multi_test/skel_api:FAIL #112/2 kprobe_multi_test/link_api_addrs:FAIL [...] #112 kprobe_multi_test:FAIL #356/17 test_global_funcs/global_func17:FAIL #356 test_global_funcs:FAIL Further analysis shows llvm upstream patch [1] is responsible for the above failures. For example, for function bpf_fentry_test7() in net/bpf/test_run.c, without [1], the asm code is: 0000000000000400 <bpf_fentry_test7>: 400: f3 0f 1e fa endbr64 404: e8 00 00 00 00 callq 0x409 <bpf_fentry_test7+0x9> 409: 48 89 f8 movq %rdi, %rax 40c: c3 retq 40d: 0f 1f 00 nopl (%rax) ... and with [1], the asm code is: 0000000000005d20 <bpf_fentry_test7.specialized.1>: 5d20: e8 00 00 00 00 callq 0x5d25 <bpf_fentry_test7.specialized.1+0x5> 5d25: c3 retq ... and <bpf_fentry_test7.specialized.1> is called instead of <bpf_fentry_test7> and this caused test failures for #13/#77 etc. except #356. For test case #356/17, with [1] (progs/test_global_func17.c)), the main prog looks like: 0000000000000000 <global_func17>: 0: b4 00 00 00 2a 00 00 00 w0 = 0x2a 1: 95 00 00 00 00 00 00 00 exit ... which passed verification while the test itself expects a verification failure. Let us add 'barrier_var' style asm code in both places to prevent function specialization which caused selftests failure. [1] llvm/llvm-project#72903 Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
matttbe
pushed a commit
that referenced
this issue
Nov 8, 2024
KASAN reports that the GPU metrics table allocated in vangogh_tables_init() is not large enough for the memset done in smu_cmn_init_soft_gpu_metrics(). Condensed report follows: [ 33.861314] BUG: KASAN: slab-out-of-bounds in smu_cmn_init_soft_gpu_metrics+0x73/0x200 [amdgpu] [ 33.861799] Write of size 168 at addr ffff888129f59500 by task mangoapp/1067 ... [ 33.861808] CPU: 6 UID: 1000 PID: 1067 Comm: mangoapp Tainted: G W 6.12.0-rc4 #356 1a56f59a8b5182eeaf67eb7cb8b13594dd23b544 [ 33.861816] Tainted: [W]=WARN [ 33.861818] Hardware name: Valve Galileo/Galileo, BIOS F7G0107 12/01/2023 [ 33.861822] Call Trace: [ 33.861826] <TASK> [ 33.861829] dump_stack_lvl+0x66/0x90 [ 33.861838] print_report+0xce/0x620 [ 33.861853] kasan_report+0xda/0x110 [ 33.862794] kasan_check_range+0xfd/0x1a0 [ 33.862799] __asan_memset+0x23/0x40 [ 33.862803] smu_cmn_init_soft_gpu_metrics+0x73/0x200 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.863306] vangogh_get_gpu_metrics_v2_4+0x123/0xad0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.864257] vangogh_common_get_gpu_metrics+0xb0c/0xbc0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.865682] amdgpu_dpm_get_gpu_metrics+0xcc/0x110 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.866160] amdgpu_get_gpu_metrics+0x154/0x2d0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.867135] dev_attr_show+0x43/0xc0 [ 33.867147] sysfs_kf_seq_show+0x1f1/0x3b0 [ 33.867155] seq_read_iter+0x3f8/0x1140 [ 33.867173] vfs_read+0x76c/0xc50 [ 33.867198] ksys_read+0xfb/0x1d0 [ 33.867214] do_syscall_64+0x90/0x160 ... [ 33.867353] Allocated by task 378 on cpu 7 at 22.794876s: [ 33.867358] kasan_save_stack+0x33/0x50 [ 33.867364] kasan_save_track+0x17/0x60 [ 33.867367] __kasan_kmalloc+0x87/0x90 [ 33.867371] vangogh_init_smc_tables+0x3f9/0x840 [amdgpu] [ 33.867835] smu_sw_init+0xa32/0x1850 [amdgpu] [ 33.868299] amdgpu_device_init+0x467b/0x8d90 [amdgpu] [ 33.868733] amdgpu_driver_load_kms+0x19/0xf0 [amdgpu] [ 33.869167] amdgpu_pci_probe+0x2d6/0xcd0 [amdgpu] [ 33.869608] local_pci_probe+0xda/0x180 [ 33.869614] pci_device_probe+0x43f/0x6b0 Empirically we can confirm that the former allocates 152 bytes for the table, while the latter memsets the 168 large block. Root cause appears that when GPU metrics tables for v2_4 parts were added it was not considered to enlarge the table to fit. The fix in this patch is rather "brute force" and perhaps later should be done in a smarter way, by extracting and consolidating the part version to size logic to a common helper, instead of brute forcing the largest possible allocation. Nevertheless, for now this works and fixes the out of bounds write. v2: * Drop impossible v3_0 case. (Mario) Signed-off-by: Tvrtko Ursulin <[email protected]> Fixes: 41cec40 ("drm/amd/pm: Vangogh: Add new gpu_metrics_v2_4 to acquire gpu_metrics") Cc: Mario Limonciello <[email protected]> Cc: Evan Quan <[email protected]> Cc: Wenyou Yang <[email protected]> Cc: Alex Deucher <[email protected]> Reviewed-by: Mario Limonciello <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]> (cherry picked from commit 0880f58f9609f0200483a49429af0f050d281703) Cc: [email protected] # v6.6+
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Head: 89ce2ca (02/28)
3038497 (02/27)
0150d5b
Trace:
No reproducers yet.
Kconfig: Kconfig_k8_kmemleak.txt
The text was updated successfully, but these errors were encountered: