-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RK3328 : Decoding issues #12
Comments
The patch which was provided below seems to solve the playback start issue
But when stopping playback, we're getting another crash. Here is the associated kernel log :
|
Here is a full kernel log file where this issue happenned : |
Did you modify dma code yourselves? |
No i don't think we changed anything regarding dma. I know @Kwiboo mentionned that recent changes to pl330 seems to have introduced such issues. |
try it |
commit ba80aa9 upstream. This patch closes a long standing race in configfs between the creation of a new symlink in create_link(), while the symlink target's config_item is being concurrently removed via configfs_rmdir(). This can happen because the symlink target's reference is obtained by config_item_get() in create_link() before the CONFIGFS_USET_DROPPING bit set by configfs_detach_prep() during configfs_rmdir() shutdown is actually checked.. This originally manifested itself on ppc64 on v4.8.y under heavy load using ibmvscsi target ports with Novalink API: [ 7877.289863] rpadlpar_io: slot U8247.22L.212A91A-V1-C8 added [ 7879.893760] ------------[ cut here ]------------ [ 7879.893768] WARNING: CPU: 15 PID: 17585 at ./include/linux/kref.h:46 config_item_get+0x7c/0x90 [configfs] [ 7879.893811] CPU: 15 PID: 17585 Comm: targetcli Tainted: G O 4.8.17-customv2.22 #12 [ 7879.893812] task: c00000018a0d3400 task.stack: c0000001f3b40000 [ 7879.893813] NIP: d000000002c664ec LR: d000000002c60980 CTR: c000000000b70870 [ 7879.893814] REGS: c0000001f3b43810 TRAP: 0700 Tainted: G O (4.8.17-customv2.22) [ 7879.893815] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 28222242 XER: 00000000 [ 7879.893820] CFAR: d000000002c664bc SOFTE: 1 GPR00: d000000002c60980 c0000001f3b43a90 d000000002c70908 c0000000fbc06820 GPR04: c0000001ef1bd900 0000000000000004 0000000000000001 0000000000000000 GPR08: 0000000000000000 0000000000000001 d000000002c69560 d000000002c66d80 GPR12: c000000000b70870 c00000000e798700 c0000001f3b43ca0 c0000001d4949d40 GPR16: c00000014637e1c0 0000000000000000 0000000000000000 c0000000f2392940 GPR20: c0000001f3b43b98 0000000000000041 0000000000600000 0000000000000000 GPR24: fffffffffffff000 0000000000000000 d000000002c60be0 c0000001f1dac490 GPR28: 0000000000000004 0000000000000000 c0000001ef1bd900 c0000000f2392940 [ 7879.893839] NIP [d000000002c664ec] config_item_get+0x7c/0x90 [configfs] [ 7879.893841] LR [d000000002c60980] check_perm+0x80/0x2e0 [configfs] [ 7879.893842] Call Trace: [ 7879.893844] [c0000001f3b43ac0] [d000000002c60980] check_perm+0x80/0x2e0 [configfs] [ 7879.893847] [c0000001f3b43b10] [c000000000329770] do_dentry_open+0x2c0/0x460 [ 7879.893849] [c0000001f3b43b70] [c000000000344480] path_openat+0x210/0x1490 [ 7879.893851] [c0000001f3b43c80] [c00000000034708c] do_filp_open+0xfc/0x170 [ 7879.893853] [c0000001f3b43db0] [c00000000032b5bc] do_sys_open+0x1cc/0x390 [ 7879.893856] [c0000001f3b43e30] [c000000000009584] system_call+0x38/0xec [ 7879.893856] Instruction dump: [ 7879.893858] 409d0014 38210030 e8010010 7c0803a6 4e800020 3d220000 e94981e0 892a0000 [ 7879.893861] 2f890000 409effe0 39200001 992a0000 <0fe00000> 4bffffd0 60000000 60000000 [ 7879.893866] ---[ end trace 14078f0b3b5ad0aa ]--- To close this race, go ahead and obtain the symlink's target config_item reference only after the existing CONFIGFS_USET_DROPPING check succeeds. This way, if configfs_rmdir() wins create_link() will return -ENONET, and if create_link() wins configfs_rmdir() will return -EBUSY. Reported-by: Bryant G. Ly <[email protected]> Tested-by: Bryant G. Ly <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
The power domain change did not have much affect but I have not seen this issue with latest kernel updates. #20 might be related to this issue. |
…ns[] When adding GP-1-28 port pin support it was forgotten to remove the CLKOUT pin from the list of pins that are not associated with a GPIO port in pinmux_pins[]. This results in a warning when reading the pinctrl files in sysfs as the CLKOUT pin is still added as a none GPIO pin. Fix this by removing the duplicated entry which is no longer needed. ~ # cat /sys/kernel/debug/pinctrl/e6060000.pin-controller/pinconf-pins [ 89.432081] ------------[ cut here ]------------ [ 89.436904] Pin 496 is not in bias info list [ 89.441252] WARNING: CPU: 1 PID: 456 at drivers/pinctrl/sh-pfc/core.c:408 sh_pfc_pin_to_bias_reg+0xb0/0xb8 [ 89.451002] CPU: 1 PID: 456 Comm: cat Not tainted 4.16.0-rc1-arm64-renesas-00048-gdfafc344a4f24dde #12 [ 89.460394] Hardware name: Renesas Salvator-X 2nd version board based on r8a7795 ES2.0+ (DT) [ 89.468910] pstate: 80000085 (Nzcv daIf -PAN -UAO) [ 89.473747] pc : sh_pfc_pin_to_bias_reg+0xb0/0xb8 [ 89.478495] lr : sh_pfc_pin_to_bias_reg+0xb0/0xb8 [ 89.483241] sp : ffff00000aff3ab0 [ 89.486587] x29: ffff00000aff3ab0 x28: ffff00000893c698 [ 89.491955] x27: ffff000008ad7d98 x26: 0000000000000000 [ 89.497323] x25: ffff8006fb3f5028 x24: ffff8006fb3f5018 [ 89.502690] x23: 0000000000000001 x22: 00000000000001f0 [ 89.508057] x21: ffff8006fb3f5018 x20: ffff000008bef000 [ 89.513423] x19: 0000000000000000 x18: ffffffffffffffff [ 89.518790] x17: 0000000000006c4a x16: ffff000008d67c98 [ 89.524157] x15: 0000000000000001 x14: ffff00000896ca98 [ 89.529524] x13: 00000000cce5f611 x12: ffff8006f8d3b5a8 [ 89.534891] x11: ffff00000981e000 x10: ffff000008befa08 [ 89.540258] x9 : ffff8006f9b987a0 x8 : ffff000008befa08 [ 89.545625] x7 : ffff000008137094 x6 : 0000000000000000 [ 89.550991] x5 : 0000000000000000 x4 : 0000000000000001 [ 89.556357] x3 : 0000000000000007 x2 : 0000000000000007 [ 89.561723] x1 : 1ff24f80f1818600 x0 : 0000000000000000 [ 89.567091] Call trace: [ 89.569561] sh_pfc_pin_to_bias_reg+0xb0/0xb8 [ 89.573960] r8a7795_pinmux_get_bias+0x30/0xc0 [ 89.578445] sh_pfc_pinconf_get+0x1e0/0x2d8 [ 89.582669] pin_config_get_for_pin+0x20/0x30 [ 89.587067] pinconf_generic_dump_one+0x180/0x1c8 [ 89.591815] pinconf_generic_dump_pins+0x84/0xd8 [ 89.596476] pinconf_pins_show+0xc8/0x130 [ 89.600528] seq_read+0xe4/0x510 [ 89.603789] full_proxy_read+0x60/0x90 [ 89.607576] __vfs_read+0x30/0x140 [ 89.611010] vfs_read+0x90/0x170 [ 89.614269] SyS_read+0x60/0xd8 [ 89.617443] __sys_trace_return+0x0/0x4 [ 89.621314] ---[ end trace 99c8d0d39c13e794 ]--- Fixes: 82d2de5 ("pinctrl: sh-pfc: r8a7795: Add GP-1-28 port pin support") Reviewed-and-tested-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Niklas Söderlund <[email protected]> Signed-off-by: Linus Walleij <[email protected]>
Scheduler debug stats include newlines that display out of alignment when prefixed by timestamps. For example, the dmesg utility: % echo t > /proc/sysrq-trigger % dmesg ... [ 83.124251] runnable tasks: S task PID tree-key switches prio wait-time sum-exec sum-sleep ----------------------------------------------------------------------------------------------------------- At the same time, some syslog utilities (like rsyslog by default) don't like the additional newlines control characters, saving lines like this to /var/log/messages: Mar 16 16:02:29 localhost kernel: #012runnable tasks:#12 S task PID tree-key ... ^^^^ ^^^^ Clean these up by moving newline characters to their own SEQ_printf invocation. This leaves the /proc/sched_debug unchanged, but brings the entire output into alignment when prefixed: % echo t > /proc/sysrq-trigger % dmesg ... [ 62.410368] runnable tasks: [ 62.410368] S task PID tree-key switches prio wait-time sum-exec sum-sleep [ 62.410369] ----------------------------------------------------------------------------------------------------------- [ 62.410369] I kworker/u12:0 5 1932.215593 332 120 0.000000 3.621252 0.000000 0 0 / and no escaped control characters from rsyslog in /var/log/messages: Mar 16 16:15:06 localhost kernel: runnable tasks: Mar 16 16:15:06 localhost kernel: S task PID tree-key ... Signed-off-by: Joe Lawrence <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
The seg6_build_state() function is called with RCU read lock held, so we cannot use GFP_KERNEL. This patch uses GFP_ATOMIC instead. [ 92.770271] ============================= [ 92.770628] WARNING: suspicious RCU usage [ 92.770921] 4.16.0-rc4+ #12 Not tainted [ 92.771277] ----------------------------- [ 92.771585] ./include/linux/rcupdate.h:302 Illegal context switch in RCU read-side critical section! [ 92.772279] [ 92.772279] other info that might help us debug this: [ 92.772279] [ 92.773067] [ 92.773067] rcu_scheduler_active = 2, debug_locks = 1 [ 92.773514] 2 locks held by ip/2413: [ 92.773765] #0: (rtnl_mutex){+.+.}, at: [<00000000e5461720>] rtnetlink_rcv_msg+0x441/0x4d0 [ 92.774377] #1: (rcu_read_lock){....}, at: [<00000000df4f161e>] lwtunnel_build_state+0x59/0x210 [ 92.775065] [ 92.775065] stack backtrace: [ 92.775371] CPU: 0 PID: 2413 Comm: ip Not tainted 4.16.0-rc4+ #12 [ 92.775791] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc27 04/01/2014 [ 92.776608] Call Trace: [ 92.776852] dump_stack+0x7d/0xbc [ 92.777130] __schedule+0x133/0xf00 [ 92.777393] ? unwind_get_return_address_ptr+0x50/0x50 [ 92.777783] ? __sched_text_start+0x8/0x8 [ 92.778073] ? rcu_is_watching+0x19/0x30 [ 92.778383] ? kernel_text_address+0x49/0x60 [ 92.778800] ? __kernel_text_address+0x9/0x30 [ 92.779241] ? unwind_get_return_address+0x29/0x40 [ 92.779727] ? pcpu_alloc+0x102/0x8f0 [ 92.780101] _cond_resched+0x23/0x50 [ 92.780459] __mutex_lock+0xbd/0xad0 [ 92.780818] ? pcpu_alloc+0x102/0x8f0 [ 92.781194] ? seg6_build_state+0x11d/0x240 [ 92.781611] ? save_stack+0x9b/0xb0 [ 92.781965] ? __ww_mutex_wakeup_for_backoff+0xf0/0xf0 [ 92.782480] ? seg6_build_state+0x11d/0x240 [ 92.782925] ? lwtunnel_build_state+0x1bd/0x210 [ 92.783393] ? ip6_route_info_create+0x687/0x1640 [ 92.783846] ? ip6_route_add+0x74/0x110 [ 92.784236] ? inet6_rtm_newroute+0x8a/0xd0 Fixes: 6c8702c ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels") Signed-off-by: David Lebrun <[email protected]> Signed-off-by: David S. Miller <[email protected]>
[ Upstream commit 2bbea6e ] when mounting an ISO filesystem sometimes (very rarely) the system hangs because of a race condition between two tasks. PID: 6766 TASK: ffff88007b2a6dd0 CPU: 0 COMMAND: "mount" #0 [ffff880078447ae0] __schedule at ffffffff8168d605 #1 [ffff880078447b48] schedule_preempt_disabled at ffffffff8168ed49 #2 [ffff880078447b58] __mutex_lock_slowpath at ffffffff8168c995 #3 [ffff880078447bb8] mutex_lock at ffffffff8168bdef #4 [ffff880078447bd0] sr_block_ioctl at ffffffffa00b6818 [sr_mod] #5 [ffff880078447c10] blkdev_ioctl at ffffffff812fea50 #6 [ffff880078447c70] ioctl_by_bdev at ffffffff8123a8b3 #7 [ffff880078447c90] isofs_fill_super at ffffffffa04fb1e1 [isofs] #8 [ffff880078447da8] mount_bdev at ffffffff81202570 #9 [ffff880078447e18] isofs_mount at ffffffffa04f9828 [isofs] #10 [ffff880078447e28] mount_fs at ffffffff81202d09 #11 [ffff880078447e70] vfs_kern_mount at ffffffff8121ea8f #12 [ffff880078447ea8] do_mount at ffffffff81220fee #13 [ffff880078447f28] sys_mount at ffffffff812218d6 #14 [ffff880078447f80] system_call_fastpath at ffffffff81698c49 RIP: 00007fd9ea914e9a RSP: 00007ffd5d9bf648 RFLAGS: 00010246 RAX: 00000000000000a5 RBX: ffffffff81698c49 RCX: 0000000000000010 RDX: 00007fd9ec2bc210 RSI: 00007fd9ec2bc290 RDI: 00007fd9ec2bcf30 RBP: 0000000000000000 R8: 0000000000000000 R9: 0000000000000010 R10: 00000000c0ed0001 R11: 0000000000000206 R12: 00007fd9ec2bc040 R13: 00007fd9eb6b2380 R14: 00007fd9ec2bc210 R15: 00007fd9ec2bcf30 ORIG_RAX: 00000000000000a5 CS: 0033 SS: 002b This task was trying to mount the cdrom. It allocated and configured a super_block struct and owned the write-lock for the super_block->s_umount rwsem. While exclusively owning the s_umount lock, it called sr_block_ioctl and waited to acquire the global sr_mutex lock. PID: 6785 TASK: ffff880078720fb0 CPU: 0 COMMAND: "systemd-udevd" #0 [ffff880078417898] __schedule at ffffffff8168d605 #1 [ffff880078417900] schedule at ffffffff8168dc59 #2 [ffff880078417910] rwsem_down_read_failed at ffffffff8168f605 #3 [ffff880078417980] call_rwsem_down_read_failed at ffffffff81328838 #4 [ffff8800784179d0] down_read at ffffffff8168cde0 #5 [ffff8800784179e8] get_super at ffffffff81201cc7 #6 [ffff880078417a10] __invalidate_device at ffffffff8123a8de #7 [ffff880078417a40] flush_disk at ffffffff8123a94b #8 [ffff880078417a88] check_disk_change at ffffffff8123ab50 #9 [ffff880078417ab0] cdrom_open at ffffffffa00a29e1 [cdrom] #10 [ffff880078417b68] sr_block_open at ffffffffa00b6f9b [sr_mod] #11 [ffff880078417b98] __blkdev_get at ffffffff8123ba86 #12 [ffff880078417bf0] blkdev_get at ffffffff8123bd65 #13 [ffff880078417c78] blkdev_open at ffffffff8123bf9b #14 [ffff880078417c90] do_dentry_open at ffffffff811fc7f7 #15 [ffff880078417cd8] vfs_open at ffffffff811fc9cf #16 [ffff880078417d00] do_last at ffffffff8120d53d #17 [ffff880078417db0] path_openat at ffffffff8120e6b2 #18 [ffff880078417e48] do_filp_open at ffffffff8121082b #19 [ffff880078417f18] do_sys_open at ffffffff811fdd33 #20 [ffff880078417f70] sys_open at ffffffff811fde4e #21 [ffff880078417f80] system_call_fastpath at ffffffff81698c49 RIP: 00007f29438b0c20 RSP: 00007ffc76624b78 RFLAGS: 00010246 RAX: 0000000000000002 RBX: ffffffff81698c49 RCX: 0000000000000000 RDX: 00007f2944a5fa70 RSI: 00000000000a0800 RDI: 00007f2944a5fa70 RBP: 00007f2944a5f540 R8: 0000000000000000 R9: 0000000000000020 R10: 00007f2943614c40 R11: 0000000000000246 R12: ffffffff811fde4e R13: ffff880078417f78 R14: 000000000000000c R15: 00007f2944a4b010 ORIG_RAX: 0000000000000002 CS: 0033 SS: 002b This task tried to open the cdrom device, the sr_block_open function acquired the global sr_mutex lock. The call to check_disk_change() then saw an event flag indicating a possible media change and tried to flush any cached data for the device. As part of the flush, it tried to acquire the super_block->s_umount lock associated with the cdrom device. This was the same super_block as created and locked by the previous task. The first task acquires the s_umount lock and then the sr_mutex_lock; the second task acquires the sr_mutex_lock and then the s_umount lock. This patch fixes the issue by moving check_disk_change() out of cdrom_open() and let the caller take care of it. Signed-off-by: Maurizio Lombardi <[email protected]> Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
…sion commit 8afb1d2 upstream. Commit 40f70c0 ("serial: sh-sci: add locking to console write function to avoid SMP lockup") copied the strategy to avoid locking problems in conjuncture with the console from the UART8250 driver. Instead using directly spin_{try}lock_irqsave(), local_irq_save() followed by spin_{try}lock() was used. While this is correct on mainline, for -rt it is a problem. spin_{try}lock() will check if it is running in a valid context. Since the local_irq_save() has already been executed, the context has changed and spin_{try}lock() will complain. The reason why spin_{try}lock() complains is that on -rt the spin locks are turned into mutexes and therefore can sleep. Sleeping with interrupts disabled is not valid. BUG: sleeping function called from invalid context at /home/wagi/work/rt/v4.4-cip-rt/kernel/locking/rtmutex.c:995 in_atomic(): 0, irqs_disabled(): 128, pid: 778, name: irq/76-eth0 CPU: 0 PID: 778 Comm: irq/76-eth0 Not tainted 4.4.126-test-cip22-rt14-00403-gcd03665c8318 #12 Hardware name: Generic RZ/G1 (Flattened Device Tree) Backtrace: [<c00140a0>] (dump_backtrace) from [<c001424c>] (show_stack+0x18/0x1c) r7:c06b01f0 r6:60010193 r5:00000000 r4:c06b01f0 [<c0014234>] (show_stack) from [<c01d3c94>] (dump_stack+0x78/0x94) [<c01d3c1c>] (dump_stack) from [<c004c134>] (___might_sleep+0x134/0x194) r7:60010113 r6:c06d3559 r5:00000000 r4:ffffe000 [<c004c000>] (___might_sleep) from [<c04ded60>] (rt_spin_lock+0x20/0x74) r5:c06f4d60 r4:c06f4d60 [<c04ded40>] (rt_spin_lock) from [<c02577e4>] (serial_console_write+0x100/0x118) r5:c06f4d60 r4:c06f4d60 [<c02576e4>] (serial_console_write) from [<c0061060>] (call_console_drivers.constprop.15+0x10c/0x124) r10:c06d2894 r9:c04e18b0 r8:00000028 r7:00000000 r6:c06d3559 r5:c06d2798 r4:c06b9914 r3:c02576e4 [<c0060f54>] (call_console_drivers.constprop.15) from [<c0062984>] (console_unlock+0x32c/0x430) r10:c06d30d8 r9:00000028 r8:c06dd518 r7:00000005 r6:00000000 r5:c06d2798 r4:c06d2798 r3:00000028 [<c0062658>] (console_unlock) from [<c0062e1c>] (vprintk_emit+0x394/0x4f0) r10:c06d2798 r9:c06d30ee r8:00000006 r7:00000005 r6:c06a78fc r5:00000027 r4:00000003 [<c0062a88>] (vprintk_emit) from [<c0062fa0>] (vprintk+0x28/0x30) r10:c060bd46 r9:00001000 r8:c06b9a90 r7:c06b9a90 r6:c06b994c r5:c06b9a3c r4:c0062fa8 [<c0062f78>] (vprintk) from [<c0062fb8>] (vprintk_default+0x10/0x14) [<c0062fa8>] (vprintk_default) from [<c009cd30>] (printk+0x78/0x84) [<c009ccbc>] (printk) from [<c025afdc>] (credit_entropy_bits+0x17c/0x2cc) r3:00000001 r2:decade60 r1:c061a5ee r0:c061a523 r4:00000006 [<c025ae60>] (credit_entropy_bits) from [<c025bf74>] (add_interrupt_randomness+0x160/0x178) r10:466e7196 r9:1f536000 r8:fffeef74 r7:00000000 r6:c06b9a60 r5:c06b9a3c r4:dfbcf680 [<c025be14>] (add_interrupt_randomness) from [<c006536c>] (irq_thread+0x1e8/0x248) r10:c006537c r9:c06cdf21 r8:c0064fcc r7:df791c24 r6:df791c00 r5:ffffe000 r4:df525180 [<c0065184>] (irq_thread) from [<c003fba4>] (kthread+0x108/0x11c) r10:00000000 r9:00000000 r8:c0065184 r7:df791c00 r6:00000000 r5:df791d00 r4:decac000 [<c003fa9c>] (kthread) from [<c00101b8>] (ret_from_fork+0x14/0x3c) r8:00000000 r7:00000000 r6:00000000 r5:c003fa9c r4:df791d00 Cc: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Daniel Wagner <[email protected]> Reviewed-by: Geert Uytterhoeven <[email protected]> [dw: Backported to 4.4.] Signed-off-by: Daniel Wagner <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Increase kasan instrumented kernel stack size from 32k to 64k. Other architectures seems to get away with just doubling kernel stack size under kasan, but on s390 this appears to be not enough due to bigger frame size. The particular pain point is kasan inlined checks (CONFIG_KASAN_INLINE vs CONFIG_KASAN_OUTLINE). With inlined checks one particular case hitting stack overflow is fs sync on xfs filesystem: #0 [9a0681e8] 704 bytes check_usage at 34b1fc #1 [9a0684a8] 432 bytes check_usage at 34c710 #2 [9a068658] 1048 bytes validate_chain at 35044a #3 [9a068a70] 312 bytes __lock_acquire at 3559fe #4 [9a068ba8] 440 bytes lock_acquire at 3576ee #5 [9a068d60] 104 bytes _raw_spin_lock at 21b44e0 #6 [9a068dc8] 1992 bytes enqueue_entity at 2dbf72 #7 [9a069590] 1496 bytes enqueue_task_fair at 2df5f0 #8 [9a069b68] 64 bytes ttwu_do_activate at 28f438 #9 [9a069ba8] 552 bytes try_to_wake_up at 298c4c #10 [9a069dd0] 168 bytes wake_up_worker at 23f97c #11 [9a069e78] 200 bytes insert_work at 23fc2e #12 [9a069f40] 648 bytes __queue_work at 2487c0 #13 [9a06a1c8] 200 bytes __queue_delayed_work at 24db28 #14 [9a06a290] 248 bytes mod_delayed_work_on at 24de84 #15 [9a06a388] 24 bytes kblockd_mod_delayed_work_on at 153e2a0 #16 [9a06a3a0] 288 bytes __blk_mq_delay_run_hw_queue at 158168c #17 [9a06a4c0] 192 bytes blk_mq_run_hw_queue at 1581a3c #18 [9a06a580] 184 bytes blk_mq_sched_insert_requests at 15a2192 #19 [9a06a638] 1024 bytes blk_mq_flush_plug_list at 1590f3a #20 [9a06aa38] 704 bytes blk_flush_plug_list at 1555028 #21 [9a06acf8] 320 bytes schedule at 219e476 #22 [9a06ae38] 760 bytes schedule_timeout at 21b0aac #23 [9a06b130] 408 bytes wait_for_common at 21a1706 #24 [9a06b2c8] 360 bytes xfs_buf_iowait at fa1540 #25 [9a06b430] 256 bytes __xfs_buf_submit at fadae6 #26 [9a06b530] 264 bytes xfs_buf_read_map at fae3f6 #27 [9a06b638] 656 bytes xfs_trans_read_buf_map at 10ac9a8 #28 [9a06b8c8] 304 bytes xfs_btree_kill_root at e72426 #29 [9a06b9f8] 288 bytes xfs_btree_lookup_get_block at e7bc5e #30 [9a06bb18] 624 bytes xfs_btree_lookup at e7e1a6 #31 [9a06bd88] 2664 bytes xfs_alloc_ag_vextent_near at dfa070 #32 [9a06c7f0] 144 bytes xfs_alloc_ag_vextent at dff3ca #33 [9a06c880] 1128 bytes xfs_alloc_vextent at e05fce #34 [9a06cce8] 584 bytes xfs_bmap_btalloc at e58342 #35 [9a06cf30] 1336 bytes xfs_bmapi_write at e618de #36 [9a06d468] 776 bytes xfs_iomap_write_allocate at ff678e #37 [9a06d770] 720 bytes xfs_map_blocks at f82af8 rockchip-linux#38 [9a06da40] 928 bytes xfs_writepage_map at f83cd6 rockchip-linux#39 [9a06dde0] 320 bytes xfs_do_writepage at f85872 rockchip-linux#40 [9a06df20] 1320 bytes write_cache_pages at 73dfe8 rockchip-linux#41 [9a06e448] 208 bytes xfs_vm_writepages at f7f892 rockchip-linux#42 [9a06e518] 88 bytes do_writepages at 73fe6a rockchip-linux#43 [9a06e570] 872 bytes __writeback_single_inode at a20cb6 rockchip-linux#44 [9a06e8d8] 664 bytes writeback_sb_inodes at a23be2 rockchip-linux#45 [9a06eb70] 296 bytes __writeback_inodes_wb at a242e0 rockchip-linux#46 [9a06ec98] 928 bytes wb_writeback at a2500e rockchip-linux#47 [9a06f038] 848 bytes wb_do_writeback at a260ae rockchip-linux#48 [9a06f388] 536 bytes wb_workfn at a28228 rockchip-linux#49 [9a06f5a0] 1088 bytes process_one_work at 24a234 rockchip-linux#50 [9a06f9e0] 1120 bytes worker_thread at 24ba26 rockchip-linux#51 [9a06fe40] 104 bytes kthread at 26545a rockchip-linux#52 [9a06fea8] kernel_thread_starter at 21b6b62 To be able to increase the stack size to 64k reuse LLILL instruction in __switch_to function to load 64k - STACK_FRAME_OVERHEAD - __PT_SIZE (65192) value as unsigned. Reported-by: Benjamin Block <[email protected]> Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]>
info->nr_rings isn't adjusted in case of ENOMEM error from negotiate_mq(). This leads to kernel panic in error path. Typical call stack involving panic - #8 page_fault at ffffffff8175936f [exception RIP: blkif_free_ring+33] RIP: ffffffffa0149491 RSP: ffff8804f7673c08 RFLAGS: 00010292 ... #9 blkif_free at ffffffffa0149aaa [xen_blkfront] #10 talk_to_blkback at ffffffffa014c8cd [xen_blkfront] #11 blkback_changed at ffffffffa014ea8b [xen_blkfront] #12 xenbus_otherend_changed at ffffffff81424670 #13 backend_changed at ffffffff81426dc3 #14 xenwatch_thread at ffffffff81422f29 #15 kthread at ffffffff810abe6a #16 ret_from_fork at ffffffff81754078 Cc: [email protected] Fixes: 7ed8ce1 ("xen-blkfront: move negotiate_mq to cover all cases of new VBDs") Signed-off-by: Manjunath Patil <[email protected]> Acked-by: Roger Pau Monné <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
It was observed that a process blocked indefintely in __fscache_read_or_alloc_page(), waiting for FSCACHE_COOKIE_LOOKING_UP to be cleared via fscache_wait_for_deferred_lookup(). At this time, ->backing_objects was empty, which would normaly prevent __fscache_read_or_alloc_page() from getting to the point of waiting. This implies that ->backing_objects was cleared *after* __fscache_read_or_alloc_page was was entered. When an object is "killed" and then "dropped", FSCACHE_COOKIE_LOOKING_UP is cleared in fscache_lookup_failure(), then KILL_OBJECT and DROP_OBJECT are "called" and only in DROP_OBJECT is ->backing_objects cleared. This leaves a window where something else can set FSCACHE_COOKIE_LOOKING_UP and __fscache_read_or_alloc_page() can start waiting, before ->backing_objects is cleared There is some uncertainty in this analysis, but it seems to be fit the observations. Adding the wake in this patch will be handled correctly by __fscache_read_or_alloc_page(), as it checks if ->backing_objects is empty again, after waiting. Customer which reported the hang, also report that the hang cannot be reproduced with this fix. The backtrace for the blocked process looked like: PID: 29360 TASK: ffff881ff2ac0f80 CPU: 3 COMMAND: "zsh" #0 [ffff881ff43efbf8] schedule at ffffffff815e56f1 #1 [ffff881ff43efc58] bit_wait at ffffffff815e64ed #2 [ffff881ff43efc68] __wait_on_bit at ffffffff815e61b8 #3 [ffff881ff43efca0] out_of_line_wait_on_bit at ffffffff815e625e #4 [ffff881ff43efd08] fscache_wait_for_deferred_lookup at ffffffffa04f2e8f [fscache] #5 [ffff881ff43efd18] __fscache_read_or_alloc_page at ffffffffa04f2ffe [fscache] #6 [ffff881ff43efd58] __nfs_readpage_from_fscache at ffffffffa0679668 [nfs] #7 [ffff881ff43efd78] nfs_readpage at ffffffffa067092b [nfs] #8 [ffff881ff43efda0] generic_file_read_iter at ffffffff81187a73 #9 [ffff881ff43efe50] nfs_file_read at ffffffffa066544b [nfs] #10 [ffff881ff43efe70] __vfs_read at ffffffff811fc756 #11 [ffff881ff43efee8] vfs_read at ffffffff811fccfa #12 [ffff881ff43eff18] sys_read at ffffffff811fda62 #13 [ffff881ff43eff50] entry_SYSCALL_64_fastpath at ffffffff815e986e Signed-off-by: NeilBrown <[email protected]> Signed-off-by: David Howells <[email protected]>
Commit 9b6f7e1 ("mm: rework memcg kernel stack accounting") will result in fork failing if allocating a kernel stack for a task in dup_task_struct exceeds the kernel memory allowance for that cgroup. Unfortunately, it also results in a crash. This is due to the code jumping to free_stack and calling free_thread_stack when the memcg kernel stack charge fails, but without tsk->stack pointing at the freshly allocated stack. This in turn results in the vfree_atomic in free_thread_stack oopsing with a backtrace like this: #5 [ffffc900244efc88] die at ffffffff8101f0ab #6 [ffffc900244efcb8] do_general_protection at ffffffff8101cb86 #7 [ffffc900244efce0] general_protection at ffffffff818ff082 [exception RIP: llist_add_batch+7] RIP: ffffffff8150d487 RSP: ffffc900244efd98 RFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88085ef55980 RCX: 0000000000000000 RDX: ffff88085ef55980 RSI: 343834343531203a RDI: 343834343531203a RBP: ffffc900244efd98 R8: 0000000000000001 R9: ffff8808578c3600 R10: 0000000000000000 R11: 0000000000000001 R12: ffff88029f6c21c0 R13: 0000000000000286 R14: ffff880147759b00 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #8 [ffffc900244efda0] vfree_atomic at ffffffff811df2c7 #9 [ffffc900244efdb8] copy_process at ffffffff81086e37 #10 [ffffc900244efe98] _do_fork at ffffffff810884e0 #11 [ffffc900244eff10] sys_vfork at ffffffff810887ff #12 [ffffc900244eff20] do_syscall_64 at ffffffff81002a43 RIP: 000000000049b948 RSP: 00007ffcdb307830 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: 0000000000896030 RCX: 000000000049b948 RDX: 0000000000000000 RSI: 00007ffcdb307790 RDI: 00000000005d7421 RBP: 000000000067370f R8: 00007ffcdb3077b0 R9: 000000000001ed00 R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000040 R13: 000000000000000f R14: 0000000000000000 R15: 000000000088d018 ORIG_RAX: 000000000000003a CS: 0033 SS: 002b The simplest fix is to assign tsk->stack right where it is allocated. Link: http://lkml.kernel.org/r/[email protected] Fixes: 9b6f7e1 ("mm: rework memcg kernel stack accounting") Signed-off-by: Rik van Riel <[email protected]> Acked-by: Roman Gushchin <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Tejun Heo <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
commit a007734 upstream. The bus master was not removed after unloading the module or unbinding the driver. That lead to oopses like this [ 127.842987] Unable to handle kernel paging request at virtual address bf01d04c [ 127.850646] pgd = 70e3cd9a [ 127.853698] [bf01d04c] *pgd=8f908811, *pte=00000000, *ppte=00000000 [ 127.860412] Internal error: Oops: 80000007 [#1] PREEMPT SMP ARM [ 127.866668] Modules linked in: bq27xxx_battery overlay [last unloaded: omap_hdq] [ 127.874542] CPU: 0 PID: 1022 Comm: w1_bus_master1 Not tainted 4.19.0-rc4-00001-g2d51da718324 #12 [ 127.883819] Hardware name: Generic OMAP36xx (Flattened Device Tree) [ 127.890441] PC is at 0xbf01d04c [ 127.893798] LR is at w1_search_process_cb+0x4c/0xfc [ 127.898956] pc : [<bf01d04c>] lr : [<c05f9580>] psr: a0070013 [ 127.905609] sp : cf885f48 ip : bf01d04c fp : ddf1e11c [ 127.911132] r10: cf8fe040 r9 : c05f8d00 r8 : cf8fe040 [ 127.916656] r7 : 000000f0 r6 : cf8fe02c r5 : cf8fe000 r4 : cf8fe01c [ 127.923553] r3 : c05f8d00 r2 : 000000f0 r1 : cf8fe000 r0 : dde1ef10 [ 127.930450] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none [ 127.938018] Control: 10c5387d Table: 8f8f0019 DAC: 00000051 [ 127.944091] Process w1_bus_master1 (pid: 1022, stack limit = 0x9135699f) [ 127.951171] Stack: (0xcf885f48 to 0xcf886000) [ 127.955810] 5f40: cf8fe000 00000000 cf884000 cf8fe090 000003e8 c05f8d00 [ 127.964477] 5f60: dde5fc34 c05f9700 ddf1e100 ddf1e540 cf884000 cf8fe000 c05f9694 00000000 [ 127.973114] 5f80: dde5fc34 c01499a4 00000000 ddf1e540 c0149874 00000000 00000000 00000000 [ 127.981781] 5fa0: 00000000 00000000 00000000 c01010e8 00000000 00000000 00000000 00000000 [ 127.990447] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 127.999114] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000 [ 128.007781] [<c05f9580>] (w1_search_process_cb) from [<c05f9700>] (w1_process+0x6c/0x118) [ 128.016479] [<c05f9700>] (w1_process) from [<c01499a4>] (kthread+0x130/0x148) [ 128.024047] [<c01499a4>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c) [ 128.031677] Exception stack(0xcf885fb0 to 0xcf885ff8) [ 128.037017] 5fa0: 00000000 00000000 00000000 00000000 [ 128.045684] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 128.054351] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000 [ 128.061340] Code: bad PC value [ 128.064697] ---[ end trace af066e33c0e14119 ]--- Cc: <[email protected]> Signed-off-by: Andreas Kemnade <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
[ Upstream commit c5a94f4 ] It was observed that a process blocked indefintely in __fscache_read_or_alloc_page(), waiting for FSCACHE_COOKIE_LOOKING_UP to be cleared via fscache_wait_for_deferred_lookup(). At this time, ->backing_objects was empty, which would normaly prevent __fscache_read_or_alloc_page() from getting to the point of waiting. This implies that ->backing_objects was cleared *after* __fscache_read_or_alloc_page was was entered. When an object is "killed" and then "dropped", FSCACHE_COOKIE_LOOKING_UP is cleared in fscache_lookup_failure(), then KILL_OBJECT and DROP_OBJECT are "called" and only in DROP_OBJECT is ->backing_objects cleared. This leaves a window where something else can set FSCACHE_COOKIE_LOOKING_UP and __fscache_read_or_alloc_page() can start waiting, before ->backing_objects is cleared There is some uncertainty in this analysis, but it seems to be fit the observations. Adding the wake in this patch will be handled correctly by __fscache_read_or_alloc_page(), as it checks if ->backing_objects is empty again, after waiting. Customer which reported the hang, also report that the hang cannot be reproduced with this fix. The backtrace for the blocked process looked like: PID: 29360 TASK: ffff881ff2ac0f80 CPU: 3 COMMAND: "zsh" #0 [ffff881ff43efbf8] schedule at ffffffff815e56f1 #1 [ffff881ff43efc58] bit_wait at ffffffff815e64ed #2 [ffff881ff43efc68] __wait_on_bit at ffffffff815e61b8 #3 [ffff881ff43efca0] out_of_line_wait_on_bit at ffffffff815e625e #4 [ffff881ff43efd08] fscache_wait_for_deferred_lookup at ffffffffa04f2e8f [fscache] #5 [ffff881ff43efd18] __fscache_read_or_alloc_page at ffffffffa04f2ffe [fscache] #6 [ffff881ff43efd58] __nfs_readpage_from_fscache at ffffffffa0679668 [nfs] #7 [ffff881ff43efd78] nfs_readpage at ffffffffa067092b [nfs] #8 [ffff881ff43efda0] generic_file_read_iter at ffffffff81187a73 #9 [ffff881ff43efe50] nfs_file_read at ffffffffa066544b [nfs] #10 [ffff881ff43efe70] __vfs_read at ffffffff811fc756 #11 [ffff881ff43efee8] vfs_read at ffffffff811fccfa #12 [ffff881ff43eff18] sys_read at ffffffff811fda62 #13 [ffff881ff43eff50] entry_SYSCALL_64_fastpath at ffffffff815e986e Signed-off-by: NeilBrown <[email protected]> Signed-off-by: David Howells <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Ido Schimmel says: ==================== Add drop monitor for offloaded data paths Users have several ways to debug the kernel and understand why a packet was dropped. For example, using drop monitor and perf. Both utilities trace kfree_skb(), which is the function called when a packet is freed as part of a failure. The information provided by these tools is invaluable when trying to understand the cause of a packet loss. In recent years, large portions of the kernel data path were offloaded to capable devices. Today, it is possible to perform L2 and L3 forwarding in hardware, as well as tunneling (IP-in-IP and VXLAN). Different TC classifiers and actions are also offloaded to capable devices, at both ingress and egress. However, when the data path is offloaded it is not possible to achieve the same level of introspection since packets are dropped by the underlying device and never reach the kernel. This patchset aims to solve this by allowing users to monitor packets that the underlying device decided to drop along with relevant metadata such as the drop reason and ingress port. The above is achieved by exposing a fundamental capability of devices capable of data path offloading - packet trapping. In much the same way as drop monitor registers its probe function with the kfree_skb() tracepoint, the device is instructed to pass to the CPU (trap) packets that it decided to drop in various places in the pipeline. The configuration of the device to pass such packets to the CPU is performed using devlink, as it is not specific to a port, but rather to a device. In the future, we plan to control the policing of such packets using devlink, in order not to overwhelm the CPU. While devlink is used as the control path, the dropped packets are passed along with metadata to drop monitor, which reports them to userspace as netlink events. This allows users to use the same interface for the monitoring of both software and hardware drops. Logically, the solution looks as follows: Netlink event: Packet w/ metadata Or a summary of recent drops ^ | Userspace | +---------------------------------------------------+ Kernel | | +-------+--------+ | | | drop_monitor | | | +-------^--------+ | | | +----+----+ | | Kernel's Rx path | devlink | (non-drop traps) | | +----^----+ ^ | | +-----------+ | +-------+-------+ | | | Device driver | | | +-------^-------+ Kernel | +---------------------------------------------------+ Hardware | | Trapped packet | +--+---+ | | | ASIC | | | +------+ In order to reduce the patch count, this patchset only includes integration with netdevsim. A follow-up patchset will add devlink-trap support in mlxsw. Patches #1-#7 extend drop monitor to also monitor hardware originated drops. Patches #8-#10 add the devlink-trap infrastructure. Patches #11-#12 add devlink-trap support in netdevsim. Patches #13-#16 add tests for the generic infrastructure over netdevsim. Example ======= Instantiate netdevsim --------------------- List supported traps -------------------- netdevsim/netdevsim10: name source_mac_is_multicast type drop generic true action drop group l2_drops name vlan_tag_mismatch type drop generic true action drop group l2_drops name ingress_vlan_filter type drop generic true action drop group l2_drops name ingress_spanning_tree_filter type drop generic true action drop group l2_drops name port_list_is_empty type drop generic true action drop group l2_drops name port_loopback_filter type drop generic true action drop group l2_drops name fid_miss type exception generic false action trap group l2_drops name blackhole_route type drop generic true action drop group l3_drops name ttl_value_is_too_small type exception generic true action trap group l3_drops name tail_drop type drop generic true action drop group buffer_drops Enable a trap ------------- Query statistics ---------------- netdevsim/netdevsim10: name blackhole_route type drop generic true action trap group l3_drops stats: rx: bytes 7384 packets 52 Monitor dropped packets ----------------------- dropwatch> set alertmode packet Setting alert mode Alert mode successfully set dropwatch> set sw true setting software drops monitoring to 1 dropwatch> set hw true setting hardware drops monitoring to 1 dropwatch> start Enabling monitoring... Kernel monitoring activated. Issue Ctrl-C to stop monitoring drop at: ttl_value_is_too_small (l3_drops) origin: hardware input port ifindex: 55 input port name: eth0 timestamp: Mon Aug 12 10:52:20 2019 445911505 nsec protocol: 0x800 length: 142 original length: 142 drop at: ip6_mc_input+0x8b8/0xef8 (0xffffffff9e2bb0e8) origin: software input port ifindex: 4 timestamp: Mon Aug 12 10:53:37 2019 024444587 nsec protocol: 0x86dd length: 110 original length: 110 Future plans ============ * Provide more drop reasons as well as more metadata * Add dropmon support to libpcap, so that tcpdump/tshark could specifically listen on dropmon traffic, instead of capturing all netlink packets via nlmon interface Changes in v3: * Place test with the rest of the netdevsim tests * Fix test to load netdevsim module * Move devlink helpers from the test to devlink_lib.sh. Will be used by mlxsw tests * Re-order netdevsim includes in alphabetical order * Fix reverse xmas tree in netdevsim * Remove double include in netdevsim Changes in v2: * Use drop monitor to report dropped packets instead of devlink * Add drop monitor patches * Add test cases ==================== Signed-off-by: David S. Miller <[email protected]>
The mkey_table xarray is touched by the reg_mr_callback() function which is called from a hard irq. Thus all other uses of xa_lock must use the _irq variants. WARNING: inconsistent lock state 5.4.0-rc1 #12 Not tainted -------------------------------- inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. python3/343 [HC0[0]:SC0[0]:HE1:SE1] takes: ffff888182be1d40 (&(&xa->xa_lock)->rlock#3){?.-.}, at: xa_erase+0x12/0x30 {IN-HARDIRQ-W} state was registered at: lock_acquire+0xe1/0x200 _raw_spin_lock_irqsave+0x35/0x50 reg_mr_callback+0x2dd/0x450 [mlx5_ib] mlx5_cmd_exec_cb_handler+0x2c/0x70 [mlx5_core] mlx5_cmd_comp_handler+0x355/0x840 [mlx5_core] [..] Possible unsafe locking scenario: CPU0 ---- lock(&(&xa->xa_lock)->rlock#3); <Interrupt> lock(&(&xa->xa_lock)->rlock#3); *** DEADLOCK *** 2 locks held by python3/343: #0: ffff88818eb4bd38 (&uverbs_dev->disassociate_srcu){....}, at: ib_uverbs_ioctl+0xe5/0x1e0 [ib_uverbs] #1: ffff888176c76d38 (&file->hw_destroy_rwsem){++++}, at: uobj_destroy+0x2d/0x90 [ib_uverbs] stack backtrace: CPU: 3 PID: 343 Comm: python3 Not tainted 5.4.0-rc1 #12 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x86/0xca print_usage_bug.cold.50+0x2e5/0x355 mark_lock+0x871/0xb50 ? match_held_lock+0x20/0x250 ? check_usage_forwards+0x240/0x240 __lock_acquire+0x7de/0x23a0 ? __kasan_check_read+0x11/0x20 ? mark_lock+0xae/0xb50 ? mark_held_locks+0xb0/0xb0 ? find_held_lock+0xca/0xf0 lock_acquire+0xe1/0x200 ? xa_erase+0x12/0x30 _raw_spin_lock+0x2a/0x40 ? xa_erase+0x12/0x30 xa_erase+0x12/0x30 mlx5_ib_dealloc_mw+0x55/0xa0 [mlx5_ib] uverbs_dealloc_mw+0x3c/0x70 [ib_uverbs] uverbs_free_mw+0x1a/0x20 [ib_uverbs] destroy_hw_idr_uobject+0x49/0xa0 [ib_uverbs] [..] Fixes: 0417791 ("RDMA/mlx5: Add missing synchronize_srcu() for MW cases") Link: https://lore.kernel.org/r/[email protected] Acked-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
The following deadlock was captured. The first process is holding 'kernfs_mutex' and hung by io. The io was staging in 'r1conf.pending_bio_list' of raid1 device, this pending bio list would be flushed by second process 'md127_raid1', but it was hung by 'kernfs_mutex'. Using sysfs_notify_dirent_safe() to replace sysfs_notify() can fix it. There were other sysfs_notify() invoked from io path, removed all of them. PID: 40430 TASK: ffff8ee9c8c65c40 CPU: 29 COMMAND: "probe_file" #0 [ffffb87c4df37260] __schedule at ffffffff9a8678ec #1 [ffffb87c4df372f8] schedule at ffffffff9a867f06 #2 [ffffb87c4df37310] io_schedule at ffffffff9a0c73e6 #3 [ffffb87c4df37328] __dta___xfs_iunpin_wait_3443 at ffffffffc03a4057 [xfs] #4 [ffffb87c4df373a0] xfs_iunpin_wait at ffffffffc03a6c79 [xfs] #5 [ffffb87c4df373b0] __dta_xfs_reclaim_inode_3357 at ffffffffc039a46c [xfs] #6 [ffffb87c4df37400] xfs_reclaim_inodes_ag at ffffffffc039a8b6 [xfs] #7 [ffffb87c4df37590] xfs_reclaim_inodes_nr at ffffffffc039bb33 [xfs] #8 [ffffb87c4df375b0] xfs_fs_free_cached_objects at ffffffffc03af0e9 [xfs] #9 [ffffb87c4df375c0] super_cache_scan at ffffffff9a287ec7 #10 [ffffb87c4df37618] shrink_slab at ffffffff9a1efd93 #11 [ffffb87c4df37700] shrink_node at ffffffff9a1f5968 #12 [ffffb87c4df37788] do_try_to_free_pages at ffffffff9a1f5ea2 #13 [ffffb87c4df377f0] try_to_free_mem_cgroup_pages at ffffffff9a1f6445 #14 [ffffb87c4df37880] try_charge at ffffffff9a26cc5f #15 [ffffb87c4df37920] memcg_kmem_charge_memcg at ffffffff9a270f6a #16 [ffffb87c4df37958] new_slab at ffffffff9a251430 #17 [ffffb87c4df379c0] ___slab_alloc at ffffffff9a251c85 #18 [ffffb87c4df37a80] __slab_alloc at ffffffff9a25635d #19 [ffffb87c4df37ac0] kmem_cache_alloc at ffffffff9a251f89 #20 [ffffb87c4df37b00] alloc_inode at ffffffff9a2a2b10 #21 [ffffb87c4df37b20] iget_locked at ffffffff9a2a4854 #22 [ffffb87c4df37b60] kernfs_get_inode at ffffffff9a311377 #23 [ffffb87c4df37b80] kernfs_iop_lookup at ffffffff9a311e2b #24 [ffffb87c4df37ba8] lookup_slow at ffffffff9a290118 #25 [ffffb87c4df37c10] walk_component at ffffffff9a291e83 #26 [ffffb87c4df37c78] path_lookupat at ffffffff9a293619 #27 [ffffb87c4df37cd8] filename_lookup at ffffffff9a2953af #28 [ffffb87c4df37de8] user_path_at_empty at ffffffff9a295566 #29 [ffffb87c4df37e10] vfs_statx at ffffffff9a289787 #30 [ffffb87c4df37e70] SYSC_newlstat at ffffffff9a289d5d #31 [ffffb87c4df37f18] sys_newlstat at ffffffff9a28a60e #32 [ffffb87c4df37f28] do_syscall_64 at ffffffff9a003949 #33 [ffffb87c4df37f50] entry_SYSCALL_64_after_hwframe at ffffffff9aa001ad RIP: 00007f617a5f2905 RSP: 00007f607334f838 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: 00007f6064044b20 RCX: 00007f617a5f2905 RDX: 00007f6064044b20 RSI: 00007f6064044b20 RDI: 00007f6064005890 RBP: 00007f6064044aa0 R8: 0000000000000030 R9: 000000000000011c R10: 0000000000000013 R11: 0000000000000246 R12: 00007f606417e6d0 R13: 00007f6064044aa0 R14: 00007f6064044b10 R15: 00000000ffffffff ORIG_RAX: 0000000000000006 CS: 0033 SS: 002b PID: 927 TASK: ffff8f15ac5dbd80 CPU: 42 COMMAND: "md127_raid1" #0 [ffffb87c4df07b28] __schedule at ffffffff9a8678ec #1 [ffffb87c4df07bc0] schedule at ffffffff9a867f06 #2 [ffffb87c4df07bd8] schedule_preempt_disabled at ffffffff9a86825e #3 [ffffb87c4df07be8] __mutex_lock at ffffffff9a869bcc #4 [ffffb87c4df07ca0] __mutex_lock_slowpath at ffffffff9a86a013 #5 [ffffb87c4df07cb0] mutex_lock at ffffffff9a86a04f #6 [ffffb87c4df07cc8] kernfs_find_and_get_ns at ffffffff9a311d83 #7 [ffffb87c4df07cf0] sysfs_notify at ffffffff9a314b3a #8 [ffffb87c4df07d18] md_update_sb at ffffffff9a688696 #9 [ffffb87c4df07d98] md_update_sb at ffffffff9a6886d5 #10 [ffffb87c4df07da8] md_check_recovery at ffffffff9a68ad9c #11 [ffffb87c4df07dd0] raid1d at ffffffffc01f0375 [raid1] #12 [ffffb87c4df07ea0] md_thread at ffffffff9a680348 #13 [ffffb87c4df07f08] kthread at ffffffff9a0b8005 #14 [ffffb87c4df07f50] ret_from_fork at ffffffff9aa00344 Signed-off-by: Junxiao Bi <[email protected]> Signed-off-by: Song Liu <[email protected]>
[ Upstream commit 27dada0 ] The defer ops code has been finishing items in the wrong order -- if a top level defer op creates items A and B, and finishing item A creates more defer ops A1 and A2, we'll put the new items on the end of the chain and process them in the order A B A1 A2. This is kind of weird, since it's convenient for programmers to be able to think of A and B as an ordered sequence where all the sub-tasks for A must finish before we move on to B, e.g. A A1 A2 D. Right now, our log intent items are not so complex that this matters, but this will become important for the atomic extent swapping patchset. In order to maintain correct reference counting of extents, we have to unmap and remap extents in that order, and we want to complete that work before moving on to the next range that the user wants to swap. This patch fixes defer ops to satsify that requirement. The primary symptom of the incorrect order was noticed in an early performance analysis of the atomic extent swap code. An astonishingly large number of deferred work items accumulated when userspace requested an atomic update of two very fragmented files. The cause of this was traced to the same ordering bug in the inner loop of xfs_defer_finish_noroll. If the ->finish_item method of a deferred operation queues new deferred operations, those new deferred ops are appended to the tail of the pending work list. To illustrate, say that a caller creates a transaction t0 with four deferred operations D0-D3. The first thing defer ops does is roll the transaction to t1, leaving us with: t1: D0(t0), D1(t0), D2(t0), D3(t0) Let's say that finishing each of D0-D3 will create two new deferred ops. After finish D0 and roll, we'll have the following chain: t2: D1(t0), D2(t0), D3(t0), d4(t1), d5(t1) d4 and d5 were logged to t1. Notice that while we're about to start work on D1, we haven't actually completed all the work implied by D0 being finished. So far we've been careful (or lucky) to structure the dfops callers such that D1 doesn't depend on d4 or d5 being finished, but this is a potential logic bomb. There's a second problem lurking. Let's see what happens as we finish D1-D3: t3: D2(t0), D3(t0), d4(t1), d5(t1), d6(t2), d7(t2) t4: D3(t0), d4(t1), d5(t1), d6(t2), d7(t2), d8(t3), d9(t3) t5: d4(t1), d5(t1), d6(t2), d7(t2), d8(t3), d9(t3), d10(t4), d11(t4) Let's say that d4-d11 are simple work items that don't queue any other operations, which means that we can complete each d4 and roll to t6: t6: d5(t1), d6(t2), d7(t2), d8(t3), d9(t3), d10(t4), d11(t4) t7: d6(t2), d7(t2), d8(t3), d9(t3), d10(t4), d11(t4) ... t11: d10(t4), d11(t4) t12: d11(t4) <done> When we try to roll to transaction #12, we're holding defer op d11, which we logged way back in t4. This means that the tail of the log is pinned at t4. If the log is very small or there are a lot of other threads updating metadata, this means that we might have wrapped the log and cannot get roll to t11 because there isn't enough space left before we'd run into t4. Let's shift back to the original failure. I mentioned before that I discovered this flaw while developing the atomic file update code. In that scenario, we have a defer op (D0) that finds a range of file blocks to remap, creates a handful of new defer ops to do that, and then asks to be continued with however much work remains. So, D0 is the original swapext deferred op. The first thing defer ops does is rolls to t1: t1: D0(t0) We try to finish D0, logging d1 and d2 in the process, but can't get all the work done. We log a done item and a new intent item for the work that D0 still has to do, and roll to t2: t2: D0'(t1), d1(t1), d2(t1) We roll and try to finish D0', but still can't get all the work done, so we log a done item and a new intent item for it, requeue D0 a second time, and roll to t3: t3: D0''(t2), d1(t1), d2(t1), d3(t2), d4(t2) If it takes 48 more rolls to complete D0, then we'll finally dispense with D0 in t50: t50: D<fifty primes>(t49), d1(t1), ..., d102(t50) We then try to roll again to get a chain like this: t51: d1(t1), d2(t1), ..., d101(t50), d102(t50) ... t152: d102(t50) <done> Notice that in rolling to transaction rockchip-linux#51, we're holding on to a log intent item for d1 that was logged in transaction #1. This means that the tail of the log is pinned at t1. If the log is very small or there are a lot of other threads updating metadata, this means that we might have wrapped the log and cannot roll to t51 because there isn't enough space left before we'd run into t1. This is of course problem #2 again. But notice the third problem with this scenario: we have 102 defer ops tied to this transaction! Each of these items are backed by pinned kernel memory, which means that we risk OOM if the chains get too long. Yikes. Problem #1 is a subtle logic bomb that could hit someone in the future; problem #2 applies (rarely) to the current upstream, and problem #3 applies to work under development. This is not how incremental deferred operations were supposed to work. The dfops design of logging in the same transaction an intent-done item and a new intent item for the work remaining was to make it so that we only have to juggle enough deferred work items to finish that one small piece of work. Deferred log item recovery will find that first unfinished work item and restart it, no matter how many other intent items might follow it in the log. Therefore, it's ok to put the new intents at the start of the dfops chain. For the first example, the chains look like this: t2: d4(t1), d5(t1), D1(t0), D2(t0), D3(t0) t3: d5(t1), D1(t0), D2(t0), D3(t0) ... t9: d9(t7), D3(t0) t10: D3(t0) t11: d10(t10), d11(t10) t12: d11(t10) For the second example, the chains look like this: t1: D0(t0) t2: d1(t1), d2(t1), D0'(t1) t3: d2(t1), D0'(t1) t4: D0'(t1) t5: d1(t4), d2(t4), D0''(t4) ... t148: D0<50 primes>(t147) t149: d101(t148), d102(t148) t150: d102(t148) <done> This actually sucks more for pinning the log tail (we try to roll to t10 while holding an intent item that was logged in t1) but we've solved problem #1. We've also reduced the maximum chain length from: sum(all the new items) + nr_original_items to: max(new items that each original item creates) + nr_original_items This solves problem #3 by sharply reducing the number of defer ops that can be attached to a transaction at any given time. The change makes the problem of log tail pinning worse, but is improvement we need to solve problem #2. Actually solving #2, however, is left to the next patch. Note that a subsequent analysis of some hard-to-trigger reflink and COW livelocks on extremely fragmented filesystems (or systems running a lot of IO threads) showed the same symptoms -- uncomfortably large numbers of incore deferred work items and occasional stalls in the transaction grant code while waiting for log reservations. I think this patch and the next one will also solve these problems. As originally written, the code used list_splice_tail_init instead of list_splice_init, so change that, and leave a short comment explaining our actions. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 0e435be ] crq->msgs could be NULL if the previous reset did not complete after freeing crq->msgs. Check for NULL before dereferencing them. Snippet of call trace: ... ibmvnic 30000003 env3 (unregistering): Releasing sub-CRQ ibmvnic 30000003 env3 (unregistering): Releasing CRQ BUG: Kernel NULL pointer dereference on read at 0x00000000 Faulting instruction address: 0xc0000000000c1a30 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: ibmvnic(E-) rpadlpar_io rpaphp xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables xsk_diag tcp_diag udp_diag tun raw_diag inet_diag unix_diag bridge af_packet_diag netlink_diag stp llc rfkill sunrpc pseries_rng xts vmx_crypto uio_pdrv_genirq uio binfmt_misc ip_tables xfs libcrc32c sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ibmvnic] CPU: 20 PID: 8426 Comm: kworker/20:0 Tainted: G E 5.10.0-rc1+ #12 Workqueue: events __ibmvnic_reset [ibmvnic] NIP: c0000000000c1a30 LR: c008000001b00c18 CTR: 0000000000000400 REGS: c00000000d05b7a0 TRAP: 0380 Tainted: G E (5.10.0-rc1+) MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 44002480 XER: 20040000 CFAR: c0000000000c19ec IRQMASK: 0 GPR00: 0000000000000400 c00000000d05ba30 c008000001b17c00 0000000000000000 GPR04: 0000000000000000 0000000000000000 0000000000000000 00000000000001e2 GPR08: 000000000001f400 ffffffffffffd950 0000000000000000 c008000001b0b280 GPR12: c0000000000c19c8 c00000001ec72e00 c00000000019a778 c00000002647b440 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000006 0000000000000001 0000000000000003 0000000000000002 GPR24: 0000000000001000 c008000001b0d570 0000000000000005 c00000007ab5d550 GPR28: c00000007ab5c000 c000000032fcf848 c00000007ab5cc00 c000000032fcf800 NIP [c0000000000c1a30] memset+0x68/0x104 LR [c008000001b00c18] ibmvnic_reset_crq+0x70/0x110 [ibmvnic] Call Trace: [c00000000d05ba30] [0000000000000800] 0x800 (unreliable) [c00000000d05bab0] [c008000001b0a930] do_reset.isra.40+0x224/0x634 [ibmvnic] [c00000000d05bb80] [c008000001b08574] __ibmvnic_reset+0x17c/0x3c0 [ibmvnic] [c00000000d05bc50] [c00000000018d9ac] process_one_work+0x2cc/0x800 [c00000000d05bd20] [c00000000018df58] worker_thread+0x78/0x520 [c00000000d05bdb0] [c00000000019a934] kthread+0x1c4/0x1d0 [c00000000d05be20] [c00000000000d5d0] ret_from_kernel_thread+0x5c/0x6c Fixes: 032c5e8 ("Driver for IBM System i/p VNIC protocol") Signed-off-by: Lijun Pan <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Found by leak sanitizer: ``` ==1632594==ERROR: LeakSanitizer: detected memory leaks Direct leak of 21 byte(s) in 1 object(s) allocated from: #0 0x7f2953a7077b in __interceptor_strdup ../../../../src/libsanitizer/asan/asan_interceptors.cpp:439 #1 0x556701d6fbbf in perf_env__read_cpuid util/env.c:369 #2 0x556701d70589 in perf_env__cpuid util/env.c:465 #3 0x55670204bba2 in x86__is_amd_cpu arch/x86/util/env.c:14 #4 0x5567020487a2 in arch__post_evsel_config arch/x86/util/evsel.c:83 #5 0x556701d8f78b in evsel__config util/evsel.c:1366 #6 0x556701ef5872 in evlist__config util/record.c:108 #7 0x556701cd6bcd in test__PERF_RECORD tests/perf-record.c:112 #8 0x556701cacd07 in run_test tests/builtin-test.c:236 #9 0x556701cacfac in test_and_print tests/builtin-test.c:265 #10 0x556701cadddb in __cmd_test tests/builtin-test.c:402 #11 0x556701caf2aa in cmd_test tests/builtin-test.c:559 #12 0x556701d3b557 in run_builtin tools/perf/perf.c:323 #13 0x556701d3bac8 in handle_internal_command tools/perf/perf.c:377 #14 0x556701d3be90 in run_argv tools/perf/perf.c:421 #15 0x556701d3c3f8 in main tools/perf/perf.c:537 #16 0x7f2952a46189 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 SUMMARY: AddressSanitizer: 21 byte(s) leaked in 1 allocation(s). ``` Fixes: f7b58cb ("perf mem/c2c: Add load store event mappings for AMD") Signed-off-by: Ian Rogers <[email protected]> Acked-by: Ravi Bangoria <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
[ Upstream commit 99d4850 ] Found by leak sanitizer: ``` ==1632594==ERROR: LeakSanitizer: detected memory leaks Direct leak of 21 byte(s) in 1 object(s) allocated from: #0 0x7f2953a7077b in __interceptor_strdup ../../../../src/libsanitizer/asan/asan_interceptors.cpp:439 #1 0x556701d6fbbf in perf_env__read_cpuid util/env.c:369 #2 0x556701d70589 in perf_env__cpuid util/env.c:465 #3 0x55670204bba2 in x86__is_amd_cpu arch/x86/util/env.c:14 #4 0x5567020487a2 in arch__post_evsel_config arch/x86/util/evsel.c:83 #5 0x556701d8f78b in evsel__config util/evsel.c:1366 #6 0x556701ef5872 in evlist__config util/record.c:108 #7 0x556701cd6bcd in test__PERF_RECORD tests/perf-record.c:112 #8 0x556701cacd07 in run_test tests/builtin-test.c:236 #9 0x556701cacfac in test_and_print tests/builtin-test.c:265 #10 0x556701cadddb in __cmd_test tests/builtin-test.c:402 #11 0x556701caf2aa in cmd_test tests/builtin-test.c:559 #12 0x556701d3b557 in run_builtin tools/perf/perf.c:323 #13 0x556701d3bac8 in handle_internal_command tools/perf/perf.c:377 #14 0x556701d3be90 in run_argv tools/perf/perf.c:421 #15 0x556701d3c3f8 in main tools/perf/perf.c:537 #16 0x7f2952a46189 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 SUMMARY: AddressSanitizer: 21 byte(s) leaked in 1 allocation(s). ``` Fixes: f7b58cb ("perf mem/c2c: Add load store event mappings for AMD") Signed-off-by: Ian Rogers <[email protected]> Acked-by: Ravi Bangoria <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
commit 17b17fc upstream. While trying to get the subpage blocksize tests running, I hit the following panic on generic/476 assertion failed: PagePrivate(page) && page->private, in fs/btrfs/subpage.c:229 kernel BUG at fs/btrfs/subpage.c:229! Internal error: Oops - BUG: 00000000f2000800 [#1] SMP CPU: 1 PID: 1453 Comm: fsstress Not tainted 6.4.0-rc7+ #12 Hardware name: QEMU KVM Virtual Machine, BIOS edk2-20230301gitf80f052277c8-26.fc38 03/01/2023 pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) pc : btrfs_subpage_assert+0xbc/0xf0 lr : btrfs_subpage_assert+0xbc/0xf0 Call trace: btrfs_subpage_assert+0xbc/0xf0 btrfs_subpage_clear_checked+0x38/0xc0 btrfs_page_clear_checked+0x48/0x98 btrfs_truncate_block+0x5d0/0x6a8 btrfs_cont_expand+0x5c/0x528 btrfs_write_check.isra.0+0xf8/0x150 btrfs_buffered_write+0xb4/0x760 btrfs_do_write_iter+0x2f8/0x4b0 btrfs_file_write_iter+0x1c/0x30 do_iter_readv_writev+0xc8/0x158 do_iter_write+0x9c/0x210 vfs_iter_write+0x24/0x40 iter_file_splice_write+0x224/0x390 direct_splice_actor+0x38/0x68 splice_direct_to_actor+0x12c/0x260 do_splice_direct+0x90/0xe8 generic_copy_file_range+0x50/0x90 vfs_copy_file_range+0x29c/0x470 __arm64_sys_copy_file_range+0xcc/0x498 invoke_syscall.constprop.0+0x80/0xd8 do_el0_svc+0x6c/0x168 el0_svc+0x50/0x1b0 el0t_64_sync_handler+0x114/0x120 el0t_64_sync+0x194/0x198 This happens because during btrfs_cont_expand we'll get a page, set it as mapped, and if it's not Uptodate we'll read it. However between the read and re-locking the page we could have called release_folio() on the page, but left the page in the file mapping. release_folio() can clear the page private, and thus further down we blow up when we go to modify the subpage bits. Fix this by putting the set_page_extent_mapped() after the read. This is safe because read_folio() will call set_page_extent_mapped() before it does the read, and then if we clear page private but leave it on the mapping we're completely safe re-setting set_page_extent_mapped(). With this patch I can now run generic/476 without panicing. CC: [email protected] # 6.1+ Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Fix an error detected by memory sanitizer: ``` ==4033==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x55fb0fbedfc7 in read_alias_info tools/perf/util/pmu.c:457:6 #1 0x55fb0fbea339 in check_info_data tools/perf/util/pmu.c:1434:2 #2 0x55fb0fbea339 in perf_pmu__check_alias tools/perf/util/pmu.c:1504:9 #3 0x55fb0fbdca85 in parse_events_add_pmu tools/perf/util/parse-events.c:1429:32 #4 0x55fb0f965230 in parse_events_parse tools/perf/util/parse-events.y:299:6 #5 0x55fb0fbdf6b2 in parse_events__scanner tools/perf/util/parse-events.c:1822:8 #6 0x55fb0fbdf8c1 in __parse_events tools/perf/util/parse-events.c:2094:8 #7 0x55fb0fa8ffa9 in parse_events tools/perf/util/parse-events.h:41:9 #8 0x55fb0fa8ffa9 in test_event tools/perf/tests/parse-events.c:2393:8 #9 0x55fb0fa8f458 in test__pmu_events tools/perf/tests/parse-events.c:2551:15 #10 0x55fb0fa6d93f in run_test tools/perf/tests/builtin-test.c:242:9 #11 0x55fb0fa6d93f in test_and_print tools/perf/tests/builtin-test.c:271:8 #12 0x55fb0fa6d082 in __cmd_test tools/perf/tests/builtin-test.c:442:5 #13 0x55fb0fa6d082 in cmd_test tools/perf/tests/builtin-test.c:564:9 #14 0x55fb0f942720 in run_builtin tools/perf/perf.c:322:11 #15 0x55fb0f942486 in handle_internal_command tools/perf/perf.c:375:8 #16 0x55fb0f941dab in run_argv tools/perf/perf.c:419:2 #17 0x55fb0f941dab in main tools/perf/perf.c:535:3 ``` Fixes: 7b723db ("perf pmu: Be lazy about loading event info files from sysfs") Signed-off-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Kan Liang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
The following call trace shows a deadlock issue due to recursive locking of mutex "device_mutex". First lock acquire is in target_for_each_device() and second in target_free_device(). PID: 148266 TASK: ffff8be21ffb5d00 CPU: 10 COMMAND: "iscsi_ttx" #0 [ffffa2bfc9ec3b18] __schedule at ffffffffa8060e7f #1 [ffffa2bfc9ec3ba0] schedule at ffffffffa8061224 #2 [ffffa2bfc9ec3bb8] schedule_preempt_disabled at ffffffffa80615ee #3 [ffffa2bfc9ec3bc8] __mutex_lock at ffffffffa8062fd7 #4 [ffffa2bfc9ec3c40] __mutex_lock_slowpath at ffffffffa80631d3 #5 [ffffa2bfc9ec3c50] mutex_lock at ffffffffa806320c #6 [ffffa2bfc9ec3c68] target_free_device at ffffffffc0935998 [target_core_mod] #7 [ffffa2bfc9ec3c90] target_core_dev_release at ffffffffc092f975 [target_core_mod] #8 [ffffa2bfc9ec3ca0] config_item_put at ffffffffa79d250f #9 [ffffa2bfc9ec3cd0] config_item_put at ffffffffa79d2583 #10 [ffffa2bfc9ec3ce0] target_devices_idr_iter at ffffffffc0933f3a [target_core_mod] #11 [ffffa2bfc9ec3d00] idr_for_each at ffffffffa803f6fc #12 [ffffa2bfc9ec3d60] target_for_each_device at ffffffffc0935670 [target_core_mod] #13 [ffffa2bfc9ec3d98] transport_deregister_session at ffffffffc0946408 [target_core_mod] #14 [ffffa2bfc9ec3dc8] iscsit_close_session at ffffffffc09a44a6 [iscsi_target_mod] #15 [ffffa2bfc9ec3df0] iscsit_close_connection at ffffffffc09a4a88 [iscsi_target_mod] #16 [ffffa2bfc9ec3df8] finish_task_switch at ffffffffa76e5d07 #17 [ffffa2bfc9ec3e78] iscsit_take_action_for_connection_exit at ffffffffc0991c23 [iscsi_target_mod] #18 [ffffa2bfc9ec3ea0] iscsi_target_tx_thread at ffffffffc09a403b [iscsi_target_mod] #19 [ffffa2bfc9ec3f08] kthread at ffffffffa76d8080 #20 [ffffa2bfc9ec3f50] ret_from_fork at ffffffffa8200364 Fixes: 36d4cb4 ("scsi: target: Avoid that EXTENDED COPY commands trigger lock inversion") Signed-off-by: Junxiao Bi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Mike Christie <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
Syzkaller reported the following issue: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 2807 at mm/vmalloc.c:3247 __vmalloc_node_range (mm/vmalloc.c:3361) Modules linked in: CPU: 0 PID: 2807 Comm: repro Not tainted 6.6.0-rc2+ #12 Hardware name: Generic DT based system unwind_backtrace from show_stack (arch/arm/kernel/traps.c:258) show_stack from dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1)) dump_stack_lvl from __warn (kernel/panic.c:633 kernel/panic.c:680) __warn from warn_slowpath_fmt (./include/linux/context_tracking.h:153 kernel/panic.c:700) warn_slowpath_fmt from __vmalloc_node_range (mm/vmalloc.c:3361 (discriminator 3)) __vmalloc_node_range from vmalloc_user (mm/vmalloc.c:3478) vmalloc_user from xskq_create (net/xdp/xsk_queue.c:40) xskq_create from xsk_setsockopt (net/xdp/xsk.c:953 net/xdp/xsk.c:1286) xsk_setsockopt from __sys_setsockopt (net/socket.c:2308) __sys_setsockopt from ret_fast_syscall (arch/arm/kernel/entry-common.S:68) xskq_get_ring_size() uses struct_size() macro to safely calculate the size of struct xsk_queue and q->nentries of desc members. But the syzkaller repro was able to set q->nentries with the value initially taken from copy_from_sockptr() high enough to return SIZE_MAX by struct_size(). The next PAGE_ALIGN(size) is such case will overflow the size_t value and set it to 0. This will trigger WARN_ON_ONCE in vmalloc_user() -> __vmalloc_node_range(). The issue is reproducible on 32-bit arm kernel. Fixes: 9f78bf3 ("xsk: support use vaddr as ring") Reported-by: [email protected] Closes: https://lore.kernel.org/all/[email protected]/T/ Reported-by: [email protected] Closes: https://lore.kernel.org/all/[email protected]/T/ Signed-off-by: Andrew Kanner <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Tested-by: [email protected] Acked-by: Magnus Karlsson <[email protected]> Link: https://syzkaller.appspot.com/bug?extid=fae676d3cf469331fc89 Link: https://lore.kernel.org/bpf/[email protected]
[ Upstream commit a154f5f ] The following call trace shows a deadlock issue due to recursive locking of mutex "device_mutex". First lock acquire is in target_for_each_device() and second in target_free_device(). PID: 148266 TASK: ffff8be21ffb5d00 CPU: 10 COMMAND: "iscsi_ttx" #0 [ffffa2bfc9ec3b18] __schedule at ffffffffa8060e7f #1 [ffffa2bfc9ec3ba0] schedule at ffffffffa8061224 #2 [ffffa2bfc9ec3bb8] schedule_preempt_disabled at ffffffffa80615ee #3 [ffffa2bfc9ec3bc8] __mutex_lock at ffffffffa8062fd7 #4 [ffffa2bfc9ec3c40] __mutex_lock_slowpath at ffffffffa80631d3 #5 [ffffa2bfc9ec3c50] mutex_lock at ffffffffa806320c #6 [ffffa2bfc9ec3c68] target_free_device at ffffffffc0935998 [target_core_mod] #7 [ffffa2bfc9ec3c90] target_core_dev_release at ffffffffc092f975 [target_core_mod] #8 [ffffa2bfc9ec3ca0] config_item_put at ffffffffa79d250f #9 [ffffa2bfc9ec3cd0] config_item_put at ffffffffa79d2583 #10 [ffffa2bfc9ec3ce0] target_devices_idr_iter at ffffffffc0933f3a [target_core_mod] #11 [ffffa2bfc9ec3d00] idr_for_each at ffffffffa803f6fc #12 [ffffa2bfc9ec3d60] target_for_each_device at ffffffffc0935670 [target_core_mod] #13 [ffffa2bfc9ec3d98] transport_deregister_session at ffffffffc0946408 [target_core_mod] #14 [ffffa2bfc9ec3dc8] iscsit_close_session at ffffffffc09a44a6 [iscsi_target_mod] #15 [ffffa2bfc9ec3df0] iscsit_close_connection at ffffffffc09a4a88 [iscsi_target_mod] #16 [ffffa2bfc9ec3df8] finish_task_switch at ffffffffa76e5d07 #17 [ffffa2bfc9ec3e78] iscsit_take_action_for_connection_exit at ffffffffc0991c23 [iscsi_target_mod] #18 [ffffa2bfc9ec3ea0] iscsi_target_tx_thread at ffffffffc09a403b [iscsi_target_mod] #19 [ffffa2bfc9ec3f08] kthread at ffffffffa76d8080 #20 [ffffa2bfc9ec3f50] ret_from_fork at ffffffffa8200364 Fixes: 36d4cb4 ("scsi: target: Avoid that EXTENDED COPY commands trigger lock inversion") Signed-off-by: Junxiao Bi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Mike Christie <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Since the 4.4 kernel merge and updating to it, as well updating mpp, we have various playback crashes.
Some files will play about right, some other result in a black screen.
Checking kernel log will show :
and then after a few secs we get
I have uploaded a short sample where i see this issue
https://www.dropbox.com/s/grxtyzyzq8dj6i4/My%20Little%20Pony-%20Friendship%20is%20Magic%20-%20S05E04%20-%20Bloom%20and%20Gloom%20SDTV.mp4?dl=0
The text was updated successfully, but these errors were encountered: