Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RK3328 : device locks on reboot #14

Closed
LongChair opened this issue Jun 15, 2017 · 6 comments
Closed

RK3328 : device locks on reboot #14

LongChair opened this issue Jun 15, 2017 · 6 comments

Comments

@LongChair
Copy link
Collaborator

Quite often when rebooting with reboot, it seems that the device gets locked.

It seems it won't happen if we boot in console mode only.
The only way to restart the device then is to pull the plug.

I think kwiboo saw it stuck on "Starting kernel", @Kwiboo can you confirm ?

@LongChair
Copy link
Collaborator Author

Here is a log thta @Kwiboo posted some time ago :

LibreELEC:~ # [  307.820028] reboot: Restarting system
INFO:    PSCI Power Domain Map:
INFO:      Domain Node : Level 2, parent_node -1, State ON (0x0)
INFO:      Domain Node : Level 1, parent_node 0, State ON (0x0)
INFO:      Domain Node : Level 0, parent_node 0, State ON (0x0)
INFO:      Domain Node : Level 0, parent_node 0, State ON (0x0)
INFO:      CPU Node : MPID 0x0, parent_node 1, State ON (0x0)
INFO:      CPU Node : MPID 0x1, parent_node 1, State ON (0x0)
INFO:      CPU Node : MPID 0x2, parent_node 1, State ON (0x0)
INFO:      CPU Node : MPID 0x3, parent_node 1, State ON (0x0)
DDR version 1.06 20170424
In
SRX
DDR3
786MHz
Bus Width=32 Col=10 Bank=8 Row=15/15 CS=2 Die Bus-Width=16 Size=2048MB
ddrconfig:6
OUT
Boot1 Release Time: 2017-05-18, version: 2.43
ChipType = 0x11, 186
SdmmcInit=2 0
BootCapSize=2000
UserCapSize=15028MB
FwPartOffset=2000 , 2000
SdmmcInit=0 0
BootCapSize=0
UserCapSize=15193MB
FwPartOffset=2000 , 0
StorageInit ok = 59216
Raw SecureMode = 0
SecureInit read PBA: 0x4
SecureInit read PBA: 0x404
SecureInit read PBA: 0x804
SecureInit read PBA: 0xc04
SecureInit read PBA: 0x1004
SecureInit ret = 0, SecureMode = 0
LoadTrustBL
No find bl30.bin
Load uboot, ReadLba = 2000
Load OK, addr=0x200000, size=0x62060
RunBL31 0x10000
NOTICE:  BL31: v1.3(release):aa660e1
NOTICE:  BL31: Built : 16:38:31, Mar  9 2017
NOTICE:  BL31: Release version: v1.1
INFO:    ARM GICv2 driver initialized
INFO:    boot cpu mask: 1
INFO:    plat_rockchip_pmu_init: pd status 0xe
INFO:    BL31: Initializing runtime services
INFO:    BL31: Initializing BL32
INF [0x0] TEE-CORE:init_primary_helper:336: Initializing (1.1.0-68-gf89f3ad-dev #4 Fri Mar  3 00:43:47 UTC 2017 aarch64)
INF [0x0] TEE-CORE:init_primary_helper:337: Release version: 1.1
INF [0x0] TEE-CORE:init_teecore:83: teecore inits done
INFO:    BL31: Preparing for EL3 exit to normal world
INFO:    Entry point address = 0x200000
INFO:    SPSR = 0x3c9
U-Boot 2017.05 (May 27 2017 - 12:46:11 +0200)
Model: Rockchip RK3328 EVB
DRAM:  2 GiB
MMC:   rksdmmc@ff500000: 1, rksdmmc@ff520000: 0
*** Warning - bad CRC, using default environment
In:    serial@ff130000
Out:   serial@ff130000
Err:   serial@ff130000
Net:   Net Initialization Skipped
No ethernet found.
Hit any key to stop autoboot:  0
switch to partitions #0, OK
mmc0(part 0) is current device
Scanning mmc 0:1...
switch to partitions #0, OK
mmc1 is current device
Scanning mmc 1:1...
Found /extlinux/extlinux.conf
Retrieving file: /extlinux/extlinux.conf
reading /extlinux/extlinux.conf
173 bytes read in 3 ms (55.7 KiB/s)
1:      LibreELEC
Retrieving file: /Image
reading /Image
20028584 bytes read in 1601 ms (11.9 MiB/s)
append: earlyprintk console=ttyFIQ0 console=tty0 boot=/dev/mmcblk0p1 disk=/dev/mmcblk0p2 debugging textmode ssh
Retrieving file: /rk3328-box.dtb
reading /rk3328-box.dtb
39414 bytes read in 6 ms (6.3 MiB/s)
## Flattened Device Tree blob at 01f00000
   Booting using the fdt blob at 0x1f00000
   Loading Device Tree to 000000007df3f000, end 000000007df4b9f5 ... OK
Starting kernel ...

@omegamoon
Copy link
Collaborator

kernel logging of a reboot:

[   23.692333] reboot: Restarting system
INFO:    PSCI Power Domain Map:
INFO:      Domain Node : Level 2, parent_node -1, State ON (0x0)
INFO:      Domain Node : Level 1, parent_node 0, State ON (0x0)
INFO:      Domain Node : Level 0, parent_node 0, State ON (0x0)
INFO:      Domain Node : Level 0, parent_node 0, State ON (0x0)
INFO:      CPU Node : MPID 0x0, parent_node 1, State ON (0x0)
INFO:      CPU Node : MPID 0x1, parent_node 1, State ON (0x0)
INFO:      CPU Node : MPID 0x2, parent_node 1, State ON (0x0)
INFO:      CPU Node : MPID 0x3, parent_node 1, State ON (0x0)
DDR version 1.06 20170424
In
SRX
DDR3
933MHz
Bus Width=32 Col=10 Bank=8 Row=15 CS=1 Die Bus-Width=16 Size=1024MB
ddrconfig:1
OUT
Boot1 Release Time: 2017-05-18, version: 2.43
ChipType = 0x11, 184
SdmmcInit=2 0
BootCapSize=2000
UserCapSize=7456MB
FwPartOffset=2000 , 2000
SdmmcInit=0 2
StorageInit ok = 35286
Raw SecureMode = 0
SecureInit read PBA: 0x4
SecureInit read PBA: 0x404
SecureInit read PBA: 0x804
SecureInit read PBA: 0xc04
SecureInit read PBA: 0x1004
SecureInit ret = 0, SecureMode = 0
LoadTrustBL
No find bl30.bin
Load uboot, ReadLba = 2000
Load OK, addr=0x200000, size=0x57bcc
RunBL31 0x10000
�NOTICE:  BL31: v1.3(debug):3cb76c2
NOTICE:  BL31: Built : 21:10:34, Feb 23 2017
NOTICE:  BL31: Release version: v1.1
INFO:    ARM GICv2 driver initialized
INFO:    boot cpu mask: 1
INFO:    plat_rockchip_pmu_init: pd status 0xe
INFO:    BL31: Initializing runtime services
INFO:    BL31: Preparing for EL3 exit to normal world
INFO:    Entry point address = 0x200000
INFO:    SPSR = 0x3c9


U-Boot 2017.02-RK3328-06-g84e0a7b (Apr 20 2017 - 19:05:18)

CPU: rk3328
cpu version = 1
CPU's clock information:
    arm pll = 600000000HZ
    general pll = 800000000HZ
    ddr pll = 1848000000HZ
    codec pll = 594000000HZ
    new pll = 594000000HZ
Board:	Rockchip platform Board
Uboot as second level loader
DRAM:  Found dram banks: 1
Adding bank:0000000000200000(000000003fe00000)
128 MiB
GIC CPU mask = 0x00000001
SdmmcInit = 0 20
SdmmcInit = 2 0
storage init OK!
Using default environment

GetParam
remotectl v0.1
pwm freq=0x11b3dc
pwm_freq_nstime=0x35d
Load FDT from boot image.
no key node
pmic:rk805
can't find dts node for pwm0

@yanghanxing
Copy link

Did you meet this issue on Rock64?
I can't find on my RK3328 EVB board.

@LongChair
Copy link
Collaborator Author

We met that issue with Z28, so far we didn't recieve Rock64 boards yet.

@LongChair
Copy link
Collaborator Author

I didn't seem to have those locks anymore with the latest changes, will close it for now, we'll reopen it if we get some proper way to reporduce it.

@yanghanxing
Copy link

OK,It‘s good news.

Kwiboo pushed a commit that referenced this issue Mar 1, 2018
... before the first use of kaiser_enabled as otherwise funky
things happen:

  about to get started...
  (XEN) d0v0 Unhandled page fault fault/trap [#14, ec=0000]
  (XEN) Pagetable walk from ffff88022a449090:
  (XEN)  L4[0x110] = 0000000229e0e067 0000000000001e0e
  (XEN)  L3[0x008] = 0000000000000000 ffffffffffffffff
  (XEN) domain_crash_sync called from entry.S: fault at ffff82d08033fd08
  entry.o#create_bounce_frame+0x135/0x14d
  (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
  (XEN) ----[ Xen-4.9.1_02-3.21  x86_64  debug=n   Not tainted ]----
  (XEN) CPU:    0
  (XEN) RIP:    e033:[<ffffffff81007460>]
  (XEN) RFLAGS: 0000000000000286   EM: 1   CONTEXT: pv guest (d0v0)

Signed-off-by: Borislav Petkov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Kwiboo pushed a commit that referenced this issue Jul 3, 2018
[ Upstream commit 2bbea6e ]

when mounting an ISO filesystem sometimes (very rarely)
the system hangs because of a race condition between two tasks.

PID: 6766   TASK: ffff88007b2a6dd0  CPU: 0   COMMAND: "mount"
 #0 [ffff880078447ae0] __schedule at ffffffff8168d605
 #1 [ffff880078447b48] schedule_preempt_disabled at ffffffff8168ed49
 #2 [ffff880078447b58] __mutex_lock_slowpath at ffffffff8168c995
 #3 [ffff880078447bb8] mutex_lock at ffffffff8168bdef
 #4 [ffff880078447bd0] sr_block_ioctl at ffffffffa00b6818 [sr_mod]
 #5 [ffff880078447c10] blkdev_ioctl at ffffffff812fea50
 #6 [ffff880078447c70] ioctl_by_bdev at ffffffff8123a8b3
 #7 [ffff880078447c90] isofs_fill_super at ffffffffa04fb1e1 [isofs]
 #8 [ffff880078447da8] mount_bdev at ffffffff81202570
 #9 [ffff880078447e18] isofs_mount at ffffffffa04f9828 [isofs]
#10 [ffff880078447e28] mount_fs at ffffffff81202d09
#11 [ffff880078447e70] vfs_kern_mount at ffffffff8121ea8f
#12 [ffff880078447ea8] do_mount at ffffffff81220fee
#13 [ffff880078447f28] sys_mount at ffffffff812218d6
#14 [ffff880078447f80] system_call_fastpath at ffffffff81698c49
    RIP: 00007fd9ea914e9a  RSP: 00007ffd5d9bf648  RFLAGS: 00010246
    RAX: 00000000000000a5  RBX: ffffffff81698c49  RCX: 0000000000000010
    RDX: 00007fd9ec2bc210  RSI: 00007fd9ec2bc290  RDI: 00007fd9ec2bcf30
    RBP: 0000000000000000   R8: 0000000000000000   R9: 0000000000000010
    R10: 00000000c0ed0001  R11: 0000000000000206  R12: 00007fd9ec2bc040
    R13: 00007fd9eb6b2380  R14: 00007fd9ec2bc210  R15: 00007fd9ec2bcf30
    ORIG_RAX: 00000000000000a5  CS: 0033  SS: 002b

This task was trying to mount the cdrom.  It allocated and configured a
super_block struct and owned the write-lock for the super_block->s_umount
rwsem. While exclusively owning the s_umount lock, it called
sr_block_ioctl and waited to acquire the global sr_mutex lock.

PID: 6785   TASK: ffff880078720fb0  CPU: 0   COMMAND: "systemd-udevd"
 #0 [ffff880078417898] __schedule at ffffffff8168d605
 #1 [ffff880078417900] schedule at ffffffff8168dc59
 #2 [ffff880078417910] rwsem_down_read_failed at ffffffff8168f605
 #3 [ffff880078417980] call_rwsem_down_read_failed at ffffffff81328838
 #4 [ffff8800784179d0] down_read at ffffffff8168cde0
 #5 [ffff8800784179e8] get_super at ffffffff81201cc7
 #6 [ffff880078417a10] __invalidate_device at ffffffff8123a8de
 #7 [ffff880078417a40] flush_disk at ffffffff8123a94b
 #8 [ffff880078417a88] check_disk_change at ffffffff8123ab50
 #9 [ffff880078417ab0] cdrom_open at ffffffffa00a29e1 [cdrom]
#10 [ffff880078417b68] sr_block_open at ffffffffa00b6f9b [sr_mod]
#11 [ffff880078417b98] __blkdev_get at ffffffff8123ba86
#12 [ffff880078417bf0] blkdev_get at ffffffff8123bd65
#13 [ffff880078417c78] blkdev_open at ffffffff8123bf9b
#14 [ffff880078417c90] do_dentry_open at ffffffff811fc7f7
#15 [ffff880078417cd8] vfs_open at ffffffff811fc9cf
#16 [ffff880078417d00] do_last at ffffffff8120d53d
#17 [ffff880078417db0] path_openat at ffffffff8120e6b2
#18 [ffff880078417e48] do_filp_open at ffffffff8121082b
#19 [ffff880078417f18] do_sys_open at ffffffff811fdd33
#20 [ffff880078417f70] sys_open at ffffffff811fde4e
#21 [ffff880078417f80] system_call_fastpath at ffffffff81698c49
    RIP: 00007f29438b0c20  RSP: 00007ffc76624b78  RFLAGS: 00010246
    RAX: 0000000000000002  RBX: ffffffff81698c49  RCX: 0000000000000000
    RDX: 00007f2944a5fa70  RSI: 00000000000a0800  RDI: 00007f2944a5fa70
    RBP: 00007f2944a5f540   R8: 0000000000000000   R9: 0000000000000020
    R10: 00007f2943614c40  R11: 0000000000000246  R12: ffffffff811fde4e
    R13: ffff880078417f78  R14: 000000000000000c  R15: 00007f2944a4b010
    ORIG_RAX: 0000000000000002  CS: 0033  SS: 002b

This task tried to open the cdrom device, the sr_block_open function
acquired the global sr_mutex lock. The call to check_disk_change()
then saw an event flag indicating a possible media change and tried
to flush any cached data for the device.
As part of the flush, it tried to acquire the super_block->s_umount
lock associated with the cdrom device.
This was the same super_block as created and locked by the previous task.

The first task acquires the s_umount lock and then the sr_mutex_lock;
the second task acquires the sr_mutex_lock and then the s_umount lock.

This patch fixes the issue by moving check_disk_change() out of
cdrom_open() and let the caller take care of it.

Signed-off-by: Maurizio Lombardi <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Kwiboo pushed a commit that referenced this issue Nov 5, 2018
Call it 'nr', as in this context it should be expressive enough, i.e.:

  # perf trace -e sched:*waking/nr=8,call-graph=fp/
     0.000 :0/0 sched:sched_waking:comm=rcu_sched pid=10 prio=120 target_cpu=001
                                       try_to_wake_up ([kernel.kallsyms])
                                       sched_clock ([kernel.kallsyms])
     3.933 :0/0 sched:sched_waking:comm=rcu_sched pid=10 prio=120 target_cpu=001
                                       try_to_wake_up ([kernel.kallsyms])
                                       sched_clock ([kernel.kallsyms])
     3.970 IPDL Backgroun/3622 sched:sched_waking:comm=Gecko_IOThread pid=3569 prio=120 target_cpu=003
                                       try_to_wake_up ([kernel.kallsyms])
                                       __libc_write (/usr/lib64/libpthread-2.26.so)
    20.069 IPDL Backgroun/3622 sched:sched_waking:comm=Gecko_IOThread pid=3569 prio=120 target_cpu=003
                                       try_to_wake_up ([kernel.kallsyms])
                                       __libc_write (/usr/lib64/libpthread-2.26.so)
    37.170 IPDL Backgroun/3622 sched:sched_waking:comm=Gecko_IOThread pid=3569 prio=120 target_cpu=003
                                       try_to_wake_up ([kernel.kallsyms])
                                       __libc_write (/usr/lib64/libpthread-2.26.so)
    53.267 IPDL Backgroun/3622 sched:sched_waking:comm=Gecko_IOThread pid=3569 prio=120 target_cpu=003
                                       try_to_wake_up ([kernel.kallsyms])
                                       __libc_write (/usr/lib64/libpthread-2.26.so)
    70.365 IPDL Backgroun/3622 sched:sched_waking:comm=Gecko_IOThread pid=3569 prio=120 target_cpu=003
                                       try_to_wake_up ([kernel.kallsyms])
                                       __libc_write (/usr/lib64/libpthread-2.26.so)
    75.781 Web Content/3649 sched:sched_waking:comm=JS Helper pid=3670 prio=120 target_cpu=000
                                       try_to_wake_up ([kernel.kallsyms])
                                       try_to_wake_up ([kernel.kallsyms])
                                       wake_up_q ([kernel.kallsyms])
                                       futex_wake ([kernel.kallsyms])
                                       do_futex ([kernel.kallsyms])
                                       __x64_sys_futex ([kernel.kallsyms])
                                       do_syscall_64 ([kernel.kallsyms])
                                       entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
                                       pthread_cond_signal@@GLIBC_2.3.2 (/usr/lib64/libpthread-2.26.so)
  #

  # perf trace -e sched:*switch/nr=2/,block:*_plug/nr=4/,block:*_unplug/nr=1/,net:*dev_queue/nr=3,max-stack=16/
     0.000 :0/0 sched:sched_switch:swapper/0:0 [120] S ==> trace:3367 [120]
     0.046 :0/0 sched:sched_switch:swapper/1:0 [120] S ==> kworker/u16:58:2722 [120]
   570.670 irq/50-iwlwifi/680 net:net_dev_queue:dev=wlp3s0 skbaddr=0xffff93498051ef00 len=66
                                       __dev_queue_xmit ([kernel.kallsyms])
  1106.141 jbd2/dm-0-8/476 block:block_plug:[jbd2/dm-0-8]
  1106.175 jbd2/dm-0-8/476 block:block_unplug:[jbd2/dm-0-8] 1
  1618.088 kworker/u16:30/2694 block:block_plug:[kworker/u16:30]
  1810.000 :0/0 net:net_dev_queue:dev=vnet0 skbaddr=0xffff93498051ef00 len=52
                                       __dev_queue_xmit ([kernel.kallsyms])
  3857.974 :0/0 net:net_dev_queue:dev=vnet0 skbaddr=0xffff93498051f900 len=52
                                       __dev_queue_xmit ([kernel.kallsyms])
  4790.277 jbd2/dm-2-8/748 block:block_plug:[jbd2/dm-2-8]
  4790.448 jbd2/dm-2-8/748 block:block_plug:[jbd2/dm-2-8]
  #

The global --max-events has precendence:

  # trace --max-events 3 -e sched:*switch/nr=2/,block:*_plug/nr=4/,block:*_unplug/nr=1/,net:*dev_queue/nr=3,max-stack=16/
     0.000 :0/0 sched:sched_switch:swapper/0:0 [120] S ==> qemu-system-x86:2252 [120]
     0.029 qemu-system-x8/2252 sched:sched_switch:qemu-system-x86:2252 [120] D ==> swapper/0:0 [120]
    58.047 DNS Res~er #14/31661 net:net_dev_queue:dev=wlp3s0 skbaddr=0xffff9346966af100 len=84
                                       __dev_queue_xmit ([kernel.kallsyms])
                                       __libc_send (/usr/lib64/libpthread-2.26.so)
  #

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Kwiboo pushed a commit that referenced this issue Dec 15, 2018
Increase kasan instrumented kernel stack size from 32k to 64k. Other
architectures seems to get away with just doubling kernel stack size under
kasan, but on s390 this appears to be not enough due to bigger frame size.
The particular pain point is kasan inlined checks (CONFIG_KASAN_INLINE
vs CONFIG_KASAN_OUTLINE). With inlined checks one particular case hitting
stack overflow is fs sync on xfs filesystem:

 #0 [9a0681e8]  704 bytes  check_usage at 34b1fc
 #1 [9a0684a8]  432 bytes  check_usage at 34c710
 #2 [9a068658]  1048 bytes  validate_chain at 35044a
 #3 [9a068a70]  312 bytes  __lock_acquire at 3559fe
 #4 [9a068ba8]  440 bytes  lock_acquire at 3576ee
 #5 [9a068d60]  104 bytes  _raw_spin_lock at 21b44e0
 #6 [9a068dc8]  1992 bytes  enqueue_entity at 2dbf72
 #7 [9a069590]  1496 bytes  enqueue_task_fair at 2df5f0
 #8 [9a069b68]  64 bytes  ttwu_do_activate at 28f438
 #9 [9a069ba8]  552 bytes  try_to_wake_up at 298c4c
 #10 [9a069dd0]  168 bytes  wake_up_worker at 23f97c
 #11 [9a069e78]  200 bytes  insert_work at 23fc2e
 #12 [9a069f40]  648 bytes  __queue_work at 2487c0
 #13 [9a06a1c8]  200 bytes  __queue_delayed_work at 24db28
 #14 [9a06a290]  248 bytes  mod_delayed_work_on at 24de84
 #15 [9a06a388]  24 bytes  kblockd_mod_delayed_work_on at 153e2a0
 #16 [9a06a3a0]  288 bytes  __blk_mq_delay_run_hw_queue at 158168c
 #17 [9a06a4c0]  192 bytes  blk_mq_run_hw_queue at 1581a3c
 #18 [9a06a580]  184 bytes  blk_mq_sched_insert_requests at 15a2192
 #19 [9a06a638]  1024 bytes  blk_mq_flush_plug_list at 1590f3a
 #20 [9a06aa38]  704 bytes  blk_flush_plug_list at 1555028
 #21 [9a06acf8]  320 bytes  schedule at 219e476
 #22 [9a06ae38]  760 bytes  schedule_timeout at 21b0aac
 #23 [9a06b130]  408 bytes  wait_for_common at 21a1706
 #24 [9a06b2c8]  360 bytes  xfs_buf_iowait at fa1540
 #25 [9a06b430]  256 bytes  __xfs_buf_submit at fadae6
 #26 [9a06b530]  264 bytes  xfs_buf_read_map at fae3f6
 #27 [9a06b638]  656 bytes  xfs_trans_read_buf_map at 10ac9a8
 #28 [9a06b8c8]  304 bytes  xfs_btree_kill_root at e72426
 #29 [9a06b9f8]  288 bytes  xfs_btree_lookup_get_block at e7bc5e
 #30 [9a06bb18]  624 bytes  xfs_btree_lookup at e7e1a6
 #31 [9a06bd88]  2664 bytes  xfs_alloc_ag_vextent_near at dfa070
 #32 [9a06c7f0]  144 bytes  xfs_alloc_ag_vextent at dff3ca
 #33 [9a06c880]  1128 bytes  xfs_alloc_vextent at e05fce
 #34 [9a06cce8]  584 bytes  xfs_bmap_btalloc at e58342
 #35 [9a06cf30]  1336 bytes  xfs_bmapi_write at e618de
 #36 [9a06d468]  776 bytes  xfs_iomap_write_allocate at ff678e
 #37 [9a06d770]  720 bytes  xfs_map_blocks at f82af8
 rockchip-linux#38 [9a06da40]  928 bytes  xfs_writepage_map at f83cd6
 rockchip-linux#39 [9a06dde0]  320 bytes  xfs_do_writepage at f85872
 rockchip-linux#40 [9a06df20]  1320 bytes  write_cache_pages at 73dfe8
 rockchip-linux#41 [9a06e448]  208 bytes  xfs_vm_writepages at f7f892
 rockchip-linux#42 [9a06e518]  88 bytes  do_writepages at 73fe6a
 rockchip-linux#43 [9a06e570]  872 bytes  __writeback_single_inode at a20cb6
 rockchip-linux#44 [9a06e8d8]  664 bytes  writeback_sb_inodes at a23be2
 rockchip-linux#45 [9a06eb70]  296 bytes  __writeback_inodes_wb at a242e0
 rockchip-linux#46 [9a06ec98]  928 bytes  wb_writeback at a2500e
 rockchip-linux#47 [9a06f038]  848 bytes  wb_do_writeback at a260ae
 rockchip-linux#48 [9a06f388]  536 bytes  wb_workfn at a28228
 rockchip-linux#49 [9a06f5a0]  1088 bytes  process_one_work at 24a234
 rockchip-linux#50 [9a06f9e0]  1120 bytes  worker_thread at 24ba26
 rockchip-linux#51 [9a06fe40]  104 bytes  kthread at 26545a
 rockchip-linux#52 [9a06fea8]             kernel_thread_starter at 21b6b62

To be able to increase the stack size to 64k reuse LLILL instruction
in __switch_to function to load 64k - STACK_FRAME_OVERHEAD - __PT_SIZE
(65192) value as unsigned.

Reported-by: Benjamin Block <[email protected]>
Reviewed-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>
Kwiboo pushed a commit that referenced this issue Dec 15, 2018
info->nr_rings isn't adjusted in case of ENOMEM error from
negotiate_mq(). This leads to kernel panic in error path.

Typical call stack involving panic -
 #8 page_fault at ffffffff8175936f
    [exception RIP: blkif_free_ring+33]
    RIP: ffffffffa0149491  RSP: ffff8804f7673c08  RFLAGS: 00010292
 ...
 #9 blkif_free at ffffffffa0149aaa [xen_blkfront]
 #10 talk_to_blkback at ffffffffa014c8cd [xen_blkfront]
 #11 blkback_changed at ffffffffa014ea8b [xen_blkfront]
 #12 xenbus_otherend_changed at ffffffff81424670
 #13 backend_changed at ffffffff81426dc3
 #14 xenwatch_thread at ffffffff81422f29
 #15 kthread at ffffffff810abe6a
 #16 ret_from_fork at ffffffff81754078

Cc: [email protected]
Fixes: 7ed8ce1 ("xen-blkfront: move negotiate_mq to cover all cases of new VBDs")
Signed-off-by: Manjunath Patil <[email protected]>
Acked-by: Roger Pau Monné <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>
Kwiboo pushed a commit that referenced this issue Dec 15, 2018
There are actually two kinds of discard merge:

- one is the normal discard merge, just like normal read/write request,
and call it single-range discard

- another is the multi-range discard, queue_max_discard_segments(rq->q) > 1

For the former case, queue_max_discard_segments(rq->q) is 1, and we
should handle this kind of discard merge like the normal read/write
request.

This patch fixes the following kernel panic issue[1], which is caused by
not removing the single-range discard request from elevator queue.

Guangwu has one raid discard test case, in which this issue is a bit
easier to trigger, and I verified that this patch can fix the kernel
panic issue in Guangwu's test case.

[1] kernel panic log from Jens's report

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000148
 PGD 0 P4D 0.
 Oops: 0000 [#1] SMP PTI
 CPU: 37 PID: 763 Comm: kworker/37:1H Not tainted \
4.20.0-rc3-00649-ge64d9a554a91-dirty #14  Hardware name: Wiwynn \
Leopard-Orv2/Leopard-DDR BW, BIOS LBM08   03/03/2017       Workqueue: kblockd \
blk_mq_run_work_fn                                            RIP: \
0010:blk_mq_get_driver_tag+0x81/0x120                                       Code: 24 \
10 48 89 7c 24 20 74 21 83 fa ff 0f 95 c0 48 8b 4c 24 28 65 48 33 0c 25 28 00 00 00 \
0f 85 96 00 00 00 48 83 c4 30 5b 5d c3 <48> 8b 87 48 01 00 00 8b 40 04 39 43 20 72 37 \
f6 87 b0 00 00 00 02  RSP: 0018:ffffc90004aabd30 EFLAGS: 00010246                     \
  RAX: 0000000000000003 RBX: ffff888465ea1300 RCX: ffffc90004aabde8
 RDX: 00000000ffffffff RSI: ffffc90004aabde8 RDI: 0000000000000000
 RBP: 0000000000000000 R08: ffff888465ea1348 R09: 0000000000000000
 R10: 0000000000001000 R11: 00000000ffffffff R12: ffff888465ea1300
 R13: 0000000000000000 R14: ffff888465ea1348 R15: ffff888465d10000
 FS:  0000000000000000(0000) GS:ffff88846f9c0000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000148 CR3: 000000000220a003 CR4: 00000000003606e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  blk_mq_dispatch_rq_list+0xec/0x480
  ? elv_rb_del+0x11/0x30
  blk_mq_do_dispatch_sched+0x6e/0xf0
  blk_mq_sched_dispatch_requests+0xfa/0x170
  __blk_mq_run_hw_queue+0x5f/0xe0
  process_one_work+0x154/0x350
  worker_thread+0x46/0x3c0
  kthread+0xf5/0x130
  ? process_one_work+0x350/0x350
  ? kthread_destroy_worker+0x50/0x50
  ret_from_fork+0x1f/0x30
 Modules linked in: sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel \
kvm switchtec irqbypass iTCO_wdt iTCO_vendor_support efivars cdc_ether usbnet mii \
cdc_acm i2c_i801 lpc_ich mfd_core ipmi_si ipmi_devintf ipmi_msghandler acpi_cpufreq \
button sch_fq_codel nfsd nfs_acl lockd grace auth_rpcgss oid_registry sunrpc nvme \
nvme_core fuse sg loop efivarfs autofs4  CR2: 0000000000000148                        \

 ---[ end trace 340a1fb996df1b9b ]---
 RIP: 0010:blk_mq_get_driver_tag+0x81/0x120
 Code: 24 10 48 89 7c 24 20 74 21 83 fa ff 0f 95 c0 48 8b 4c 24 28 65 48 33 0c 25 28 \
00 00 00 0f 85 96 00 00 00 48 83 c4 30 5b 5d c3 <48> 8b 87 48 01 00 00 8b 40 04 39 43 \
20 72 37 f6 87 b0 00 00 00 02

Fixes: 445251d ("blk-mq: fix discard merge with scheduler attached")
Reported-by: Jens Axboe <[email protected]>
Cc: Guangwu Zhang <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Jianchao Wang <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Kwiboo pushed a commit that referenced this issue Dec 1, 2019
Fatal read errors are worth warning about, unless of course the device
was just unplugged from the machine - something that's a rather normal
occurrence when the igb/igc adapter is located on a Thunderbolt dock. So,
let's only WARN() if there's a fatal read error while the device is
still present.

This fixes the following WARN splat that's been appearing whenever I
unplug my Caldigit TS3 Thunderbolt dock from my laptop:

  igb 0000:09:00.0 enp9s0: PCIe link lost
  ------------[ cut here ]------------
  igb: Failed to read reg 0x18!
  WARNING: CPU: 7 PID: 516 at
  drivers/net/ethernet/intel/igb/igb_main.c:756 igb_rd32+0x57/0x6a [igb]
  Modules linked in: igb dca thunderbolt fuse vfat fat elan_i2c mei_wdt
  mei_hdcp i915 wmi_bmof intel_wmi_thunderbolt iTCO_wdt
  iTCO_vendor_support x86_pkg_temp_thermal intel_powerclamp joydev
  coretemp crct10dif_pclmul crc32_pclmul i2c_algo_bit ghash_clmulni_intel
  intel_cstate drm_kms_helper intel_uncore syscopyarea sysfillrect
  sysimgblt fb_sys_fops intel_rapl_perf intel_xhci_usb_role_switch mei_me
  drm roles idma64 i2c_i801 ucsi_acpi typec_ucsi mei intel_lpss_pci
  processor_thermal_device typec intel_pch_thermal intel_soc_dts_iosf
  intel_lpss int3403_thermal thinkpad_acpi wmi int340x_thermal_zone
  ledtrig_audio int3400_thermal acpi_thermal_rel acpi_pad video
  pcc_cpufreq ip_tables serio_raw nvme nvme_core crc32c_intel uas
  usb_storage e1000e i2c_dev
  CPU: 7 PID: 516 Comm: kworker/u16:3 Not tainted 5.2.0-rc1Lyude-Test+ #14
  Hardware name: LENOVO 20L8S2N800/20L8S2N800, BIOS N22ET35W (1.12 ) 04/09/2018
  Workqueue: kacpi_hotplug acpi_hotplug_work_fn
  RIP: 0010:igb_rd32+0x57/0x6a [igb]
  Code: 87 b8 fc ff ff 48 c7 47 08 00 00 00 00 48 c7 c6 33 42 9b c0 4c 89
  c7 e8 47 45 cd dc 89 ee 48 c7 c7 43 42 9b c0 e8 c1 94 71 dc <0f> 0b eb
  08 8b 00 ff c0 75 b0 eb c8 44 89 e0 5d 41 5c c3 0f 1f 44
  RSP: 0018:ffffba5801cf7c48 EFLAGS: 00010286
  RAX: 0000000000000000 RBX: ffff9e7956608840 RCX: 0000000000000007
  RDX: 0000000000000000 RSI: ffffba5801cf7b24 RDI: ffff9e795e3d6a00
  RBP: 0000000000000018 R08: 000000009dec4a01 R09: ffffffff9e61018f
  R10: 0000000000000000 R11: ffffba5801cf7ae5 R12: 00000000ffffffff
  R13: ffff9e7956608840 R14: ffff9e795a6f10b0 R15: 0000000000000000
  FS:  0000000000000000(0000) GS:ffff9e795e3c0000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000564317bc4088 CR3: 000000010e00a006 CR4: 00000000003606e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   igb_release_hw_control+0x1a/0x30 [igb]
   igb_remove+0xc5/0x14b [igb]
   pci_device_remove+0x3b/0x93
   device_release_driver_internal+0xd7/0x17e
   pci_stop_bus_device+0x36/0x75
   pci_stop_bus_device+0x66/0x75
   pci_stop_bus_device+0x66/0x75
   pci_stop_and_remove_bus_device+0xf/0x19
   trim_stale_devices+0xc5/0x13a
   ? __pm_runtime_resume+0x6e/0x7b
   trim_stale_devices+0x103/0x13a
   ? __pm_runtime_resume+0x6e/0x7b
   trim_stale_devices+0x103/0x13a
   acpiphp_check_bridge+0xd8/0xf5
   acpiphp_hotplug_notify+0xf7/0x14b
   ? acpiphp_check_bridge+0xf5/0xf5
   acpi_device_hotplug+0x357/0x3b5
   acpi_hotplug_work_fn+0x1a/0x23
   process_one_work+0x1a7/0x296
   worker_thread+0x1a8/0x24c
   ? process_scheduled_works+0x2c/0x2c
   kthread+0xe9/0xee
   ? kthread_destroy_worker+0x41/0x41
   ret_from_fork+0x35/0x40
  ---[ end trace 252bf10352c63d22 ]---

Signed-off-by: Lyude Paul <[email protected]>
Fixes: 47e1669 ("igb/igc: warn when fatal read failure happens")
Acked-by: Sasha Neftin <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Acked-by: Feng Tang <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
Kwiboo pushed a commit that referenced this issue Jul 20, 2020
The following deadlock was captured. The first process is holding 'kernfs_mutex'
and hung by io. The io was staging in 'r1conf.pending_bio_list' of raid1 device,
this pending bio list would be flushed by second process 'md127_raid1', but
it was hung by 'kernfs_mutex'. Using sysfs_notify_dirent_safe() to replace
sysfs_notify() can fix it. There were other sysfs_notify() invoked from io
path, removed all of them.

 PID: 40430  TASK: ffff8ee9c8c65c40  CPU: 29  COMMAND: "probe_file"
  #0 [ffffb87c4df37260] __schedule at ffffffff9a8678ec
  #1 [ffffb87c4df372f8] schedule at ffffffff9a867f06
  #2 [ffffb87c4df37310] io_schedule at ffffffff9a0c73e6
  #3 [ffffb87c4df37328] __dta___xfs_iunpin_wait_3443 at ffffffffc03a4057 [xfs]
  #4 [ffffb87c4df373a0] xfs_iunpin_wait at ffffffffc03a6c79 [xfs]
  #5 [ffffb87c4df373b0] __dta_xfs_reclaim_inode_3357 at ffffffffc039a46c [xfs]
  #6 [ffffb87c4df37400] xfs_reclaim_inodes_ag at ffffffffc039a8b6 [xfs]
  #7 [ffffb87c4df37590] xfs_reclaim_inodes_nr at ffffffffc039bb33 [xfs]
  #8 [ffffb87c4df375b0] xfs_fs_free_cached_objects at ffffffffc03af0e9 [xfs]
  #9 [ffffb87c4df375c0] super_cache_scan at ffffffff9a287ec7
 #10 [ffffb87c4df37618] shrink_slab at ffffffff9a1efd93
 #11 [ffffb87c4df37700] shrink_node at ffffffff9a1f5968
 #12 [ffffb87c4df37788] do_try_to_free_pages at ffffffff9a1f5ea2
 #13 [ffffb87c4df377f0] try_to_free_mem_cgroup_pages at ffffffff9a1f6445
 #14 [ffffb87c4df37880] try_charge at ffffffff9a26cc5f
 #15 [ffffb87c4df37920] memcg_kmem_charge_memcg at ffffffff9a270f6a
 #16 [ffffb87c4df37958] new_slab at ffffffff9a251430
 #17 [ffffb87c4df379c0] ___slab_alloc at ffffffff9a251c85
 #18 [ffffb87c4df37a80] __slab_alloc at ffffffff9a25635d
 #19 [ffffb87c4df37ac0] kmem_cache_alloc at ffffffff9a251f89
 #20 [ffffb87c4df37b00] alloc_inode at ffffffff9a2a2b10
 #21 [ffffb87c4df37b20] iget_locked at ffffffff9a2a4854
 #22 [ffffb87c4df37b60] kernfs_get_inode at ffffffff9a311377
 #23 [ffffb87c4df37b80] kernfs_iop_lookup at ffffffff9a311e2b
 #24 [ffffb87c4df37ba8] lookup_slow at ffffffff9a290118
 #25 [ffffb87c4df37c10] walk_component at ffffffff9a291e83
 #26 [ffffb87c4df37c78] path_lookupat at ffffffff9a293619
 #27 [ffffb87c4df37cd8] filename_lookup at ffffffff9a2953af
 #28 [ffffb87c4df37de8] user_path_at_empty at ffffffff9a295566
 #29 [ffffb87c4df37e10] vfs_statx at ffffffff9a289787
 #30 [ffffb87c4df37e70] SYSC_newlstat at ffffffff9a289d5d
 #31 [ffffb87c4df37f18] sys_newlstat at ffffffff9a28a60e
 #32 [ffffb87c4df37f28] do_syscall_64 at ffffffff9a003949
 #33 [ffffb87c4df37f50] entry_SYSCALL_64_after_hwframe at ffffffff9aa001ad
     RIP: 00007f617a5f2905  RSP: 00007f607334f838  RFLAGS: 00000246
     RAX: ffffffffffffffda  RBX: 00007f6064044b20  RCX: 00007f617a5f2905
     RDX: 00007f6064044b20  RSI: 00007f6064044b20  RDI: 00007f6064005890
     RBP: 00007f6064044aa0   R8: 0000000000000030   R9: 000000000000011c
     R10: 0000000000000013  R11: 0000000000000246  R12: 00007f606417e6d0
     R13: 00007f6064044aa0  R14: 00007f6064044b10  R15: 00000000ffffffff
     ORIG_RAX: 0000000000000006  CS: 0033  SS: 002b

 PID: 927    TASK: ffff8f15ac5dbd80  CPU: 42  COMMAND: "md127_raid1"
  #0 [ffffb87c4df07b28] __schedule at ffffffff9a8678ec
  #1 [ffffb87c4df07bc0] schedule at ffffffff9a867f06
  #2 [ffffb87c4df07bd8] schedule_preempt_disabled at ffffffff9a86825e
  #3 [ffffb87c4df07be8] __mutex_lock at ffffffff9a869bcc
  #4 [ffffb87c4df07ca0] __mutex_lock_slowpath at ffffffff9a86a013
  #5 [ffffb87c4df07cb0] mutex_lock at ffffffff9a86a04f
  #6 [ffffb87c4df07cc8] kernfs_find_and_get_ns at ffffffff9a311d83
  #7 [ffffb87c4df07cf0] sysfs_notify at ffffffff9a314b3a
  #8 [ffffb87c4df07d18] md_update_sb at ffffffff9a688696
  #9 [ffffb87c4df07d98] md_update_sb at ffffffff9a6886d5
 #10 [ffffb87c4df07da8] md_check_recovery at ffffffff9a68ad9c
 #11 [ffffb87c4df07dd0] raid1d at ffffffffc01f0375 [raid1]
 #12 [ffffb87c4df07ea0] md_thread at ffffffff9a680348
 #13 [ffffb87c4df07f08] kthread at ffffffff9a0b8005
 #14 [ffffb87c4df07f50] ret_from_fork at ffffffff9aa00344

Signed-off-by: Junxiao Bi <[email protected]>
Signed-off-by: Song Liu <[email protected]>
rubenvb pushed a commit to rubenvb/linux-rockchip that referenced this issue Jan 3, 2021
[ Upstream commit 5f0ce58 ]

Fix the following possible deadlock reported by lockdep disabling BH
running mt76_free_pending_txwi()

================================
WARNING: inconsistent lock state
5.9.0-rc6 Kwiboo#14 Not tainted
--------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
rmmod/1227 [HC0[0]:SC0[0]:HE1:SE1] takes:
ffff888156a83530 (&dev->lock#2){+.?.}-{2:2}, at: mt76_dma_cleanup+0x125/0x150 [mt76]
{IN-SOFTIRQ-W} state was registered at:
  __lock_acquire+0x20c/0x6b0
  lock_acquire+0x9d/0x220
  _raw_spin_lock+0x2c/0x70
  mt76_dma_tx_cleanup+0xc7/0x200 [mt76]
  mt76x02_poll_tx+0x31/0xb0 [mt76x02_lib]
  napi_poll+0x3a/0x100
  net_rx_action+0xa8/0x200
  __do_softirq+0xc4/0x430
  asm_call_on_stack+0xf/0x20
  do_softirq_own_stack+0x49/0x60
  irq_exit_rcu+0x9a/0xd0
  common_interrupt+0xa4/0x190
  asm_common_interrupt+0x1e/0x40
irq event stamp: 9915
hardirqs last  enabled at (9915): [<ffffffff8124e286>] __free_pages_ok+0x336/0x3b0
hardirqs last disabled at (9914): [<ffffffff8124e24e>] __free_pages_ok+0x2fe/0x3b0
softirqs last  enabled at (9912): [<ffffffffa03aa672>] mt76_dma_rx_cleanup+0xa2/0x120 [mt76]
softirqs last disabled at (9846): [<ffffffffa03aa5ea>] mt76_dma_rx_cleanup+0x1a/0x120 [mt76]

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&dev->lock#2);
  <Interrupt>
    lock(&dev->lock#2);

 *** DEADLOCK ***

1 lock held by rmmod/1227:
 #0: ffff88815b5eb240 (&dev->mutex){....}-{3:3}, at: driver_detach+0xb5/0x110

stack backtrace:
CPU: 1 PID: 1227 Comm: rmmod Kdump: loaded Not tainted 5.9.0-rc6-wdn-src+ Kwiboo#14
Hardware name: Dell Inc. Studio XPS 1340/0K183D, BIOS A11 09/08/2009
Call Trace:
 dump_stack+0x77/0xa0
 mark_lock_irq.cold+0x15/0x39
 mark_lock+0x1fc/0x500
 mark_usage+0xc7/0x140
 __lock_acquire+0x20c/0x6b0
 ? find_held_lock+0x2b/0x80
 ? sched_clock_cpu+0xc/0xb0
 lock_acquire+0x9d/0x220
 ? mt76_dma_cleanup+0x125/0x150 [mt76]
 _raw_spin_lock+0x2c/0x70
 ? mt76_dma_cleanup+0x125/0x150 [mt76]
 mt76_dma_cleanup+0x125/0x150 [mt76]
 mt76x2_cleanup+0x5a/0x70 [mt76x2e]
 mt76x2e_remove+0x18/0x30 [mt76x2e]
 pci_device_remove+0x36/0xa0
 __device_release_driver+0x16c/0x220
 driver_detach+0xcf/0x110
 bus_remove_driver+0x56/0xca
 pci_unregister_driver+0x36/0x80
 __do_sys_delete_module.constprop.0+0x127/0x200
 ? syscall_enter_from_user_mode+0x1d/0x50
 ? trace_hardirqs_on+0x1c/0xe0
 do_syscall_64+0x33/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7ff0da54e36b
Code: 73 01 c3 48 8b 0d 2d 0b 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d fd 0a 0c 00 f7 d8 64 89 01 48

Fixes: dd57a95 ("mt76: move txwi handling code to dma.c, since it is mmio specific")
Signed-off-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: Felix Fietkau <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Kwiboo pushed a commit that referenced this issue Jun 17, 2023
Found by leak sanitizer:
```
==1632594==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 21 byte(s) in 1 object(s) allocated from:
    #0 0x7f2953a7077b in __interceptor_strdup ../../../../src/libsanitizer/asan/asan_interceptors.cpp:439
    #1 0x556701d6fbbf in perf_env__read_cpuid util/env.c:369
    #2 0x556701d70589 in perf_env__cpuid util/env.c:465
    #3 0x55670204bba2 in x86__is_amd_cpu arch/x86/util/env.c:14
    #4 0x5567020487a2 in arch__post_evsel_config arch/x86/util/evsel.c:83
    #5 0x556701d8f78b in evsel__config util/evsel.c:1366
    #6 0x556701ef5872 in evlist__config util/record.c:108
    #7 0x556701cd6bcd in test__PERF_RECORD tests/perf-record.c:112
    #8 0x556701cacd07 in run_test tests/builtin-test.c:236
    #9 0x556701cacfac in test_and_print tests/builtin-test.c:265
    #10 0x556701cadddb in __cmd_test tests/builtin-test.c:402
    #11 0x556701caf2aa in cmd_test tests/builtin-test.c:559
    #12 0x556701d3b557 in run_builtin tools/perf/perf.c:323
    #13 0x556701d3bac8 in handle_internal_command tools/perf/perf.c:377
    #14 0x556701d3be90 in run_argv tools/perf/perf.c:421
    #15 0x556701d3c3f8 in main tools/perf/perf.c:537
    #16 0x7f2952a46189 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

SUMMARY: AddressSanitizer: 21 byte(s) leaked in 1 allocation(s).
```

Fixes: f7b58cb ("perf mem/c2c: Add load store event mappings for AMD")
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Ravi Bangoria <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Kwiboo pushed a commit that referenced this issue Jul 29, 2023
[ Upstream commit 99d4850 ]

Found by leak sanitizer:
```
==1632594==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 21 byte(s) in 1 object(s) allocated from:
    #0 0x7f2953a7077b in __interceptor_strdup ../../../../src/libsanitizer/asan/asan_interceptors.cpp:439
    #1 0x556701d6fbbf in perf_env__read_cpuid util/env.c:369
    #2 0x556701d70589 in perf_env__cpuid util/env.c:465
    #3 0x55670204bba2 in x86__is_amd_cpu arch/x86/util/env.c:14
    #4 0x5567020487a2 in arch__post_evsel_config arch/x86/util/evsel.c:83
    #5 0x556701d8f78b in evsel__config util/evsel.c:1366
    #6 0x556701ef5872 in evlist__config util/record.c:108
    #7 0x556701cd6bcd in test__PERF_RECORD tests/perf-record.c:112
    #8 0x556701cacd07 in run_test tests/builtin-test.c:236
    #9 0x556701cacfac in test_and_print tests/builtin-test.c:265
    #10 0x556701cadddb in __cmd_test tests/builtin-test.c:402
    #11 0x556701caf2aa in cmd_test tests/builtin-test.c:559
    #12 0x556701d3b557 in run_builtin tools/perf/perf.c:323
    #13 0x556701d3bac8 in handle_internal_command tools/perf/perf.c:377
    #14 0x556701d3be90 in run_argv tools/perf/perf.c:421
    #15 0x556701d3c3f8 in main tools/perf/perf.c:537
    #16 0x7f2952a46189 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

SUMMARY: AddressSanitizer: 21 byte(s) leaked in 1 allocation(s).
```

Fixes: f7b58cb ("perf mem/c2c: Add load store event mappings for AMD")
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Ravi Bangoria <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Kwiboo pushed a commit that referenced this issue Oct 23, 2023
Fix an error detected by memory sanitizer:
```
==4033==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x55fb0fbedfc7 in read_alias_info tools/perf/util/pmu.c:457:6
    #1 0x55fb0fbea339 in check_info_data tools/perf/util/pmu.c:1434:2
    #2 0x55fb0fbea339 in perf_pmu__check_alias tools/perf/util/pmu.c:1504:9
    #3 0x55fb0fbdca85 in parse_events_add_pmu tools/perf/util/parse-events.c:1429:32
    #4 0x55fb0f965230 in parse_events_parse tools/perf/util/parse-events.y:299:6
    #5 0x55fb0fbdf6b2 in parse_events__scanner tools/perf/util/parse-events.c:1822:8
    #6 0x55fb0fbdf8c1 in __parse_events tools/perf/util/parse-events.c:2094:8
    #7 0x55fb0fa8ffa9 in parse_events tools/perf/util/parse-events.h:41:9
    #8 0x55fb0fa8ffa9 in test_event tools/perf/tests/parse-events.c:2393:8
    #9 0x55fb0fa8f458 in test__pmu_events tools/perf/tests/parse-events.c:2551:15
    #10 0x55fb0fa6d93f in run_test tools/perf/tests/builtin-test.c:242:9
    #11 0x55fb0fa6d93f in test_and_print tools/perf/tests/builtin-test.c:271:8
    #12 0x55fb0fa6d082 in __cmd_test tools/perf/tests/builtin-test.c:442:5
    #13 0x55fb0fa6d082 in cmd_test tools/perf/tests/builtin-test.c:564:9
    #14 0x55fb0f942720 in run_builtin tools/perf/perf.c:322:11
    #15 0x55fb0f942486 in handle_internal_command tools/perf/perf.c:375:8
    #16 0x55fb0f941dab in run_argv tools/perf/perf.c:419:2
    #17 0x55fb0f941dab in main tools/perf/perf.c:535:3
```

Fixes: 7b723db ("perf pmu: Be lazy about loading event info files from sysfs")
Signed-off-by: Ian Rogers <[email protected]>
Cc: James Clark <[email protected]>
Cc: Kan Liang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Namhyung Kim <[email protected]>
Kwiboo pushed a commit that referenced this issue Oct 23, 2023
The following call trace shows a deadlock issue due to recursive locking of
mutex "device_mutex". First lock acquire is in target_for_each_device() and
second in target_free_device().

 PID: 148266   TASK: ffff8be21ffb5d00  CPU: 10   COMMAND: "iscsi_ttx"
  #0 [ffffa2bfc9ec3b18] __schedule at ffffffffa8060e7f
  #1 [ffffa2bfc9ec3ba0] schedule at ffffffffa8061224
  #2 [ffffa2bfc9ec3bb8] schedule_preempt_disabled at ffffffffa80615ee
  #3 [ffffa2bfc9ec3bc8] __mutex_lock at ffffffffa8062fd7
  #4 [ffffa2bfc9ec3c40] __mutex_lock_slowpath at ffffffffa80631d3
  #5 [ffffa2bfc9ec3c50] mutex_lock at ffffffffa806320c
  #6 [ffffa2bfc9ec3c68] target_free_device at ffffffffc0935998 [target_core_mod]
  #7 [ffffa2bfc9ec3c90] target_core_dev_release at ffffffffc092f975 [target_core_mod]
  #8 [ffffa2bfc9ec3ca0] config_item_put at ffffffffa79d250f
  #9 [ffffa2bfc9ec3cd0] config_item_put at ffffffffa79d2583
 #10 [ffffa2bfc9ec3ce0] target_devices_idr_iter at ffffffffc0933f3a [target_core_mod]
 #11 [ffffa2bfc9ec3d00] idr_for_each at ffffffffa803f6fc
 #12 [ffffa2bfc9ec3d60] target_for_each_device at ffffffffc0935670 [target_core_mod]
 #13 [ffffa2bfc9ec3d98] transport_deregister_session at ffffffffc0946408 [target_core_mod]
 #14 [ffffa2bfc9ec3dc8] iscsit_close_session at ffffffffc09a44a6 [iscsi_target_mod]
 #15 [ffffa2bfc9ec3df0] iscsit_close_connection at ffffffffc09a4a88 [iscsi_target_mod]
 #16 [ffffa2bfc9ec3df8] finish_task_switch at ffffffffa76e5d07
 #17 [ffffa2bfc9ec3e78] iscsit_take_action_for_connection_exit at ffffffffc0991c23 [iscsi_target_mod]
 #18 [ffffa2bfc9ec3ea0] iscsi_target_tx_thread at ffffffffc09a403b [iscsi_target_mod]
 #19 [ffffa2bfc9ec3f08] kthread at ffffffffa76d8080
 #20 [ffffa2bfc9ec3f50] ret_from_fork at ffffffffa8200364

Fixes: 36d4cb4 ("scsi: target: Avoid that EXTENDED COPY commands trigger lock inversion")
Signed-off-by: Junxiao Bi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Mike Christie <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Kwiboo pushed a commit that referenced this issue Mar 5, 2024
[ Upstream commit a154f5f ]

The following call trace shows a deadlock issue due to recursive locking of
mutex "device_mutex". First lock acquire is in target_for_each_device() and
second in target_free_device().

 PID: 148266   TASK: ffff8be21ffb5d00  CPU: 10   COMMAND: "iscsi_ttx"
  #0 [ffffa2bfc9ec3b18] __schedule at ffffffffa8060e7f
  #1 [ffffa2bfc9ec3ba0] schedule at ffffffffa8061224
  #2 [ffffa2bfc9ec3bb8] schedule_preempt_disabled at ffffffffa80615ee
  #3 [ffffa2bfc9ec3bc8] __mutex_lock at ffffffffa8062fd7
  #4 [ffffa2bfc9ec3c40] __mutex_lock_slowpath at ffffffffa80631d3
  #5 [ffffa2bfc9ec3c50] mutex_lock at ffffffffa806320c
  #6 [ffffa2bfc9ec3c68] target_free_device at ffffffffc0935998 [target_core_mod]
  #7 [ffffa2bfc9ec3c90] target_core_dev_release at ffffffffc092f975 [target_core_mod]
  #8 [ffffa2bfc9ec3ca0] config_item_put at ffffffffa79d250f
  #9 [ffffa2bfc9ec3cd0] config_item_put at ffffffffa79d2583
 #10 [ffffa2bfc9ec3ce0] target_devices_idr_iter at ffffffffc0933f3a [target_core_mod]
 #11 [ffffa2bfc9ec3d00] idr_for_each at ffffffffa803f6fc
 #12 [ffffa2bfc9ec3d60] target_for_each_device at ffffffffc0935670 [target_core_mod]
 #13 [ffffa2bfc9ec3d98] transport_deregister_session at ffffffffc0946408 [target_core_mod]
 #14 [ffffa2bfc9ec3dc8] iscsit_close_session at ffffffffc09a44a6 [iscsi_target_mod]
 #15 [ffffa2bfc9ec3df0] iscsit_close_connection at ffffffffc09a4a88 [iscsi_target_mod]
 #16 [ffffa2bfc9ec3df8] finish_task_switch at ffffffffa76e5d07
 #17 [ffffa2bfc9ec3e78] iscsit_take_action_for_connection_exit at ffffffffc0991c23 [iscsi_target_mod]
 #18 [ffffa2bfc9ec3ea0] iscsi_target_tx_thread at ffffffffc09a403b [iscsi_target_mod]
 #19 [ffffa2bfc9ec3f08] kthread at ffffffffa76d8080
 #20 [ffffa2bfc9ec3f50] ret_from_fork at ffffffffa8200364

Fixes: 36d4cb4 ("scsi: target: Avoid that EXTENDED COPY commands trigger lock inversion")
Signed-off-by: Junxiao Bi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Mike Christie <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Kwiboo pushed a commit that referenced this issue Mar 5, 2024
…nect to sensor

fixes:

[    2.431002][    T1] rkcif rkcif-mipi-lvds: clear unready subdev num: 1
[    2.431031][    T1] ================================================================================
[    2.431039][    T1] UBSAN: shift-out-of-bounds in drivers/media/platform/rockchip/cif/mipi-csi2.c:523:18
[    2.431048][    T1] shift exponent -1 is negative
[    2.431059][    T1] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.10.198 #14
[    2.431065][    T1] Hardware name: Rockchip RK3588 EVB1 LP4 V10 Board (DT)
[    2.431072][    T1] Call trace:
[    2.431088][    T1]  dump_backtrace+0x0/0x1dc
[    2.431096][    T1]  show_stack+0x18/0x24
[    2.431107][    T1]  dump_stack_lvl+0xe4/0x12c
[    2.431114][    T1]  dump_stack+0x18/0x3c
[    2.431126][    T1]  __ubsan_handle_shift_out_of_bounds+0x2dc/0x30c
[    2.431137][    T1]  csi2_g_mbus_config+0xf8/0x100
[    2.431147][    T1]  call_get_mbus_config+0x2c/0x44
[    2.431154][    T1]  subdev_notifier_complete+0xec/0x6dc
[    2.431162][    T1]  v4l2_async_notifier_clr_unready_dev+0xd0/0xd8
[    2.431168][    T1]  rkcif_clr_unready_dev+0x38/0x6c
[    2.431175][    T1]  do_one_initcall+0xe8/0x320
[    2.431186][    T1]  do_initcall_level+0xd4/0x1d8
[    2.431193][    T1]  do_initcalls+0x54/0x94
[    2.431199][    T1]  do_basic_setup+0x24/0x30
[    2.431205][    T1]  kernel_init_freeable+0xf0/0x168
[    2.431212][    T1]  kernel_init+0x14/0x1ac
[    2.431219][    T1]  ret_from_fork+0x10/0x18
[    2.431225][    T1] ================================================================================
[    2.431639][    T1] rkcif-mipi-lvds: rkcif_update_sensor_info: stream[0] get remote terminal sensor failed!
[    2.431662][    T1] rkcif-mipi-lvds: Async subdev notifier completed
[    2.431679][    T1] rkcif-mipi-lvds: rkcif_update_sensor_info: stream[0] get remote terminal sensor failed!
[    2.431687][    T1] rkcif-mipi-lvds: There is not terminal subdev, not synchronized with ISP

Signed-off-by: Zefa Chen <[email protected]>
Change-Id: Id6bebc37a2e649a856ae12cbe3eb60917ecdf165
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants