Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VC4 breakage at bootup #20

Closed
Hecatron opened this issue Mar 12, 2016 · 2 comments
Closed

VC4 breakage at bootup #20

Hecatron opened this issue Mar 12, 2016 · 2 comments

Comments

@Hecatron
Copy link

Hi,
I recently raised an issue on the rpi linux sources repo, but I think this might be the better place for it
raspberrypi#1344

One of the recent changes there, seems to prevent the kernel from booting up when dtoverlay=vc4-kms-v3d is enabled, commit id 8924ecf

that being said without that commit my system will bootup but vc4 is having problems recognising the monitor for some reason (I'm still in the process of trying different options from the default to see if I'm missing something that needs to be compiled in directly)

[drm:vc4_hdmi_bind] *ERROR* Failed to get pixel clock
[    2.228836] vc4-drm soc:gpu@7e4c0000: failed to bind 3f902000.hdmi (ops vc4_hdmi_ops): -517
[    2.242500] vc4-drm soc:gpu@7e4c0000: master bind failed: -517

perhaps this is related to the issue of the system not booting when there's no monitor connected that others have been seeing

@anholt
Copy link
Owner

anholt commented Mar 14, 2016

If you don't have the clock driver, then we won't be able to get clocks. You need the clock driver.

If you have troubles booting the current branch, you're going to need to hook up a serial console to find what's going wrong. If you're trying to boot without HDMI connected, you'll probably need the HDMI power on series.

@anholt anholt closed this as completed Mar 14, 2016
@Hecatron
Copy link
Author

I'm booting with a hdmi cable connected to a monitor, it's just for some reason it stalls during the boot process. The lack of a clock must be allowing it to continue some how (although be it in a broken state).
I think like you've said a uart cable is probably the only way forward

anholt pushed a commit that referenced this issue Apr 20, 2016
commit d913897 upstream.

When we're using LZ4 multi compression streams for zram swap, we found
out page allocation failure message in system running test.  That was
not only once, but a few(2 - 5 times per test).  Also, some failure
cases were continually occurring to try allocation order 3.

In order to make parallel compression private data, we should call
kzalloc() with order 2/3 in runtime(lzo/lz4).  But if there is no order
2/3 size memory to allocate in that time, page allocation fails.  This
patch makes to use vmalloc() as fallback of kmalloc(), this prevents
page alloc failure warning.

After using this, we never found warning message in running test, also
It could reduce process startup latency about 60-120ms in each case.

For reference a call trace :

    Binder_1: page allocation failure: order:3, mode:0x10c0d0
    CPU: 0 PID: 424 Comm: Binder_1 Tainted: GW 3.10.49-perf-g991d02b-dirty #20
    Call trace:
      dump_backtrace+0x0/0x270
      show_stack+0x10/0x1c
      dump_stack+0x1c/0x28
      warn_alloc_failed+0xfc/0x11c
      __alloc_pages_nodemask+0x724/0x7f0
      __get_free_pages+0x14/0x5c
      kmalloc_order_trace+0x38/0xd8
      zcomp_lz4_create+0x2c/0x38
      zcomp_strm_alloc+0x34/0x78
      zcomp_strm_multi_find+0x124/0x1ec
      zcomp_strm_find+0xc/0x18
      zram_bvec_rw+0x2fc/0x780
      zram_make_request+0x25c/0x2d4
      generic_make_request+0x80/0xbc
      submit_bio+0xa4/0x15c
      __swap_writepage+0x218/0x230
      swap_writepage+0x3c/0x4c
      shrink_page_list+0x51c/0x8d0
      shrink_inactive_list+0x3f8/0x60c
      shrink_lruvec+0x33c/0x4cc
      shrink_zone+0x3c/0x100
      try_to_free_pages+0x2b8/0x54c
      __alloc_pages_nodemask+0x514/0x7f0
      __get_free_pages+0x14/0x5c
      proc_info_read+0x50/0xe4
      vfs_read+0xa0/0x12c
      SyS_read+0x44/0x74
    DMA: 3397*4kB (MC) 26*8kB (RC) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB
         0*512kB 0*1024kB 0*2048kB 0*4096kB = 13796kB

[[email protected]: change vmalloc gfp and adding comment about gfp]
[[email protected]: tweak comments and styles]
Signed-off-by: Kyeongdon Kim <[email protected]>
Signed-off-by: Minchan Kim <[email protected]>
Acked-by: Sergey Senozhatsky <[email protected]>
Sergey Senozhatsky <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
anholt pushed a commit that referenced this issue Oct 17, 2016
Function ib_create_qp() was failing to return an error when
rdma_rw_init_mrs() fails, causing a crash further down in ib_create_qp()
when trying to dereferece the qp pointer which was actually a negative
errno.

The crash:

crash> log|grep BUG
[  136.458121] BUG: unable to handle kernel NULL pointer dereference at 0000000000000098
crash> bt
PID: 3736   TASK: ffff8808543215c0  CPU: 2   COMMAND: "kworker/u64:2"
 #0 [ffff88084d323340] machine_kexec at ffffffff8105fbb0
 #1 [ffff88084d3233b0] __crash_kexec at ffffffff81116758
 #2 [ffff88084d323480] crash_kexec at ffffffff8111682d
 #3 [ffff88084d3234b0] oops_end at ffffffff81032bd6
 #4 [ffff88084d3234e0] no_context at ffffffff8106e431
 #5 [ffff88084d323530] __bad_area_nosemaphore at ffffffff8106e610
 #6 [ffff88084d323590] bad_area_nosemaphore at ffffffff8106e6f4
 #7 [ffff88084d3235a0] __do_page_fault at ffffffff8106ebdc
 #8 [ffff88084d323620] do_page_fault at ffffffff8106f057
 #9 [ffff88084d323660] page_fault at ffffffff816e3148
    [exception RIP: ib_create_qp+427]
    RIP: ffffffffa02554fb  RSP: ffff88084d323718  RFLAGS: 00010246
    RAX: 0000000000000004  RBX: fffffffffffffff4  RCX: 000000018020001f
    RDX: ffff880830997fc0  RSI: 0000000000000001  RDI: ffff88085f407200
    RBP: ffff88084d323778   R8: 0000000000000001   R9: ffffea0020bae210
    R10: ffffea0020bae218  R11: 0000000000000001  R12: ffff88084d3237c8
    R13: 00000000fffffff4  R14: ffff880859fa5000  R15: ffff88082eb89800
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
#10 [ffff88084d323780] rdma_create_qp at ffffffffa0782681 [rdma_cm]
#11 [ffff88084d3237b0] nvmet_rdma_create_queue_ib at ffffffffa07c43f3 [nvmet_rdma]
#12 [ffff88084d323860] nvmet_rdma_alloc_queue at ffffffffa07c5ba9 [nvmet_rdma]
#13 [ffff88084d323900] nvmet_rdma_queue_connect at ffffffffa07c5c96 [nvmet_rdma]
#14 [ffff88084d323980] nvmet_rdma_cm_handler at ffffffffa07c6450 [nvmet_rdma]
#15 [ffff88084d3239b0] iw_conn_req_handler at ffffffffa0787480 [rdma_cm]
#16 [ffff88084d323a60] cm_conn_req_handler at ffffffffa0775f06 [iw_cm]
#17 [ffff88084d323ab0] process_event at ffffffffa0776019 [iw_cm]
#18 [ffff88084d323af0] cm_work_handler at ffffffffa0776170 [iw_cm]
#19 [ffff88084d323cb0] process_one_work at ffffffff810a1483
#20 [ffff88084d323d90] worker_thread at ffffffff810a211d
#21 [ffff88084d323ec0] kthread at ffffffff810a6c5c
#22 [ffff88084d323f50] ret_from_fork at ffffffff816e1ebf

Fixes: 632bc3f ("IB/core, RDMA RW API: Do not exceed QP SGE send limit")
Signed-off-by: Steve Wise <[email protected]>
Cc: [email protected]
Reviewed-by: Bart Van Assche <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
anholt pushed a commit that referenced this issue Jan 23, 2017
commit 4dfce57 upstream.

There have been several reports over the years of NULL pointer
dereferences in xfs_trans_log_inode during xfs_fsr processes,
when the process is doing an fput and tearing down extents
on the temporary inode, something like:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
PID: 29439  TASK: ffff880550584fa0  CPU: 6   COMMAND: "xfs_fsr"
    [exception RIP: xfs_trans_log_inode+0x10]
 #9 [ffff8800a57bbbe0] xfs_bunmapi at ffffffffa037398e [xfs]
#10 [ffff8800a57bbce8] xfs_itruncate_extents at ffffffffa0391b29 [xfs]
#11 [ffff8800a57bbd88] xfs_inactive_truncate at ffffffffa0391d0c [xfs]
#12 [ffff8800a57bbdb8] xfs_inactive at ffffffffa0392508 [xfs]
#13 [ffff8800a57bbdd8] xfs_fs_evict_inode at ffffffffa035907e [xfs]
#14 [ffff8800a57bbe00] evict at ffffffff811e1b67
#15 [ffff8800a57bbe28] iput at ffffffff811e23a5
#16 [ffff8800a57bbe58] dentry_kill at ffffffff811dcfc8
#17 [ffff8800a57bbe88] dput at ffffffff811dd06c
#18 [ffff8800a57bbea8] __fput at ffffffff811c823b
#19 [ffff8800a57bbef0] ____fput at ffffffff811c846e
#20 [ffff8800a57bbf00] task_work_run at ffffffff81093b27
#21 [ffff8800a57bbf30] do_notify_resume at ffffffff81013b0c
#22 [ffff8800a57bbf50] int_signal at ffffffff8161405d

As it turns out, this is because the i_itemp pointer, along
with the d_ops pointer, has been overwritten with zeros
when we tear down the extents during truncate.  When the in-core
inode fork on the temporary inode used by xfs_fsr was originally
set up during the extent swap, we mistakenly looked at di_nextents
to determine whether all extents fit inline, but this misses extents
generated by speculative preallocation; we should be using if_bytes
instead.

This mistake corrupts the in-memory inode, and code in
xfs_iext_remove_inline eventually gets bad inputs, causing
it to memmove and memset incorrect ranges; this became apparent
because the two values in ifp->if_u2.if_inline_ext[1] contained
what should have been in d_ops and i_itemp; they were memmoved due
to incorrect array indexing and then the original locations
were zeroed with memset, again due to an array overrun.

Fix this by properly using i_df.if_bytes to determine the number
of extents, not di_nextents.

Thanks to dchinner for looking at this with me and spotting the
root cause.

Signed-off-by: Eric Sandeen <[email protected]>
Reviewed-by: Brian Foster <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
anholt pushed a commit that referenced this issue Mar 30, 2017
[ Upstream commit 45caeaa ]

As Eric Dumazet pointed out this also needs to be fixed in IPv6.
v2: Contains the IPv6 tcp/Ipv6 dccp patches as well.

We have seen a few incidents lately where a dst_enty has been freed
with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that
dst_entry. If the conditions/timings are right a crash then ensues when the
freed dst_entry is referenced later on. A Common crashing back trace is:

 #8 [] page_fault at ffffffff8163e648
    [exception RIP: __tcp_ack_snd_check+74]
.
.
 #9 [] tcp_rcv_established at ffffffff81580b64
#10 [] tcp_v4_do_rcv at ffffffff8158b54a
#11 [] tcp_v4_rcv at ffffffff8158cd02
#12 [] ip_local_deliver_finish at ffffffff815668f4
#13 [] ip_local_deliver at ffffffff81566bd9
#14 [] ip_rcv_finish at ffffffff8156656d
#15 [] ip_rcv at ffffffff81566f06
#16 [] __netif_receive_skb_core at ffffffff8152b3a2
#17 [] __netif_receive_skb at ffffffff8152b608
#18 [] netif_receive_skb at ffffffff8152b690
#19 [] vmxnet3_rq_rx_complete at ffffffffa015eeaf [vmxnet3]
#20 [] vmxnet3_poll_rx_only at ffffffffa015f32a [vmxnet3]
#21 [] net_rx_action at ffffffff8152bac2
#22 [] __do_softirq at ffffffff81084b4f
#23 [] call_softirq at ffffffff8164845c
#24 [] do_softirq at ffffffff81016fc5
#25 [] irq_exit at ffffffff81084ee5
#26 [] do_IRQ at ffffffff81648ff8

Of course it may happen with other NIC drivers as well.

It's found the freed dst_entry here:

 224 static bool tcp_in_quickack_mode(struct sock *sk)↩
 225 {↩
 226 ▹       const struct inet_connection_sock *icsk = inet_csk(sk);↩
 227 ▹       const struct dst_entry *dst = __sk_dst_get(sk);↩
 228 ↩
 229 ▹       return (dst && dst_metric(dst, RTAX_QUICKACK)) ||↩
 230 ▹       ▹       (icsk->icsk_ack.quick && !icsk->icsk_ack.pingpong);↩
 231 }↩

But there are other backtraces attributed to the same freed dst_entry in
netfilter code as well.

All the vmcores showed 2 significant clues:

- Remote hosts behind the default gateway had always been redirected to a
different gateway. A rtable/dst_entry will be added for that host. Making
more dst_entrys with lower reference counts. Making this more probable.

- All vmcores showed a postitive LockDroppedIcmps value, e.g:

LockDroppedIcmps                  267

A closer look at the tcp_v4_err() handler revealed that do_redirect() will run
regardless of whether user space has the socket locked. This can result in a
race condition where the same dst_entry cached in sk->sk_dst_entry can be
decremented twice for the same socket via:

do_redirect()->__sk_dst_check()-> dst_release().

Which leads to the dst_entry being prematurely freed with another socket
pointing to it via sk->sk_dst_cache and a subsequent crash.

To fix this skip do_redirect() if usespace has the socket locked. Instead let
the redirect take place later when user space does not have the socket
locked.

The dccp/IPv6 code is very similar in this respect, so fixing it there too.

As Eric Garver pointed out the following commit now invalidates routes. Which
can set the dst->obsolete flag so that ipv4_dst_check() returns null and
triggers the dst_release().

Fixes: ceb3320 ("ipv4: Kill routes during PMTU/redirect updates.")
Cc: Eric Garver <[email protected]>
Cc: Hannes Sowa <[email protected]>
Signed-off-by: Jon Maxwell <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
anholt pushed a commit that referenced this issue Mar 30, 2017
[ Upstream commit 45caeaa ]

As Eric Dumazet pointed out this also needs to be fixed in IPv6.
v2: Contains the IPv6 tcp/Ipv6 dccp patches as well.

We have seen a few incidents lately where a dst_enty has been freed
with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that
dst_entry. If the conditions/timings are right a crash then ensues when the
freed dst_entry is referenced later on. A Common crashing back trace is:

 #8 [] page_fault at ffffffff8163e648
    [exception RIP: __tcp_ack_snd_check+74]
.
.
 #9 [] tcp_rcv_established at ffffffff81580b64
#10 [] tcp_v4_do_rcv at ffffffff8158b54a
#11 [] tcp_v4_rcv at ffffffff8158cd02
#12 [] ip_local_deliver_finish at ffffffff815668f4
#13 [] ip_local_deliver at ffffffff81566bd9
#14 [] ip_rcv_finish at ffffffff8156656d
#15 [] ip_rcv at ffffffff81566f06
#16 [] __netif_receive_skb_core at ffffffff8152b3a2
#17 [] __netif_receive_skb at ffffffff8152b608
#18 [] netif_receive_skb at ffffffff8152b690
#19 [] vmxnet3_rq_rx_complete at ffffffffa015eeaf [vmxnet3]
#20 [] vmxnet3_poll_rx_only at ffffffffa015f32a [vmxnet3]
#21 [] net_rx_action at ffffffff8152bac2
#22 [] __do_softirq at ffffffff81084b4f
#23 [] call_softirq at ffffffff8164845c
#24 [] do_softirq at ffffffff81016fc5
#25 [] irq_exit at ffffffff81084ee5
#26 [] do_IRQ at ffffffff81648ff8

Of course it may happen with other NIC drivers as well.

It's found the freed dst_entry here:

 224 static bool tcp_in_quickack_mode(struct sock *sk)↩
 225 {↩
 226 ▹       const struct inet_connection_sock *icsk = inet_csk(sk);↩
 227 ▹       const struct dst_entry *dst = __sk_dst_get(sk);↩
 228 ↩
 229 ▹       return (dst && dst_metric(dst, RTAX_QUICKACK)) ||↩
 230 ▹       ▹       (icsk->icsk_ack.quick && !icsk->icsk_ack.pingpong);↩
 231 }↩

But there are other backtraces attributed to the same freed dst_entry in
netfilter code as well.

All the vmcores showed 2 significant clues:

- Remote hosts behind the default gateway had always been redirected to a
different gateway. A rtable/dst_entry will be added for that host. Making
more dst_entrys with lower reference counts. Making this more probable.

- All vmcores showed a postitive LockDroppedIcmps value, e.g:

LockDroppedIcmps                  267

A closer look at the tcp_v4_err() handler revealed that do_redirect() will run
regardless of whether user space has the socket locked. This can result in a
race condition where the same dst_entry cached in sk->sk_dst_entry can be
decremented twice for the same socket via:

do_redirect()->__sk_dst_check()-> dst_release().

Which leads to the dst_entry being prematurely freed with another socket
pointing to it via sk->sk_dst_cache and a subsequent crash.

To fix this skip do_redirect() if usespace has the socket locked. Instead let
the redirect take place later when user space does not have the socket
locked.

The dccp/IPv6 code is very similar in this respect, so fixing it there too.

As Eric Garver pointed out the following commit now invalidates routes. Which
can set the dst->obsolete flag so that ipv4_dst_check() returns null and
triggers the dst_release().

Fixes: ceb3320 ("ipv4: Kill routes during PMTU/redirect updates.")
Cc: Eric Garver <[email protected]>
Cc: Hannes Sowa <[email protected]>
Signed-off-by: Jon Maxwell <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
anholt pushed a commit that referenced this issue Apr 8, 2017
As Eric Dumazet pointed out this also needs to be fixed in IPv6.
v2: Contains the IPv6 tcp/Ipv6 dccp patches as well.

We have seen a few incidents lately where a dst_enty has been freed
with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that
dst_entry. If the conditions/timings are right a crash then ensues when the
freed dst_entry is referenced later on. A Common crashing back trace is:

 #8 [] page_fault at ffffffff8163e648
    [exception RIP: __tcp_ack_snd_check+74]
.
.
 #9 [] tcp_rcv_established at ffffffff81580b64
#10 [] tcp_v4_do_rcv at ffffffff8158b54a
#11 [] tcp_v4_rcv at ffffffff8158cd02
#12 [] ip_local_deliver_finish at ffffffff815668f4
#13 [] ip_local_deliver at ffffffff81566bd9
#14 [] ip_rcv_finish at ffffffff8156656d
#15 [] ip_rcv at ffffffff81566f06
#16 [] __netif_receive_skb_core at ffffffff8152b3a2
#17 [] __netif_receive_skb at ffffffff8152b608
#18 [] netif_receive_skb at ffffffff8152b690
#19 [] vmxnet3_rq_rx_complete at ffffffffa015eeaf [vmxnet3]
#20 [] vmxnet3_poll_rx_only at ffffffffa015f32a [vmxnet3]
#21 [] net_rx_action at ffffffff8152bac2
#22 [] __do_softirq at ffffffff81084b4f
#23 [] call_softirq at ffffffff8164845c
#24 [] do_softirq at ffffffff81016fc5
#25 [] irq_exit at ffffffff81084ee5
#26 [] do_IRQ at ffffffff81648ff8

Of course it may happen with other NIC drivers as well.

It's found the freed dst_entry here:

 224 static bool tcp_in_quickack_mode(struct sock *sk)↩
 225 {↩
 226 ▹       const struct inet_connection_sock *icsk = inet_csk(sk);↩
 227 ▹       const struct dst_entry *dst = __sk_dst_get(sk);↩
 228 ↩
 229 ▹       return (dst && dst_metric(dst, RTAX_QUICKACK)) ||↩
 230 ▹       ▹       (icsk->icsk_ack.quick && !icsk->icsk_ack.pingpong);↩
 231 }↩

But there are other backtraces attributed to the same freed dst_entry in
netfilter code as well.

All the vmcores showed 2 significant clues:

- Remote hosts behind the default gateway had always been redirected to a
different gateway. A rtable/dst_entry will be added for that host. Making
more dst_entrys with lower reference counts. Making this more probable.

- All vmcores showed a postitive LockDroppedIcmps value, e.g:

LockDroppedIcmps                  267

A closer look at the tcp_v4_err() handler revealed that do_redirect() will run
regardless of whether user space has the socket locked. This can result in a
race condition where the same dst_entry cached in sk->sk_dst_entry can be
decremented twice for the same socket via:

do_redirect()->__sk_dst_check()-> dst_release().

Which leads to the dst_entry being prematurely freed with another socket
pointing to it via sk->sk_dst_cache and a subsequent crash.

To fix this skip do_redirect() if usespace has the socket locked. Instead let
the redirect take place later when user space does not have the socket
locked.

The dccp/IPv6 code is very similar in this respect, so fixing it there too.

As Eric Garver pointed out the following commit now invalidates routes. Which
can set the dst->obsolete flag so that ipv4_dst_check() returns null and
triggers the dst_release().

Fixes: ceb3320 ("ipv4: Kill routes during PMTU/redirect updates.")
Cc: Eric Garver <[email protected]>
Cc: Hannes Sowa <[email protected]>
Signed-off-by: Jon Maxwell <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
anholt pushed a commit that referenced this issue May 4, 2017
Commit 57e5568 ("sata_via: Implement hotplug for VT6421") adds
hotplug IRQ handler for VT6421 but enables hotplug on all chips. This
is a bug because it causes "irq xx: nobody cared" error on VT6420 when
hot-(un)plugging a drive:

[  381.839948] irq 20: nobody cared (try booting with the "irqpoll" option)
[  381.840014] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.10.0-rc5+ #148
[  381.840066] Hardware name:          P4VM800/P4VM800, BIOS P1.60 05/29/2006
[  381.840117] Call Trace:
[  381.840167]  <IRQ>
[  381.840225]  ? dump_stack+0x44/0x58
[  381.840278]  ? __report_bad_irq+0x14/0x97
[  381.840327]  ? handle_edge_irq+0xa5/0xa5
[  381.840376]  ? note_interrupt+0x155/0x1cf
[  381.840426]  ? handle_edge_irq+0xa5/0xa5
[  381.840474]  ? handle_irq_event_percpu+0x32/0x38
[  381.840524]  ? handle_irq_event+0x1f/0x38
[  381.840573]  ? handle_fasteoi_irq+0x69/0xb8
[  381.840625]  ? handle_irq+0x4f/0x5d
[  381.840672]  </IRQ>
[  381.840726]  ? do_IRQ+0x2e/0x8b
[  381.840782]  ? common_interrupt+0x2c/0x34
[  381.840836]  ? mwait_idle+0x60/0x82
[  381.840892]  ? arch_cpu_idle+0x6/0x7
[  381.840949]  ? do_idle+0x96/0x18e
[  381.841002]  ? cpu_startup_entry+0x16/0x1a
[  381.841057]  ? start_kernel+0x319/0x31c
[  381.841111]  ? startup_32_smp+0x166/0x168
[  381.841165] handlers:
[  381.841219] [<c12a7263>] ata_bmdma_interrupt
[  381.841274] Disabling IRQ #20

Seems that VT6420 can do hotplug too (there's no documentation) but the
comments say that SCR register access (required for detecting hotplug
events) can cause problems on these chips.

For now, just keep hotplug disabled on anything other than VT6421.

Signed-off-by: Ondrej Zary <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
anholt pushed a commit that referenced this issue Aug 15, 2017
According to the kerneldoc[0], should do fbdev setup before calling
drm_kms_helper_poll_init(), otherwise, Kms hotplug event may race
into fbdev helper initial, and fb_helper->dev may be NULL pointer,
that would cause the bug:
[    0.735411] [00000200] *pgd=00000000f6ffe003, *pud=00000000f6ffe003, *pmd=0000000000000000
[    0.736156] Internal error: Oops: 96000005 [#1] PREEMPT SMP
[    0.736648] Modules linked in:
[    0.736930] CPU: 2 PID: 20 Comm: kworker/2:0 Not tainted 4.4.41 #20
[    0.737480] Hardware name: Rockchip RK3399 Board rev2 (BOX) (DT)
[    0.738020] Workqueue: events cdn_dp_pd_event_work
[    0.738447] task: ffffffc0f21f3100 ti: ffffffc0f2218000 task.ti: ffffffc0f2218000
[    0.739109] PC is at mutex_lock+0x14/0x44
[    0.739469] LR is at drm_fb_helper_hotplug_event+0x30/0x114
[    0.756253] [<ffffff8008a344f4>] mutex_lock+0x14/0x44
[    0.756260] [<ffffff8008445708>] drm_fb_helper_hotplug_event+0x30/0x114
[    0.756271] [<ffffff8008473c84>] rockchip_drm_output_poll_changed+0x18/0x20
[    0.756280] [<ffffff8008439fcc>] drm_kms_helper_hotplug_event+0x28/0x34
[    0.756286] [<ffffff800846c444>] cdn_dp_pd_event_work+0x394/0x3c4
[    0.756295] [<ffffff80080b2b38>] process_one_work+0x218/0x3e0
[    0.756302] [<ffffff80080b3538>] worker_thread+0x2e8/0x404
[    0.756308] [<ffffff80080b7e70>] kthread+0xe8/0xf0
[    0.756316] [<ffffff8008082690>] ret_from_fork+0x10/0x40

[0]: https://01.org/linuxgraphics/gfx-docs/drm/gpu/drm-kms-helpers.html

Signed-off-by: Mark Yao <[email protected]>
Reviewed-by: Sandy huang <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
anholt pushed a commit that referenced this issue Aug 15, 2017
This patch fixes a generate_node_acls = 1 + cache_dynamic_acls = 0
regression, that was introduced by

  commit 01d4d67
  Author: Nicholas Bellinger <[email protected]>
  Date:   Wed Dec 7 12:55:54 2016 -0800

which originally had the proper list_del_init() usage, but was
dropped during list review as it was thought unnecessary by HCH.

However, list_del_init() usage is required during the special
generate_node_acls = 1 + cache_dynamic_acls = 0 case when
transport_free_session() does a list_del(&se_nacl->acl_list),
followed by target_complete_nacl() doing the same thing.

This was manifesting as a general protection fault as reported
by Justin:

kernel: general protection fault: 0000 [#1] SMP
kernel: Modules linked in:
kernel: CPU: 0 PID: 11047 Comm: iscsi_ttx Not tainted 4.13.0-rc2.x86_64.1+ #20
kernel: Hardware name: Intel Corporation S5500BC/S5500BC, BIOS S5500.86B.01.00.0064.050520141428 05/05/2014
kernel: task: ffff88026939e800 task.stack: ffffc90007884000
kernel: RIP: 0010:target_put_nacl+0x49/0xb0
kernel: RSP: 0018:ffffc90007887d70 EFLAGS: 00010246
kernel: RAX: dead000000000200 RBX: ffff8802556ca000 RCX: 0000000000000000
kernel: RDX: dead000000000100 RSI: 0000000000000246 RDI: ffff8802556ce028
kernel: RBP: ffffc90007887d88 R08: 0000000000000001 R09: 0000000000000000
kernel: R10: ffffc90007887df8 R11: ffffea0009986900 R12: ffff8802556ce020
kernel: R13: ffff8802556ce028 R14: ffff8802556ce028 R15: ffffffff88d85540
kernel: FS:  0000000000000000(0000) GS:ffff88027fc00000(0000) knlGS:0000000000000000
kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 00007fffe36f5f94 CR3: 0000000009209000 CR4: 00000000003406f0
kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
kernel: Call Trace:
kernel:  transport_free_session+0x67/0x140
kernel:  transport_deregister_session+0x7a/0xc0
kernel:  iscsit_close_session+0x92/0x210
kernel:  iscsit_close_connection+0x5f9/0x840
kernel:  iscsit_take_action_for_connection_exit+0xfe/0x110
kernel:  iscsi_target_tx_thread+0x140/0x1e0
kernel:  ? wait_woken+0x90/0x90
kernel:  kthread+0x124/0x160
kernel:  ? iscsit_thread_get_cpumask+0x90/0x90
kernel:  ? kthread_create_on_node+0x40/0x40
kernel:  ret_from_fork+0x22/0x30
kernel: Code: 00 48 89 fb 4c 8b a7 48 01 00 00 74 68 4d 8d 6c 24 08 4c
89 ef e8 e8 28 43 00 48 8b 93 20 04 00 00 48 8b 83 28 04 00 00 4c 89
ef <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 83 20
kernel: RIP: target_put_nacl+0x49/0xb0 RSP: ffffc90007887d70
kernel: ---[ end trace f12821adbfd46fed ]---

To address this, go ahead and use proper list_del_list() for all
cases of se_nacl->acl_list deletion.

Reported-by: Justin Maggard <[email protected]>
Tested-by: Justin Maggard <[email protected]>
Cc: Justin Maggard <[email protected]>
Cc: [email protected] # 4.1+
Signed-off-by: Nicholas Bellinger <[email protected]>
anholt pushed a commit that referenced this issue Nov 13, 2017
commit 6f48655 upstream.

This patch fixes a generate_node_acls = 1 + cache_dynamic_acls = 0
regression, that was introduced by

  commit 01d4d67
  Author: Nicholas Bellinger <[email protected]>
  Date:   Wed Dec 7 12:55:54 2016 -0800

which originally had the proper list_del_init() usage, but was
dropped during list review as it was thought unnecessary by HCH.

However, list_del_init() usage is required during the special
generate_node_acls = 1 + cache_dynamic_acls = 0 case when
transport_free_session() does a list_del(&se_nacl->acl_list),
followed by target_complete_nacl() doing the same thing.

This was manifesting as a general protection fault as reported
by Justin:

kernel: general protection fault: 0000 [#1] SMP
kernel: Modules linked in:
kernel: CPU: 0 PID: 11047 Comm: iscsi_ttx Not tainted 4.13.0-rc2.x86_64.1+ #20
kernel: Hardware name: Intel Corporation S5500BC/S5500BC, BIOS S5500.86B.01.00.0064.050520141428 05/05/2014
kernel: task: ffff88026939e800 task.stack: ffffc90007884000
kernel: RIP: 0010:target_put_nacl+0x49/0xb0
kernel: RSP: 0018:ffffc90007887d70 EFLAGS: 00010246
kernel: RAX: dead000000000200 RBX: ffff8802556ca000 RCX: 0000000000000000
kernel: RDX: dead000000000100 RSI: 0000000000000246 RDI: ffff8802556ce028
kernel: RBP: ffffc90007887d88 R08: 0000000000000001 R09: 0000000000000000
kernel: R10: ffffc90007887df8 R11: ffffea0009986900 R12: ffff8802556ce020
kernel: R13: ffff8802556ce028 R14: ffff8802556ce028 R15: ffffffff88d85540
kernel: FS:  0000000000000000(0000) GS:ffff88027fc00000(0000) knlGS:0000000000000000
kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 00007fffe36f5f94 CR3: 0000000009209000 CR4: 00000000003406f0
kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
kernel: Call Trace:
kernel:  transport_free_session+0x67/0x140
kernel:  transport_deregister_session+0x7a/0xc0
kernel:  iscsit_close_session+0x92/0x210
kernel:  iscsit_close_connection+0x5f9/0x840
kernel:  iscsit_take_action_for_connection_exit+0xfe/0x110
kernel:  iscsi_target_tx_thread+0x140/0x1e0
kernel:  ? wait_woken+0x90/0x90
kernel:  kthread+0x124/0x160
kernel:  ? iscsit_thread_get_cpumask+0x90/0x90
kernel:  ? kthread_create_on_node+0x40/0x40
kernel:  ret_from_fork+0x22/0x30
kernel: Code: 00 48 89 fb 4c 8b a7 48 01 00 00 74 68 4d 8d 6c 24 08 4c
89 ef e8 e8 28 43 00 48 8b 93 20 04 00 00 48 8b 83 28 04 00 00 4c 89
ef <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 83 20
kernel: RIP: target_put_nacl+0x49/0xb0 RSP: ffffc90007887d70
kernel: ---[ end trace f12821adbfd46fed ]---

To address this, go ahead and use proper list_del_list() for all
cases of se_nacl->acl_list deletion.

Reported-by: Justin Maggard <[email protected]>
Tested-by: Justin Maggard <[email protected]>
Cc: Justin Maggard <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
anholt pushed a commit that referenced this issue Nov 13, 2017
[ Upstream commit 3cf8645 ]

Commit 57e5568 ("sata_via: Implement hotplug for VT6421") adds
hotplug IRQ handler for VT6421 but enables hotplug on all chips. This
is a bug because it causes "irq xx: nobody cared" error on VT6420 when
hot-(un)plugging a drive:

[  381.839948] irq 20: nobody cared (try booting with the "irqpoll" option)
[  381.840014] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.10.0-rc5+ #148
[  381.840066] Hardware name:          P4VM800/P4VM800, BIOS P1.60 05/29/2006
[  381.840117] Call Trace:
[  381.840167]  <IRQ>
[  381.840225]  ? dump_stack+0x44/0x58
[  381.840278]  ? __report_bad_irq+0x14/0x97
[  381.840327]  ? handle_edge_irq+0xa5/0xa5
[  381.840376]  ? note_interrupt+0x155/0x1cf
[  381.840426]  ? handle_edge_irq+0xa5/0xa5
[  381.840474]  ? handle_irq_event_percpu+0x32/0x38
[  381.840524]  ? handle_irq_event+0x1f/0x38
[  381.840573]  ? handle_fasteoi_irq+0x69/0xb8
[  381.840625]  ? handle_irq+0x4f/0x5d
[  381.840672]  </IRQ>
[  381.840726]  ? do_IRQ+0x2e/0x8b
[  381.840782]  ? common_interrupt+0x2c/0x34
[  381.840836]  ? mwait_idle+0x60/0x82
[  381.840892]  ? arch_cpu_idle+0x6/0x7
[  381.840949]  ? do_idle+0x96/0x18e
[  381.841002]  ? cpu_startup_entry+0x16/0x1a
[  381.841057]  ? start_kernel+0x319/0x31c
[  381.841111]  ? startup_32_smp+0x166/0x168
[  381.841165] handlers:
[  381.841219] [<c12a7263>] ata_bmdma_interrupt
[  381.841274] Disabling IRQ #20

Seems that VT6420 can do hotplug too (there's no documentation) but the
comments say that SCR register access (required for detecting hotplug
events) can cause problems on these chips.

For now, just keep hotplug disabled on anything other than VT6421.

Signed-off-by: Ondrej Zary <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
anholt pushed a commit that referenced this issue Nov 21, 2017
…-text symbols"

This reverts commit 83e840c ("powerpc64/elfv1: Only dereference
function descriptor for non-text symbols").

Chandan reported that on newer kernels, trying to enable function_graph
tracer on ppc64 (BE) locks up the system with the following trace:

  Unable to handle kernel paging request for data at address 0x600000002fa30010
  Faulting instruction address: 0xc0000000001f1300
  Thread overran stack, or stack corrupted
  Oops: Kernel access of bad area, sig: 11 [#1]
  BE SMP NR_CPUS=2048 DEBUG_PAGEALLOC NUMA pSeries
  Modules linked in:
  CPU: 1 PID: 6586 Comm: bash Not tainted 4.14.0-rc3-00162-g6e51f1f-dirty #20
  task: c000000625c07200 task.stack: c000000625c07310
  NIP:  c0000000001f1300 LR: c000000000121cac CTR: c000000000061af8
  REGS: c000000625c088c0 TRAP: 0380   Not tainted  (4.14.0-rc3-00162-g6e51f1f-dirty)
  MSR:  8000000000001032 <SF,ME,IR,DR,RI>  CR: 28002848  XER: 00000000
  CFAR: c0000000001f1320 SOFTE: 0
  ...
  NIP [c0000000001f1300] .__is_insn_slot_addr+0x30/0x90
  LR [c000000000121cac] .kernel_text_address+0x18c/0x1c0
  Call Trace:
  [c000000625c08b40] [c0000000001bd040] .is_module_text_address+0x20/0x40 (unreliable)
  [c000000625c08bc0] [c000000000121cac] .kernel_text_address+0x18c/0x1c0
  [c000000625c08c50] [c000000000061960] .prepare_ftrace_return+0x50/0x130
  [c000000625c08cf0] [c000000000061b10] .ftrace_graph_caller+0x14/0x34
  [c000000625c08d60] [c000000000121b40] .kernel_text_address+0x20/0x1c0
  [c000000625c08df0] [c000000000061960] .prepare_ftrace_return+0x50/0x130
  ...
  [c000000625c0ab30] [c000000000061960] .prepare_ftrace_return+0x50/0x130
  [c000000625c0abd0] [c000000000061b10] .ftrace_graph_caller+0x14/0x34
  [c000000625c0ac40] [c000000000121b40] .kernel_text_address+0x20/0x1c0
  [c000000625c0acd0] [c000000000061960] .prepare_ftrace_return+0x50/0x130
  [c000000625c0ad70] [c000000000061b10] .ftrace_graph_caller+0x14/0x34
  [c000000625c0ade0] [c000000000121b40] .kernel_text_address+0x20/0x1c0

This is because ftrace is using ppc_function_entry() for obtaining the
address of return_to_handler() in prepare_ftrace_return(). The call to
kernel_text_address() itself gets traced and we end up in a recursive
loop.

Fixes: 83e840c ("powerpc64/elfv1: Only dereference function descriptor for non-text symbols")
Cc: [email protected] # v4.13+
Reported-by: Chandan Rajendra <[email protected]>
Signed-off-by: Naveen N. Rao <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
anholt pushed a commit that referenced this issue Nov 21, 2017
When a GSO skb of truesize O is segmented into 2 new skbs of truesize N1
and N2, we want to transfer socket ownership to the new fresh skbs.

In order to avoid expensive atomic operations on a cache line subject to
cache bouncing, we replace the sequence :

refcount_add(N1, &sk->sk_wmem_alloc);
refcount_add(N2, &sk->sk_wmem_alloc); // repeated by number of segments

refcount_sub(O, &sk->sk_wmem_alloc);

by a single

refcount_add(sum_of(N) - O, &sk->sk_wmem_alloc);

Problem is :

In some pathological cases, sum(N) - O might be a negative number, and
syzkaller bot was apparently able to trigger this trace [1]

atomic_t was ok with this construct, but we need to take care of the
negative delta with refcount_t

[1]
refcount_t: saturated; leaking memory.
------------[ cut here ]------------
WARNING: CPU: 0 PID: 8404 at lib/refcount.c:77 refcount_add_not_zero+0x198/0x200 lib/refcount.c:77
Kernel panic - not syncing: panic_on_warn set ...

CPU: 0 PID: 8404 Comm: syz-executor2 Not tainted 4.14.0-rc5-mm1+ #20
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x194/0x257 lib/dump_stack.c:52
 panic+0x1e4/0x41c kernel/panic.c:183
 __warn+0x1c4/0x1e0 kernel/panic.c:546
 report_bug+0x211/0x2d0 lib/bug.c:183
 fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:177
 do_trap_no_signal arch/x86/kernel/traps.c:211 [inline]
 do_trap+0x260/0x390 arch/x86/kernel/traps.c:260
 do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:297
 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:310
 invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905
RIP: 0010:refcount_add_not_zero+0x198/0x200 lib/refcount.c:77
RSP: 0018:ffff8801c606e3a0 EFLAGS: 00010282
RAX: 0000000000000026 RBX: 0000000000001401 RCX: 0000000000000000
RDX: 0000000000000026 RSI: ffffc900036fc000 RDI: ffffed0038c0dc68
RBP: ffff8801c606e430 R08: 0000000000000001 R09: 0000000000000000
R10: ffff8801d97f5eba R11: 0000000000000000 R12: ffff8801d5acf73c
R13: 1ffff10038c0dc75 R14: 00000000ffffffff R15: 00000000fffff72f
 refcount_add+0x1b/0x60 lib/refcount.c:101
 tcp_gso_segment+0x10d0/0x16b0 net/ipv4/tcp_offload.c:155
 tcp4_gso_segment+0xd4/0x310 net/ipv4/tcp_offload.c:51
 inet_gso_segment+0x60c/0x11c0 net/ipv4/af_inet.c:1271
 skb_mac_gso_segment+0x33f/0x660 net/core/dev.c:2749
 __skb_gso_segment+0x35f/0x7f0 net/core/dev.c:2821
 skb_gso_segment include/linux/netdevice.h:3971 [inline]
 validate_xmit_skb+0x4ba/0xb20 net/core/dev.c:3074
 __dev_queue_xmit+0xe49/0x2070 net/core/dev.c:3497
 dev_queue_xmit+0x17/0x20 net/core/dev.c:3538
 neigh_hh_output include/net/neighbour.h:471 [inline]
 neigh_output include/net/neighbour.h:479 [inline]
 ip_finish_output2+0xece/0x1460 net/ipv4/ip_output.c:229
 ip_finish_output+0x85e/0xd10 net/ipv4/ip_output.c:317
 NF_HOOK_COND include/linux/netfilter.h:238 [inline]
 ip_output+0x1cc/0x860 net/ipv4/ip_output.c:405
 dst_output include/net/dst.h:459 [inline]
 ip_local_out+0x95/0x160 net/ipv4/ip_output.c:124
 ip_queue_xmit+0x8c6/0x18e0 net/ipv4/ip_output.c:504
 tcp_transmit_skb+0x1ab7/0x3840 net/ipv4/tcp_output.c:1137
 tcp_write_xmit+0x663/0x4de0 net/ipv4/tcp_output.c:2341
 __tcp_push_pending_frames+0xa0/0x250 net/ipv4/tcp_output.c:2513
 tcp_push_pending_frames include/net/tcp.h:1722 [inline]
 tcp_data_snd_check net/ipv4/tcp_input.c:5050 [inline]
 tcp_rcv_established+0x8c7/0x18a0 net/ipv4/tcp_input.c:5497
 tcp_v4_do_rcv+0x2ab/0x7d0 net/ipv4/tcp_ipv4.c:1460
 sk_backlog_rcv include/net/sock.h:909 [inline]
 __release_sock+0x124/0x360 net/core/sock.c:2264
 release_sock+0xa4/0x2a0 net/core/sock.c:2776
 tcp_sendmsg+0x3a/0x50 net/ipv4/tcp.c:1462
 inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:763
 sock_sendmsg_nosec net/socket.c:632 [inline]
 sock_sendmsg+0xca/0x110 net/socket.c:642
 ___sys_sendmsg+0x31c/0x890 net/socket.c:2048
 __sys_sendmmsg+0x1e6/0x5f0 net/socket.c:2138

Fixes: 14afee4 ("net: convert sock.sk_wmem_alloc from atomic_t to refcount_t")
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: syzbot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
lategoodbye pushed a commit that referenced this issue Oct 10, 2019
Masoud Sharbiani noticed that commit 29ef680 ("memcg, oom: move
out_of_memory back to the charge path") broke memcg OOM called from
__xfs_filemap_fault() path.  It turned out that try_charge() is retrying
forever without making forward progress because mem_cgroup_oom(GFP_NOFS)
cannot invoke the OOM killer due to commit 3da88fb ("mm, oom:
move GFP_NOFS check to out_of_memory").

Allowing forced charge due to being unable to invoke memcg OOM killer will
lead to global OOM situation.  Also, just returning -ENOMEM will be risky
because OOM path is lost and some paths (e.g.  get_user_pages()) will leak
-ENOMEM.  Therefore, invoking memcg OOM killer (despite GFP_NOFS) will be
the only choice we can choose for now.

Until 29ef680, we were able to invoke memcg OOM killer when
GFP_KERNEL reclaim failed [1].  But since 29ef680, we need to
invoke memcg OOM killer when GFP_NOFS reclaim failed [2].  Although in the
past we did invoke memcg OOM killer for GFP_NOFS [3], we might get
pre-mature memcg OOM reports due to this patch.

[1]

 leaker invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0
 CPU: 0 PID: 2746 Comm: leaker Not tainted 4.18.0+ #19
 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
 Call Trace:
  dump_stack+0x63/0x88
  dump_header+0x67/0x27a
  ? mem_cgroup_scan_tasks+0x91/0xf0
  oom_kill_process+0x210/0x410
  out_of_memory+0x10a/0x2c0
  mem_cgroup_out_of_memory+0x46/0x80
  mem_cgroup_oom_synchronize+0x2e4/0x310
  ? high_work_func+0x20/0x20
  pagefault_out_of_memory+0x31/0x76
  mm_fault_error+0x55/0x115
  ? handle_mm_fault+0xfd/0x220
  __do_page_fault+0x433/0x4e0
  do_page_fault+0x22/0x30
  ? page_fault+0x8/0x30
  page_fault+0x1e/0x30
 RIP: 0033:0x4009f0
 Code: 03 00 00 00 e8 71 fd ff ff 48 83 f8 ff 49 89 c6 74 74 48 89 c6 bf c0 0c 40 00 31 c0 e8 69 fd ff ff 45 85 ff 7e 21 31 c9 66 90 <41> 0f be 14 0e 01 d3 f7 c1 ff 0f 00 00 75 05 41 c6 04 0e 2a 48 83
 RSP: 002b:00007ffe29ae96f0 EFLAGS: 00010206
 RAX: 000000000000001b RBX: 0000000000000000 RCX: 0000000001ce1000
 RDX: 0000000000000000 RSI: 000000007fffffe5 RDI: 0000000000000000
 RBP: 000000000000000c R08: 0000000000000000 R09: 00007f94be09220d
 R10: 0000000000000002 R11: 0000000000000246 R12: 00000000000186a0
 R13: 0000000000000003 R14: 00007f949d845000 R15: 0000000002800000
 Task in /leaker killed as a result of limit of /leaker
 memory: usage 524288kB, limit 524288kB, failcnt 158965
 memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
 kmem: usage 2016kB, limit 9007199254740988kB, failcnt 0
 Memory cgroup stats for /leaker: cache:844KB rss:521136KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:132KB writeback:0KB inactive_anon:0KB active_anon:521224KB inactive_file:1012KB active_file:8KB unevictable:0KB
 Memory cgroup out of memory: Kill process 2746 (leaker) score 998 or sacrifice child
 Killed process 2746 (leaker) total-vm:536704kB, anon-rss:521176kB, file-rss:1208kB, shmem-rss:0kB
 oom_reaper: reaped process 2746 (leaker), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

[2]

 leaker invoked oom-killer: gfp_mask=0x600040(GFP_NOFS), nodemask=(null), order=0, oom_score_adj=0
 CPU: 1 PID: 2746 Comm: leaker Not tainted 4.18.0+ #20
 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
 Call Trace:
  dump_stack+0x63/0x88
  dump_header+0x67/0x27a
  ? mem_cgroup_scan_tasks+0x91/0xf0
  oom_kill_process+0x210/0x410
  out_of_memory+0x109/0x2d0
  mem_cgroup_out_of_memory+0x46/0x80
  try_charge+0x58d/0x650
  ? __radix_tree_replace+0x81/0x100
  mem_cgroup_try_charge+0x7a/0x100
  __add_to_page_cache_locked+0x92/0x180
  add_to_page_cache_lru+0x4d/0xf0
  iomap_readpages_actor+0xde/0x1b0
  ? iomap_zero_range_actor+0x1d0/0x1d0
  iomap_apply+0xaf/0x130
  iomap_readpages+0x9f/0x150
  ? iomap_zero_range_actor+0x1d0/0x1d0
  xfs_vm_readpages+0x18/0x20 [xfs]
  read_pages+0x60/0x140
  __do_page_cache_readahead+0x193/0x1b0
  ondemand_readahead+0x16d/0x2c0
  page_cache_async_readahead+0x9a/0xd0
  filemap_fault+0x403/0x620
  ? alloc_set_pte+0x12c/0x540
  ? _cond_resched+0x14/0x30
  __xfs_filemap_fault+0x66/0x180 [xfs]
  xfs_filemap_fault+0x27/0x30 [xfs]
  __do_fault+0x19/0x40
  __handle_mm_fault+0x8e8/0xb60
  handle_mm_fault+0xfd/0x220
  __do_page_fault+0x238/0x4e0
  do_page_fault+0x22/0x30
  ? page_fault+0x8/0x30
  page_fault+0x1e/0x30
 RIP: 0033:0x4009f0
 Code: 03 00 00 00 e8 71 fd ff ff 48 83 f8 ff 49 89 c6 74 74 48 89 c6 bf c0 0c 40 00 31 c0 e8 69 fd ff ff 45 85 ff 7e 21 31 c9 66 90 <41> 0f be 14 0e 01 d3 f7 c1 ff 0f 00 00 75 05 41 c6 04 0e 2a 48 83
 RSP: 002b:00007ffda45c9290 EFLAGS: 00010206
 RAX: 000000000000001b RBX: 0000000000000000 RCX: 0000000001a1e000
 RDX: 0000000000000000 RSI: 000000007fffffe5 RDI: 0000000000000000
 RBP: 000000000000000c R08: 0000000000000000 R09: 00007f6d061ff20d
 R10: 0000000000000002 R11: 0000000000000246 R12: 00000000000186a0
 R13: 0000000000000003 R14: 00007f6ce59b2000 R15: 0000000002800000
 Task in /leaker killed as a result of limit of /leaker
 memory: usage 524288kB, limit 524288kB, failcnt 7221
 memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
 kmem: usage 1944kB, limit 9007199254740988kB, failcnt 0
 Memory cgroup stats for /leaker: cache:3632KB rss:518232KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB inactive_anon:0KB active_anon:518408KB inactive_file:3908KB active_file:12KB unevictable:0KB
 Memory cgroup out of memory: Kill process 2746 (leaker) score 992 or sacrifice child
 Killed process 2746 (leaker) total-vm:536704kB, anon-rss:518264kB, file-rss:1188kB, shmem-rss:0kB
 oom_reaper: reaped process 2746 (leaker), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

[3]

 leaker invoked oom-killer: gfp_mask=0x50, order=0, oom_score_adj=0
 leaker cpuset=/ mems_allowed=0
 CPU: 1 PID: 3206 Comm: leaker Not tainted 3.10.0-957.27.2.el7.x86_64 #1
 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
 Call Trace:
  [<ffffffffaf364147>] dump_stack+0x19/0x1b
  [<ffffffffaf35eb6a>] dump_header+0x90/0x229
  [<ffffffffaedbb456>] ? find_lock_task_mm+0x56/0xc0
  [<ffffffffaee32a38>] ? try_get_mem_cgroup_from_mm+0x28/0x60
  [<ffffffffaedbb904>] oom_kill_process+0x254/0x3d0
  [<ffffffffaee36c36>] mem_cgroup_oom_synchronize+0x546/0x570
  [<ffffffffaee360b0>] ? mem_cgroup_charge_common+0xc0/0xc0
  [<ffffffffaedbc194>] pagefault_out_of_memory+0x14/0x90
  [<ffffffffaf35d072>] mm_fault_error+0x6a/0x157
  [<ffffffffaf3717c8>] __do_page_fault+0x3c8/0x4f0
  [<ffffffffaf371925>] do_page_fault+0x35/0x90
  [<ffffffffaf36d768>] page_fault+0x28/0x30
 Task in /leaker killed as a result of limit of /leaker
 memory: usage 524288kB, limit 524288kB, failcnt 20628
 memory+swap: usage 524288kB, limit 9007199254740988kB, failcnt 0
 kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
 Memory cgroup stats for /leaker: cache:840KB rss:523448KB rss_huge:0KB mapped_file:0KB swap:0KB inactive_anon:0KB active_anon:523448KB inactive_file:464KB active_file:376KB unevictable:0KB
 Memory cgroup out of memory: Kill process 3206 (leaker) score 970 or sacrifice child
 Killed process 3206 (leaker) total-vm:536692kB, anon-rss:523304kB, file-rss:412kB, shmem-rss:0kB

Bisected by Masoud Sharbiani.

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 3da88fb ("mm, oom: move GFP_NOFS check to out_of_memory") [necessary after 29ef680]
Signed-off-by: Tetsuo Handa <[email protected]>
Reported-by: Masoud Sharbiani <[email protected]>
Tested-by: Masoud Sharbiani <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: <[email protected]>	[4.19+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants