Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update from linux #1

Merged
merged 10,000 commits into from
Dec 30, 2017
Merged

Update from linux #1

merged 10,000 commits into from
Dec 30, 2017

Conversation

chaosmamama
Copy link
Owner

No description provided.

pmachata and others added 30 commits December 19, 2017 11:08
This reverts commit 63dd00f.

RAUHT DELETE_ALL seems to trigger a bug in FW. That manifests by later
calls to RAUHT ADD of an IPv6 neighbor to fail with "bad parameter"
error code.

Signed-off-by: Petr Machata <[email protected]>
Fixes: 63dd00f ("mlxsw: spectrum_router: Add batch neighbour deletion")
Reviewed-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
A few thousand such pages are usually left around due to the re-use of
L1 tables having been provided by the hypervisor (Dom0) or tool stack
(DomU). Set NX in the direct map variant, which needs to be done in L2
due to the dual use of the re-used L1s.

For x86_configure_nx() to actually do what it is supposed to do, call
get_cpu_cap() first. This was broken by commit 4763ed4 ("x86, mm:
Clean up and simplify NX enablement") when switching away from the
direct EFER read.

Signed-off-by: Jan Beulich <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>
Signed-off-by: Boris Ostrovsky <[email protected]>
The current solution would setup fixed and force link of 1Gbps to the both
GMAC on the default. However, The GMAC should always be put to link down
state when the GMAC is disabled on certain target boards. Otherwise,
the driver possibly receives unexpected data from the floating hardware
connection through the unused GMAC. Although the driver had been added
certain protection in RX path to get rid of such kind of unexpected data
sent to the upper stack.

Signed-off-by: Sean Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
arc_emac_rx() has some issues found by code review.

In case netdev_alloc_skb_ip_align() or dma_map_single() failure
rx fifo entry will not be returned to EMAC.

In case dma_map_single() failure previously allocated skb became
lost to driver. At the same time address of newly allocated skb
will not be provided to EMAC.

Signed-off-by: Alexander Kochetkov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Under certain conditions EMAC stop reception of incoming packets and
continuously increment R_MISS register instead of saving data into
provided buffer. The commit implement workaround for such situation.
Then the stall detected EMAC will be restarted.

On device the stall looks like the device lost it's dynamic IP address.
ifconfig shows that interface error counter rapidly increments.
At the same time on the DHCP server we can see continues DHCP-requests
from device.

In real network stalls happen really rarely. To make them frequent the
broadcast storm[1] should be simulated. For simulation it is necessary
to make following connections:
    1. connect radxarock to 1st port of switch
    2. connect some PC to 2nd port of switch
    3. connect two other free ports together using standard ethernet cable,
       in order to make a switching loop.

After that, is necessary to make a broadcast storm. For example, running on
PC 'ping' to some IP address triggers ARP-request storm. After some
time (~10sec), EMAC on rk3188 will stall.

Observed and tested on rk3188 radxarock.

[1] https://en.wikipedia.org/wiki/Broadcast_radiation

Signed-off-by: Alexander Kochetkov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Unlike ip tunnels, now vxlan doesn't do any pmtu update for
upper dst pmtu, even if it doesn't match the lower dst pmtu
any more.

The problem can be reproduced when reducing the vxlan lower
dev's pmtu when running netperf. In jianlin's testing, the
performance went to 1/7 of the previous.

This patch is to update the upper dst pmtu to match the lower
dst pmtu on tx path so that packets can be sent out even when
lower dev's pmtu has been changed.

It also works for metadata dst.

Note that this patch doesn't process any pmtu icmp packet.
But even in the future, the support for pmtu icmp packets
process of udp tunnels will also needs this.

The same thing will be done for geneve in another patch.

Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
ipgre tap driver calls ether_setup(), after commit 61e8462
("net: centralize net_device min/max MTU checking"), the range
of mtu is [min_mtu, max_mtu], which is [68, 1500] by default.

It causes the dev mtu of the ipgre tap device to not be greater
than 1500, this limit value is not correct for ipgre tap device.

Besides, it's .change_mtu already does the right check. So this
patch is just to set max_mtu as 0, and leave the check to it's
.change_mtu.

Fixes: 61e8462 ("net: centralize net_device min/max MTU checking")
Reported-by: Jianlin Shi <[email protected]>
Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
The same fix as the patch "ip_gre: remove the incorrect mtu limit for
ipgre tap" is also needed for ip6_gre.

Fixes: 61e8462 ("net: centralize net_device min/max MTU checking")
Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Now it's using IPV6_MIN_MTU as the min mtu in ip6_tnl_xmit, but
IPV6_MIN_MTU actually only works when the inner packet is ipv6.

With IPV6_MIN_MTU for ipv4 packets, the new pmtu for inner dst
couldn't be set less than 1280. It would cause tx_err and the
packet to be dropped when the outer dst pmtu is close to 1280.

Jianlin found it by running ipv4 traffic with the topo:

  (client) gre6 <---> eth1 (route) eth2 <---> gre6 (server)

After changing eth2 mtu to 1300, the performance became very
low, or the connection was even broken. The issue also affects
ip4ip6 and ip6ip6 tunnels.

So if the inner packet is ipv4, 576 should be considered as the
min mtu.

Note that for ip4ip6 and ip6ip6 tunnels, the inner packet can
only be ipv4 or ipv6, but for gre6 tunnel, it may also be ARP.
This patch using 576 as the min mtu for non-ipv6 packet works
for all those cases.

Reported-by: Jianlin Shi <[email protected]>
Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
When, during a join operation, or during message transmission, a group
member needs to be added to the group's 'congested' list, we sort it
into the list in ascending order, according to its current advertised
window size. However, we miss the case when the member is already on
that list. This will have the result that the member, after the window
size has been decremented, might be at the wrong position in that list.
This again may have the effect that we during broadcast and multicast
transmissions miss the fact that a destination is not yet ready for
reception, and we end up sending anyway. From this point on, the
behavior during the remaining session is unpredictable, e.g., with
underflowing window sizes.

We now correct this bug by unconditionally removing the member from
the list before (re-)sorting it in.

Signed-off-by: Jon Maloy <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
When using GMAC4 the value written in PTP_SSIR should be shifted however
the shifted value is also used in subsequent calculations which results
in a bad timestamp value.

Signed-off-by: Fredrik Hallenberg <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
As noted in dwmac4_wrback_get_rx_timestamp_status the timestamp is found
in the context descriptor following the current descriptor. However the
current code looks for the context descriptor in the current
descriptor, which will always fail.

Signed-off-by: Fredrik Hallenberg <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
The recently added fib_metrics_match() causes a regression for routes
with both RTAX_FEATURES and RTAX_CC_ALGO if the latter has
TCP_CONG_NEEDS_ECN flag set:

| # ip link add d0 type dummy
| # ip link set d0 up
| # ip route add 172.29.29.0/24 dev d0 features ecn congctl dctcp
| # ip route del 172.29.29.0/24 dev d0 features ecn congctl dctcp
| RTNETLINK answers: No such process

During route insertion, fib_convert_metrics() detects that the given CC
algo requires ECN and hence sets DST_FEATURE_ECN_CA bit in
RTAX_FEATURES.

During route deletion though, fib_metrics_match() compares stored
RTAX_FEATURES value with that from userspace (which obviously has no
knowledge about DST_FEATURE_ECN_CA) and fails.

Fixes: 5f9ae3d ("ipv4: do metrics match when looking up and deleting a route")
Signed-off-by: Phil Sutter <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Our MMC host driver now issues a reset, instead of just deasserting
the reset control, since commit c34eda6 ("mmc: sunxi: Reset the
device at probe time"). The sun9i-mmc clock driver does not support
this, and will fail, which results in MMC not probing.

This patch implements the reset callback by asserting the reset control,
then deasserting it after a small delay.

Fixes: 7a6fca8 ("clk: sunxi: Add driver for A80 MMC config clocks/resets")
Cc: <[email protected]> # 4.14.x
Signed-off-by: Chen-Yu Tsai <[email protected]>
Acked-by: Philipp Zabel <[email protected]>
Acked-by: Maxime Ripard <[email protected]>
Signed-off-by: Michael Turquette <[email protected]>
Link: lkml.kernel.org/r/[email protected]
Currently, if a size of zero is passed to
mlx5_fpga_mem_{read|write}_i2c()
the "err" return value will not be initialized, which triggers gcc
warnings:

[..]/mlx5/core/fpga/sdk.c:87 mlx5_fpga_mem_read_i2c() error:
uninitialized symbol 'err'.
[..]/mlx5/core/fpga/sdk.c:115 mlx5_fpga_mem_write_i2c() error:
uninitialized symbol 'err'.

fix that.

Fixes: a9956d3 ('net/mlx5: FPGA, Add SBU infrastructure')
Signed-off-by: Kamal Heib <[email protected]>
Reviewed-by: Yevgeny Kliteynik <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Before the offending commit, mlx5 core did the IRQ affinity itself,
and it seems that the new generic code have some drawbacks and one
of them is the lack for user ability to modify irq affinity after
the initial affinity values got assigned.

The issue is still being discussed and a solution in the new generic code
is required, until then we need to revert this patch.

This fixes the following issue:
echo <new affinity> > /proc/irq/<x>/smp_affinity
fails with  -EIO

This reverts commit a435393.
Note: kept mlx5_get_vector_affinity in include/linux/mlx5/driver.h since
it is used in mlx5_ib driver.

Fixes: a435393 ("mlx5: move affinity hints assignments to generic code")
Cc: Sagi Grimberg <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Jes Sorensen <[email protected]>
Reported-by: Jes Sorensen <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
In mlx5_ifc, struct size was not complete, and thus driver was sending
garbage after the last defined field. Fixed it by adding reserved field
to complete the struct size.

In addition, rename all set_rate_limit to set_pp_rate_limit to be
compliant with the Firmware <-> Driver definition.

Fixes: 7486216 ("{net,IB}/mlx5: mlx5_ifc updates")
Fixes: 1466cc5 ("net/mlx5: Rate limit tables support")
Signed-off-by: Eran Ben Elisha <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Fix bug that allows ets bw sum to be 0% when ets tc type exists.

Fixes: 08fb1da ('net/mlx5e: Support DCBNL IEEE ETS')
Signed-off-by: Moshe Shemesh <[email protected]>
Reviewed-by: Huy Nguyen <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
The assumption that the next header field contains the transport
protocol is wrong for IPv6 packets with extension headers.
Instead, we should look the inner-most next header field in the buffer.
This will fix TSO offload for tunnels over IPv6 with extension headers.

Performance testing: 19.25x improvement, cool!
Measuring bandwidth of 16 threads TCP traffic over IPv6 GRE tap.
CPU: Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz
NIC: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]
TSO: Enabled
Before: 4,926.24  Mbps
Now   : 94,827.91 Mbps

Fixes: b3f63c3 ("net/mlx5e: Add netdev support for VXLAN tunneling")
Signed-off-by: Gal Pressman <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Fixes the bug when turning on/off CQE compression mechanism
resets the RX rings size to default value when it is not
needed.

Fixes: 2fc4bfb ("net/mlx5e: Dynamic RQ type infrastructure")
Signed-off-by: Eugenia Emantayev <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Fix misspelling in word syndrome.

Fixes: e126ba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Eugenia Emantayev <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
In error flow, when DESTROY_QP command should be executed, the wrong
mailbox was set with data, not the one that is written to hardware,
Fix that.

Fixes: 09a7d9e '{net,IB}/mlx5: QP/XRCD commands via mlx5 ifc'
Signed-off-by: Moni Shoua <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
mlx5e_vxlan_lookup_port is called both from mlx5e_add_vxlan_port (user
context) and mlx5e_features_check (softirq), but the lock acquired does
not disable bottom half and might result in deadlock. Fix it by simply
replacing spin_lock() with spin_lock_bh().
While at it, replace all unnecessary spin_lock_irq() to spin_lock_bh().

lockdep's WARNING: inconsistent lock state
[  654.028136] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[  654.028229] swapper/5/0 [HC0[0]:SC1[9]:HE1:SE0] takes:
[  654.028321]  (&(&vxlan_db->lock)->rlock){+.?.}, at: [<ffffffffa06e7f0e>] mlx5e_vxlan_lookup_port+0x1e/0x50 [mlx5_core]
[  654.028528] {SOFTIRQ-ON-W} state was registered at:
[  654.028607]   _raw_spin_lock+0x3c/0x70
[  654.028689]   mlx5e_vxlan_lookup_port+0x1e/0x50 [mlx5_core]
[  654.028794]   mlx5e_vxlan_add_port+0x2e/0x120 [mlx5_core]
[  654.028878]   process_one_work+0x1e9/0x640
[  654.028942]   worker_thread+0x4a/0x3f0
[  654.029002]   kthread+0x141/0x180
[  654.029056]   ret_from_fork+0x24/0x30
[  654.029114] irq event stamp: 579088
[  654.029174] hardirqs last  enabled at (579088): [<ffffffff818f475a>] ip6_finish_output2+0x49a/0x8c0
[  654.029309] hardirqs last disabled at (579087): [<ffffffff818f470e>] ip6_finish_output2+0x44e/0x8c0
[  654.029446] softirqs last  enabled at (579030): [<ffffffff810b3b3d>] irq_enter+0x6d/0x80
[  654.029567] softirqs last disabled at (579031): [<ffffffff810b3c05>] irq_exit+0xb5/0xc0
[  654.029684] other info that might help us debug this:
[  654.029781]  Possible unsafe locking scenario:

[  654.029868]        CPU0
[  654.029908]        ----
[  654.029947]   lock(&(&vxlan_db->lock)->rlock);
[  654.030045]   <Interrupt>
[  654.030090]     lock(&(&vxlan_db->lock)->rlock);
[  654.030162]
 *** DEADLOCK ***

Fixes: b3f63c3 ("net/mlx5e: Add netdev support for VXLAN tunneling")
Signed-off-by: Gal Pressman <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
A refcount mechanism must be implemented in order to prevent unwanted
scenarios such as:
- Open an IPv4 VXLAN interface
- Open an IPv6 VXLAN interface (different socket)
- Remove one of the interfaces

With current implementation, the UDP port will be removed from our VXLAN
database and turn off the offloads for the other interface, which is
still active.
The reference count mechanism will only allow UDP port removals once all
consumers are gone.

Fixes: b3f63c3 ("net/mlx5e: Add netdev support for VXLAN tunneling")
Signed-off-by: Gal Pressman <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
When calling add/remove VXLAN port, a lock must be held in order to
prevent race scenarios when more than one add/remove happens at the
same time.
Fix by holding our state_lock (mutex) as done by all other parts of the
driver.
Note that the spinlock protecting the radix-tree is still needed in
order to synchronize radix-tree access from softirq context.

Fixes: b3f63c3 ("net/mlx5e: Add netdev support for VXLAN tunneling")
Signed-off-by: Gal Pressman <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Flow steering priority and namespace are software only objects that
didn't have the proper destructors and were not freed during steering
cleanup.

Fix it by adding destructor functions for these objects.

Fixes: bd71b08 ("net/mlx5: Support multiple updates of steering rules in parallel")
Signed-off-by: Maor Gottlieb <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
When mlx5_stop_eqs fails to destroy any of the eqs it returns with an error.
In such failure flow the function will return without
releasing all EQs irqs and then pci_free_irq_vectors will fail.
Fix by only warn on destroy EQ failure and continue to release other
EQs and their irqs.

It fixes the following kernel trace:
kernel: kernel BUG at drivers/pci/msi.c:352!
...
...
kernel: Call Trace:
kernel: pci_disable_msix+0xd3/0x100
kernel: pci_free_irq_vectors+0xe/0x20
kernel: mlx5_load_one.isra.17+0x9f5/0xec0 [mlx5_core]

Fixes: e126ba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Moshe Shemesh <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
During unload, on mlx5_stop_eqs we move command interface from events
mode to polling mode, but if command interface EQ destroy fail we move
back to events mode.
That's wrong since even if we fail to destroy command interface EQ, we
do release its irq, so no interrupts will be received.

Fixes: e126ba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Moshe Shemesh <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
fdo#104340.

Signed-off-by: Ben Skeggs <[email protected]>
The alignment checks at pfn driver startup fail to properly account for
the 'start_pad' in the case where the namespace is misaligned relative
to its internal alignment. This is typically triggered in 1G aligned
namespace, but could theoretically trigger with small namespace
alignments. When this triggers the kernel reports messages of the form:

    dax2.1: bad offset: 0x3c000000 dax disabled align: 0x40000000

Cc: <[email protected]>
Fixes: 1ee6667 ("libnvdimm, pfn, dax: fix initialization vs autodetect...")
Reported-by: Jane Chu <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
xpu22 and others added 28 commits December 27, 2017 13:47
The patch(180d8cd) replaces all uses of struct sock fields'
memory_pressure, memory_allocated, sockets_allocated, and sysctl_mem
to accessor macros. But the sockets_allocated field of sctp sock is
not replaced at all. Then replace it now for unifying the code.

Fixes: 180d8cd ("foundations of per-cgroup memory pressure controlling.")
Cc: Glauber Costa <[email protected]>
Signed-off-by: Tonghao Zhang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Two info bits were added to the "commit" part of the ring buffer data page
when returned to be consumed. This was to inform the user space readers that
events have been missed, and that the count may be stored at the end of the
page.

What wasn't handled, was the splice code that actually called a function to
return the length of the data in order to zero out the rest of the page
before sending it up to user space. These data bits were returned with the
length making the value negative, and that negative value was not checked.
It was compared to PAGE_SIZE, and only used if the size was less than
PAGE_SIZE. Luckily PAGE_SIZE is unsigned long which made the compare an
unsigned compare, meaning the negative size value did not end up causing a
large portion of memory to be randomly zeroed out.

Cc: [email protected]
Fixes: 66a8cb9 ("ring-buffer: Add place holder recording of dropped events")
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
….org/drm/drm-intel into drm-fixes

GLK pipe C related fix, and a gvt fix.

* tag 'drm-intel-fixes-2017-12-22-1' of git://anongit.freedesktop.org/drm/drm-intel:
  i915: Reject CCS modifiers for pipe C on Geminilake
  drm/i915/gvt: Fix pipe A enable as default for vgpu
The ring_buffer_read_page() takes care of zeroing out any extra data in the
page that it returns. There's no need to zero it out again from the
consumer. It was removed from one consumer of this function, but
read_buffers_splice_read() did not remove it, and worse, it contained a
nasty bug because of it.

Cc: [email protected]
Fixes: 2711ca2 ("ring-buffer: Move zeroing out excess in page to ring buffer code")
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
To free the reader page that is allocated with ring_buffer_alloc_read_page(),
ring_buffer_free_read_page() must be called. For faster performance, this
page can be reused by the ring buffer to avoid having to free and allocate
new pages.

The issue arises when the page is used with a splice pipe into the
networking code. The networking code may up the page counter for the page,
and keep it active while sending it is queued to go to the network. The
incrementing of the page ref does not prevent it from being reused in the
ring buffer, and this can cause the page that is being sent out to the
network to be modified before it is sent by reading new data.

Add a check to the page ref counter, and only reuse the page if it is not
being used anywhere else.

Cc: [email protected]
Fixes: 73a757e ("ring-buffer: Return reader page back into existing ring buffer")
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Double free of the ring buffer happens when it fails to alloc new
ring buffer instance for max_buffer if TRACER_MAX_TRACE is configured.
The root cause is that the pointer is not set to NULL after the buffer
is freed in allocate_trace_buffers(), and the freeing of the ring
buffer is invoked again later if the pointer is not equal to Null,
as:

instance_mkdir()
    |-allocate_trace_buffers()
        |-allocate_trace_buffer(tr, &tr->trace_buffer...)
	|-allocate_trace_buffer(tr, &tr->max_buffer...)

          // allocate fail(-ENOMEM),first free
          // and the buffer pointer is not set to null
        |-ring_buffer_free(tr->trace_buffer.buffer)

       // out_free_tr
    |-free_trace_buffers()
        |-free_trace_buffer(&tr->trace_buffer);

	      //if trace_buffer is not null, free again
	    |-ring_buffer_free(buf->buffer)
                |-rb_free_cpu_buffer(buffer->buffers[cpu])
                    // ring_buffer_per_cpu is null, and
                    // crash in ring_buffer_per_cpu->pages

Link: http://lkml.kernel.org/r/[email protected]

Cc: [email protected]
Fixes: 737223f ("tracing: Consolidate buffer allocation code")
Signed-off-by: Jing Xia <[email protected]>
Signed-off-by: Chunyan Zhang <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Jing Xia and Chunyan Zhang reported that on failing to allocate part of the
tracing buffer, memory is freed, but the pointers that point to them are not
initialized back to NULL, and later paths may try to free the freed memory
again. Jing and Chunyan fixed one of the locations that does this, but
missed a spot.

Link: http://lkml.kernel.org/r/[email protected]

Cc: [email protected]
Fixes: 737223f ("tracing: Consolidate buffer allocation code")
Reported-by: Jing Xia <[email protected]>
Reported-by: Chunyan Zhang <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
With the current code, the following sequence won't work :
echo timer > trigger

echo 0 >  delay_off
* at this point we call
** led_delay_off_store
** led_blink_set
*** stop timer
** led_blink_setup
** led_set_software_blink
*** if !delay_on, led off
*** if !delay_off, set led_set_brightness_nosleep <--- LED_BLINK_SW is set but timer is stop
*** otherwise start timer/set LED_BLINK_SW flag

echo xxx > brightness
* led_set_brightness
** if LED_BLINK_SW
*** if brightness=0, led off
*** else apply brightness if next timer <--- timer is stop, and will never apply new setting
** otherwise set led_set_brightness_nosleep

To fix that, when we delete the timer, we should clear LED_BLINK_SW.

Cc: [email protected]
Signed-off-by: Matthieu CASTET <[email protected]>
Signed-off-by: Jacek Anaszewski <[email protected]>
…el/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "It seems that Santa overslept with a bunch of gifts; the majority of
  changes here are various device-specific ASoC fixes, most notably the
  revert of rcar IOMMU support and fsl_ssi AC97 fixes, but also lots of
  small fixes for codecs. Besides that, the usual HD-audio quirks and
  fixes are included, too"

* tag 'sound-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (31 commits)
  ALSA: hda - Fix missing COEF init for ALC225/295/299
  ALSA: hda: Drop useless WARN_ON()
  ALSA: hda - change the location for one mic on a Lenovo machine
  ALSA: hda - fix headset mic detection issue on a Dell machine
  ALSA: hda - Add MIC_NO_PRESENCE fixup for 2 HP machines
  ASoC: rsnd: fixup ADG register mask
  ASoC: rt5514-spi: only enable wakeup when fully initialized
  ASoC: nau8825: fix issue that pop noise when start capture
  ASoC: rt5663: Fix the wrong result of the first jack detection
  ASoC: rsnd: ssi: fix race condition in rsnd_ssi_pointer_update
  ASoC: Intel: Change kern log level to avoid unwanted messages
  ASoC: atmel-classd: select correct Kconfig symbol
  ASoC: wm_adsp: Fix validation of firmware and coeff lengths
  ASoC: Intel: Skylake: Do not check dev_type for dmic link type
  ASoC: rockchip: disable clock on error
  ASoC: tlv320aic31xx: Fix GPIO1 register definition
  ASoC: codecs: msm8916-wcd: Fix supported formats
  ASoC: fsl_asrc: Fix typo in a field define
  ASoC: rsnd: ssiu: clear SSI_MODE for non TDM Extended modes
  ASoC: da7218: Correct IRQ level in DT binding example
  ...
…nel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
 "While doing tests on tracing over the network, I found that the
  packets were getting corrupted.

  In the process I found three bugs.

  One was the culprit, but the other two scared me. After deeper
  investigation, they were not as major as I thought they were, due to a
  signed compared to an unsigned that prevented a negative number from
  doing actual harm.

  The two bigger bugs:

   - Mask the ring buffer data page length. There are data flags at the
     high bits of the length field. These were not cleared via the
     length function, and the length could return a negative number.
     (Although the number returned was unsigned, but was assigned to a
     signed number) Luckily, this value was compared to PAGE_SIZE which
     is unsigned and kept it from entering the path that could have
     caused damage.

   - Check the page usage before reusing the ring buffer reader page.
     TCP increments the page ref when passing the page off to the
     network. The page is passed back to the ring buffer for use on
     free. But the page could still be in use by the TCP stack.

  Minor bugs:

   - Related to the first bug. No need to clear out the unused ring
     buffer data before sending to user space. It is now done by the
     ring buffer code itself.

   - Reset pointers after free on error path. There were some cases in
     the error path that pointers were freed but not set to NULL, and
     could have them freed again, having a pointer freed twice"

* tag 'trace-v4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Fix possible double free on failure of allocating trace buffer
  tracing: Fix crash when it fails to alloc ring buffer
  ring-buffer: Do no reuse reader page if still in use
  tracing: Remove extra zeroing out of the ring buffer page
  ring-buffer: Mask out the info bits when returning buffer page length
User-space applications can do mmap and munmap directly at
any time.

Since the VMA list is not protected with a mutex, concurrent
accesses to the VMA list from the mmap and munmap can cause
data corruption. Add a mutex around the list.

Cc: <[email protected]> # v4.7
Fixes: 7c2344c ("IB/mlx5: Implements disassociate_ucontext API")
Reviewed-by: Yishai Hadas <[email protected]>
Signed-off-by: Majd Dibbiny <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
If the input command length is larger than the kernel supports an error should
be returned in case the unsupported bytes are not cleared, instead of the
other way aroudn. This matches what all other callers of ib_is_udata_cleared
do and will avoid user ABI problems in the future.

Cc: <[email protected]> # v4.10
Fixes: 189aba9 ("IB/uverbs: Extend modify_qp and support packet pacing")
Reviewed-by: Yishai Hadas <[email protected]>
Signed-off-by: Moni Shoua <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
The XRC target QP create flow sets up qp_sec only if there is an IB link with
LSM security enabled. However, several other related uAPI entry points blindly
follow the qp_sec NULL pointer, resulting in a possible oops.

Check for NULL before using qp_sec.

Cc: <[email protected]> # v4.12
Fixes: d291f1a ("IB/core: Enforce PKey security on QPs")
Reviewed-by: Daniel Jurgens <[email protected]>
Signed-off-by: Moni Shoua <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
ibmr.device is being set only after ib_alloc_mr() is
(successfully) complete. Therefore, in case mlx5_core_create_mkey()
return with error, the error flow calls mlx5_free_priv_descs()
which uses ibmr.device (which doesn't exist yet), causing
a NULL dereference oops.

To fix this, the IB device should be set in the mr struct earlier
stage (e.g. prior to calling mlx5_core_create_mkey()).

Fixes: 8a187ee ("IB/mlx5: Support the new memory registration API")
Signed-off-by: Max Gurtovoy <[email protected]>
Signed-off-by: Nitzan Carmi <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Daniel Borkmann says:

====================
pull-request: bpf 2017-12-28

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Two small fixes for bpftool. Fix otherwise broken output if any of
   the system calls failed when listing maps in json format and instead
   of bailing out, skip maps or progs that disappeared between fetching
   next id and getting an fd for that id, both from Jakub.

2) Small fix in BPF selftests to respect LLC passed from command line
   when testing for -mcpu=probe presence, from Quentin.
====================

Signed-off-by: David S. Miller <[email protected]>
Since the recent remote cpufreq callback work, its possible that a cpufreq
update is triggered from a remote CPU. For single policies however, the current
code uses the local CPU when trying to determine if the remote sg_cpu entered
idle or is busy. This is incorrect. To remedy this, compare with the nohz tick
idle_calls counter of the remote CPU.

Fixes: 674e754 (sched: cpufreq: Allow remote cpufreq callbacks)
Acked-by: Viresh Kumar <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Joel Fernandes <[email protected]>
Cc: 4.14+ <[email protected]> # 4.14+
Signed-off-by: Rafael J. Wysocki <[email protected]>
In commit 42b531d ("tipc: Fix missing connection request
handling"), we replaced unconditional wakeup() with condtional
wakeup for clients with flags POLLIN | POLLRDNORM | POLLRDBAND.

This breaks the applications which do a connect followed by poll
with POLLOUT flag. These applications are not woken when the
connection is ESTABLISHED and hence sleep forever.

In this commit, we fix it by including the POLLOUT event for
sockets in TIPC_CONNECTING state.

Fixes: 42b531d ("tipc: Fix missing connection request handling")
Acked-by: Jon Maloy <[email protected]>
Signed-off-by: Parthasarathy Bhuvaragan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
skb_copy_ubufs must unclone before it is safe to modify its
skb_shared_info with skb_zcopy_clear.

Commit b90ddd5 ("skbuff: skb_copy_ubufs must release uarg even
without user frags") ensures that all skbs release their zerocopy
state, even those without frags.

But I forgot an edge case where such an skb arrives that is cloned.

The stack does not build such packets. Vhost/tun skbs have their
frags orphaned before cloning. TCP skbs only attach zerocopy state
when a frag is added.

But if TCP packets can be trimmed or linearized, this might occur.
Tracing the code I found no instance so far (e.g., skb_linearize
ends up calling skb_zcopy_clear if !skb->data_len).

Still, it is non-obvious that no path exists. And it is fragile to
rely on this.

Fixes: b90ddd5 ("skbuff: skb_copy_ubufs must release uarg even without user frags")
Signed-off-by: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
This allows checking socket lock ownership with producing lockdep
warnings.

Signed-off-by: Tom Herbert <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
strparser wants to check socket ownership without producing any
warnings. As indicated by the comment in the code, it is permissible
for owned_by_user to return true.

Fixes: 43a0c67 ("strparser: Stream parser for messages")
Reported-by: syzbot <[email protected]>
Reported-and-tested-by: <syzbot+c91c53af67f9ebe599a337d2e70950366153b295@syzkaller.appspotmail.com>
Signed-off-by: Tom Herbert <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Tom Herbert says:

====================
strparser: Fix lockdep issue

When sock_owned_by_user returns true in strparser. Fix is to add and
call sock_owned_by_user_nocheck since the check for owned by user is
not an error condition in this case.
====================

Fixes: 43a0c67 ("strparser: Stream parser for messages")
Reported-by: syzbot <[email protected]>
Reported-and-tested-by: <syzbot+c91c53af67f9ebe599a337d2e70950366153b295@syzkaller.appspotmail.com>
Signed-off-by: David S. Miller <[email protected]>
…t/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
 "This is the next batch of for-rc patches from RDMA. It includes the
  fix for the ipoib regression I mentioned last time, and the result of
  a fairly major debugging effort to get iser working reliably on cxgb4
  hardware - it turns out the cxgb4 driver was not handling QP error
  flushing properly causing iser to fail.

   - cxgb4 fix for an iser testing failure as debugged by Steve and
     Sagi. The problem was a driver bug in the handling of shutting down
     a QP.

   - Various vmw_pvrdma fixes for bogus WARN_ON, missed resource free on
     error unwind and a use after free bug

   - Improper congestion counter values on mlx5 when link aggregation is
     enabled

   - ipoib lockdep regression introduced in this merge window

   - hfi1 regression supporting the device in a VM introduced in a
     recent patch

   - Typo that breaks future uAPI compatibility in the verbs core

   - More SELinux related oops fixing

   - Fix an oops during error unwind in mlx5"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  IB/mlx5: Fix mlx5_ib_alloc_mr error flow
  IB/core: Verify that QP is security enabled in create and destroy
  IB/uverbs: Fix command checking as part of ib_uverbs_ex_modify_qp()
  IB/mlx5: Serialize access to the VMA list
  IB/hfi: Only read capability registers if the capability exists
  IB/ipoib: Fix lockdep issue found on ipoib_ib_dev_heavy_flush
  IB/mlx5: Fix congestion counters in LAG mode
  RDMA/vmw_pvrdma: Avoid use after free due to QP/CQ/SRQ destroy
  RDMA/vmw_pvrdma: Use refcount_dec_and_test to avoid warning
  RDMA/vmw_pvrdma: Call ib_umem_release on destroy QP path
  iw_cxgb4: when flushing, complete all wrs in a chain
  iw_cxgb4: reflect the original WR opcode in drain cqes
  iw_cxgb4: Only validate the MSN for successful completions
…nux/kernel/git/j.anaszewski/linux-leds

Pull LED fix from Jacek Anaszewski:
 "A single LED fix for brightness setting when delay_off is 0"

* tag 'led_fixes_for_4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
  led: core: Fix brightness setting when setting delay_off=0
…/kernel/git/clk/linux

Pull clk fix from Stephen Boyd:
 "One more fix for the runtime PM clk patches. We're calling a runtime
  PM API that may schedule from somewhere that we can't do that. We
  change to the async version of pm_runtime_put() to fix it"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: use atomic runtime pm api in clk_core_is_enabled
…airlied/linux

Pull drm fixes from Dave Airlie:
 "nouveau and i915 regression fixes"

* tag 'drm-fixes-for-v4.15-rc6' of git://people.freedesktop.org/~airlied/linux:
  drm/nouveau: fix race when adding delayed work items
  i915: Reject CCS modifiers for pipe C on Geminilake
  drm/i915/gvt: Fix pipe A enable as default for vgpu
Pull networking fixes from David Miller:

 1) IPv6 gre tunnels end up with different default features enabled
    depending upon whether netlink or ioctls are used to bring them up.
    Fix from Alexey Kodanev.

 2) Fix read past end of user control message in RDS< from Avinash
    Repaka.

 3) Missing RCU barrier in mini qdisc code, from Cong Wang.

 4) Missing policy put when reusing per-cpu route entries, from Florian
    Westphal.

 5) Handle nested PCI errors properly in bnx2x driver, from Guilherme G.
    Piccoli.

 6) Run nested transport mode IPSEC packets via tasklet, from Herbert
    Xu.

 7) Fix handling poll() for stream sockets in tipc, from Parthasarathy
    Bhuvaragan.

 8) Fix two stack-out-of-bounds issues in IPSEC, from Steffen Klassert.

 9) Another zerocopy ubuf handling fix, from Willem de Bruijn.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (33 commits)
  strparser: Call sock_owned_by_user_nocheck
  sock: Add sock_owned_by_user_nocheck
  skbuff: in skb_copy_ubufs unclone before releasing zerocopy
  tipc: fix hanging poll() for stream sockets
  sctp: Replace use of sockets_allocated with specified macro.
  bnx2x: Improve reliability in case of nested PCI errors
  tg3: Enable PHY reset in MTU change path for 5720
  tg3: Add workaround to restrict 5762 MRRS to 2048
  tg3: Update copyright
  net: fec: unmap the xmit buffer that are not transferred by DMA
  tipc: fix tipc_mon_delete() oops in tipc_enable_bearer() error path
  tipc: error path leak fixes in tipc_enable_bearer()
  RDS: Check cmsg_len before dereferencing CMSG_DATA
  tcp: Avoid preprocessor directives in tracepoint macro args
  tipc: fix memory leak of group member when peer node is lost
  net: sched: fix possible null pointer deref in tcf_block_put
  tipc: base group replicast ack counter on number of actual receivers
  net_sched: fix a missing rcu barrier in mini_qdisc_pair_swap()
  net: phy: micrel: ksz9031: reconfigure autoneg after phy autoneg workaround
  ip6_gre: fix device features for ioctl setup
  ...
…git/rafael/linux-pm

Pull power management fix from Rafael Wysocki:
 "This fixes a schedutil cpufreq governor regression from the 4.14 cycle
  that may cause a CPU idleness check to return incorrect results in
  some cases which leads to suboptimal decisions (Joel Fernandes)"

* tag 'pm-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: schedutil: Use idle_calls counter of the remote CPU
…x/kernel/git/tip/tip

Pull x86 page table isolation updates from Thomas Gleixner:
 "This is the final set of enabling page table isolation on x86:

   - Infrastructure patches for handling the extra page tables.

   - Patches which map the various bits and pieces which are required to
     get in and out of user space into the user space visible page
     tables.

   - The required changes to have CR3 switching in the entry/exit code.

   - Optimizations for the CR3 switching along with documentation how
     the ASID/PCID mechanism works.

   - Updates to dump pagetables to cover the user space page tables for
     W+X scans and extra debugfs files to analyze both the kernel and
     the user space visible page tables

  The whole functionality is compile time controlled via a config switch
  and can be turned on/off on the command line as well"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
  x86/ldt: Make the LDT mapping RO
  x86/mm/dump_pagetables: Allow dumping current pagetables
  x86/mm/dump_pagetables: Check user space page table for WX pages
  x86/mm/dump_pagetables: Add page table directory to the debugfs VFS hierarchy
  x86/mm/pti: Add Kconfig
  x86/dumpstack: Indicate in Oops whether PTI is configured and enabled
  x86/mm: Clarify the whole ASID/kernel PCID/user PCID naming
  x86/mm: Use INVPCID for __native_flush_tlb_single()
  x86/mm: Optimize RESTORE_CR3
  x86/mm: Use/Fix PCID to optimize user/kernel switches
  x86/mm: Abstract switching CR3
  x86/mm: Allow flushing for future ASID switches
  x86/pti: Map the vsyscall page if needed
  x86/pti: Put the LDT in its own PGD if PTI is on
  x86/mm/64: Make a full PGD-entry size hole in the memory map
  x86/events/intel/ds: Map debug buffers in cpu_entry_area
  x86/cpu_entry_area: Add debugstore entries to cpu_entry_area
  x86/mm/pti: Map ESPFIX into user space
  x86/mm/pti: Share entry text PMD
  x86/entry: Align entry text section to PMD boundary
  ...
@chaosmamama chaosmamama merged commit 2baf6e5 into chaosmamama:master Dec 30, 2017
chaosmamama pushed a commit that referenced this pull request Jan 2, 2018
Trying to read from debugfs after the system has resumed from
hibernate causes a use-after-free and thus a protection fault.

Steps to reproduce:
Hibernate system, resume from hibernate, then run
$ cat /sys/kernel/debug/usb/xhci/*/command-ring/enqueue

[ 3902.765086] general protection fault: 0000 [#1] PREEMPT SMP
...
[ 3902.765136] RIP: 0010:xhci_trb_virt_to_dma.part.50+0x5/0x30
...
[ 3902.765178] Call Trace:
[ 3902.765188]  xhci_ring_enqueue_show+0x1e/0x40
[ 3902.765197]  seq_read+0xdb/0x3a0
[ 3902.765204]  ? __handle_mm_fault+0x5fb/0x1210
[ 3902.765211]  full_proxy_read+0x4a/0x70
[ 3902.765219]  __vfs_read+0x23/0x120
[ 3902.765228]  vfs_read+0x8e/0x130
[ 3902.765235]  SyS_read+0x42/0x90
[ 3902.765242]  do_syscall_64+0x6b/0x290
[ 3902.765251]  entry_SYSCALL64_slow_path+0x25/0x25

The issue is caused by the xhci ring structures being reallocated
when the system is resumed, but pointers to the old structures
being retained in the debugfs files "private" field:

The proposed patch fixes this issue by storing a pointer to the xhci_ring
field in the xhci device structure in debugfs rather than directly
storing a pointer to the xhci_ring.

Fixes: 02b6fdc ("usb: xhci: Add debugfs interface for xHCI driver")
Signed-off-by: Alexander Kappner <[email protected]>
Signed-off-by: Mathias Nyman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.