Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linux 5.17.4 #133

Closed
wants to merge 10,000 commits into from
Closed

Linux 5.17.4 #133

wants to merge 10,000 commits into from

Conversation

jackpot51
Copy link
Member

Peter Zijlstra and others added 30 commits April 13, 2022 19:27
commit 5b6547e upstream.

Steve reported that ChromeOS encounters the forceidle balancer being
ran from rt_mutex_setprio()'s balance_callback() invocation and
explodes.

Now, the forceidle balancer gets queued every time the idle task gets
selected, set_next_task(), which is strictly too often.
rt_mutex_setprio() also uses set_next_task() in the 'change' pattern:

	queued = task_on_rq_queued(p); /* p->on_rq == TASK_ON_RQ_QUEUED */
	running = task_current(rq, p); /* rq->curr == p */

	if (queued)
		dequeue_task(...);
	if (running)
		put_prev_task(...);

	/* change task properties */

	if (queued)
		enqueue_task(...);
	if (running)
		set_next_task(...);

However, rt_mutex_setprio() will explicitly not run this pattern on
the idle task (since priority boosting the idle task is quite insane).
Most other 'change' pattern users are pidhash based and would also not
apply to idle.

Also, the change pattern doesn't contain a __balance_callback()
invocation and hence we could have an out-of-band balance-callback,
which *should* trigger the WARN in rq_pin_lock() (which guards against
this exact anti-pattern).

So while none of that explains how this happens, it does indicate that
having it in set_next_task() might not be the most robust option.

Instead, explicitly queue the forceidle balancer from pick_next_task()
when it does indeed result in forceidle selection. Having it here,
ensures it can only be triggered under the __schedule() rq->lock
instance, and hence must be ran from that context.

This also happens to clean up the code a little, so win-win.

Fixes: d2dfa17 ("sched: Trivial forced-newidle balancer")
Reported-by: Steven Rostedt <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Tested-by: T.J. Alumbaugh <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 386ef21 upstream.

try_steal_cookie() looks at task_struct::cpus_mask to decide if the
task could be moved to `this' CPU. It ignores that the task might be in
a migration disabled section while not on the CPU. In this case the task
must not be moved otherwise per-CPU assumption are broken.

Use is_cpu_allowed(), as suggested by Peter Zijlstra, to decide if the a
task can be moved.

Fixes: d2dfa17 ("sched: Trivial forced-newidle balancer")
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 1cd5f05 upstream.

Paolo reported that the instruction sequence that is used to replace:

    call __static_call_return0

namely:

    66 66 48 31 c0	data16 data16 xor %rax,%rax

decodes to something else on i386, namely:

    66 66 48		data16 dec %ax
    31 c0		xor    %eax,%eax

Which is a nonsensical sequence that happens to have the same outcome.
*However* an important distinction is that it consists of 2
instructions which is a problem when the thing needs to be overwriten
with a regular call instruction again.

As such, replace the instruction with something that decodes the same
on both i386 and x86_64.

Fixes: 3f2a8fc ("static_call/x86: Add __static_call_return0()")
Reported-by: Paolo Bonzini <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 334865b upstream.

Bernardo reported an error that Nathan bisected down to
(x86_64) defconfig+LTO_CLANG_FULL+X86_PMEM_LEGACY.

    LTO     vmlinux.o
  ld.lld: error: <instantiation>:1:13: redefinition of 'found'
  .set found, 0
              ^

  <inline asm>:29:1: while in macro instantiation
  extable_type_reg reg=%eax, type=(17 | ((0) << 16))
  ^

This appears to be another LTO specific issue similar to what was folded
into commit 4b5305d ("x86/extable: Extend extable functionality"),
where the `.set found, 0` in DEFINE_EXTABLE_TYPE_REG in
arch/x86/include/asm/asm.h conflicts with the symbol for the static
function `found` in arch/x86/kernel/pmem.c.

Assembler .set directive declare symbols with global visibility, so the
assembler may not rename such symbols in the event of a conflict. LTO
could rename static functions if there was a conflict in C sources, but
it cannot see into symbols defined in inline asm.

The symbols are also retained in the symbol table, regardless of LTO.

Give the symbols .L prefixes making them locally visible, so that they
may be renamed for LTO to avoid conflicts, and to drop them from the
symbol table regardless of LTO.

Fixes: 4b5305d ("x86/extable: Extend extable functionality")
Reported-by: Bernardo Meurer Costa <[email protected]>
Debugged-by: Nathan Chancellor <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Nathan Chancellor <[email protected]>
Tested-by: Nathan Chancellor <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
…duling

commit af27e41 upstream.

The way KVM drives GICv4.{0,1} is as follows:
- vcpu_load() makes the VPE resident, instructing the RD to start
  scanning for interrupts
- just before entering the guest, we check that the RD has finished
  scanning and that we can start running the vcpu
- on preemption, we deschedule the VPE by making it invalid on
  the RD

However, we are preemptible between the first two steps. If it so
happens *and* that the RD was still scanning, we nonetheless write
to the GICR_VPENDBASER register while Dirty is set, and bad things
happen (we're in UNPRED land).

This affects both the 4.0 and 4.1 implementations.

Make sure Dirty is cleared before performing the deschedule,
meaning that its_clear_vpend_valid() becomes a sort of full VPE
residency barrier.

Reported-by: Jingyi Wang <[email protected]>
Tested-by: Nianyao Tang <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Fixes: 57e3ceb ("KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty bit")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit af41d28 upstream.

Using conditional branches between two files is hasardous,
they may get linked too far from each other.

  arch/powerpc/kvm/book3s_64_entry.o:(.text+0x3ec): relocation truncated
  to fit: R_PPC64_REL14 (stub) against symbol `system_reset_common'
  defined in .text section in arch/powerpc/kernel/head_64.o

Reorganise the code to use non conditional branches.

Fixes: 89d35b2 ("KVM: PPC: Book3S HV P9: Implement the rest of the P9 path in C")
Signed-off-by: Christophe Leroy <[email protected]>
[mpe: Avoid odd-looking bne ., use named local labels]
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/89cf27bf43ee07a0b2879b9e8e2f5cd6386a3645.1648366338.git.christophe.leroy@csgroup.eu
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 544808f upstream.

At the moment the GIC IRQ domain translation routine happily converts
ACPI table GSI numbers below 16 to GIC SGIs (Software Generated
Interrupts aka IPIs). On the Devicetree side we explicitly forbid this
translation, actually the function will never return HWIRQs below 16 when
using a DT based domain translation.

We expect SGIs to be handled in the first part of the function, and any
further occurrence should be treated as a firmware bug, so add a check
and print to report this explicitly and avoid lengthy debug sessions.

Fixes: 64b499d ("irqchip/gic-v3: Configure SGIs as standard interrupts")
Signed-off-by: Andre Przywara <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit a431dbb upstream.

The gcc 12 compiler reports a "'mem_section' will never be NULL" warning
on the following code:

    static inline struct mem_section *__nr_to_section(unsigned long nr)
    {
    #ifdef CONFIG_SPARSEMEM_EXTREME
        if (!mem_section)
                return NULL;
    #endif
        if (!mem_section[SECTION_NR_TO_ROOT(nr)])
                return NULL;
       :

It happens with CONFIG_SPARSEMEM_EXTREME off.  The mem_section definition
is

    #ifdef CONFIG_SPARSEMEM_EXTREME
    extern struct mem_section **mem_section;
    #else
    extern struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT];
    #endif

In the !CONFIG_SPARSEMEM_EXTREME case, mem_section is a static
2-dimensional array and so the check "!mem_section[SECTION_NR_TO_ROOT(nr)]"
doesn't make sense.

Fix this warning by moving the "!mem_section[SECTION_NR_TO_ROOT(nr)]"
check up inside the CONFIG_SPARSEMEM_EXTREME block and adding an
explicit NR_SECTION_ROOTS check to make sure that there is no
out-of-bound array access.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: 3e34726 ("sparsemem extreme implementation")
Signed-off-by: Waiman Long <[email protected]>
Reported-by: Justin Forbes <[email protected]>
Cc: "Kirill A . Shutemov" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Rafael Aquini <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 8fd4ddd upstream.

System.map shows that vmlinux contains several instances of
__static_call_return0():

	c0004fc0 t __static_call_return0
	c0011518 t __static_call_return0
	c00d8160 t __static_call_return0

arch_static_call_transform() uses the middle one to check whether we are
setting a call to __static_call_return0 or not:

	c0011520 <arch_static_call_transform>:
	c0011520:       3d 20 c0 01     lis     r9,-16383	<== r9 =  0xc001 << 16
	c0011524:       39 29 15 18     addi    r9,r9,5400	<== r9 += 0x1518
	c0011528:       7c 05 48 00     cmpw    r5,r9		<== r9 has value 0xc0011518 here

So if static_call_update() is called with one of the other instances of
__static_call_return0(), arch_static_call_transform() won't recognise it.

In order to work properly, global single instance of __static_call_return0() is required.

Fixes: 3f2a8fc ("static_call/x86: Add __static_call_return0()")
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Josh Poimboeuf <[email protected]>
Link: https://lkml.kernel.org/r/30821468a0e7d28251954b578e5051dc09300d04.1647258493.git.christophe.leroy@csgroup.eu
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 584b018 upstream.

In preparation for not necessarily having a file assigned at prep time,
defer any initialization associated with the file to when the opcode
handler is run.

Cc: [email protected] # v5.15+
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 5106dd6 upstream.

We'll need this in a future patch, when we could be assigning the file
after the prep stage. While at it, get rid of the io_file_get() helper,
it just makes the code harder to read.

Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 6bf9c47 upstream.

If an application uses direct open or accept, it knows in advance what
direct descriptor value it will get as it picks it itself. This allows
combined requests such as:

sqe = io_uring_get_sqe(ring);
io_uring_prep_openat_direct(sqe, ..., file_slot);
sqe->flags |= IOSQE_IO_LINK | IOSQE_CQE_SKIP_SUCCESS;

sqe = io_uring_get_sqe(ring);
io_uring_prep_read(sqe,file_slot, buf, buf_size, 0);
sqe->flags |= IOSQE_FIXED_FILE;

io_uring_submit(ring);

where we prepare both a file open and read, and only get a completion
event for the read when both have completed successfully.

Currently links are fully prepared before the head is issued, but that
fails if the dependent link needs a file assigned that isn't valid until
the head has completed.

Conversely, if the same chain is performed but the fixed file slot is
already valid, then we would be unexpectedly returning data from the
old file slot rather than the newly opened one. Make sure we're
consistent here.

Allow deferral of file setup, which makes this documented case work.

Cc: [email protected] # v5.15+
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit d536123 upstream.

io_uring tracks requests that are referencing an io_uring descriptor to
be able to cancel without worrying about loops in the references. Since
we now assign the file at execution time, the easier approach is to drop
a potentially problematic reference before we punt the request. This
eliminates the need to special case these types of files beyond just
marking them as such, and simplifies cancelation quite a bit.

This also fixes a recent issue where an async punted tee operation would
with the io_uring descriptor as the output file would crash when
attempting to get a reference to the file from the io-wq worker. We
could have worked around that, but this is the much cleaner fix.

Fixes: 6bf9c47 ("io_uring: defer file assignment")
Reported-by: [email protected]
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Ronald Warsow <[email protected]>
Tested-by: Ron Economos <[email protected]>
Tested-by: Justin M. Forbes <[email protected]>
Tested-by: Shuah Khan <[email protected]>
Tested-by: Fox Chen <[email protected]>
Tested-by: Zan Aziz <[email protected]>
Tested-by: Guenter Roeck <[email protected]>
Tested-by: Linux Kernel Functional Testing <[email protected]>
Tested-by: Jiri Slaby <[email protected]>
Tested-by: Rudi Heitbaum <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit e7031d8 upstream.

[Why]
To debug when p-state is being blocked and avoid PMFW hangs when
it does occur.

[How]
Re-use the DCN10 hardware sequencer by adding a new interface for
verifying p-state high on the hubbub. The interface is mostly the
same as the DCN10 interface, but the bit definitions have changed for
the debug bus.

Signed-off-by: Nicholas Kazlauskas <[email protected]>
Reviewed-by: Eric Yang <[email protected]>
Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 3107e1a upstream.

[Why]
It changed since dcn30 but the hubbub31 constructor hasn't been
modified to reflect this.

[How]
Update the value in the constructor to 0x6 so we're checking the right
bits for p-state allow.

It worked before by accident, but can falsely assert 0 depending on HW
state transitions. The most frequent of which appears to be when
all pipes turn off during IGT tests.

Cc: Harry Wentland <[email protected]>

Fixes: e7031d8 ("drm/amd/display: Add pstate verification and recovery for DCN31")
Signed-off-by: Nicholas Kazlauskas <[email protected]>
Reviewed-by: Eric Yang <[email protected]>
Acked-by: Harry Wentland <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 01f6c73 upstream.

Currently the first thing checked is whether the PCSI cpu_suspend function
has been initialized.

Another change will be overloading `acpi_processor_ffh_lpi_probe` and
calling it sooner.  So make the `has_lpi` check the first thing checked
to prepare for that change.

Reviewed-by: Sudeep Holla <[email protected]>
Signed-off-by: Mario Limonciello <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit eb087f3 upstream.

When `osc_pc_lpi_support_confirmed` is set through `_OSC` and `_LPI` is
populated then the cpuidle driver assumes that LPI is fully functional.

However currently the kernel only provides architectural support for LPI
on ARM.  This leads to high power consumption on X86 platforms that
otherwise try to enable LPI.

So probe whether or not LPI support is implemented before enabling LPI in
the kernel.  This is done by overloading `acpi_processor_ffh_lpi_probe` to
check whether it returns `-EOPNOTSUPP`. It also means that all future
implementations of `acpi_processor_ffh_lpi_probe` will need to follow
these semantics as well.

Reviewed-by: Sudeep Holla <[email protected]>
Signed-off-by: Mario Limonciello <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 907e772 upstream.

Currently there is no way for Realtek DSA subdrivers to serialize
consecutive regmap accesses. In preparation for a bugfix relating to
indirect PHY register access - which involves a series of regmap
reads and writes - add a facility for subdrivers to serialize their
regmap access.

Specifically, a mutex is added to the driver private data structure and
the standard regmap is initialized with custom lock/unlock ops which use
this mutex. Then, a "nolock" variant of the regmap is added, which is
functionally equivalent to the existing regmap except that regmap
locking is disabled. Functions that wish to serialize a sequence of
regmap accesses may then lock the newly introduced driver-owned mutex
before using the nolock regmap.

Doing things this way means that subdriver code that doesn't care about
serialized register access - i.e. the vast majority of code - needn't
worry about synchronizing register access with an external lock: it can
just continue to use the original regmap.

Another advantage of this design is that, while regmaps with locking
disabled do not expose a debugfs interface for obvious reasons, there
still exists the original regmap which does expose this interface. This
interface remains safe to use even combined with driver codepaths that
use the nolock regmap, because said codepaths will use the same mutex
to synchronize access.

With respect to disadvantages, it can be argued that having
near-duplicate regmaps is confusing. However, the naming is rather
explicit, and examples will abound.

Finally, while we are at it, rename realtek_smi_mdio_regmap_config to
realtek_smi_regmap_config. This makes it consistent with the naming
realtek_mdio_regmap_config in realtek-mdio.c.

Signed-off-by: Alvin Šipraga <[email protected]>
Reviewed-by: Vladimir Oltean <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
[alsi: backport to 5.16: s/priv/smi/g and remove realtek-mdio changes]
Signed-off-by: Alvin Šipraga <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 2796728 upstream.

Realtek switches in the rtl8365mb family can access the PHY registers of
the internal PHYs via the switch registers. This method is called
indirect access. At a high level, the indirect PHY register access
method involves reading and writing some special switch registers in a
particular sequence. This works for both SMI and MDIO connected
switches.

Currently the rtl8365mb driver does not take any care to serialize the
aforementioned access to the switch registers. In particular, it is
permitted for other driver code to access other switch registers while
the indirect PHY register access is ongoing. Locking is only done at the
regmap level. This, however, is a bug: concurrent register access, even
to unrelated switch registers, risks corrupting the PHY register value
read back via the indirect access method described above.

Arınç reported that the switch sometimes returns nonsense data when
reading the PHY registers. In particular, a value of 0 causes the
kernel's PHY subsystem to think that the link is down, but since most
reads return correct data, the link then flip-flops between up and down
over a period of time.

The aforementioned bug can be readily observed by:

 1. Enabling ftrace events for regmap and mdio
 2. Polling BSMR PHY register for a connected port;
    it should always read the same (e.g. 0x79ed)
 3. Wait for step 2 to give a different value

Example command for step 2:

    while true; do phytool read swp2/2/0x01; done

On my i.MX8MM, the above steps will yield a bogus value for the BSMR PHY
register within a matter of seconds. The interleaved register access it
then evident in the trace log:

 kworker/3:4-70      [003] .......  1927.139849: regmap_reg_write: ethernet-switch reg=1004 val=bd
     phytool-16816   [002] .......  1927.139979: regmap_reg_read: ethernet-switch reg=1f01 val=0
 kworker/3:4-70      [003] .......  1927.140381: regmap_reg_read: ethernet-switch reg=1005 val=0
     phytool-16816   [002] .......  1927.140468: regmap_reg_read: ethernet-switch reg=1d15 val=a69
 kworker/3:4-70      [003] .......  1927.140864: regmap_reg_read: ethernet-switch reg=1003 val=0
     phytool-16816   [002] .......  1927.140955: regmap_reg_write: ethernet-switch reg=1f02 val=2041
 kworker/3:4-70      [003] .......  1927.141390: regmap_reg_read: ethernet-switch reg=1002 val=0
     phytool-16816   [002] .......  1927.141479: regmap_reg_write: ethernet-switch reg=1f00 val=1
 kworker/3:4-70      [003] .......  1927.142311: regmap_reg_write: ethernet-switch reg=1004 val=be
     phytool-16816   [002] .......  1927.142410: regmap_reg_read: ethernet-switch reg=1f01 val=0
 kworker/3:4-70      [003] .......  1927.142534: regmap_reg_read: ethernet-switch reg=1005 val=0
     phytool-16816   [002] .......  1927.142618: regmap_reg_read: ethernet-switch reg=1f04 val=0
     phytool-16816   [002] .......  1927.142641: mdio_access: SMI-0 read  phy:0x02 reg:0x01 val:0x0000 <- ?!
 kworker/3:4-70      [003] .......  1927.143037: regmap_reg_read: ethernet-switch reg=1001 val=0
 kworker/3:4-70      [003] .......  1927.143133: regmap_reg_read: ethernet-switch reg=1000 val=2d89
 kworker/3:4-70      [003] .......  1927.143213: regmap_reg_write: ethernet-switch reg=1004 val=be
 kworker/3:4-70      [003] .......  1927.143291: regmap_reg_read: ethernet-switch reg=1005 val=0
 kworker/3:4-70      [003] .......  1927.143368: regmap_reg_read: ethernet-switch reg=1003 val=0
 kworker/3:4-70      [003] .......  1927.143443: regmap_reg_read: ethernet-switch reg=1002 val=6

The kworker here is polling MIB counters for stats, as evidenced by the
register 0x1004 that we are writing to (RTL8365MB_MIB_ADDRESS_REG). This
polling is performed every 3 seconds, but is just one example of such
unsynchronized access. In Arınç's case, the driver was not using the
switch IRQ, so the PHY subsystem was itself doing polling analogous to
phytool in the above example.

A test module was created [see second Link] to simulate such spurious
switch register accesses while performing indirect PHY register reads
and writes. Realtek was also consulted to confirm whether this is a
known issue or not. The conclusion of these lines of inquiry is as
follows:

1. Reading of PHY registers via indirect access will be aborted if,
   after executing the read operation (via a write to the
   INDIRECT_ACCESS_CTRL_REG), any register is accessed, other than
   INDIRECT_ACCESS_STATUS_REG.

2. The PHY register indirect read is only complete when
   INDIRECT_ACCESS_STATUS_REG reads zero.

3. The INDIRECT_ACCESS_DATA_REG, which is read to get the result of the
   PHY read, will contain the result of the last successful read
   operation. If there was spurious register access and the indirect
   read was aborted, then this register is not guaranteed to hold
   anything meaningful and the PHY read will silently fail.

4. PHY writes do not appear to be affected by this mechanism.

5. Other similar access routines, such as for MIB counters, although
   similar to the PHY indirect access method, are actually table access.
   Table access is not affected by spurious reads or writes of other
   registers. However, concurrent table access is not allowed. Currently
   this is protected via mib_lock, so there is nothing to fix.

The above statements are corroborated both via the test module and
through consultation with Realtek. In particular, Realtek states that
this is simply a property of the hardware design and is not a hardware
bug.

To fix this problem, one must guard against regmap access while the
PHY indirect register read is executing. Fix this by using the newly
introduced "nolock" regmap in all PHY-related functions, and by aquiring
the regmap mutex at the top level of the PHY register access callbacks.
Although no issue has been observed with PHY register _writes_, this
change also serializes the indirect access method there. This is done
purely as a matter of convenience and for reasons of symmetry.

Fixes: 4af2950 ("net: dsa: realtek-smi: add rtl8365mb subdriver for RTL8365MB-VC")
Link: https://lore.kernel.org/netdev/CAJq09z5FCgG-+jVT7uxh1a-0CiiFsoKoHYsAWJtiKwv7LXKofQ@mail.gmail.com/
Link: https://lore.kernel.org/netdev/[email protected]/
Reported-by: Arınç ÜNAL <[email protected]>
Reported-by: Luiz Angelo Daros de Luca <[email protected]>
Signed-off-by: Alvin Šipraga <[email protected]>
Reviewed-by: Vladimir Oltean <[email protected]>
Reviewed-by: Linus Walleij <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
[alsi: backport to 5.16: s/priv/smi/g]
Cc: [email protected] # v5.16+
Signed-off-by: Alvin Šipraga <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 109d899 upstream.

The kernel test robot reported build warnings with a randconfig that
built realtek-{smi,mdio} without CONFIG_OF set. Since both interface
drivers are using OF and will not probe without, add the corresponding
dependency to Kconfig.

Link: https://lore.kernel.org/all/[email protected]/
Link: https://lore.kernel.org/all/[email protected]/
Fixes: aac9400 ("net: dsa: realtek: add new mdio interface for drivers")
Fixes: 765c39a ("net: dsa: realtek: convert subdrivers into modules")
Signed-off-by: Alvin Šipraga <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Acked-by: Luiz Angelo Daros de Luca <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
[alsi: backport to 5.16: remove mdio part]
Cc: [email protected] # v5.16+
Signed-off-by: Alvin Šipraga <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit ad3fc79 upstream.

After commit 92082d4 ("btrfs: integrate page status update for
data read path into begin/end_page_read"), the 'nr' counter at
btrfs_do_readpage() is no longer used, we increment it but we never
read from it. So just remove it.

Reviewed-by: Johannes Thumshirn <[email protected]>
Signed-off-by: Filipe Manana <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Cc: Nathan Chancellor <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
…ps()

commit 6d4a6b5 upstream.

Clang's version of -Wunused-but-set-variable recently gained support for
unary operations, which reveals two unused variables:

  fs/btrfs/block-group.c:2949:6: error: variable 'num_started' set but not used [-Werror,-Wunused-but-set-variable]
          int num_started = 0;
              ^
  fs/btrfs/block-group.c:3116:6: error: variable 'num_started' set but not used [-Werror,-Wunused-but-set-variable]
          int num_started = 0;
              ^
  2 errors generated.

These variables appear to be unused from their introduction, so just
remove them to silence the warnings.

Fixes: c9dc4c6 ("Btrfs: two stage dirty block group writeout")
Fixes: 1bbc621 ("Btrfs: allow block group cache writeout outside critical section in commit")
CC: [email protected] # 5.4+
Link: ClangBuiltLinux#1614
Signed-off-by: Nathan Chancellor <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 8c3ce49 upstream.

We might have RISC-V systems (such as QEMU) where VMID is not part
of the TLB entry tag so these systems will have to flush all TLB
entries upon any change in hgatp.VMID.

Currently, we zero-out hgatp CSR in kvm_arch_vcpu_put() and we
re-program hgatp CSR in kvm_arch_vcpu_load(). For above described
systems, this will flush all TLB entries whenever VCPU exits to
user-space hence reducing performance.

This patch fixes above described performance issue by not clearing
hgatp CSR in kvm_arch_vcpu_put().

Fixes: 34bde9d ("RISC-V: KVM: Implement VCPU world-switch")
Cc: [email protected]
Signed-off-by: Anup Patel <[email protected]>
Signed-off-by: Anup Patel <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 3ae87d2 upstream.

Fix firmware file names assignment in si2157 tuner, allow for running
devices without firmware files needed.

modprobe gives error: unknown chip version Si2147-A30 ROM 0x50
Device initialization is interrupted.

Caused by:
1. table si2157_tuners has swapped fields rom_id and required vs struct
   si2157_tuner_info.
2. both firmware file names can be null for devices with
   required == false - device uses build-in firmware in this case

Tested on this device:
	m07ca:1871 AVerMedia Technologies, Inc. TD310 DVB-T/T2/C dongle

[mchehab: fix mangled patch]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215726
Link: https://lore.kernel.org/lkml/[email protected]/
Link: https://lore.kernel.org/linux-media/[email protected]
Fixes: 1c35ba3 ("media: si2157: use a different namespace for firmware")
Cc: [email protected] # 5.17.x
Signed-off-by: Piotr Chmura <[email protected]>
Tested-by: Robert Schlabbach <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 55037ed upstream.

Add include guard wrapper define to uapi/linux/stddef.h to prevent macro
redefinition errors when stddef.h is included more than once. This was not
needed before since the only contents already used a redefinition test.

Signed-off-by: Tadeusz Struk <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Fixes: 50d7bd3 ("stddef: Introduce struct_group() helper macro")
Cc: [email protected]
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 887f75c upstream.

DP/HDMI audio on AMD PRO VII stops working after S3:
[  149.450391] amdgpu 0000:63:00.0: amdgpu: MODE1 reset
[  149.450395] amdgpu 0000:63:00.0: amdgpu: GPU mode1 reset
[  149.450494] amdgpu 0000:63:00.0: amdgpu: GPU psp mode1 reset
[  149.983693] snd_hda_intel 0000:63:00.1: refused to change power state from D0 to D3hot
[  150.003439] amdgpu 0000:63:00.0: refused to change power state from D0 to D3hot
...
[  155.432975] snd_hda_intel 0000:63:00.1: CORB reset timeout#2, CORBRP = 65535

The offending commit is daf8de0 ("drm/amdgpu: always reset the asic in
suspend (v2)"). Commit 34452ac ("drm/amdgpu: don't use BACO for
reset in S3 ") doesn't help, so the issue is something different.

Assuming that to make HDA resume to D0 fully realized, it needs to be
successfully put to D3 first. And this guesswork proves working, by
moving amdgpu_asic_reset() to noirq callback, so it's called after HDA
function is in D3.

Fixes: daf8de0 ("drm/amdgpu: always reset the asic in suspend (v2)")
Signed-off-by: Kai-Heng Feng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 6d82ad1 upstream.

Running generic/406 causes the following WARNING in btrfs_destroy_inode()
which tells there are outstanding extents left.

In btrfs_get_blocks_direct_write(), we reserve a temporary outstanding
extents with btrfs_delalloc_reserve_metadata() (or indirectly from
btrfs_delalloc_reserve_space(()). We then release the outstanding extents
with btrfs_delalloc_release_extents(). However, the "len" can be modified
in the COW case, which releases fewer outstanding extents than expected.

Fix it by calling btrfs_delalloc_release_extents() for the original length.

To reproduce the warning, the filesystem should be 1 GiB.  It's
triggering a short-write, due to not being able to allocate a large
extent and instead allocating a smaller one.

  WARNING: CPU: 0 PID: 757 at fs/btrfs/inode.c:8848 btrfs_destroy_inode+0x1e6/0x210 [btrfs]
  Modules linked in: btrfs blake2b_generic xor lzo_compress
  lzo_decompress raid6_pq zstd zstd_decompress zstd_compress xxhash zram
  zsmalloc
  CPU: 0 PID: 757 Comm: umount Not tainted 5.17.0-rc8+ #101
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS d55cb5a 04/01/2014
  RIP: 0010:btrfs_destroy_inode+0x1e6/0x210 [btrfs]
  RSP: 0018:ffffc9000327bda8 EFLAGS: 00010206
  RAX: 0000000000000000 RBX: ffff888100548b78 RCX: 0000000000000000
  RDX: 0000000000026900 RSI: 0000000000000000 RDI: ffff888100548b78
  RBP: ffff888100548940 R08: 0000000000000000 R09: ffff88810b48aba8
  R10: 0000000000000001 R11: ffff8881004eb240 R12: ffff88810b48a800
  R13: ffff88810b48ec08 R14: ffff88810b48ed00 R15: ffff888100490c68
  FS:  00007f8549ea0b80(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f854a09e733 CR3: 000000010a2e9003 CR4: 0000000000370eb0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   <TASK>
   destroy_inode+0x33/0x70
   dispose_list+0x43/0x60
   evict_inodes+0x161/0x1b0
   generic_shutdown_super+0x2d/0x110
   kill_anon_super+0xf/0x20
   btrfs_kill_super+0xd/0x20 [btrfs]
   deactivate_locked_super+0x27/0x90
   cleanup_mnt+0x12c/0x180
   task_work_run+0x54/0x80
   exit_to_user_mode_prepare+0x152/0x160
   syscall_exit_to_user_mode+0x12/0x30
   do_syscall_64+0x42/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xae
   RIP: 0033:0x7f854a000fb7

Fixes: f0bfa76 ("btrfs: fix ENOSPC failure when attempting direct IO write into NOCOW range")
CC: [email protected] # 5.17
Reviewed-by: Johannes Thumshirn <[email protected]>
Tested-by: Johannes Thumshirn <[email protected]>
Reviewed-by: Filipe Manana <[email protected]>
Signed-off-by: Naohiro Aota <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit acee08a upstream.

This restores the logic from commit 46bcff2 ("btrfs: fix compressed
write bio blkcg attribution") which added cgroup attribution to btrfs
writeback. It also adds back the REQ_CGROUP_PUNT flag for these ios.

Fixes: 9150724 ("btrfs: determine stripe boundary at bio allocation time in btrfs_submit_compressed_write")
CC: [email protected] # 5.16+
Signed-off-by: Dennis Zhou <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 820c363 upstream.

Return the allocated block group from do_chunk_alloc(). This is a
preparation patch for the next patch.

CC: [email protected] # 5.16+
Reviewed-by: Johannes Thumshirn <[email protected]>
Tested-by: Johannes Thumshirn <[email protected]>
Signed-off-by: Naohiro Aota <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
@jackpot51 jackpot51 requested review from a team April 26, 2022 00:07
@jackpot51 jackpot51 self-assigned this Apr 26, 2022
@jackpot51 jackpot51 mentioned this pull request Apr 26, 2022
@n3m0-22 n3m0-22 self-assigned this Apr 26, 2022
@leviport
Copy link
Member

Looks like Aquantia ethernet on my desktop is not causing any problems with this kernel 🎉

@n3m0-22
Copy link

n3m0-22 commented Apr 26, 2022

Kernel Releases Tests

gaze15

  • Updating to new kernel works with apt update && apt upgrade
    • No new dependencies are required for kernel update
  • system76-power still operates as expected (across Intel/NVIDIA/switchable machines)
  • Mic in
  • Audio out:
    • Laptop's built-in speakers
    • Headphones
    • Display Port
    • HDMI
  • Video out via:
    • DisplayPort
    • HDMI
  • Suspend and resume:
    • Switchable graphics laptops
      • In Hybrid graphics mode
      • In Nvidia graphics mode
      • In Integrated graphics mode
    • Nvidia desktop
    • Integrated Intel desktop
    • 150 suspend/resume cycles (fwts s3 --s3-multiple 150)
  • Graphics drivers included in the kernel (Intel/AMD)
    • Switchable/hybrid graphics and graphics switching
  • Steam
    • Steam installs via the Pop!_OS .deb in Pop!_Shop
    • Steam launches from launcher
    • Linux native game installs and runs
    • Proton game installs and runs
  • Virtualbox installs and works as expected

@n3m0-22
Copy link

n3m0-22 commented Apr 26, 2022

Kernel Releases Tests

darp6

  • Updating to new kernel works with apt update && apt upgrade
    • No new dependencies are required for kernel update
  • Mic in
  • Audio out:
    • Laptop's built-in speakers
    • Headphones
    • Display Port
    • HDMI
  • Video out via:
    • DisplayPort
    • HDMI
  • Suspend and resume:
    • Integrated Intel desktop
    • 150 suspend/resume cycles (fwts s3 --s3-multiple 150)
  • Graphics drivers included in the kernel (Intel/AMD)
  • Steam
    • Steam installs via the Pop!_OS .deb in Pop!_Shop
    • Steam launches from launcher
    • Linux native game installs and runs
    • Proton game installs and runs
  • Virtualbox installs and works as expected

@jackpot51
Copy link
Member Author

As usual, please test NVIDIA, VirtualBox, and ZFS DKMS packages

@jackpot51
Copy link
Member Author

It was requested that we test that bluetooth audio is working.

@linuxgnuru
Copy link

Tested with kudu6; zfs, nvidia, and virtualbox worked fine and DKMS packages updated.

Copy link

@n3m0-22 n3m0-22 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The darp6 won't connect to two sets of bluetooth headphones, but will connect to and plays through a bluetooth speaker.

Update
This may specific to the darp6 as both headphones and the speaker connected and played fine on the gaze15.

@leviport leviport dismissed n3m0-22’s stale review April 26, 2022 20:43

Seems to be a darp6 problem, not a regression with this kernel

@linuxgnuru
Copy link

kudu6 is not resuming from suspend with power button; only way to wake is by closing and opening the lid

@jackpot51
Copy link
Member Author

kudu6 is not resuming from suspend with power button; only way to wake is by closing and opening the lid

Is this a regression?

@linuxgnuru
Copy link

Yes it is a regression. On clean install with 5.16.19 resume worked with the power button on the kudu6.

Tested on pang11 and resume worked with both kernels so may be unique to the kudu

@n3m0-22
Copy link

n3m0-22 commented Apr 28, 2022

@linuxgnuru Did you have the 510.68.02 driver installed as well when this suspend issues happened?

@linuxgnuru
Copy link

linuxgnuru commented Apr 28, 2022

Yes the 510 driver was installed but I was in integrated mode

@n3m0-22
Copy link

n3m0-22 commented Apr 28, 2022

Thanks for the info.

@jackpot51
Copy link
Member Author

Yes the 510 driver was installed but I was in integrated mode

Which 510 driver? If you are testing this change, it should be the one we have released, not one from another PR.

@linuxgnuru
Copy link

Sorry, it has the 510.60.02 from a clean install

@jackpot51
Copy link
Member Author

Sorry, it has the 510.60.02 from a clean install

Welp, since we have now released a newer NVIDIA driver, it might be worth trying it out

@jackpot51
Copy link
Member Author

Obsolete, see #135. I'd recommend testing the kudu6 issue first.

@jackpot51 jackpot51 closed this May 2, 2022
@jackpot51 jackpot51 deleted the linux-5.17.4 branch May 2, 2022 15:22
@n3m0-22 n3m0-22 removed their assignment May 11, 2022
jackpot51 pushed a commit that referenced this pull request Sep 20, 2023
[ Upstream commit 7f74563 ]

LE Create CIS command shall not be sent before all CIS Established
events from its previous invocation have been processed. Currently it is
sent via hci_sync but that only waits for the first event, but there can
be multiple.

Make it wait for all events, and simplify the CIS creation as follows:

Add new flag HCI_CONN_CREATE_CIS, which is set if Create CIS has been
sent for the connection but it is not yet completed.

Make BT_CONNECT state to mean the connection wants Create CIS.

On events after which new Create CIS may need to be sent, send it if
possible and some connections need it. These events are:
hci_connect_cis, iso_connect_cfm, hci_cs_le_create_cis,
hci_le_cis_estabilished_evt.

The Create CIS status/completion events shall queue new Create CIS only
if at least one of the connections transitions away from BT_CONNECT, so
that we don't loop if controller is sending bogus events.

This fixes sending multiple CIS Create for the same CIS in the
"ISO AC 6(i) - Success" BlueZ test case:

< HCI Command: LE Create Co.. (0x08|0x0064) plen 9  #129 [hci0]
        Number of CIS: 2
        CIS Handle: 257
        ACL Handle: 42
        CIS Handle: 258
        ACL Handle: 42
> HCI Event: Command Status (0x0f) plen 4           #130 [hci0]
      LE Create Connected Isochronous Stream (0x08|0x0064) ncmd 1
        Status: Success (0x00)
> HCI Event: LE Meta Event (0x3e) plen 29           #131 [hci0]
      LE Connected Isochronous Stream Established (0x19)
        Status: Success (0x00)
        Connection Handle: 257
        ...
< HCI Command: LE Setup Is.. (0x08|0x006e) plen 13  #132 [hci0]
        ...
> HCI Event: Command Complete (0x0e) plen 6         #133 [hci0]
      LE Setup Isochronous Data Path (0x08|0x006e) ncmd 1
        ...
< HCI Command: LE Create Co.. (0x08|0x0064) plen 5  #134 [hci0]
        Number of CIS: 1
        CIS Handle: 258
        ACL Handle: 42
> HCI Event: Command Status (0x0f) plen 4           #135 [hci0]
      LE Create Connected Isochronous Stream (0x08|0x0064) ncmd 1
        Status: ACL Connection Already Exists (0x0b)
> HCI Event: LE Meta Event (0x3e) plen 29           #136 [hci0]
      LE Connected Isochronous Stream Established (0x19)
        Status: Success (0x00)
        Connection Handle: 258
        ...

Fixes: c09b80b ("Bluetooth: hci_conn: Fix not waiting for HCI_EVT_LE_CIS_ESTABLISHED")
Signed-off-by: Pauli Virtanen <[email protected]>
Signed-off-by: Luiz Augusto von Dentz <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.