[5.15] Track Steam performance patches #16

kakra · 2021-11-13T17:15:01Z

Export patch series: https://github.com/kakra/linux/pull/16.patch

orbea · 2022-03-29T22:19:54Z

@kakra Is there any information on usage for these patches? I enabled the winesync kernel module, but I am not sure what else needs to be done?

kakra · 2022-03-30T00:55:47Z

The winesync kernel module itself does nothing unless you also use the appropriate wine patches. There's very low activity around these patches, last time I looked at it, some commits still contained todo and fixme lines. It seems it has since been rebased. The patches should be here: https://repo.or.cz/wine/zf.git/shortlog/refs/heads/fastsync3

This is the commit, that actually opens the winesync device: https://repo.or.cz/wine/zf.git/commitdiff/85481c0a11baabc529c252fd36e58ee9e626860d (search for /dev/winesync) ... just to identify we are using the correct fastsync branch. The other patches following it will be needed, too, and the previous patch.

I don't think that any of the major protonized wine distributions currently include these patches so you'd need to rebase the patchset yourself. They'll conflict with esync and fsync so you'd need to either remove those, or integrate it properly (those conflicts should be easy to resolve, just take care to enable winesync with highest priority when detected, rebuilding of the wineserver protocol headers will be needed after resolving conflicts).

It looks like Tkg has limited support for it but it's not enabled by default: https://github.com/Frogging-Family/wine-tkg-git/tree/master/wine-tkg-git/wine-tkg-patches/misc/fastsync

So with the kernel module enabled, you may receive better support from Tkg on how to actually make use of the module in wine.

Since winesync is still in very early (and silent) stages, I do not currently look into updates for the kernel module, it might be out of date. But I don't think there have been any critical updates to it. If you find evidence for a new kernel patch revision, let me know, and I'd happily update this patchset.

orbea · 2022-03-30T02:53:01Z

Thank you, that is useful information. The xanmod project also seems to have kernel patches for winesync, but I haven't compared the patches with what is here.

https://xanmod.org/

If none of the major proton wine builds are using these patches how does this PR relate to steam? Maybe there is something else I missed?

kakra · 2022-03-30T03:27:32Z

I just collect patches somehow related to my Steam installation here... It'll also improve non-Steam gaming probably. It's just a personal preference, and I had to give it a name. OTOH, we have to consider that these changes were mostly pushed by Valve activities (directly and indirectly) - so this gives "Steam" some credit for it. ;-)

kakra · 2022-03-30T03:28:59Z

I'm using sets of patches for different systems, this is one of the kernel patchsets I'm using for the system that has Steam installed - maybe take it that way... ;-)

orbea · 2022-03-30T14:13:30Z

That makes sense, thanks for explaining! I'm not sure how much time I will spend getting winesync to work right now, but I'll update here if I have any more information. :)

[ Upstream commit e4a41c2 ] The following error is reported when running "./test_progs -t for_each" under arm64: bpf_jit: multi-func JIT bug 58 != 56 [...] JIT doesn't support bpf-to-bpf calls The root cause is the size of BPF_PSEUDO_FUNC instruction increases from 2 to 3 after the address of called bpf-function is settled and there are two bpf-to-bpf calls in test_pkt_access. The generated instructions are shown below: 0x48: 21 00 C0 D2 movz x1, #0x1, lsl #32 0x4c: 21 00 80 F2 movk x1, #0x1 0x48: E1 3F C0 92 movn x1, #0x1ff, lsl #32 0x4c: 41 FE A2 F2 movk x1, #0x17f2, lsl #16 0x50: 81 70 9F F2 movk x1, #0xfb84 Fixing it by using emit_addr_mov_i64() for BPF_PSEUDO_FUNC, so the size of jited image will not change. Fixes: 69c087b ("bpf: Add bpf_for_each_map_elem() helper") Signed-off-by: Hou Tao <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Sasha Levin <[email protected]>

Add support to wait on multiple futexes. This is the interface implemented by this syscall: futex_waitv(struct futex_waitv *waiters, unsigned int nr_futexes, unsigned int flags, struct timespec *timeout, clockid_t clockid) struct futex_waitv { __u64 val; __u64 uaddr; __u32 flags; __u32 __reserved; }; Given an array of struct futex_waitv, wait on each uaddr. The thread wakes if a futex_wake() is performed at any uaddr. The syscall returns immediately if any waiter has *uaddr != val. *timeout is an optional absolute timeout value for the operation. This syscall supports only 64bit sized timeout structs. The flags argument of the syscall should be empty, but it can be used for future extensions. Flags for shared futexes, sizes, etc. should be used on the individual flags of each waiter. __reserved is used for explicit padding and should be 0, but it might be used for future extensions. If the userspace uses 32-bit pointers, it should make sure to explicitly cast it when assigning to waitv::uaddr. Returns the array index of one of the woken futexes. There’s no given information of how many were woken, or any particular attribute of it (if it’s the first woken, if it is of the smaller index...). Signed-off-by: André Almeida <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]

Wire up syscall entry point for x86 arch, for both i386 and x86_64. Signed-off-by: André Almeida <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]

Signed-off-by: Kai Krakow <[email protected]>

…C_IOC_WAIT_ANY. Signed-off-by: Kai Krakow <[email protected]>

…C_IOC_WAIT_ALL. Signed-off-by: Kai Krakow <[email protected]>

Signed-off-by: Kai Krakow <[email protected]>

…events. Signed-off-by: Kai Krakow <[email protected]>

Signed-off-by: Kai Krakow <[email protected]>

Currently, the table that stores information about the connected hidraw devices has a mutex to prevent concurrent hidraw users to manipulate the hidraw table (e.g. delete an entry) while someone is trying to use the table (e.g. issuing an ioctl to the device), preventing the kernel to referencing a NULL pointer. However, since that every user that wants to access the table for both manipulating it and reading it content, this prevents concurrent access to the table for read-only operations for different or the same device (e.g. two hidraw ioctls can't happen at the same time, even if they are completely unrelated). This proves to be a bottleneck and gives performance issues when using multiple HID devices at same time, like VR kits where one can have two controllers, the headset and some tracking sensors. To improve the performance, replace the table mutex with a read-write semaphore, enabling multiple threads to issue parallel syscalls to multiple devices at the same time while protecting the table for concurrent modifications. Signed-off-by: André Almeida <[email protected]>

Use [defer+madvise] as default khugepaged defrag strategy: For some reason, the default strategy to respond to THP fault fallbacks is still just madvise, meaning stall if the program wants transparent hugepages, but don't trigger a background reclaim / compaction if THP begins to fail allocations. This creates a snowball affect where we still use the THP code paths, but we almost always fail once a system has been active and busy for a while. The option "defer" was created for interactive systems where THP can still improve performance. If we have to fallback to a regular page due to an allocation failure or anything else, we will trigger a background reclaim and compaction so future THP attempts succeed and previous attempts eventually have their smaller pages combined without stalling running applications. We still want madvise to stall applications that explicitely want THP, so defer+madvise _does_ make a ton of sense. Make it the default for interactive systems, especially if the kernel maintainer left transparent hugepages on "always". Reasoning and details in the original patch: https://lwn.net/Articles/711248/ Signed-off-by: Kai Krakow <[email protected]>

Signed-off-by: Oleksandr Natalenko <[email protected]>

Signed-off-by: Kai Krakow <[email protected]>

Also add ifdefs so that elevator_get_default() remains unchanged with respect to upstream if CONFIG_IOSCHED_BFQ is disabled. Signed-off-by: Juuso Alasuutari <[email protected]>

Signed-off-by: Alexandre Frade <[email protected]>

Signed-off-by: Kai Krakow <[email protected]>

…a single balance run. Signed-off-by: Alexandre Frade <[email protected]>

[ Upstream commit a699781 ] A sysfs reader can race with a device reset or removal, attempting to read device state when the device is not actually present. eg: [exception RIP: qed_get_current_link+17] #8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede] #9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3 #10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4 #11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300 #12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c torvalds#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b #14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3 #15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1 #16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f #17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb crash> struct net_device.state ffff9a9d21336000 state = 5, state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100). The device is not present, note lack of __LINK_STATE_PRESENT (0b10). This is the same sort of panic as observed in commit 4224cfd ("net-sysfs: add check for netdevice being present to speed_show"). There are many other callers of __ethtool_get_link_ksettings() which don't have a device presence check. Move this check into ethtool to protect all callers. Fixes: d519e17 ("net: export device speed and duplex via sysfs") Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show") Signed-off-by: Jamie Bainbridge <[email protected]> Link: https://patch.msgid.link/8bae218864beaa44ed01628140475b9bf641c5b0.1724393671.git.jamie.bainbridge@gmail.com Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

kakra mentioned this pull request Nov 13, 2021

[5.10] Track Steam performance patches #10

Closed

kakra force-pushed the rebase-5.15/steam-patches branch from d3b6690 to cde56e4 Compare June 2, 2022 07:11

kakra force-pushed the rebase-5.15/steam-patches branch from cde56e4 to 4d58cd7 Compare August 1, 2022 22:03

kakra added the done To be superseded by next LTS label Dec 27, 2022

kakra force-pushed the rebase-5.15/steam-patches branch from 4d58cd7 to de844b1 Compare January 10, 2023 09:21

andrealmeid and others added 18 commits January 10, 2023 10:24

futex,x86: Wire up sys_futex_waitv()

c8febfb

Wire up syscall entry point for x86 arch, for both i386 and x86_64. Signed-off-by: André Almeida <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]

winesync: Introduce the winesync driver and character device.

d583c0e

Signed-off-by: Kai Krakow <[email protected]>

winesync: Reserve a minor device number and ioctl range.

e7a4892

Signed-off-by: Kai Krakow <[email protected]>

winesync: Introduce WINESYNC_IOC_CREATE_SEM and WINESYNC_IOC_DELETE.

9e72be7

Signed-off-by: Kai Krakow <[email protected]>

winesync: Introduce WINESYNC_IOC_PUT_SEM.

34bab2f

Signed-off-by: Kai Krakow <[email protected]>

winesync: Introduce WINESYNC_IOC_WAIT_ANY.

3d2c1a0

Signed-off-by: Kai Krakow <[email protected]>

winesync: Introduce WINESYNC_IOC_WAIT_ALL.

d597bb9

Signed-off-by: Kai Krakow <[email protected]>

winesync: Introduce WINESYNC_IOC_CREATE_MUTEX.

690a781

Signed-off-by: Kai Krakow <[email protected]>

winesync: Introduce WINESYNC_IOC_PUT_MUTEX.

638fa76

Signed-off-by: Kai Krakow <[email protected]>

winesync: Introduce WINESYNC_IOC_KILL_OWNER.

43e6bc2

Signed-off-by: Kai Krakow <[email protected]>

winesync: Introduce WINESYNC_IOC_READ_SEM.

0259383

Signed-off-by: Kai Krakow <[email protected]>

winesync: Introduce WINESYNC_IOC_READ_MUTEX.

e37374c

Signed-off-by: Kai Krakow <[email protected]>

docs: winesync: Add documentation for the winesync uAPI.

e6e331c

Signed-off-by: Kai Krakow <[email protected]>

selftests: winesync: Add some tests for semaphore state.

416a66e

Signed-off-by: Kai Krakow <[email protected]>

selftests: winesync: Add some tests for mutex state.

b1d0815

Signed-off-by: Kai Krakow <[email protected]>

selftests: winesync: Add some tests for WINESYNC_IOC_WAIT_ANY.

abe2505

Signed-off-by: Kai Krakow <[email protected]>

selftests: winesync: Add some tests for WINESYNC_IOC_WAIT_ALL.

72d8579

Signed-off-by: Kai Krakow <[email protected]>

Zebediah Figura and others added 27 commits January 10, 2023 10:24

selftests: winesync: Add some tests for invalid object handling.

096b9a0

Signed-off-by: Kai Krakow <[email protected]>

selftests: winesync: Add some tests for wakeup signaling with WINESYN…

67e3d34

…C_IOC_WAIT_ANY. Signed-off-by: Kai Krakow <[email protected]>

selftests: winesync: Add some tests for wakeup signaling with WINESYN…

978a760

…C_IOC_WAIT_ALL. Signed-off-by: Kai Krakow <[email protected]>

maintainers: Add an entry for winesync.

9207ab7

Signed-off-by: Kai Krakow <[email protected]>

winesync: Introduce WINESYNC_IOC_CREATE_EVENT.

ea22739

Signed-off-by: Kai Krakow <[email protected]>

winesync: Introduce WINESYNC_IOC_SET_EVENT.

26b8d0c

Signed-off-by: Kai Krakow <[email protected]>

winesync: Introduce WINESYNC_IOC_RESET_EVENT.

9245256

Signed-off-by: Kai Krakow <[email protected]>

winesync: Introduce WINESYNC_IOC_PULSE_EVENT.

c94cc05

Signed-off-by: Kai Krakow <[email protected]>

winesync: Introduce WINESYNC_IOC_READ_EVENT.

f95eadd

Signed-off-by: Kai Krakow <[email protected]>

selftests: winesync: Add some tests for manual-reset event state.

2e6d92d

Signed-off-by: Kai Krakow <[email protected]>

selftests: winesync: Add some tests for auto-reset event state.

ccc60ef

Signed-off-by: Kai Krakow <[email protected]>

selftests: winesync: Add some tests for wakeup signaling with events.

a08e5d1

Signed-off-by: Kai Krakow <[email protected]>

selftests: winesync: Add some tests for invalid object handling with …

1a14a45

…events. Signed-off-by: Kai Krakow <[email protected]>

docs: winesync: Document event APIs.

5fa39b7

Signed-off-by: Kai Krakow <[email protected]>

winesync: Introduce alertable waits.

1230c4e

Signed-off-by: Kai Krakow <[email protected]>

selftests: winesync: Add tests for alertable waits.

40d97b5

Signed-off-by: Kai Krakow <[email protected]>

serftests: winesync: Add some tests for wakeup signaling via alerts.

c54c87e

Signed-off-by: Kai Krakow <[email protected]>

docs: winesync: Document alertable waits.

9c2eb2a

Signed-off-by: Kai Krakow <[email protected]>

mm: protect mappings under memory pressure

3976731

Signed-off-by: Oleksandr Natalenko <[email protected]>

mm: Support soft dirty flag reset for VA range.

213451d

Signed-off-by: Kai Krakow <[email protected]>

mm: Support soft dirty flag read with reset.

f44419b

Signed-off-by: Kai Krakow <[email protected]>

blk: elevator: always use bfq unless overridden by flag

338baa5

Also add ifdefs so that elevator_get_default() remains unchanged with respect to upstream if CONFIG_IOSCHED_BFQ is disabled. Signed-off-by: Juuso Alasuutari <[email protected]>

XANMOD: block: set rq_affinity to force full multithreading I/O requests

f00cb22

Signed-off-by: Alexandre Frade <[email protected]>

Make threaded IRQs optionally the default which can be disabled.

b9881e5

Signed-off-by: Kai Krakow <[email protected]>

sched/core: nr_migrate = 128 increases number of tasks to iterate in …

f60df06

…a single balance run. Signed-off-by: Alexandre Frade <[email protected]>

kakra force-pushed the rebase-5.15/steam-patches branch from de844b1 to f60df06 Compare January 10, 2023 09:25

kakra closed this Mar 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[5.15] Track Steam performance patches #16

[5.15] Track Steam performance patches #16

kakra commented Nov 13, 2021 •

edited

Loading

orbea commented Mar 29, 2022

kakra commented Mar 30, 2022

orbea commented Mar 30, 2022

kakra commented Mar 30, 2022

kakra commented Mar 30, 2022 •

edited

Loading

orbea commented Mar 30, 2022

[5.15] Track Steam performance patches #16

[5.15] Track Steam performance patches #16

Conversation

kakra commented Nov 13, 2021 • edited Loading

orbea commented Mar 29, 2022

kakra commented Mar 30, 2022

orbea commented Mar 30, 2022

kakra commented Mar 30, 2022

kakra commented Mar 30, 2022 • edited Loading

orbea commented Mar 30, 2022

kakra commented Nov 13, 2021 •

edited

Loading

kakra commented Mar 30, 2022 •

edited

Loading