-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[5.15] Track Steam performance patches #16
Conversation
@kakra Is there any information on usage for these patches? I enabled the winesync kernel module, but I am not sure what else needs to be done? |
The winesync kernel module itself does nothing unless you also use the appropriate wine patches. There's very low activity around these patches, last time I looked at it, some commits still contained todo and fixme lines. It seems it has since been rebased. The patches should be here: https://repo.or.cz/wine/zf.git/shortlog/refs/heads/fastsync3 This is the commit, that actually opens the winesync device: https://repo.or.cz/wine/zf.git/commitdiff/85481c0a11baabc529c252fd36e58ee9e626860d (search for I don't think that any of the major protonized wine distributions currently include these patches so you'd need to rebase the patchset yourself. They'll conflict with esync and fsync so you'd need to either remove those, or integrate it properly (those conflicts should be easy to resolve, just take care to enable winesync with highest priority when detected, rebuilding of the wineserver protocol headers will be needed after resolving conflicts). It looks like Tkg has limited support for it but it's not enabled by default: https://github.com/Frogging-Family/wine-tkg-git/tree/master/wine-tkg-git/wine-tkg-patches/misc/fastsync So with the kernel module enabled, you may receive better support from Tkg on how to actually make use of the module in wine. Since winesync is still in very early (and silent) stages, I do not currently look into updates for the kernel module, it might be out of date. But I don't think there have been any critical updates to it. If you find evidence for a new kernel patch revision, let me know, and I'd happily update this patchset. |
Thank you, that is useful information. The xanmod project also seems to have kernel patches for winesync, but I haven't compared the patches with what is here. If none of the major proton wine builds are using these patches how does this PR relate to steam? Maybe there is something else I missed? |
I just collect patches somehow related to my Steam installation here... It'll also improve non-Steam gaming probably. It's just a personal preference, and I had to give it a name. OTOH, we have to consider that these changes were mostly pushed by Valve activities (directly and indirectly) - so this gives "Steam" some credit for it. ;-) |
I'm using sets of patches for different systems, this is one of the kernel patchsets I'm using for the system that has Steam installed - maybe take it that way... ;-) |
That makes sense, thanks for explaining! I'm not sure how much time I will spend getting winesync to work right now, but I'll update here if I have any more information. :) |
d3b6690
to
cde56e4
Compare
[ Upstream commit e4a41c2 ] The following error is reported when running "./test_progs -t for_each" under arm64: bpf_jit: multi-func JIT bug 58 != 56 [...] JIT doesn't support bpf-to-bpf calls The root cause is the size of BPF_PSEUDO_FUNC instruction increases from 2 to 3 after the address of called bpf-function is settled and there are two bpf-to-bpf calls in test_pkt_access. The generated instructions are shown below: 0x48: 21 00 C0 D2 movz x1, #0x1, lsl #32 0x4c: 21 00 80 F2 movk x1, #0x1 0x48: E1 3F C0 92 movn x1, #0x1ff, lsl #32 0x4c: 41 FE A2 F2 movk x1, #0x17f2, lsl #16 0x50: 81 70 9F F2 movk x1, #0xfb84 Fixing it by using emit_addr_mov_i64() for BPF_PSEUDO_FUNC, so the size of jited image will not change. Fixes: 69c087b ("bpf: Add bpf_for_each_map_elem() helper") Signed-off-by: Hou Tao <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Sasha Levin <[email protected]>
cde56e4
to
4d58cd7
Compare
4d58cd7
to
de844b1
Compare
Add support to wait on multiple futexes. This is the interface implemented by this syscall: futex_waitv(struct futex_waitv *waiters, unsigned int nr_futexes, unsigned int flags, struct timespec *timeout, clockid_t clockid) struct futex_waitv { __u64 val; __u64 uaddr; __u32 flags; __u32 __reserved; }; Given an array of struct futex_waitv, wait on each uaddr. The thread wakes if a futex_wake() is performed at any uaddr. The syscall returns immediately if any waiter has *uaddr != val. *timeout is an optional absolute timeout value for the operation. This syscall supports only 64bit sized timeout structs. The flags argument of the syscall should be empty, but it can be used for future extensions. Flags for shared futexes, sizes, etc. should be used on the individual flags of each waiter. __reserved is used for explicit padding and should be 0, but it might be used for future extensions. If the userspace uses 32-bit pointers, it should make sure to explicitly cast it when assigning to waitv::uaddr. Returns the array index of one of the woken futexes. There’s no given information of how many were woken, or any particular attribute of it (if it’s the first woken, if it is of the smaller index...). Signed-off-by: André Almeida <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
Wire up syscall entry point for x86 arch, for both i386 and x86_64. Signed-off-by: André Almeida <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
…C_IOC_WAIT_ANY. Signed-off-by: Kai Krakow <[email protected]>
…C_IOC_WAIT_ALL. Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
…events. Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Currently, the table that stores information about the connected hidraw devices has a mutex to prevent concurrent hidraw users to manipulate the hidraw table (e.g. delete an entry) while someone is trying to use the table (e.g. issuing an ioctl to the device), preventing the kernel to referencing a NULL pointer. However, since that every user that wants to access the table for both manipulating it and reading it content, this prevents concurrent access to the table for read-only operations for different or the same device (e.g. two hidraw ioctls can't happen at the same time, even if they are completely unrelated). This proves to be a bottleneck and gives performance issues when using multiple HID devices at same time, like VR kits where one can have two controllers, the headset and some tracking sensors. To improve the performance, replace the table mutex with a read-write semaphore, enabling multiple threads to issue parallel syscalls to multiple devices at the same time while protecting the table for concurrent modifications. Signed-off-by: André Almeida <[email protected]>
Use [defer+madvise] as default khugepaged defrag strategy: For some reason, the default strategy to respond to THP fault fallbacks is still just madvise, meaning stall if the program wants transparent hugepages, but don't trigger a background reclaim / compaction if THP begins to fail allocations. This creates a snowball affect where we still use the THP code paths, but we almost always fail once a system has been active and busy for a while. The option "defer" was created for interactive systems where THP can still improve performance. If we have to fallback to a regular page due to an allocation failure or anything else, we will trigger a background reclaim and compaction so future THP attempts succeed and previous attempts eventually have their smaller pages combined without stalling running applications. We still want madvise to stall applications that explicitely want THP, so defer+madvise _does_ make a ton of sense. Make it the default for interactive systems, especially if the kernel maintainer left transparent hugepages on "always". Reasoning and details in the original patch: https://lwn.net/Articles/711248/ Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Oleksandr Natalenko <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
Also add ifdefs so that elevator_get_default() remains unchanged with respect to upstream if CONFIG_IOSCHED_BFQ is disabled. Signed-off-by: Juuso Alasuutari <[email protected]>
Signed-off-by: Alexandre Frade <[email protected]>
Signed-off-by: Kai Krakow <[email protected]>
…a single balance run. Signed-off-by: Alexandre Frade <[email protected]>
de844b1
to
f60df06
Compare
[ Upstream commit a699781 ] A sysfs reader can race with a device reset or removal, attempting to read device state when the device is not actually present. eg: [exception RIP: qed_get_current_link+17] #8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede] #9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3 #10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4 #11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300 #12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c torvalds#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b #14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3 #15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1 #16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f #17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb crash> struct net_device.state ffff9a9d21336000 state = 5, state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100). The device is not present, note lack of __LINK_STATE_PRESENT (0b10). This is the same sort of panic as observed in commit 4224cfd ("net-sysfs: add check for netdevice being present to speed_show"). There are many other callers of __ethtool_get_link_ksettings() which don't have a device presence check. Move this check into ethtool to protect all callers. Fixes: d519e17 ("net: export device speed and duplex via sysfs") Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show") Signed-off-by: Jamie Bainbridge <[email protected]> Link: https://patch.msgid.link/8bae218864beaa44ed01628140475b9bf641c5b0.1724393671.git.jamie.bainbridge@gmail.com Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Export patch series: https://github.com/kakra/linux/pull/16.patch