[pull] master from torvalds:master #130

pull · 2020-09-22T23:08:41Z

See Commits and Changes for more details.

Created by pull[bot]. Want to support this open source service? Please star it : )

We should call destroy_workqueue to destroy mlme_workqueue in error branch. Fixes: ded845a ("ieee802154: Add CA8210 IEEE 802.15.4 device driver") Signed-off-by: Liu Jian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Stefan Schmidt <[email protected]>

Clang static analysis reports this error adf7242.c:887:6: warning: Assigned value is garbage or undefined len = len_u8; ^ ~~~~~~ len_u8 is set in adf7242_read_reg(lp, 0, &len_u8); When this call fails, len_u8 is not set. So check the return code. Fixes: 7302b9d ("ieee802154/adf7242: Driver for ADF7242 MAC IEEE802154") Signed-off-by: Tom Rix <[email protected]> Acked-by: Michael Hennerich <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Stefan Schmidt <[email protected]>

The 4329 throughput drops from 40.2 Mbits/sec to 190 Kbits/sec in 2G 11n mode because the commit b41c232 ("brcmfmac: reserve 2 credits for host tx control path"). To fix the issue, host driver only reserves tx control credit when there is a txctl frame is pending to send. And we also check available credit by using "not equal to 0" instead of "greater than 0" because tx_max and tx_seq are circled positive numbers. Reported-by: Dmitry Osipenko <[email protected]> Fixes: b41c232 ("brcmfmac: reserve 2 credits for host tx control path") Signed-off-by: Wright Feng <[email protected]> Tested-by: Dmitry Osipenko <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected]

Fix the check for the mainline vboxsf code being used with the old mount.vboxsf mount binary from the out-of-tree vboxsf version doing a comparison between signed and unsigned data types. This fixes the following smatch warnings: fs/vboxsf/super.c:390 vboxsf_parse_monolithic() warn: impossible condition '(options[1] == (255)) => ((-128)-127 == 255)' fs/vboxsf/super.c:391 vboxsf_parse_monolithic() warn: impossible condition '(options[2] == (254)) => ((-128)-127 == 254)' fs/vboxsf/super.c:392 vboxsf_parse_monolithic() warn: impossible condition '(options[3] == (253)) => ((-128)-127 == 253)' Reported-by: kernel test robot <[email protected]> Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Hans de Goede <[email protected]> Signed-off-by: Al Viro <[email protected]>

…ving stations The implementation of embedding WTBL update inside the STA_REC update is buggy on the MT7615 v2 firmware. This leads to connection issues after a station has connected and disconnected again. Switch to the v1 MCU API ops, since they have received much more testing and should be more stable. On MT7622 and later, the v2 API is more actively used, so we should keep using it as well. Fixes: 6849e29 ("mt76: mt7615: add starec operating flow for firmware v2") Cc: [email protected] Signed-off-by: Felix Fietkau <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected]

Using dev_kfree_skb for tx skbs breaks AQL. This worked until now only by accident, because a mac80211 issue breaks AQL on drivers with firmware rate control that report the rate via ieee80211_tx_status_ext as struct rate_info. Signed-off-by: Felix Fietkau <[email protected]> Acked-by: Toke Høiland-Jørgensen <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected]

Following commit e186967 ("mwifiex: Prevent memory corruption handling keys") the mwifiex driver fails to authenticate with certain networks, specifically networks with 256 bit keys, and repeatedly asks for the password. The kernel log repeats the following lines (id and bssid redacted): mwifiex_pcie 0000:01:00.0: info: trying to associate to '<id>' bssid <bssid> mwifiex_pcie 0000:01:00.0: info: associated to bssid <bssid> successfully mwifiex_pcie 0000:01:00.0: crypto keys added mwifiex_pcie 0000:01:00.0: info: successfully disconnected from <bssid>: reason code 3 Tracking down this problem lead to the overflow check introduced by the aforementioned commit into mwifiex_ret_802_11_key_material_v2(). This check fails on networks with 256 bit keys due to the current storage size for AES keys in struct mwifiex_aes_param being only 128 bit. To fix this issue, increase the storage size for AES keys to 256 bit. Fixes: e186967 ("mwifiex: Prevent memory corruption handling keys") Signed-off-by: Maximilian Luz <[email protected]> Reported-by: Kaloyan Nikolov <[email protected]> Tested-by: Kaloyan Nikolov <[email protected]> Reviewed-by: Dan Carpenter <[email protected]> Reviewed-by: Brian Norris <[email protected]> Tested-by: Brian Norris <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected]

It seems that due to a copy & paste error the void pointer in batadv_choose_backbone_gw() is cast to the wrong type. Fixing this by using "struct batadv_bla_backbone_gw" instead of "struct batadv_bla_claim" which better matches the caller's side. For now it seems that we were lucky because the two structs both have their orig/vid and addr/vid in the beginning. However I stumbled over this issue when I was trying to add some debug variables in front of "orig" in batadv_backbone_gw, which caused hash lookups to fail. Fixes: 07568d0 ("batman-adv: don't rely on positions in struct for hashing") Signed-off-by: Linus Lüssing <[email protected]> Signed-off-by: Sven Eckelmann <[email protected]>

While compiling libbpf, some GCC versions (at least 8.4.0) have difficulty determining control flow and a emit warning for potentially uninitialized usage of 'map', which results in a build error if using "-Werror": In file included from libbpf.c:56: libbpf.c: In function '__bpf_object__open': libbpf_internal.h:59:2: warning: 'map' may be used uninitialized in this function [-Wmaybe-uninitialized] libbpf_print(level, "libbpf: " fmt, ##__VA_ARGS__); \ ^~~~~~~~~~~~ libbpf.c:5032:18: note: 'map' was declared here struct bpf_map *map, *targ_map; ^~~ The warning/error is false based on code inspection, so silence it with a NULL initialization. Fixes: 646f02f ("libbpf: Add BTF-defined map-in-map support") Reference: 063e688 ("libbpf: Fix false uninitialized variable warning") Signed-off-by: Tony Ambardar <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]

The new resolve_btfids tool did not clean up the feature detection folder on 'make clean', and also was not called properly from the clean rule in tools/make/ folder on its 'make clean'. This lead to stale objects being left around, which could cause feature detection to fail on subsequent builds. Fixes: fbbb68d ("bpf: Add resolve_btfids tool to resolve BTF IDs in ELF object") Signed-off-by: Toke Høiland-Jørgensen <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]

Ubuntu mainline builds for ppc64le are failing with the below error (*): CALL /home/kernel/COD/linux/scripts/atomic/check-atomics.sh DESCEND bpf/resolve_btfids Auto-detecting system features: ... libelf: [ [32mon[m ] ... zlib: [ [32mon[m ] ... bpf: [ [31mOFF[m ] BPF API too old make[6]: *** [Makefile:295: bpfdep] Error 1 make[5]: *** [Makefile:54: /home/kernel/COD/linux/debian/build/build-generic/tools/bpf/resolve_btfids//libbpf.a] Error 2 make[4]: *** [Makefile:71: bpf/resolve_btfids] Error 2 make[3]: *** [/home/kernel/COD/linux/Makefile:1890: tools/bpf/resolve_btfids] Error 2 make[2]: *** [/home/kernel/COD/linux/Makefile:335: __build_one_by_one] Error 2 make[2]: Leaving directory '/home/kernel/COD/linux/debian/build/build-generic' make[1]: *** [Makefile:185: __sub-make] Error 2 make[1]: Leaving directory '/home/kernel/COD/linux' resolve_btfids needs to be build as a host binary and it needs libbpf. However, libbpf Makefile hardcodes an include path utilizing $(ARCH). This results in mixing of cross-architecture headers resulting in a build failure. The specific header include path doesn't seem necessary for a libbpf build. Hence, remove the same. (*) https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.9-rc3/ppc64el/log Reported-by: Vaidyanathan Srinivasan <[email protected]> Signed-off-by: Naveen N. Rao <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]

Currently, for hashmap, the bpf iterator will grab a bucket lock, a spinlock, before traversing the elements in the bucket. This can ensure all bpf visted elements are valid. But this mechanism may cause deadlock if update/deletion happens to the same bucket of the visited map in the program. For example, if we added bpf_map_update_elem() call to the same visited element in selftests bpf_iter_bpf_hash_map.c, we will have the following deadlock: ============================================ WARNING: possible recursive locking detected 5.9.0-rc1+ #841 Not tainted -------------------------------------------- test_progs/1750 is trying to acquire lock: ffff9a5bb73c5e70 (&htab->buckets[i].raw_lock){....}-{2:2}, at: htab_map_update_elem+0x1cf/0x410 but task is already holding lock: ffff9a5bb73c5e20 (&htab->buckets[i].raw_lock){....}-{2:2}, at: bpf_hash_map_seq_find_next+0x94/0x120 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&htab->buckets[i].raw_lock); lock(&htab->buckets[i].raw_lock); *** DEADLOCK *** ... Call Trace: dump_stack+0x78/0xa0 __lock_acquire.cold.74+0x209/0x2e3 lock_acquire+0xba/0x380 ? htab_map_update_elem+0x1cf/0x410 ? __lock_acquire+0x639/0x20c0 _raw_spin_lock_irqsave+0x3b/0x80 ? htab_map_update_elem+0x1cf/0x410 htab_map_update_elem+0x1cf/0x410 ? lock_acquire+0xba/0x380 bpf_prog_ad6dab10433b135d_dump_bpf_hash_map+0x88/0xa9c ? find_held_lock+0x34/0xa0 bpf_iter_run_prog+0x81/0x16e __bpf_hash_map_seq_show+0x145/0x180 bpf_seq_read+0xff/0x3d0 vfs_read+0xad/0x1c0 ksys_read+0x5f/0xe0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 ... The bucket_lock first grabbed in seq_ops->next() called by bpf_seq_read(), and then grabbed again in htab_map_update_elem() in the bpf program, causing deadlocks. Actually, we do not need bucket_lock here, we can just use rcu_read_lock() similar to netlink iterator where the rcu_read_{lock,unlock} likes below: seq_ops->start(): rcu_read_lock(); seq_ops->next(): rcu_read_unlock(); /* next element */ rcu_read_lock(); seq_ops->stop(); rcu_read_unlock(); Compared to old bucket_lock mechanism, if concurrent updata/delete happens, we may visit stale elements, miss some elements, or repeat some elements. I think this is a reasonable compromise. For users wanting to avoid stale, missing/repeated accesses, bpf_map batch access syscall interface can be used. Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]

Added bpf_{updata,delete}_map_elem to the very map element the iter program is visiting. Due to rcu protection, the visited map elements, although stale, should still contain correct values. $ ./test_progs -n 4/18 #4/18 bpf_hash_map:OK #4 bpf_iter:OK Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]

Yonghong Song says: ==================== Currently, the bpf hashmap iterator takes a bucket_lock, a spin_lock, before visiting each element in the bucket. This will cause a deadlock if a map update/delete operates on an element with the same bucket id of the visited map. To avoid the deadlock, let us just use rcu_read_lock instead of bucket_lock. This may result in visiting stale elements, missing some elements, or repeating some elements, if concurrent map delete/update happens for the same map. I think using rcu_read_lock is a reasonable compromise. For users caring stale/missing/repeating element issues, bpf map batch access syscall interface can be used. Note that another approach is during bpf_iter link stage, we check whether the iter program might be able to do update/delete to the visited map. If it is, reject the link_create. Verifier needs to record whether an update/delete operation happens for each map for this approach. I just feel this checking is too specialized, hence still prefer rcu_read_lock approach. Patch #1 has the kernel implementation and Patch #2 added a selftest which can trigger deadlock without Patch #1. ==================== Signed-off-by: Alexei Starovoitov <[email protected]>

PVC devices are virtual devices in this driver stacked on top of the actual HDLC device. They are the devices normal users would use. PVC devices have two types: normal PVC devices and Ethernet-emulating PVC devices. When transmitting data with PVC devices, the ndo_start_xmit function will prepend a header of 4 or 10 bytes. Currently this driver requests this headroom to be reserved for normal PVC devices by setting their hard_header_len to 10. However, this does not work when these devices are used with AF_PACKET/RAW sockets. Also, this driver does not request this headroom for Ethernet-emulating PVC devices (but deals with this problem by reallocating the skb when needed, which is not optimal). This patch replaces hard_header_len with needed_headroom, and set needed_headroom for Ethernet-emulating PVC devices, too. This makes the driver to request headroom for all PVC devices in all cases. Cc: Krzysztof Halasa <[email protected]> Cc: Martin Schiller <[email protected]> Signed-off-by: Xie He <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>

eni_init_one() misses to call pci_disable_device() in an error path. Jump to err_disable to fix it. Fixes: ede58ef ("atm: remove deprecated use of pci api") Signed-off-by: Jing Xiangfeng <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>

The following deadlock scenario is triggered by syzbot: Thread A: Thread B: tcf_idr_check_alloc() ... populate_metalist() rtnl_unlock() rtnl_lock() ... request_module() tcf_idr_check_alloc() rtnl_lock() At this point, thread A is waiting for thread B to release RTNL lock, while thread B is waiting for thread A to commit the IDR change with tcf_idr_insert() later. Break this deadlock situation by preloading ife modules earlier, before tcf_idr_check_alloc(), this is fine because we only need to load modules we need potentially. Reported-and-tested-by: [email protected] Fixes: 0190c1d ("net: sched: atomically check-allocate action") Cc: Jamal Hadi Salim <[email protected]> Cc: Vlad Buslov <[email protected]> Cc: Jiri Pirko <[email protected]> Signed-off-by: Cong Wang <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>

The unicast packet rerouting code makes several assumptions. For instance it assumes that there is always exactly one destination in the TT. This breaks for multicast frames in a unicast packets in several ways: For one thing if there is actually no TT entry and the destination node was selected due to the multicast tvlv flags it announced. Then an intermediate node will wrongly drop the packet. For another thing if there is a TT entry but the TTVN of this entry is newer than the originally addressed destination node: Then the intermediate node will wrongly redirect the packet, leading to duplicated multicast packets at a multicast listener and missing packets at other multicast listeners or multicast routers. Fixing this by not applying the unicast packet rerouting to batman-adv unicast packets with a multicast payload. We are not able to detect a roaming multicast listener at the moment and will just continue to send the multicast frame to both the new and old destination for a while in case of such a roaming multicast listener. Fixes: a73105b ("batman-adv: improved client announcement mechanism") Signed-off-by: Linus Lüssing <[email protected]> Signed-off-by: Sven Eckelmann <[email protected]> Signed-off-by: Simon Wunderlich <[email protected]>

We free memory regardless of the return value of SET_FUNC_STATE cmd in hinic_close function to avoid memory leak and this cmd may timeout when fw is busy with handling other cmds, so we bump up the timeout of this cmd to ensure it won't return failure. Fixes: 00e57a6 ("net-next/hinic: Add Tx operation") Signed-off-by: Luo bin <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>

Firmware erases the entire flash region which may take several seconds before flashing, so we bump up the timeout to ensure this cmd won't return failure. Fixes: 5e126e7 ("hinic: add firmware update support") Signed-off-by: Luo bin <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>

Luo bin says: ==================== hinic: BugFixes The bugs fixed in this patchset have been present since the following commits: patch #1: Fixes: 00e57a6 ("net-next/hinic: Add Tx operation") patch #2: Fixes: 5e126e7 ("hinic: add firmware update support") ==================== Signed-off-by: Jakub Kicinski <[email protected]>

Pass the correct offset to clear the stale filter hit bytes counter. Otherwise, the counter starts incrementing from the stale information, instead of 0. Fixes: 12b276f ("cxgb4: add support to create hash filters") Signed-off-by: Ganji Aravind <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>

PAE bit of NCFGR register, when set, pauses transmission if a non-zero 802.3 classic pause frame is received. Fixes: 7897b07 ("net: macb: convert to phylink") Signed-off-by: Parshuram Thombare <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>

When removing a port from a VLAN we are just erasing the member config for the VLAN, which is wrong: other ports can be using it. Just mask off the port and only zero out the rest of the member config once ports using of the VLAN are removed from it. Reported-by: Florian Fainelli <[email protected]> Fixes: d865295 ("net: dsa: realtek-smi: Add Realtek SMI driver") Signed-off-by: Linus Walleij <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>

This patch causes a regression betwen Kernel 5.7 and 5.8 at wlcore: with it applied, WiFi stops working, and the Kernel starts printing this message every second: wlcore: PHY firmware version: Rev 8.2.0.0.242 wlcore: firmware booted (Rev 8.9.0.0.79) wlcore: ERROR command execute failure 14 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 133 at drivers/net/wireless/ti/wlcore/main.c:795 wl12xx_queue_recovery_work.part.0+0x6c/0x74 [wlcore] Modules linked in: wl18xx wlcore mac80211 libarc4 cfg80211 rfkill snd_soc_hdmi_codec crct10dif_ce wlcore_sdio adv7511 cec kirin9xx_drm(C) kirin9xx_dw_drm_dsi(C) drm_kms_helper drm ip_tables x_tables ipv6 nf_defrag_ipv6 CPU: 0 PID: 133 Comm: kworker/0:1 Tainted: G WC 5.8.0+ #186 Hardware name: HiKey970 (DT) Workqueue: events_freezable ieee80211_restart_work [mac80211] pstate: 60000005 (nZCv daif -PAN -UAO BTYPE=--) pc : wl12xx_queue_recovery_work.part.0+0x6c/0x74 [wlcore] lr : wl12xx_queue_recovery_work+0x24/0x30 [wlcore] sp : ffff8000126c3a60 x29: ffff8000126c3a60 x28: 00000000000025de x27: 0000000000000010 x26: 0000000000000005 x25: ffff0001a5d49e80 x24: ffff8000092cf580 x23: ffff0001b7c12623 x22: ffff0001b6fcf2e8 x21: ffff0001b7e46200 x20: 00000000fffffffb x19: ffff0001a78e6400 x18: 0000000000000030 x17: 0000000000000001 x16: 0000000000000001 x15: ffff0001b7e46670 x14: ffffffffffffffff x13: ffff8000926c37d7 x12: ffff8000126c37e0 x11: ffff800011e01000 x10: ffff8000120526d0 x9 : 0000000000000000 x8 : 3431206572756c69 x7 : 6166206574756365 x6 : 0000000000000c2c x5 : 0000000000000000 x4 : ffff0001bf1361e8 x3 : ffff0001bf1790b0 x2 : 0000000000000000 x1 : ffff0001a5d49e80 x0 : 0000000000000001 Call trace: wl12xx_queue_recovery_work.part.0+0x6c/0x74 [wlcore] wl12xx_queue_recovery_work+0x24/0x30 [wlcore] wl1271_cmd_set_sta_key+0x258/0x25c [wlcore] wl1271_set_key+0x7c/0x2dc [wlcore] wlcore_set_key+0xe4/0x360 [wlcore] wl18xx_set_key+0x48/0x1d0 [wl18xx] wlcore_op_set_key+0xa4/0x180 [wlcore] ieee80211_key_enable_hw_accel+0xb0/0x2d0 [mac80211] ieee80211_reenable_keys+0x70/0x110 [mac80211] ieee80211_reconfig+0xa00/0xca0 [mac80211] ieee80211_restart_work+0xc4/0xfc [mac80211] process_one_work+0x1cc/0x350 worker_thread+0x13c/0x470 kthread+0x154/0x160 ret_from_fork+0x10/0x30 ---[ end trace b1f722abf9af5919 ]--- wlcore: WARNING could not set keys wlcore: ERROR Could not add or replace key wlan0: failed to set key (4, ff:ff:ff:ff:ff:ff) to hardware (-5) wlcore: Hardware recovery in progress. FW ver: Rev 8.9.0.0.79 wlcore: pc: 0x0, hint_sts: 0x00000040 count: 39 wlcore: down wlcore: down ieee80211 phy0: Hardware restart was requested mmc_host mmc0: Bus speed (slot 0) = 400000Hz (slot req 400000Hz, actual 400000HZ div = 0) mmc_host mmc0: Bus speed (slot 0) = 25000000Hz (slot req 25000000Hz, actual 25000000HZ div = 0) wlcore: PHY firmware version: Rev 8.2.0.0.242 wlcore: firmware booted (Rev 8.9.0.0.79) wlcore: ERROR command execute failure 14 ------------[ cut here ]------------ Tested on Hikey 970. This reverts commit 2b7aadd. Signed-off-by: Mauro Carvalho Chehab <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/f0a2cb7ea606f1a284d4c23cbf983da2954ce9b6.1598420968.git.mchehab+huawei@kernel.org

When the driver goes through PCIe AER reset in error state, all firmware messages will timeout because the PCIe bus is no longer accessible. This can lead to AER reset taking many minutes to complete as each firmware command takes time to timeout. Define a new macro BNXT_NO_FW_ACCESS() to skip these firmware messages when either firmware is in fatal error state or when pci_channel_offline() is true. It now takes a more reasonable 20 to 30 seconds to complete AER recovery. Fixes: b4fff20 ("bnxt_en: Do not send firmware messages if firmware is in error state.") Signed-off-by: Vasundhara Volam <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>

bnxt_fw_reset_task() which runs from a workqueue can race with bnxt_remove_one(). For example, if firmware reset and VF FLR are happening at about the same time. bnxt_remove_one() already cancels the workqueue and waits for it to finish, but we need to do this earlier before the devlink reporters are destroyed. This will guarantee that the devlink reporters will always be valid when bnxt_fw_reset_task() is still running. Fixes: b148bb2 ("bnxt_en: Fix possible crash in bnxt_fw_reset_task().") Reviewed-by: Edwin Peer <[email protected]> Signed-off-by: Vasundhara Volam <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>

Michael Chan says: ==================== bnxt_en: Two bug fixes. The first patch fixes AER recovery by reducing the time from several minutes to a more reasonable 20 - 30 seconds. The second patch fixes a possible NULL pointer crash during firmware reset. ==================== Signed-off-by: Jakub Kicinski <[email protected]>

Fix kernel-doc warning in <linux/netdevice.h>: ../include/linux/netdevice.h:2158: warning: Function parameter or member 'proto_down_reason' not described in 'net_device' Fixes: 829eb20 ("rtnetlink: add support for protodown reason") Signed-off-by: Randy Dunlap <[email protected]> Acked-by: Roopa Prabhu <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>

Fix kernel-doc warning in <linux/netdevice.h>: ../include/linux/netdevice.h:2158: warning: Function parameter or member 'xdp_state' not described in 'net_device' Fixes: 7f0a838 ("bpf, xdp: Maintain info on attached XDP BPF programs in net_device") Signed-off-by: Randy Dunlap <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Alexei Starovoitov <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>

Currently, the RQs are temporarily deactivated while hot-replacing the XDP program, and napi_synchronize is used to make sure rq->xdp_prog is not in use. However, napi_synchronize is not ideal: instead of waiting till the end of a NAPI cycle, it polls and waits until NAPI is not running, sleeping for 1ms between the periodic checks. Under heavy workloads, this loop will never end, which may even lead to a kernel panic if the kernel detects the hangup. Such workloads include XSK TX and possibly also heavy RX (XSK or normal). The fix is inspired by commit 326fe02 ("net/mlx4_en: protect ring->xdp_prog with rcu_read_lock"). As mlx5e_xdp_handle is already protected by rcu_read_lock, and bpf_prog_put uses call_rcu to free the program, there is no need for additional synchronization if proper RCU functions are used to access the pointer. This patch converts all accesses to rq->xdp_prog to use RCU functions. Fixes: 8699415 ("net/mlx5e: XDP fast RX drop bpf programs support") Fixes: db05815 ("net/mlx5e: Add XSK zero-copy support") Signed-off-by: Maxim Mikityanskiy <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>

As described in the previous commit, napi_synchronize doesn't quite fit the purpose when we just need to wait until the currently running NAPI quits. Its implementation waits until NAPI is not running by polling and waiting for 1ms in between. In cases where we need to deactivate one queue (e.g., recovery flows) or where we deactivate them one-by-one (deactivate channel flow), we may get stuck in napi_synchronize forever if other queues keep NAPI active, causing a soft lockup. Depending on kernel configuration (CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC), it may result in a kernel panic. To fix the issue, use synchronize_rcu to wait for NAPI to quit, and wrap the whole NAPI in rcu_read_lock. Fixes: acc6c59 ("net/mlx5e: Split open/close channels to stages") Signed-off-by: Maxim Mikityanskiy <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>

…ot ready When deleting vxlan flow rule under multipath, tun_info in parse_attr is not freed when the rule is not ready. Fixes: ef06c9e ("net/mlx5e: Allow one failure when offloading tc encap rules under multipath") Signed-off-by: Jianbo Liu <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>

Add missing mapping remove call when removing ct rule, as the mapping was allocated when ct rule was adding with ct_label. Also there is a missing mapping remove call in error flow. Fixes: 54b154e ("net/mlx5e: CT: Map 128 bits labels to 32 bit map ID") Signed-off-by: Roi Dayan <[email protected]> Reviewed-by: Eli Britstein <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>

…pported The cited commit creates peer miss group during switchdev mode initialization in order to handle miss packets correctly while in VF LAG mode. This is done regardless of FW support of such groups which could cause rules setups failure later on. Fix by adding FW capability check before creating peer groups/rule. Fixes: ac004b8 ("net/mlx5e: E-Switch, Add peer miss rules") Signed-off-by: Maor Dickman <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Reviewed-by: Raed Salem <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>

The field mask value is provided in network byte order and has to be converted to host byte order before calculating pedit mask first bit. Fixes: 88f30bb ("net/mlx5e: Bit sized fields rewrite support") Signed-off-by: Maor Dickman <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>

Currently the FW does not generate events for counters other than error counters. Unlike ".get_ethtool_stats", ".ndo_get_stats64" (which ip -s uses) might run in atomic context, while the FW interface is non atomic. Thus, 'ip' is not allowed to issue FW commands, so it will only display cached counters in the driver. Add a SW counter (mcast_packets) in the driver to count rx multicast packets. The counter also counts broadcast packets, as we consider it a special case of multicast. Use the counter value when calling "ip -s"/"ifconfig". Fixes: f62b8bb ("net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality") Signed-off-by: Ron Diskin <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>

The cited commit started to reuse function mlx5e_update_ndo_stats() for the representors as well. However, the function is hard-coded to work on mlx5e_nic_stats_grps only. Due to this issue, the representors statistics were not updated in the output of "ip -s". Fix it to work with the correct group by extracting it from the caller's profile. Also, while at it and since this function became generic, move it to en_stats.c and rename it accordingly. Fixes: 8a236b1 ("net/mlx5e: Convert rep stats to mlx5e_stats_grp-based infra") Signed-off-by: Alaa Hleihel <[email protected]> Reviewed-by: Vlad Buslov <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>

The set of TLS TX global SW counters in mlx5e_tls_sw_stats_desc is updated from all rings by using atomic ops. This set of stats is used only in the FPGA TLS use case, not in the Connect-X TLS one, where regular per-ring counters are used. Do not expose them in the Connect-X use case, as this would cause counter duplication. For example, tx_tls_drop_no_sync_data would appear twice in the ethtool stats. Fixes: d2ead1f ("net/mlx5e: Add kTLS TX HW offload support") Signed-off-by: Tariq Toukan <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>

Using synchronize_rcu() is sufficient to wait until running NAPI quits. See similar upstream fix with detailed explanation: ("net/mlx5e: Use synchronize_rcu to sync with NAPI") This change also fixes a possible use-after-free as the NAPI might be already released at this stage. Fixes: 0419d8c ("net/mlx5e: kTLS, Add kTLS RX resync support") Signed-off-by: Tariq Toukan <[email protected]> Reviewed-by: Maxim Mikityanskiy <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>

Progress params dma address is never unmapped, unmap it when completion handling is over. Fixes: 0419d8c ("net/mlx5e: kTLS, Add kTLS RX resync support") Signed-off-by: Saeed Mahameed <[email protected]> Reviewed-by: Tariq Toukan <[email protected]>

Resync progress params buffer and dma weren't released on error, Add missing error unwinding for resync_post_get_progress_params(). Fixes: 0419d8c ("net/mlx5e: kTLS, Add kTLS RX resync support") Signed-off-by: Saeed Mahameed <[email protected]> Reviewed-by: Tariq Toukan <[email protected]>

The spinlock only needed when accessing the channel's icosq, grab the lock after the buf allocation in resync_post_get_progress_params() to avoid kzalloc(GFP_KERNEL) in atomic context. Fixes: 0419d8c ("net/mlx5e: kTLS, Add kTLS RX resync support") Reported-by: YueHaibing <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]> Reviewed-by: Tariq Toukan <[email protected]>

Returning errno is a bug, fix that. Also fixes smatch warnings: drivers/net/ethernet/mellanox/mlx5/core/en/port.c:453 mlx5e_fec_in_caps() warn: signedness bug returning '(-95)' Fixes: 2132b71 ("net/mlx5e: Advertise globaly supported FEC modes") Reported-by: kernel test robot <[email protected]> Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Reviewed-by: Aya Levin <[email protected]>

Update maintainers for MediaTek switch driver with Landen Chao who is familiar with MediaTek MT753x switch devices and will help maintenance from the vendor side. Cc: Steven Liu <[email protected]> Signed-off-by: Sean Wang <[email protected]> Signed-off-by: Landen Chao <[email protected]> Signed-off-by: David S. Miller <[email protected]>

…ux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes-2020-09-18 This series introduces some fixes to mlx5 driver. Please pull and let me know if there is any problem. v1->v2: Remove missing patch from -stable list. For -stable v5.1 ('net/mlx5: Fix FTE cleanup') For -stable v5.3 ('net/mlx5e: TLS, Do not expose FPGA TLS counter if not supported') ('net/mlx5e: Enable adding peer miss rules only if merged eswitch is supported') For -stable v5.7 ('net/mlx5e: Fix memory leak of tunnel info when rule under multipath not ready') For -stable v5.8 ('net/mlx5e: Use RCU to protect rq->xdp_prog') ('net/mlx5e: Fix endianness when calculating pedit mask first bit') ('net/mlx5e: Use synchronize_rcu to sync with NAPI') ==================== Signed-off-by: David S. Miller <[email protected]>

… under RCU When calling the RCU brother of br_vlan_get_pvid(), lockdep warns: ============================= WARNING: suspicious RCU usage 5.9.0-rc3-01631-g13c17acb8e38-dirty #814 Not tainted ----------------------------- net/bridge/br_private.h:1054 suspicious rcu_dereference_protected() usage! Call trace: lockdep_rcu_suspicious+0xd4/0xf8 __br_vlan_get_pvid+0xc0/0x100 br_vlan_get_pvid_rcu+0x78/0x108 The warning is because br_vlan_get_pvid_rcu() calls nbp_vlan_group() which calls rtnl_dereference() instead of rcu_dereference(). In turn, rtnl_dereference() calls rcu_dereference_protected() which assumes operation under an RCU write-side critical section, which obviously is not the case here. So, when the incorrect primitive is used to access the RCU-protected VLAN group pointer, READ_ONCE() is not used, which may cause various unexpected problems. I'm sad to say that br_vlan_get_pvid() and br_vlan_get_pvid_rcu() cannot share the same implementation. So fix the bug by splitting the 2 functions, and making br_vlan_get_pvid_rcu() retrieve the VLAN groups under proper locking annotations. Fixes: 7582f5b ("bridge: add br_vlan_get_pvid_rcu()") Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>

User space could send an invalid INET_DIAG_REQ_PROTOCOL attribute as caught by syzbot. BUG: KMSAN: uninit-value in inet_diag_lock_handler net/ipv4/inet_diag.c:55 [inline] BUG: KMSAN: uninit-value in __inet_diag_dump+0x58c/0x720 net/ipv4/inet_diag.c:1147 CPU: 0 PID: 8505 Comm: syz-executor174 Not tainted 5.9.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x21c/0x280 lib/dump_stack.c:118 kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:122 __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:219 inet_diag_lock_handler net/ipv4/inet_diag.c:55 [inline] __inet_diag_dump+0x58c/0x720 net/ipv4/inet_diag.c:1147 inet_diag_dump_compat+0x2a5/0x380 net/ipv4/inet_diag.c:1254 netlink_dump+0xb73/0x1cb0 net/netlink/af_netlink.c:2246 __netlink_dump_start+0xcf2/0xea0 net/netlink/af_netlink.c:2354 netlink_dump_start include/linux/netlink.h:246 [inline] inet_diag_rcv_msg_compat+0x5da/0x6c0 net/ipv4/inet_diag.c:1288 sock_diag_rcv_msg+0x24f/0x620 net/core/sock_diag.c:256 netlink_rcv_skb+0x6d7/0x7e0 net/netlink/af_netlink.c:2470 sock_diag_rcv+0x63/0x80 net/core/sock_diag.c:275 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline] netlink_unicast+0x11c8/0x1490 net/netlink/af_netlink.c:1330 netlink_sendmsg+0x173a/0x1840 net/netlink/af_netlink.c:1919 sock_sendmsg_nosec net/socket.c:651 [inline] sock_sendmsg net/socket.c:671 [inline] ____sys_sendmsg+0xc82/0x1240 net/socket.c:2353 ___sys_sendmsg net/socket.c:2407 [inline] __sys_sendmsg+0x6d1/0x820 net/socket.c:2440 __do_sys_sendmsg net/socket.c:2449 [inline] __se_sys_sendmsg+0x97/0xb0 net/socket.c:2447 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2447 do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x441389 Code: e8 fc ab 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 1b 09 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fff3b02ce98 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000441389 RDX: 0000000000000000 RSI: 0000000020001500 RDI: 0000000000000003 RBP: 00000000006cb018 R08: 00000000004002c8 R09: 00000000004002c8 R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000402130 R13: 00000000004021c0 R14: 0000000000000000 R15: 0000000000000000 Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:143 [inline] kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:126 kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:80 slab_alloc_node mm/slub.c:2907 [inline] __kmalloc_node_track_caller+0x9aa/0x12f0 mm/slub.c:4511 __kmalloc_reserve net/core/skbuff.c:142 [inline] __alloc_skb+0x35f/0xb30 net/core/skbuff.c:210 alloc_skb include/linux/skbuff.h:1094 [inline] netlink_alloc_large_skb net/netlink/af_netlink.c:1176 [inline] netlink_sendmsg+0xdb9/0x1840 net/netlink/af_netlink.c:1894 sock_sendmsg_nosec net/socket.c:651 [inline] sock_sendmsg net/socket.c:671 [inline] ____sys_sendmsg+0xc82/0x1240 net/socket.c:2353 ___sys_sendmsg net/socket.c:2407 [inline] __sys_sendmsg+0x6d1/0x820 net/socket.c:2440 __do_sys_sendmsg net/socket.c:2449 [inline] __se_sys_sendmsg+0x97/0xb0 net/socket.c:2447 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2447 do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 3f935c7 ("inet_diag: support for wider protocol numbers") Signed-off-by: Eric Dumazet <[email protected]> Cc: Paolo Abeni <[email protected]> Cc: Christoph Paasch <[email protected]> Cc: Mat Martineau <[email protected]> Acked-by: Paolo Abeni <[email protected]> Signed-off-by: David S. Miller <[email protected]>

Some of the IS2 IP4_TCP_UDP keys are not correct, like L4_DPORT, L4_SPORT and other L4 keys. This prevents offloaded tc-flower rules from matching on src_port and dst_port for TCP and UDP packets. Signed-off-by: Xiaoliang Yang <[email protected]> Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>

Since these were copied from the Felix VCAP IS2 code, and only the offsets were adjusted, the order of the bit fields is still wrong. Fix it. Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>

The IS2 IP4_TCP_UDP key offsets do not correspond to the VSC7514 datasheet. Whether they work or not is unknown to me. On VSC9959 and VSC9953, with the same mistake and same discrepancy from the documentation, tc-flower src_port and dst_port rules did not work, so I am assuming the same is true here. Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>

Vladimir Oltean says: ==================== Fix broken tc-flower rules for mscc_ocelot switches All 3 switch drivers from the Ocelot family have the same bug in the VCAP IS2 key offsets, which is that some keys are in the incorrect order. ==================== Signed-off-by: David S. Miller <[email protected]>

Pull block fixes from Jens Axboe: "A few NVMe fixes, and a dasd write zero fix" * tag 'block-5.9-2020-09-22' of git://git.kernel.dk/linux-block: nvmet: get transport reference for passthru ctrl nvme-core: get/put ctrl and transport module in nvme_dev_open/release() nvme-tcp: fix kconfig dependency warning when !CRYPTO nvme-pci: disable the write zeros command for Intel 600P/P3100 s390/dasd: Fix zero write for FBA devices

Pull io_uring fixes from Jens Axboe: "A few fixes - most of them regression fixes from this cycle, but also a few stable heading fixes, and a build fix for the included demo tool since some systems now actually have gettid() available" * tag 'io_uring-5.9-2020-09-22' of git://git.kernel.dk/linux-block: io_uring: fix openat/openat2 unified prep handling io_uring: mark statx/files_update/epoll_ctl as non-SQPOLL tools/io_uring: fix compile breakage io_uring: don't use retry based buffered reads for non-async bdev io_uring: don't re-setup vecs/iter in io_resumit_prep() is already there io_uring: don't run task work on an exiting task io_uring: drop 'ctx' ref on task work cancelation io_uring: grab any needed state during defer prep

Pull networking fixes from Jakub Kicinski: - fix failure to add bond interfaces to a bridge, the offload-handling code was too defensive there and recent refactoring unearthed that. Users complained (Ido) - fix unnecessarily reflecting ECN bits within TOS values / QoS marking in TCP ACK and reset packets (Wei) - fix a deadlock with bpf iterator. Hopefully we're in the clear on this front now... (Yonghong) - BPF fix for clobbering r2 in bpf_gen_ld_abs (Daniel) - fix AQL on mt76 devices with FW rate control and add a couple of AQL issues in mac80211 code (Felix) - fix authentication issue with mwifiex (Maximilian) - WiFi connectivity fix: revert IGTK support in ti/wlcore (Mauro) - fix exception handling for multipath routes via same device (David Ahern) - revert back to a BH spin lock flavor for nsid_lock: there are paths which do require the BH context protection (Taehee) - fix interrupt / queue / NAPI handling in the lantiq driver (Hauke) - fix ife module load deadlock (Cong) - make an adjustment to netlink reply message type for code added in this release (the sole change touching uAPI here) (Michal) - a number of fixes for small NXP and Microchip switches (Vladimir) [ Pull request acked by David: "you can expect more of this in the future as I try to delegate more things to Jakub" ] * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (167 commits) net: mscc: ocelot: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries net: dsa: seville: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries net: dsa: felix: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries inet_diag: validate INET_DIAG_REQ_PROTOCOL attribute net: bridge: br_vlan_get_pvid_rcu() should dereference the VLAN group under RCU net: Update MAINTAINERS for MediaTek switch driver net/mlx5e: mlx5e_fec_in_caps() returns a boolean net/mlx5e: kTLS, Avoid kzalloc(GFP_KERNEL) under spinlock net/mlx5e: kTLS, Fix leak on resync error flow net/mlx5e: kTLS, Add missing dma_unmap in RX resync net/mlx5e: kTLS, Fix napi sync and possible use-after-free net/mlx5e: TLS, Do not expose FPGA TLS counter if not supported net/mlx5e: Fix using wrong stats_grps in mlx5e_update_ndo_stats() net/mlx5e: Fix multicast counter not up-to-date in "ip -s" net/mlx5e: Fix endianness when calculating pedit mask first bit net/mlx5e: Enable adding peer miss rules only if merged eswitch is supported net/mlx5e: CT: Fix freeing ct_label mapping net/mlx5e: Fix memory leak of tunnel info when rule under multipath not ready net/mlx5e: Use synchronize_rcu to sync with NAPI net/mlx5e: Use RCU to protect rq->xdp_prog ...

…/viro/vfs Pull vfs fixes from Al Viro: "No common topic, just assorted fixes" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fuse: fix the ->direct_IO() treatment of iov_iter fs: fix cast in fsparam_u32hex() macro vboxsf: Fix the check for the old binary mount-arguments struct

A cleanup patch from my legacy timer series broke ia64 and led to RCU stall errors and a fast system clock: [ 909.360108] INFO: task systemd-sysv-ge:200 blocked for more than 127 seconds. [ 909.360108] Not tainted 5.10.0+ #130 [ 909.360108] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 909.360108] task:systemd-sysv-ge state:D stack: 0 pid: 200 ppid: 189 flags:0x00000000 [ 909.364108] [ 909.364108] Call Trace: [ 909.364423] [<a00000010109b210>] __schedule+0x890/0x21e0 [ 909.364423] sp=e0000100487d7b70 bsp=e0000100487d1748 [ 909.368423] [<a00000010109cc00>] schedule+0xa0/0x240 [ 909.368423] sp=e0000100487d7b90 bsp=e0000100487d16e0 [ 909.368558] [<a00000010109ce70>] io_schedule+0x70/0xa0 [ 909.368558] sp=e0000100487d7b90 bsp=e0000100487d16c0 [ 909.372290] [<a00000010109e1c0>] bit_wait_io+0x20/0xe0 [ 909.372290] sp=e0000100487d7b90 bsp=e0000100487d1698 [ 909.374168] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 909.376290] [<a00000010109d860>] __wait_on_bit+0xc0/0x1c0 [ 909.376290] sp=e0000100487d7b90 bsp=e0000100487d1648 [ 909.374168] rcu: 3-....: (2 ticks this GP) idle=19e/1/0x4000000000000002 softirq=1581/1581 fqs=2 [ 909.374168] (detected by 0, t=5661 jiffies, g=1089, q=3) [ 909.376290] [<a00000010109da80>] out_of_line_wait_on_bit+0x120/0x140 [ 909.376290] sp=e0000100487d7b90 bsp=e0000100487d1610 [ 909.374168] Task dump for CPU 3: [ 909.374168] task:khungtaskd state:R running task Revert most of my patch to make this work again, including the extra update_process_times()/profile_tick() and the local_irq_enable() in the loop that I expected not to be needed here. I have not found out exactly what goes wrong, and would suggest that someone with hardware access tries to convert this code into a singleshot clockevent driver, which should give better behavior in all cases. Reported-by: John Paul Adrian Glaubitz <[email protected]> Fixes: 2b49ddc ("ia64: convert to legacy_timer_tick") Cc: John Stultz <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Stephen Boyd <[email protected]> Cc: Frederic Weisbecker <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]>

Like commit 1cf3bfc ("bpf: Support 64-bit pointers to kfuncs") for s390x, add support for 64-bit pointers to kfuncs for LoongArch. Since the infrastructure is already implemented in BPF core, the only thing need to be done is to override bpf_jit_supports_far_kfunc_call(). Before this change, several test_verifier tests failed: # ./test_verifier | grep # | grep FAIL #119/p calls: invalid kfunc call: ptr_to_mem to struct with non-scalar FAIL #120/p calls: invalid kfunc call: ptr_to_mem to struct with nesting depth > 4 FAIL #121/p calls: invalid kfunc call: ptr_to_mem to struct with FAM FAIL #122/p calls: invalid kfunc call: reg->type != PTR_TO_CTX FAIL #123/p calls: invalid kfunc call: void * not allowed in func proto without mem size arg FAIL #124/p calls: trigger reg2btf_ids[reg->type] for reg->type > __BPF_REG_TYPE_MAX FAIL #125/p calls: invalid kfunc call: reg->off must be zero when passed to release kfunc FAIL #126/p calls: invalid kfunc call: don't match first member type when passed to release kfunc FAIL #127/p calls: invalid kfunc call: PTR_TO_BTF_ID with negative offset FAIL #128/p calls: invalid kfunc call: PTR_TO_BTF_ID with variable offset FAIL #129/p calls: invalid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL #130/p calls: valid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID FAIL #486/p map_kptr: ref: reference state created and released on xchg FAIL This is because the kfuncs in the loaded module are far away from __bpf_call_base: ffff800002009440 t bpf_kfunc_call_test_fail1 [bpf_testmod] 9000000002e128d8 T __bpf_call_base The offset relative to __bpf_call_base does NOT fit in s32, which breaks the assumption in BPF core. Enable bpf_jit_supports_far_kfunc_call() lifts this limit. Note that to reproduce the above result, tools/testing/selftests/bpf/config should be applied, and run the test with JIT enabled, unpriv BPF enabled. With this change, the test_verifier tests now all passed: # ./test_verifier ... Summary: 777 PASSED, 0 SKIPPED, 0 FAILED Tested-by: Tiezhu Yang <[email protected]> Signed-off-by: Hengqi Chen <[email protected]> Signed-off-by: Huacai Chen <[email protected]>

liujian56 and others added 30 commits July 21, 2020 10:31

Maxim Mikityanskiy and others added 26 commits September 21, 2020 17:22

pull bot added the ⤵️ pull label Sep 22, 2020

pull bot merged commit 805c6d3 into vchong:master Sep 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] master from torvalds:master #130

[pull] master from torvalds:master #130

pull bot commented Sep 22, 2020 •

edited

Loading

[pull] master from torvalds:master #130

[pull] master from torvalds:master #130

Conversation

pull bot commented Sep 22, 2020 • edited Loading

pull bot commented Sep 22, 2020 •

edited

Loading