summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)AuthorFilesLines
2024-06-13bnxt_en: Cap the size of HWRM_PORT_PHY_QCFG forwarded responseMichael Chan2-2/+61
Firmware interface 1.10.2.118 has increased the size of HWRM_PORT_PHY_QCFG response beyond the maximum size that can be forwarded. When the VF's link state is not the default auto state, the PF will need to forward the response back to the VF to indicate the forced state. This regression may cause the VF to fail to initialize. Fix it by capping the HWRM_PORT_PHY_QCFG response to the maximum 96 bytes. The SPEEDS2_SUPPORTED flag needs to be cleared because the new speeds2 fields are beyond the legacy structure. Also modify bnxt_hwrm_fwd_resp() to print a warning if the message size exceeds 96 bytes to make this failure more obvious. Fixes: 84a911db8305 ("bnxt_en: Update firmware interface to 1.10.2.118") Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240612231736.57823-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-13gve: Clear napi->skb before dev_kfree_skb_any()Ziwei Xiao1-3/+5
gve_rx_free_skb incorrectly leaves napi->skb referencing an skb after it is freed with dev_kfree_skb_any(). This can result in a subsequent call to napi_get_frags returning a dangling pointer. Fix this by clearing napi->skb before the skb is freed. Fixes: 9b8dd5e5ea48 ("gve: DQO: Add RX path") Cc: stable@vger.kernel.org Reported-by: Shailend Chand <shailend@google.com> Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Shailend Chand <shailend@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Link: https://lore.kernel.org/r/20240612001654.923887-1-ziweixiao@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-13ionic: fix use after netif_napi_del()Taehee Yoo1-3/+1
When queues are started, netif_napi_add() and napi_enable() are called. If there are 4 queues and only 3 queues are used for the current configuration, only 3 queues' napi should be registered and enabled. The ionic_qcq_enable() checks whether the .poll pointer is not NULL for enabling only the using queue' napi. Unused queues' napi will not be registered by netif_napi_add(), so the .poll pointer indicates NULL. But it couldn't distinguish whether the napi was unregistered or not because netif_napi_del() doesn't reset the .poll pointer to NULL. So, ionic_qcq_enable() calls napi_enable() for the queue, which was unregistered by netif_napi_del(). Reproducer: ethtool -L <interface name> rx 1 tx 1 combined 0 ethtool -L <interface name> rx 0 tx 0 combined 1 ethtool -L <interface name> rx 0 tx 0 combined 4 Splat looks like: kernel BUG at net/core/dev.c:6666! Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 3 PID: 1057 Comm: kworker/3:3 Not tainted 6.10.0-rc2+ #16 Workqueue: events ionic_lif_deferred_work [ionic] RIP: 0010:napi_enable+0x3b/0x40 Code: 48 89 c2 48 83 e2 f6 80 b9 61 09 00 00 00 74 0d 48 83 bf 60 01 00 00 00 74 03 80 ce 01 f0 4f RSP: 0018:ffffb6ed83227d48 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff97560cda0828 RCX: 0000000000000029 RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff97560cda0a28 RBP: ffffb6ed83227d50 R08: 0000000000000400 R09: 0000000000000001 R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 R13: ffff97560ce3c1a0 R14: 0000000000000000 R15: ffff975613ba0a20 FS: 0000000000000000(0000) GS:ffff975d5f780000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f8f734ee200 CR3: 0000000103e50000 CR4: 00000000007506f0 PKRU: 55555554 Call Trace: <TASK> ? die+0x33/0x90 ? do_trap+0xd9/0x100 ? napi_enable+0x3b/0x40 ? do_error_trap+0x83/0xb0 ? napi_enable+0x3b/0x40 ? napi_enable+0x3b/0x40 ? exc_invalid_op+0x4e/0x70 ? napi_enable+0x3b/0x40 ? asm_exc_invalid_op+0x16/0x20 ? napi_enable+0x3b/0x40 ionic_qcq_enable+0xb7/0x180 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] ionic_start_queues+0xc4/0x290 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] ionic_link_status_check+0x11c/0x170 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] ionic_lif_deferred_work+0x129/0x280 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] process_one_work+0x145/0x360 worker_thread+0x2bb/0x3d0 ? __pfx_worker_thread+0x10/0x10 kthread+0xcc/0x100 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2d/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 Fixes: 0f3154e6bcb3 ("ionic: Add Tx and Rx handling") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Link: https://lore.kernel.org/r/20240612060446.1754392-1-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-13Revert "igc: fix a log entry using uninitialized netdev"Sasha Neftin1-3/+2
This reverts commit 86167183a17e03ec77198897975e9fdfbd53cb0b. igc_ptp_init() needs to be called before igc_reset(), otherwise kernel crash could be observed. Following the corresponding discussion [1] and [2] revert this commit. Link: https://lore.kernel.org/all/8fb634f8-7330-4cf4-a8ce-485af9c0a61a@intel.com/ [1] Link: https://lore.kernel.org/all/87o78rmkhu.fsf@intel.com/ [2] Fixes: 86167183a17e ("igc: fix a log entry using uninitialized netdev") Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20240611162456.961631-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-11net: stmmac: replace priv->speed with the portTransmitRate from the tc-cbs ↵Xiaolei Wang1-14/+11
parameters The current cbs parameter depends on speed after uplinking, which is not needed and will report a configuration error if the port is not initially connected. The UAPI exposed by tc-cbs requires userspace to recalculate the send slope anyway, because the formula depends on port_transmit_rate (see man tc-cbs), which is not an invariant from tc's perspective. Therefore, we use offload->sendslope and offload->idleslope to derive the original port_transmit_rate from the CBS formula. Fixes: 1f705bc61aee ("net: stmmac: Add support for CBS QDISC") Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20240608143524.2065736-1-xiaolei.wang@windriver.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-11gve: ignore nonrelevant GSO type bits when processing TSO headersJoshua Washington1-15/+5
TSO currently fails when the skb's gso_type field has more than one bit set. TSO packets can be passed from userspace using PF_PACKET, TUNTAP and a few others, using virtio_net_hdr (e.g., PACKET_VNET_HDR). This includes virtualization, such as QEMU, a real use-case. The gso_type and gso_size fields as passed from userspace in virtio_net_hdr are not trusted blindly by the kernel. It adds gso_type |= SKB_GSO_DODGY to force the packet to enter the software GSO stack for verification. This issue might similarly come up when the CWR bit is set in the TCP header for congestion control, causing the SKB_GSO_TCP_ECN gso_type bit to be set. Fixes: a57e5de476be ("gve: DQO: Add TX path") Signed-off-by: Joshua Washington <joshwash@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Suggested-by: Eric Dumazet <edumazet@google.com> Acked-by: Andrei Vagin <avagin@gmail.com> v2 - Remove unnecessary comments, remove line break between fixes tag and signoffs. v3 - Add back unrelated empty line removal. Link: https://lore.kernel.org/r/20240610225729.2985343-1-joshwash@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-10net/mlx5e: Fix features validation check for tunneled UDP (non-VXLAN) packetsGal Pressman1-2/+1
Move the vxlan_features_check() call to after we verified the packet is a tunneled VXLAN packet. Without this, tunneled UDP non-VXLAN packets (for ex. GENENVE) might wrongly not get offloaded. In some cases, it worked by chance as GENEVE header is the same size as VXLAN, but it is obviously incorrect. Fixes: e3cfc7e6b7bd ("net/mlx5e: TX, Add geneve tunnel stateless offload support") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-10geneve: Fix incorrect inner network header offset when innerprotoinherit is setGal Pressman1-4/+6
When innerprotoinherit is set, the tunneled packets do not have an inner Ethernet header. Change 'maclen' to not always assume the header length is ETH_HLEN, as there might not be a MAC header. This resolves issues with drivers (e.g. mlx5, in mlx5e_tx_tunnel_accel()) who rely on the skb inner network header offset to be correct, and use it for TX offloads. Fixes: d8a6213d70ac ("geneve: fix header validation in geneve[6]_xmit_skb") Signed-off-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-10net dsa: qca8k: fix usages of device_get_named_child_node()Andy Shevchenko1-2/+10
The documentation for device_get_named_child_node() mentions this important point: " The caller is responsible for calling fwnode_handle_put() on the returned fwnode pointer. " Add fwnode_handle_put() to avoid leaked references. Fixes: 1e264f9d2918 ("net: dsa: qca8k: add LEDs basic support") Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-10netdevsim: fix backwards compatibility in nsim_get_iflink()David Wei1-1/+2
The default ndo_get_iflink() implementation returns the current ifindex of the netdev. But the overridden nsim_get_iflink() returns 0 if the current nsim is not linked, breaking backwards compatibility for userspace that depend on this behaviour. Fix the problem by returning the current ifindex if not linked to a peer. Fixes: 8debcf5832c3 ("netdevsim: add ndo_get_iflink() implementation") Reported-by: Yu Watanabe <watanabe.yu@gmail.com> Suggested-by: Yu Watanabe <watanabe.yu@gmail.com> Signed-off-by: David Wei <dw@davidwei.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-09net: stmmac: dwmac-qcom-ethqos: Configure host DMA widthSagar Cheluvegowda1-0/+4
Commit 070246e4674b ("net: stmmac: Fix for mismatched host/device DMA address width") added support in the stmmac driver for platform drivers to indicate the host DMA width, but left it up to authors of the specific platforms to indicate if their width differed from the addr64 register read from the MAC itself. Qualcomm's EMAC4 integration supports only up to 36 bit width (as opposed to the addr64 register indicating 40 bit width). Let's indicate that in the platform driver to avoid a scenario where the driver will allocate descriptors of size that is supported by the CPU which in our case is 36 bit, but as the addr64 register is still capable of 40 bits the device will use two descriptors as one address. Fixes: 8c4d92e82d50 ("net: stmmac: dwmac-qcom-ethqos: add support for emac4 on sa8775p platforms") Signed-off-by: Sagar Cheluvegowda <quic_scheluve@quicinc.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Andrew Halaney <ahalaney@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-07liquidio: Adjust a NULL pointer handling path in lio_vf_rep_copy_packetAleksandr Mishin1-6/+5
In lio_vf_rep_copy_packet() pg_info->page is compared to a NULL value, but then it is unconditionally passed to skb_add_rx_frag() which looks strange and could lead to null pointer dereference. lio_vf_rep_copy_packet() call trace looks like: octeon_droq_process_packets octeon_droq_fast_process_packets octeon_droq_dispatch_pkt octeon_create_recv_info ...search in the dispatch_list... ->disp_fn(rdisp->rinfo, ...) lio_vf_rep_pkt_recv(struct octeon_recv_info *recv_info, ...) In this path there is no code which sets pg_info->page to NULL. So this check looks unneeded and doesn't solve potential problem. But I guess the author had reason to add a check and I have no such card and can't do real test. In addition, the code in the function liquidio_push_packet() in liquidio/lio_core.c does exactly the same. Based on this, I consider the most acceptable compromise solution to adjust this issue by moving skb_add_rx_frag() into conditional scope. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 1f233f327913 ("liquidio: switchdev support for LiquidIO NIC") Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-07net: hns3: add cond_resched() to hns3 ring buffer init processJie Wang2-0/+6
Currently hns3 ring buffer init process would hold cpu too long with big Tx/Rx ring depth. This could cause soft lockup. So this patch adds cond_resched() to the process. Then cpu can break to run other tasks instead of busy looping. Fixes: a723fb8efe29 ("net: hns3: refine for set ring parameters") Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-07net: hns3: fix kernel crash problem in concurrent scenarioYonglong Liu1-5/+16
When link status change, the nic driver need to notify the roce driver to handle this event, but at this time, the roce driver may uninit, then cause kernel crash. To fix the problem, when link status change, need to check whether the roce registered, and when uninit, need to wait link update finish. Fixes: 45e92b7e4e27 ("net: hns3: add calling roce callback function when link status change") Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-06net: sfp: Always call `sfp_sm_mod_remove()` on removeCsókás, Bence1-2/+1
If the module is in SFP_MOD_ERROR, `sfp_sm_mod_remove()` will not be run. As a consequence, `sfp_hwmon_remove()` is not getting run either, leaving a stale `hwmon` device behind. `sfp_sm_mod_remove()` itself checks `sfp->sm_mod_state` anyways, so this check was not really needed in the first place. Fixes: d2e816c0293f ("net: sfp: handle module remove outside state machine") Signed-off-by: "Csókás, Bence" <csokas.bence@prolan.hu> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20240605084251.63502-1-csokas.bence@prolan.hu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-06Merge tag 'net-6.10-rc3' of ↵Linus Torvalds49-314/+713
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from BPF and big collection of fixes for WiFi core and drivers. Current release - regressions: - vxlan: fix regression when dropping packets due to invalid src addresses - bpf: fix a potential use-after-free in bpf_link_free() - xdp: revert support for redirect to any xsk socket bound to the same UMEM as it can result in a corruption - virtio_net: - add missing lock protection when reading return code from control_buf - fix false-positive lockdep splat in DIM - Revert "wifi: wilc1000: convert list management to RCU" - wifi: ath11k: fix error path in ath11k_pcic_ext_irq_config Previous releases - regressions: - rtnetlink: make the "split" NLM_DONE handling generic, restore the old behavior for two cases where we started coalescing those messages with normal messages, breaking sloppily-coded userspace - wifi: - cfg80211: validate HE operation element parsing - cfg80211: fix 6 GHz scan request building - mt76: mt7615: add missing chanctx ops - ath11k: move power type check to ASSOC stage, fix connecting to 6 GHz AP - ath11k: fix WCN6750 firmware crash caused by 17 num_vdevs - rtlwifi: ignore IEEE80211_CONF_CHANGE_RETRY_LIMITS - iwlwifi: mvm: fix a crash on 7265 Previous releases - always broken: - ncsi: prevent multi-threaded channel probing, a spec violation - vmxnet3: disable rx data ring on dma allocation failure - ethtool: init tsinfo stats if requested, prevent unintentionally reporting all-zero stats on devices which don't implement any - dst_cache: fix possible races in less common IPv6 features - tcp: auth: don't consider TCP_CLOSE to be in TCP_AO_ESTABLISHED - ax25: fix two refcounting bugs - eth: ionic: fix kernel panic in XDP_TX action Misc: - tcp: count CLOSE-WAIT sockets for TCP_MIB_CURRESTAB" * tag 'net-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (107 commits) selftests: net: lib: set 'i' as local selftests: net: lib: avoid error removing empty netns name selftests: net: lib: support errexit with busywait net: ethtool: fix the error condition in ethtool_get_phy_stats_ethtool() ipv6: fix possible race in __fib6_drop_pcpu_from() af_unix: Annotate data-race of sk->sk_shutdown in sk_diag_fill(). af_unix: Use skb_queue_len_lockless() in sk_diag_show_rqlen(). af_unix: Use skb_queue_empty_lockless() in unix_release_sock(). af_unix: Use unix_recvq_full_lockless() in unix_stream_connect(). af_unix: Annotate data-race of net->unx.sysctl_max_dgram_qlen. af_unix: Annotate data-races around sk->sk_sndbuf. af_unix: Annotate data-races around sk->sk_state in UNIX_DIAG. af_unix: Annotate data-race of sk->sk_state in unix_stream_read_skb(). af_unix: Annotate data-races around sk->sk_state in sendmsg() and recvmsg(). af_unix: Annotate data-race of sk->sk_state in unix_accept(). af_unix: Annotate data-race of sk->sk_state in unix_stream_connect(). af_unix: Annotate data-races around sk->sk_state in unix_write_space() and poll(). af_unix: Annotate data-race of sk->sk_state in unix_inq_len(). af_unix: Annodate data-races around sk->sk_state for writers. af_unix: Set sk->sk_state under unix_state_lock() for truly disconencted peer. ...
2024-06-06Merge tag 'efi-fixes-for-v6.10-2' of ↵Linus Torvalds3-11/+11
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fixes from Ard Biesheuvel: - Ensure that .discard sections are really discarded in the EFI zboot image build - Return proper error numbers from efi-pstore - Add __nocfi annotations to EFI runtime wrappers * tag 'efi-fixes-for-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi: Add missing __nocfi annotations to runtime wrappers efi: pstore: Return proper errors on UEFI failures efi/libstub: zboot.lds: Discard .discard sections
2024-06-06net: wwan: iosm: Fix tainted pointer delete is case of region creation failAleksandr Mishin1-1/+1
In case of region creation fail in ipc_devlink_create_region(), previously created regions delete process starts from tainted pointer which actually holds error code value. Fix this bug by decreasing region index before delete. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 4dcd183fbd67 ("net: wwan: iosm: devlink registration") Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Acked-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240604082500.20769-1-amishin@t-argos.ru Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-05igc: Fix Energy Efficient Ethernet support declarationSasha Neftin2-2/+11
The commit 01cf893bf0f4 ("net: intel: i40e/igc: Remove setting Autoneg in EEE capabilities") removed SUPPORTED_Autoneg field but left inappropriate ethtool_keee structure initialization. When "ethtool --show <device>" (get_eee) invoke, the 'ethtool_keee' structure was accidentally overridden. Remove the 'ethtool_keee' overriding and add EEE declaration as per IEEE specification that allows reporting Energy Efficient Ethernet capabilities. Examples: Before fix: ethtool --show-eee enp174s0 EEE settings for enp174s0: EEE status: not supported After fix: EEE settings for enp174s0: EEE status: disabled Tx LPI: disabled Supported EEE link modes: 100baseT/Full 1000baseT/Full 2500baseT/Full Fixes: 01cf893bf0f4 ("net: intel: i40e/igc: Remove setting Autoneg in EEE capabilities") Suggested-by: Dima Ruinskiy <dima.ruinskiy@intel.com> Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240603-net-2024-05-30-intel-net-fixes-v2-6-e3563aa89b0c@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-05ice: map XDP queues to vectors in ice_vsi_map_rings_to_vectors()Larysa Zaremba4-46/+68
ice_pf_dcb_recfg() re-maps queues to vectors with ice_vsi_map_rings_to_vectors(), which does not restore the previous state for XDP queues. This leads to no AF_XDP traffic after rebuild. Map XDP queues to vectors in ice_vsi_map_rings_to_vectors(). Also, move the code around, so XDP queues are mapped independently only through .ndo_bpf(). Fixes: 6624e780a577 ("ice: split ice_vsi_setup into smaller functions") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240603-net-2024-05-30-intel-net-fixes-v2-5-e3563aa89b0c@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-05ice: add flag to distinguish reset from .ndo_bpf in XDP rings configLarysa Zaremba3-14/+24
Commit 6624e780a577 ("ice: split ice_vsi_setup into smaller functions") has placed ice_vsi_free_q_vectors() after ice_destroy_xdp_rings() in the rebuild process. The behaviour of the XDP rings config functions is context-dependent, so the change of order has led to ice_destroy_xdp_rings() doing additional work and removing XDP prog, when it was supposed to be preserved. Also, dependency on the PF state reset flags creates an additional, fortunately less common problem: * PFR is requested e.g. by tx_timeout handler * .ndo_bpf() is asked to delete the program, calls ice_destroy_xdp_rings(), but reset flag is set, so rings are destroyed without deleting the program * ice_vsi_rebuild tries to delete non-existent XDP rings, because the program is still on the VSI * system crashes With a similar race, when requested to attach a program, ice_prepare_xdp_rings() can actually skip setting the program in the VSI and nevertheless report success. Instead of reverting to the old order of function calls, add an enum argument to both ice_prepare_xdp_rings() and ice_destroy_xdp_rings() in order to distinguish between calls from rebuild and .ndo_bpf(). Fixes: efc2214b6047 ("ice: Add support for XDP") Reviewed-by: Igor Bagnucki <igor.bagnucki@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240603-net-2024-05-30-intel-net-fixes-v2-4-e3563aa89b0c@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-05ice: remove af_xdp_zc_qps bitmapLarysa Zaremba3-26/+27
Referenced commit has introduced a bitmap to distinguish between ZC and copy-mode AF_XDP queues, because xsk_get_pool_from_qid() does not do this for us. The bitmap would be especially useful when restoring previous state after rebuild, if only it was not reallocated in the process. This leads to e.g. xdpsock dying after changing number of queues. Instead of preserving the bitmap during the rebuild, remove it completely and distinguish between ZC and copy-mode queues based on the presence of a device associated with the pool. Fixes: e102db780e1c ("ice: track AF_XDP ZC enabled queues in bitmap") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240603-net-2024-05-30-intel-net-fixes-v2-3-e3563aa89b0c@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-05ice: fix reads from NVM Shadow RAM on E830 and E825-C devicesJacob Keller2-9/+93
The ice driver reads data from the Shadow RAM portion of the NVM during initialization, including data used to identify the NVM image and device, such as the ETRACK ID used to populate devlink dev info fw.bundle. Currently it is using a fixed offset defined by ICE_CSS_HEADER_LENGTH to compute the appropriate offset. This worked fine for E810 and E822 devices which both have CSS header length of 330 words. Other devices, including both E825-C and E830 devices have different sizes for their CSS header. The use of a hard coded value results in the driver reading from the wrong block in the NVM when attempting to access the Shadow RAM copy. This results in the driver reporting the fw.bundle as 0x0 in both the devlink dev info and ethtool -i output. The first E830 support was introduced by commit ba20ecb1d1bb ("ice: Hook up 4 E830 devices by adding their IDs") and the first E825-C support was introducted by commit f64e18944233 ("ice: introduce new E825C devices family") The NVM actually contains the CSS header length embedded in it. Remove the hard coded value and replace it with logic to read the length from the NVM directly. This is more resilient against all existing and future hardware, vs looking up the expected values from a table. It ensures the driver will read from the appropriate place when determining the ETRACK ID value used for populating the fw.bundle_id and for reporting in ethtool -i. The CSS header length for both the active and inactive flash bank is stored in the ice_bank_info structure to avoid unnecessary duplicate work when accessing multiple words of the Shadow RAM. Both banks are read in the unlikely event that the header length is different for the NVM in the inactive bank, rather than being different only by the overall device family. Fixes: ba20ecb1d1bb ("ice: Hook up 4 E830 devices by adding their IDs") Co-developed-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240603-net-2024-05-30-intel-net-fixes-v2-2-e3563aa89b0c@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-05ice: fix iteration of TLVs in Preserved Fields AreaJacob Keller1-7/+21
The ice_get_pfa_module_tlv() function iterates over the Type-Length-Value structures in the Preserved Fields Area (PFA) of the NVM. This is used by the driver to access data such as the Part Board Assembly identifier. The function uses simple logic to iterate over the PFA. First, the pointer to the PFA in the NVM is read. Then the total length of the PFA is read from the first word. A pointer to the first TLV is initialized, and a simple loop iterates over each TLV. The pointer is moved forward through the NVM until it exceeds the PFA area. The logic seems sound, but it is missing a key detail. The Preserved Fields Area length includes one additional final word. This is documented in the device data sheet as a dummy word which contains 0xFFFF. All NVMs have this extra word. If the driver tries to scan for a TLV that is not in the PFA, it will read past the size of the PFA. It reads and interprets the last dummy word of the PFA as a TLV with type 0xFFFF. It then reads the word following the PFA as a length. The PFA resides within the Shadow RAM portion of the NVM, which is relatively small. All of its offsets are within a 16-bit size. The PFA pointer and TLV pointer are stored by the driver as 16-bit values. In almost all cases, the word following the PFA will be such that interpreting it as a length will result in 16-bit arithmetic overflow. Once overflowed, the new next_tlv value is now below the maximum offset of the PFA. Thus, the driver will continue to iterate the data as TLVs. In the worst case, the driver hits on a sequence of reads which loop back to reading the same offsets in an endless loop. To fix this, we need to correct the loop iteration check to account for this extra word at the end of the PFA. This alone is sufficient to resolve the known cases of this issue in the field. However, it is plausible that an NVM could be misconfigured or have corrupt data which results in the same kind of overflow. Protect against this by using check_add_overflow when calculating both the maximum offset of the TLVs, and when calculating the next_tlv offset at the end of each loop iteration. This ensures that the driver will not get stuck in an infinite loop when scanning the PFA. Fixes: e961b679fb0b ("ice: add board identifier info to devlink .info_get") Co-developed-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240603-net-2024-05-30-intel-net-fixes-v2-1-e3563aa89b0c@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-05ptp: Fix error message on failed pin verificationKarol Kolacinski1-1/+2
On failed verification of PTP clock pin, error message prints channel number instead of pin index after "pin", which is incorrect. Fix error message by adding channel number to the message and printing pin number instead of channel number. Fixes: 6092315dfdec ("ptp: introduce programmable pins.") Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Link: https://lore.kernel.org/r/20240604120555.16643-1-karol.kolacinski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-05Merge tag 'thermal-6.10-rc3' of ↵Linus Torvalds4-25/+50
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control fixes from Rafael Wysocki: "Fix issues related to the handling of invalid trip points in the thermal core and in the thermal debug code that have been overlooked by some recent thermal control core changes" * tag 'thermal-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal: trip: Trigger trip down notifications when trips involved in mitigation become invalid thermal: core: Introduce thermal_trip_crossed() thermal/debugfs: Allow tze_seq_show() to print statistics for invalid trips thermal/debugfs: Print initial trip temperature and hysteresis in tze_seq_show()
2024-06-05Merge tag 'acpi-6.10-rc3' of ↵Linus Torvalds6-7/+19
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These fix the ACPI EC and AC drivers, the ACPI APEI error injection driver and build issues related to the dev_is_pnp() macro referring to pnp_bus_type that is not exported to modules. Specifics: - Fix error handling during EC operation region accesses in the ACPI EC driver (Armin Wolf) - Fix a memory leak in the APEI error injection driver introduced during its converion to a platform driver (Dan Williams) - Fix build failures related to the dev_is_pnp() macro by redefining it as a proper function and exporting it to modules as appropriate and unexport pnp_bus_type which need not be exported any more (Andy Shevchenko) - Update the ACPI AC driver to use power_supply_changed() to let the power supply core handle configuration changes properly (Thomas Weißschuh)" * tag 'acpi-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: AC: Properly notify powermanagement core about changes PNP: Hide pnp_bus_type from the non-PNP code PNP: Make dev_is_pnp() to be a function and export it for modules ACPI: EC: Avoid returning AE_OK on errors in address space handler ACPI: EC: Abort address space access upon error ACPI: APEI: EINJ: Fix einj_dev release leak
2024-06-05Merge tag 'pm-6.10-rc3' of ↵Linus Torvalds4-4/+142
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix the intel_pstate and amd-pstate cpufreq drivers and the cpupower utility. Specifics: - Fix a recently introduced unchecked HWP MSR access in the intel_pstate driver (Srinivas Pandruvada) - Add missing conversion from MHz to KHz to amd_pstate_set_boost() to address sysfs inteface inconsistency and fix P-state frequency reporting on AMD Family 1Ah CPUs in the cpupower utility (Dhananjay Ugwekar) - Get rid of an excess global header file used by the amd-pstate cpufreq driver (Arnd Bergmann)" * tag 'pm-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: intel_pstate: Fix unchecked HWP MSR access cpufreq: amd-pstate: Fix the inconsistency in max frequency units cpufreq: amd-pstate: remove global header file tools/power/cpupower: Fix Pstate frequency reporting on AMD Family 1Ah CPUs
2024-06-05net/mlx5: Fix tainted pointer delete is case of flow rules creation failAleksandr Mishin1-2/+6
In case of flow rule creation fail in mlx5_lag_create_port_sel_table(), instead of previously created rules, the tainted pointer is deleted deveral times. Fix this bug by using correct flow rules pointers. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 352899f384d4 ("net/mlx5: Lag, use buckets in hash mode") Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20240604100552.25201-1-amishin@t-argos.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-05Merge tag 'i2c-for-6.10-rc3' of ↵Linus Torvalds1-6/+5
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "This should have been my second pull request during the merge window but one dependency in the drm subsystem fell through the cracks and was only applied for rc2. Now we can finally remove I2C_CLASS_SPD" * tag 'i2c-for-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: Remove I2C_CLASS_SPD i2c: synquacer: Remove a clk reference from struct synquacer_i2c
2024-06-05Merge tag 'tpmdd-next-6.10-rc3' of ↵Linus Torvalds3-3/+4
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull tpm fixes from Jarkko Sakkinen: "The bug fix for tpm_tis_core_init() is not that critical but still makes sense to get into release for the sake of better quality. I included the Intel CPU model define change mainly to help Tony just a bit, as for this subsystem it cannot realistically speaking cause any possible harm" * tag 'tpmdd-next-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: tpm: Switch to new Intel CPU model defines tpm_tis: Do *not* flush uninitialized work
2024-06-05Merge branches 'acpi-ec', 'acpi-apei' and 'pnp'Rafael J. Wysocki4-3/+15
Merge ACPI EC driver fixes, an ACPI APEI fix and PNP fixes for 6.10-rc3: - Fix error handling during EC operation region accesses in the ACPI EC driver (Armin Wolf). - Fix a memory leak in the APEI error injection driver introduced during its converion to a platform driver (Dan Williams). - Fix build failures related to the dev_is_pnp() macro by redefining it as a proper function and exporting it to modules as appropriate and unexport pnp_bus_type which need not be exported any more (Andy Shevchenko). * acpi-ec: ACPI: EC: Avoid returning AE_OK on errors in address space handler ACPI: EC: Abort address space access upon error * acpi-apei: ACPI: APEI: EINJ: Fix einj_dev release leak * pnp: PNP: Hide pnp_bus_type from the non-PNP code PNP: Make dev_is_pnp() to be a function and export it for modules
2024-06-05net/mlx5: Always stop health timer during driver removalShay Drory1-0/+3
Currently, if teardown_hca fails to execute during driver removal, mlx5 does not stop the health timer. Afterwards, mlx5 continue with driver teardown. This may lead to a UAF bug, which results in page fault Oops[1], since the health timer invokes after resources were freed. Hence, stop the health monitor even if teardown_hca fails. [1] mlx5_core 0000:18:00.0: E-Switch: Unload vfs: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) mlx5_core 0000:18:00.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) mlx5_core 0000:18:00.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) mlx5_core 0000:18:00.0: E-Switch: cleanup mlx5_core 0000:18:00.0: wait_func:1155:(pid 1967079): TEARDOWN_HCA(0x103) timeout. Will cause a leak of a command resource mlx5_core 0000:18:00.0: mlx5_function_close:1288:(pid 1967079): tear_down_hca failed, skip cleanup BUG: unable to handle page fault for address: ffffa26487064230 PGD 100c00067 P4D 100c00067 PUD 100e5a067 PMD 105ed7067 PTE 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 0 PID: 0 Comm: swapper/0 Tainted: G OE ------- --- 6.7.0-68.fc38.x86_64 #1 Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0013.121520200651 12/15/2020 RIP: 0010:ioread32be+0x34/0x60 RSP: 0018:ffffa26480003e58 EFLAGS: 00010292 RAX: ffffa26487064200 RBX: ffff9042d08161a0 RCX: ffff904c108222c0 RDX: 000000010bbf1b80 RSI: ffffffffc055ddb0 RDI: ffffa26487064230 RBP: ffff9042d08161a0 R08: 0000000000000022 R09: ffff904c108222e8 R10: 0000000000000004 R11: 0000000000000441 R12: ffffffffc055ddb0 R13: ffffa26487064200 R14: ffffa26480003f00 R15: ffff904c108222c0 FS: 0000000000000000(0000) GS:ffff904c10800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffa26487064230 CR3: 00000002c4420006 CR4: 00000000007706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> ? __die+0x23/0x70 ? page_fault_oops+0x171/0x4e0 ? exc_page_fault+0x175/0x180 ? asm_exc_page_fault+0x26/0x30 ? __pfx_poll_health+0x10/0x10 [mlx5_core] ? __pfx_poll_health+0x10/0x10 [mlx5_core] ? ioread32be+0x34/0x60 mlx5_health_check_fatal_sensors+0x20/0x100 [mlx5_core] ? __pfx_poll_health+0x10/0x10 [mlx5_core] poll_health+0x42/0x230 [mlx5_core] ? __next_timer_interrupt+0xbc/0x110 ? __pfx_poll_health+0x10/0x10 [mlx5_core] call_timer_fn+0x21/0x130 ? __pfx_poll_health+0x10/0x10 [mlx5_core] __run_timers+0x222/0x2c0 run_timer_softirq+0x1d/0x40 __do_softirq+0xc9/0x2c8 __irq_exit_rcu+0xa6/0xc0 sysvec_apic_timer_interrupt+0x72/0x90 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x1a/0x20 RIP: 0010:cpuidle_enter_state+0xcc/0x440 ? cpuidle_enter_state+0xbd/0x440 cpuidle_enter+0x2d/0x40 do_idle+0x20d/0x270 cpu_startup_entry+0x2a/0x30 rest_init+0xd0/0xd0 arch_call_rest_init+0xe/0x30 start_kernel+0x709/0xa90 x86_64_start_reservations+0x18/0x30 x86_64_start_kernel+0x96/0xa0 secondary_startup_64_no_verify+0x18f/0x19b ---[ end trace 0000000000000000 ]--- Fixes: 9b98d395b85d ("net/mlx5: Start health poll at earlier stage of driver load") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05net/mlx5: Stop waiting for PCI if pci channel is offlineMoshe Shemesh3-0/+16
In case pci channel becomes offline the driver should not wait for PCI reads during health dump and recovery flow. The driver has timeout for each of these loops trying to read PCI, so it would fail anyway. However, in case of recovery waiting till timeout may cause the pci error_detected() callback fail to meet pci_dpc_recovered() wait timeout. Fixes: b3bd076f7501 ("net/mlx5: Report devlink health on FW fatal issues") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Shay Drori <shayd@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05net: ethernet: mtk_eth_soc: handle dma buffer size soc specificFrank Wunderlich2-36/+77
The mainline MTK ethernet driver suffers long time from rarly but annoying tx queue timeouts. We think that this is caused by fixed dma sizes hardcoded for all SoCs. We suspect this problem arises from a low level of free TX DMADs, the TX Ring alomost full. The transmit timeout is caused by the Tx queue not waking up. The Tx queue stops when the free counter is less than ring->thres, and it will wake up once the free counter is greater than ring->thres. If the CPU is too late to wake up the Tx queues, it may cause a transmit timeout. Therefore, we increased the TX and RX DMADs to improve this error situation. Use the dma-size implementation from SDK in a per SoC manner. In difference to SDK we have no RSS feature yet, so all RX/TX sizes should be raised from 512 to 2048 byte except fqdma on mt7988 to avoid the tx timeout issue. Fixes: 656e705243fd ("net-next: mediatek: add support for MT7623 ethernet") Suggested-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: Frank Wunderlich <frank-w@public-files.de> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05vxlan: Fix regression when dropping packets due to invalid src addressesDaniel Borkmann1-4/+4
Commit f58f45c1e5b9 ("vxlan: drop packets from invalid src-address") has recently been added to vxlan mainly in the context of source address snooping/learning so that when it is enabled, an entry in the FDB is not being created for an invalid address for the corresponding tunnel endpoint. Before commit f58f45c1e5b9 vxlan was similarly behaving as geneve in that it passed through whichever macs were set in the L2 header. It turns out that this change in behavior breaks setups, for example, Cilium with netkit in L3 mode for Pods as well as tunnel mode has been passing before the change in f58f45c1e5b9 for both vxlan and geneve. After mentioned change it is only passing for geneve as in case of vxlan packets are dropped due to vxlan_set_mac() returning false as source and destination macs are zero which for E/W traffic via tunnel is totally fine. Fix it by only opting into the is_valid_ether_addr() check in vxlan_set_mac() when in fact source address snooping/learning is actually enabled in vxlan. This is done by moving the check into vxlan_snoop(). With this change, the Cilium connectivity test suite passes again for both tunnel flavors. Fixes: f58f45c1e5b9 ("vxlan: drop packets from invalid src-address") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: David Bauer <mail@david-bauer.net> Cc: Ido Schimmel <idosch@nvidia.com> Cc: Nikolay Aleksandrov <razor@blackwall.org> Cc: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: David Bauer <mail@david-bauer.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05ionic: fix kernel panic in XDP_TX actionTaehee Yoo1-0/+1
In the XDP_TX path, ionic driver sends a packet to the TX path with rx page and corresponding dma address. After tx is done, ionic_tx_clean() frees that page. But RX ring buffer isn't reset to NULL. So, it uses a freed page, which causes kernel panic. BUG: unable to handle page fault for address: ffff8881576c110c PGD 773801067 P4D 773801067 PUD 87f086067 PMD 87efca067 PTE 800ffffea893e060 Oops: Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN NOPTI CPU: 1 PID: 25 Comm: ksoftirqd/1 Not tainted 6.9.0+ #11 Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021 RIP: 0010:bpf_prog_f0b8caeac1068a55_balancer_ingress+0x3b/0x44f Code: 00 53 41 55 41 56 41 57 b8 01 00 00 00 48 8b 5f 08 4c 8b 77 00 4c 89 f7 48 83 c7 0e 48 39 d8 RSP: 0018:ffff888104e6fa28 EFLAGS: 00010283 RAX: 0000000000000002 RBX: ffff8881576c1140 RCX: 0000000000000002 RDX: ffffffffc0051f64 RSI: ffffc90002d33048 RDI: ffff8881576c110e RBP: ffff888104e6fa88 R08: 0000000000000000 R09: ffffed1027a04a23 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8881b03a21a8 R13: ffff8881589f800f R14: ffff8881576c1100 R15: 00000001576c1100 FS: 0000000000000000(0000) GS:ffff88881ae00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff8881576c110c CR3: 0000000767a90000 CR4: 00000000007506f0 PKRU: 55555554 Call Trace: <TASK> ? __die+0x20/0x70 ? page_fault_oops+0x254/0x790 ? __pfx_page_fault_oops+0x10/0x10 ? __pfx_is_prefetch.constprop.0+0x10/0x10 ? search_bpf_extables+0x165/0x260 ? fixup_exception+0x4a/0x970 ? exc_page_fault+0xcb/0xe0 ? asm_exc_page_fault+0x22/0x30 ? 0xffffffffc0051f64 ? bpf_prog_f0b8caeac1068a55_balancer_ingress+0x3b/0x44f ? do_raw_spin_unlock+0x54/0x220 ionic_rx_service+0x11ab/0x3010 [ionic 9180c3001ab627d82bbc5f3ebe8a0decaf6bb864] ? ionic_tx_clean+0x29b/0xc60 [ionic 9180c3001ab627d82bbc5f3ebe8a0decaf6bb864] ? __pfx_ionic_tx_clean+0x10/0x10 [ionic 9180c3001ab627d82bbc5f3ebe8a0decaf6bb864] ? __pfx_ionic_rx_service+0x10/0x10 [ionic 9180c3001ab627d82bbc5f3ebe8a0decaf6bb864] ? ionic_tx_cq_service+0x25d/0xa00 [ionic 9180c3001ab627d82bbc5f3ebe8a0decaf6bb864] ? __pfx_ionic_rx_service+0x10/0x10 [ionic 9180c3001ab627d82bbc5f3ebe8a0decaf6bb864] ionic_cq_service+0x69/0x150 [ionic 9180c3001ab627d82bbc5f3ebe8a0decaf6bb864] ionic_txrx_napi+0x11a/0x540 [ionic 9180c3001ab627d82bbc5f3ebe8a0decaf6bb864] __napi_poll.constprop.0+0xa0/0x440 net_rx_action+0x7e7/0xc30 ? __pfx_net_rx_action+0x10/0x10 Fixes: 8eeed8373e1c ("ionic: Add XDP_TX support") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05net: phy: Micrel KSZ8061: fix errata solution not taking effect problemTristram Ha1-1/+41
KSZ8061 needs to write to a MMD register at driver initialization to fix an errata. This worked in 5.0 kernel but not in newer kernels. The issue is the main phylib code no longer resets PHY at the very beginning. Calling phy resuming code later will reset the chip if it is already powered down at the beginning. This wipes out the MMD register write. Solution is to implement a phy resume function for KSZ8061 to take care of this problem. Fixes: 232ba3a51cc2 ("net: phy: Micrel KSZ8061: link failure after cable connect") Signed-off-by: Tristram Ha <tristram.ha@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05octeontx2-af: Always allocate PF entries from low prioriy zoneSubbaraya Sundeep1-11/+22
PF mcam entries has to be at low priority always so that VF can install longest prefix match rules at higher priority. This was taken care currently but when priority allocation wrt reference entry is requested then entries are allocated from mid-zone instead of low priority zone. Fix this and always allocate entries from low priority zone for PFs. Fixes: 7df5b4b260dd ("octeontx2-af: Allocate low priority entries for PF") Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05efi: Add missing __nocfi annotations to runtime wrappersArd Biesheuvel1-7/+6
The EFI runtime wrappers are a sandbox for calling into EFI runtime services, which are invoked using indirect calls. When running with kCFI enabled, the compiler will require the target of any indirect call to be type annotated. Given that the EFI runtime services prototypes and calling convention are governed by the EFI spec, not the Linux kernel, adding such type annotations for firmware routines is infeasible, and so the compiler must be informed that prototype validation should be omitted. Add the __nocfi annotation at the appropriate places in the EFI runtime wrapper code to achieve this. Note that this currently only affects 32-bit ARM, given that other architectures that support both kCFI and EFI use an asm wrapper to call EFI runtime services, and this hides the indirect call from the compiler. Fixes: 1a4fec49efe5 ("ARM: 9392/2: Support CLANG CFI") Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-06-05tpm: Switch to new Intel CPU model definesTony Luck2-2/+2
New CPU #defines encode vendor and family as well as model. Link: https://lore.kernel.org/all/20240520224620.9480-4-tony.luck@intel.com/ Signed-off-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-06-05tpm_tis: Do *not* flush uninitialized workJan Beulich1-1/+2
tpm_tis_core_init() may fail before tpm_tis_probe_irq_single() is called, in which case tpm_tis_remove() unconditionally calling flush_work() is triggering a warning for .func still being NULL. Cc: stable@vger.kernel.org # v6.5+ Fixes: 481c2d14627d ("tpm,tpm_tis: Disable interrupts after 1000 unhandled IRQs") Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-06-04Merge tag 'devicetree-fixes-for-6.10-1' of ↵Linus Torvalds4-71/+88
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fixes from Rob Herring: - Fix regression in 'interrupt-map' handling affecting Apple M1 mini (at least) - Fix binding example warning in stm32 st,mlahb binding - Fix schema error in Allwinner platform binding causing lots of spurious warnings - Add missing MODULE_DESCRIPTION() to DT kunit tests * tag 'devicetree-fixes-for-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: of: property: Fix fw_devlink handling of interrupt-map of/irq: Factor out parsing of interrupt-map parent phandle+args from of_irq_parse_raw() dt-bindings: arm: stm32: st,mlahb: Drop spurious "reg" property from example dt-bindings: arm: sunxi: Fix incorrect '-' usage of: of_test: add MODULE_DESCRIPTION()
2024-06-04Merge branch 'efi/next' into efi/urgentArd Biesheuvel2-4/+5
2024-06-03Merge tag 'wireless-2024-06-03' of ↵Jakub Kicinski25-125/+219
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Kalle Valo says: ==================== wireless fixes for v6.10-rc3 The first fixes for v6.10. And we have a big one, I suspect the biggest wireless pull request we ever had. There are fixes all over, both in stack and drivers. Likely the most important here are mt76 not working on mt7615 devices, ath11k not being able to connect to 6 GHz networks and rtlwifi suffering from packet loss. But of course there's much more. * tag 'wireless-2024-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: (37 commits) wifi: rtlwifi: Ignore IEEE80211_CONF_CHANGE_RETRY_LIMITS wifi: mt76: mt7615: add missing chanctx ops wifi: wilc1000: document SRCU usage instead of SRCU Revert "wifi: wilc1000: set atomic flag on kmemdup in srcu critical section" Revert "wifi: wilc1000: convert list management to RCU" wifi: mac80211: fix UBSAN noise in ieee80211_prep_hw_scan() wifi: mac80211: correctly parse Spatial Reuse Parameter Set element wifi: mac80211: fix Spatial Reuse element size check wifi: iwlwifi: mvm: don't read past the mfuart notifcation wifi: iwlwifi: mvm: Fix scan abort handling with HW rfkill wifi: iwlwifi: mvm: check n_ssids before accessing the ssids wifi: iwlwifi: mvm: properly set 6 GHz channel direct probe option wifi: iwlwifi: mvm: handle BA session teardown in RF-kill wifi: iwlwifi: mvm: Handle BIGTK cipher in kek_kck cmd wifi: iwlwifi: mvm: remove stale STA link data during restart wifi: iwlwifi: dbg_ini: move iwl_dbg_tlv_free outside of debugfs ifdef wifi: iwlwifi: mvm: set properly mac header wifi: iwlwifi: mvm: revert gen2 TX A-MPDU size to 64 wifi: iwlwifi: mvm: d3: fix WoWLAN command version lookup wifi: iwlwifi: mvm: fix a crash on 7265 ... ==================== Link: https://lore.kernel.org/r/20240603115129.9494CC2BD10@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-03vmxnet3: disable rx data ring on dma allocation failureMatthias Stocker1-1/+1
When vmxnet3_rq_create() fails to allocate memory for rq->data_ring.base, the subsequent call to vmxnet3_rq_destroy_all_rxdataring does not reset rq->data_ring.desc_size for the data ring that failed, which presumably causes the hypervisor to reference it on packet reception. To fix this bug, rq->data_ring.desc_size needs to be set to 0 to tell the hypervisor to disable this feature. [ 95.436876] kernel BUG at net/core/skbuff.c:207! [ 95.439074] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 95.440411] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 6.9.3-dirty #1 [ 95.441558] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018 [ 95.443481] RIP: 0010:skb_panic+0x4d/0x4f [ 95.444404] Code: 4f 70 50 8b 87 c0 00 00 00 50 8b 87 bc 00 00 00 50 ff b7 d0 00 00 00 4c 8b 8f c8 00 00 00 48 c7 c7 68 e8 be 9f e8 63 58 f9 ff <0f> 0b 48 8b 14 24 48 c7 c1 d0 73 65 9f e8 a1 ff ff ff 48 8b 14 24 [ 95.447684] RSP: 0018:ffffa13340274dd0 EFLAGS: 00010246 [ 95.448762] RAX: 0000000000000089 RBX: ffff8fbbc72b02d0 RCX: 000000000000083f [ 95.450148] RDX: 0000000000000000 RSI: 00000000000000f6 RDI: 000000000000083f [ 95.451520] RBP: 000000000000002d R08: 0000000000000000 R09: ffffa13340274c60 [ 95.452886] R10: ffffffffa04ed468 R11: 0000000000000002 R12: 0000000000000000 [ 95.454293] R13: ffff8fbbdab3c2d0 R14: ffff8fbbdbd829e0 R15: ffff8fbbdbd809e0 [ 95.455682] FS: 0000000000000000(0000) GS:ffff8fbeefd80000(0000) knlGS:0000000000000000 [ 95.457178] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 95.458340] CR2: 00007fd0d1f650c8 CR3: 0000000115f28000 CR4: 00000000000406f0 [ 95.459791] Call Trace: [ 95.460515] <IRQ> [ 95.461180] ? __die_body.cold+0x19/0x27 [ 95.462150] ? die+0x2e/0x50 [ 95.462976] ? do_trap+0xca/0x110 [ 95.463973] ? do_error_trap+0x6a/0x90 [ 95.464966] ? skb_panic+0x4d/0x4f [ 95.465901] ? exc_invalid_op+0x50/0x70 [ 95.466849] ? skb_panic+0x4d/0x4f [ 95.467718] ? asm_exc_invalid_op+0x1a/0x20 [ 95.468758] ? skb_panic+0x4d/0x4f [ 95.469655] skb_put.cold+0x10/0x10 [ 95.470573] vmxnet3_rq_rx_complete+0x862/0x11e0 [vmxnet3] [ 95.471853] vmxnet3_poll_rx_only+0x36/0xb0 [vmxnet3] [ 95.473185] __napi_poll+0x2b/0x160 [ 95.474145] net_rx_action+0x2c6/0x3b0 [ 95.475115] handle_softirqs+0xe7/0x2a0 [ 95.476122] __irq_exit_rcu+0x97/0xb0 [ 95.477109] common_interrupt+0x85/0xa0 [ 95.478102] </IRQ> [ 95.478846] <TASK> [ 95.479603] asm_common_interrupt+0x26/0x40 [ 95.480657] RIP: 0010:pv_native_safe_halt+0xf/0x20 [ 95.481801] Code: 22 d7 e9 54 87 01 00 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa eb 07 0f 00 2d 93 ba 3b 00 fb f4 <e9> 2c 87 01 00 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 [ 95.485563] RSP: 0018:ffffa133400ffe58 EFLAGS: 00000246 [ 95.486882] RAX: 0000000000004000 RBX: ffff8fbbc1d14064 RCX: 0000000000000000 [ 95.488477] RDX: ffff8fbeefd80000 RSI: ffff8fbbc1d14000 RDI: 0000000000000001 [ 95.490067] RBP: ffff8fbbc1d14064 R08: ffffffffa0652260 R09: 00000000000010d3 [ 95.491683] R10: 0000000000000018 R11: ffff8fbeefdb4764 R12: ffffffffa0652260 [ 95.493389] R13: ffffffffa06522e0 R14: 0000000000000001 R15: 0000000000000000 [ 95.495035] acpi_safe_halt+0x14/0x20 [ 95.496127] acpi_idle_do_entry+0x2f/0x50 [ 95.497221] acpi_idle_enter+0x7f/0xd0 [ 95.498272] cpuidle_enter_state+0x81/0x420 [ 95.499375] cpuidle_enter+0x2d/0x40 [ 95.500400] do_idle+0x1e5/0x240 [ 95.501385] cpu_startup_entry+0x29/0x30 [ 95.502422] start_secondary+0x11c/0x140 [ 95.503454] common_startup_64+0x13e/0x141 [ 95.504466] </TASK> [ 95.505197] Modules linked in: nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables vsock_loopback vmw_vsock_virtio_transport_common qrtr vmw_vsock_vmci_transport vsock sunrpc binfmt_misc pktcdvd vmw_balloon pcspkr vmw_vmci i2c_piix4 joydev loop dm_multipath nfnetlink zram crct10dif_pclmul crc32_pclmul vmwgfx crc32c_intel polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 sha256_ssse3 vmxnet3 sha1_ssse3 drm_ttm_helper vmw_pvscsi ttm ata_generic pata_acpi serio_raw scsi_dh_rdac scsi_dh_emc scsi_dh_alua ip6_tables ip_tables fuse [ 95.516536] ---[ end trace 0000000000000000 ]--- Fixes: 6f4833383e85 ("net: vmxnet3: Fix NULL pointer dereference in vmxnet3_rq_rx_complete()") Signed-off-by: Matthias Stocker <mstocker@barracuda.com> Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com> Reviewed-by: Ronak Doshi <ronak.doshi@broadcom.com> Link: https://lore.kernel.org/r/20240531103711.101961-1-mstocker@barracuda.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-03Merge tag 'cxl-fixes-6.10-rc3' of ↵Linus Torvalds1-9/+9
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull cxl fixes from Dave Jiang: - Compile fix for cxl-test from missing linux/vmalloc.h - Fix for memregion leaks in devm_cxl_add_region() * tag 'cxl-fixes-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/region: Fix memregion leaks in devm_cxl_add_region() cxl/test: Add missing vmalloc.h for tools/testing/cxl/test/mem.c
2024-06-03cpufreq: intel_pstate: Fix unchecked HWP MSR accessSrinivas Pandruvada1-1/+2
Fix unchecked MSR access error for processors with no HWP support. On such processors, maximum frequency can be changed by the system firmware using ACPI event ACPI_PROCESSOR_NOTIFY_HIGEST_PERF_CHANGED. This results in accessing HWP MSR 0x771. Call Trace: <TASK> generic_exec_single+0x58/0x120 smp_call_function_single+0xbf/0x110 rdmsrl_on_cpu+0x46/0x60 intel_pstate_get_hwp_cap+0x1b/0x70 intel_pstate_update_limits+0x2a/0x60 acpi_processor_notify+0xb7/0x140 acpi_ev_notify_dispatch+0x3b/0x60 HWP MSR 0x771 can be only read on a CPU which supports HWP and enabled. Hence intel_pstate_get_hwp_cap() can only be called when hwp_active is true. Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Closes: https://lore.kernel.org/linux-pm/20240529155740.Hq2Hw7be@linutronix.de/ Fixes: e8217b4bece3 ("cpufreq: intel_pstate: Update the maximum CPU frequency consistently") Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-03net: phy: micrel: fix KSZ9477 PHY issues after suspend/resumeTristram Ha1-6/+56
When the PHY is powered up after powered down most of the registers are reset, so the PHY setup code needs to be done again. In addition the interrupt register will need to be setup again so that link status indication works again. Fixes: 26dd2974c5b5 ("net: phy: micrel: Move KSZ9477 errata fixes to PHY driver") Signed-off-by: Tristram Ha <tristram.ha@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-03LoongArch: Fix entry point in kernel image headerJiaxun Yang1-1/+1
Currently kernel entry in head.S is in DMW address range, firmware is instructed to jump to this address after loading the kernel image. However kernel should not make any assumption on firmware's DMW setting, thus the entry point should be a physical address falls into direct translation region. Fix by converting entry address to physical and amend entry calculation logic in libstub accordingly. BTW, use ABSOLUTE() to calculate variables to make Clang/LLVM happy. Cc: stable@vger.kernel.org Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>