summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-12-03Merge branch 'powercap'Rafael J. Wysocki1-5/+0
Merge DTPM fixes for 5.16-rc4. * powercap: powercap: DTPM: Drop unused local variable from init_dtpm() powercap/drivers/dtpm: Disable DTPM at boot time
2021-12-03x86/sev: Fix SEV-ES INS/OUTS instructions for word, dword, and qwordMichael Sterritt1-18/+39
Properly type the operands being passed to __put_user()/__get_user(). Otherwise, these routines truncate data for dependent instructions (e.g., INSW) and only read/write one byte. This has been tested by sending a string with REP OUTSW to a port and then reading it back in with REP INSW on the same port. Previous behavior was to only send and receive the first char of the size. For example, word operations for "abcd" would only read/write "ac". With change, the full string is now written and read back. Fixes: f980f9c31a923 (x86/sev-es: Compile early handler code into kernel image) Signed-off-by: Michael Sterritt <sterritt@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Marc Orr <marcorr@google.com> Reviewed-by: Peter Gonda <pgonda@google.com> Reviewed-by: Joerg Roedel <jroedel@suse.de> Link: https://lkml.kernel.org/r/20211119232757.176201-1-sterritt@google.com
2021-12-03powercap: DTPM: Drop unused local variable from init_dtpm()Rafael J. Wysocki1-2/+0
The dtpm_descr variable in init_dtpm() is not used after commit f751db8adaea ("powercap/drivers/dtpm: Disable DTPM at boot time"), so drop it. Fixes: f751db8adaea ("powercap/drivers/dtpm: Disable DTPM at boot time") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-03io-wq: don't retry task_work creation failure on fatal conditionsJens Axboe1-0/+7
We don't want to be retrying task_work creation failure if there's an actual signal pending for the parent task. If we do, then we can enter an infinite loop of perpetually retrying and each retry failing with -ERESTARTNOINTR because a signal is pending. Fixes: 3146cba99aa2 ("io-wq: make worker creation resilient against signals") Reported-by: Florian Fischer <florian.fl.fischer@fau.de> Link: https://lore.kernel.org/io-uring/20211202165606.mqryio4yzubl7ms5@pasture/ Tested-by: Florian Fischer <florian.fl.fischer@fau.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-12-03serial: 8250_bcm7271: UART errors after resuming from S2Al Cooper1-0/+13
There is a small window in time during resume where the hardware flow control signal RTS can be asserted (which allows a sender to resume sending data to the UART) but the baud rate has not yet been restored. This will cause corrupted data and FRAMING, OVERRUN and BREAK errors. This is happening because the MCTRL register is shadowed in uart_port struct and is later used during resume to set the MCTRL register during both serial8250_do_startup() and uart_resume_port(). Unfortunately, serial8250_do_startup() happens before the UART baud rate is restored. The fix is to clear the shadowed mctrl value at the end of suspend and restore it at the end of resume. Fixes: 41a469482de2 ("serial: 8250: Add new 8250-core based Broadcom STB driver") Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Al Cooper <alcooperx@gmail.com> Link: https://lore.kernel.org/r/20211201201402.47446-1-alcooperx@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-03usb: cdnsp: Fix a NULL pointer dereference in cdnsp_endpoint_init()Zhou Qingyang1-0/+3
In cdnsp_endpoint_init(), cdnsp_ring_alloc() is assigned to pep->ring and there is a dereference of it in cdnsp_endpoint_init(), which could lead to a NULL pointer dereference on failure of cdnsp_ring_alloc(). Fix this bug by adding a check of pep->ring. This bug was found by a static analyzer. The analysis employs differential checking to identify inconsistent security operations (e.g., checks or kfrees) between two code paths and confirms that the inconsistent operations are not recovered in the current function or the callers, so they constitute bugs. Note that, as a bug found by static analysis, it can be a false positive or hard to trigger. Multiple researchers have cross-reviewed the bug. Builds with CONFIG_USB_CDNSP_GADGET=y show no new warnings, and our static analyzer no longer warns about this code. Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver") Cc: stable <stable@vger.kernel.org> Acked-by: Pawel Laszczak <pawell@cadence.com> Acked-by: Peter Chen <peter.chen@kernel.org> Signed-off-by: Zhou Qingyang <zhou1615@umn.edu> Link: https://lore.kernel.org/r/20211130172700.206650-1-zhou1615@umn.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-03usb: cdns3: gadget: fix new urb never complete if ep cancel previous requestsFrank Li1-16/+4
This issue was found at android12 MTP. 1. MTP submit many out urb request. 2. Cancel left requests (>20) when enough data get from host 3. Send ACK by IN endpoint. 4. MTP submit new out urb request. 5. 4's urb never complete. TRACE LOG: MtpServer-2157 [000] d..3 1287.150391: cdns3_ep_dequeue: ep1out: req: 00000000299e6836, req buff 000000009df42287, length: 0/16384 zsi, status: -115, trb: [start:87, end:87: virt addr 0x80004000ffd50420], flags:1 SID: 0 MtpServer-2157 [000] d..3 1287.150410: cdns3_gadget_giveback: ep1out: req: 00000000299e6836, req buff 000000009df42287, length: 0/16384 zsi, status: -104, trb: [start:87, end:87: virt addr 0x80004000ffd50420], flags:0 SID: 0 MtpServer-2157 [000] d..3 1287.150433: cdns3_ep_dequeue: ep1out: req: 0000000080b7bde6, req buff 000000009ed5c556, length: 0/16384 zsi, status: -115, trb: [start:88, end:88: virt addr 0x80004000ffd5042c], flags:1 SID: 0 MtpServer-2157 [000] d..3 1287.150446: cdns3_gadget_giveback: ep1out: req: 0000000080b7bde6, req buff 000000009ed5c556, length: 0/16384 zsi, status: -104, trb: [start:88, end:88: virt addr 0x80004000ffd5042c], flags:0 SID: 0 .... MtpServer-2157 [000] d..1 1293.630410: cdns3_alloc_request: ep1out: req: 00000000afbccb7d, req buff 0000000000000000, length: 0/0 zsi, status: 0, trb: [start:0, end:0: virt addr (null)], flags:0 SID: 0 MtpServer-2157 [000] d..2 1293.630421: cdns3_ep_queue: ep1out: req: 00000000afbccb7d, req buff 00000000871caf90, length: 0/512 zsi, status: -115, trb: [start:0, end:0: virt addr (null)], flags:0 SID: 0 MtpServer-2157 [000] d..2 1293.630445: cdns3_wa1: WA1: ep1out set guard MtpServer-2157 [000] d..2 1293.630450: cdns3_wa1: WA1: ep1out restore cycle bit MtpServer-2157 [000] d..2 1293.630453: cdns3_prepare_trb: ep1out: trb 000000007317b3ee, dma buf: 0xffd5bc00, size: 512, burst: 128 ctrl: 0x00000424 (C=0, T=0, ISP, IOC, Normal) SID:0 LAST_SID:0 MtpServer-2157 [000] d..2 1293.630460: cdns3_doorbell_epx: ep1out, ep_trbaddr ffd50414 .... irq/241-5b13000-2154 [000] d..1 1293.680849: cdns3_epx_irq: IRQ for ep1out: 01000408 ISP , ep_traddr: ffd508ac ep_last_sid: 00000000 use_streams: 0 irq/241-5b13000-2154 [000] d..1 1293.680858: cdns3_complete_trb: ep1out: trb 0000000021a11b54, dma buf: 0xffd50420, size: 16384, burst: 128 ctrl: 0x00001810 (C=0, T=0, CHAIN, LINK) SID:0 LAST_SID:0 irq/241-5b13000-2154 [000] d..1 1293.680865: cdns3_request_handled: Req: 00000000afbccb7d not handled, DMA pos: 185, ep deq: 88, ep enq: 185, start trb: 184, end trb: 184 Actually DMA pos already bigger than previous submit request afbccb7d's TRB (184-184). The reason of (not handled) is that deq position is wrong. The TRB link is below when irq happen. DEQ LINK LINK LINK LINK LINK .... TRB(afbccb7d):START DMA(EP_TRADDR). Original code check LINK TRB, but DEQ just move one step. LINK DEQ LINK LINK LINK LINK .... TRB(afbccb7d):START DMA(EP_TRADDR). This patch skip all LINK TRB and sync DEQ to trb's start. LINK LINK LINK LINK LINK .... DEQ = TRB(afbccb7d):START DMA(EP_TRADDR). Acked-by: Peter Chen <peter.chen@kernel.org> Cc: stable <stable@vger.kernel.org> Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Jun Li <jun.li@nxp.com> Link: https://lore.kernel.org/r/20211130154239.8029-1-Frank.Li@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-03usb: typec: tcpm: Wait in SNK_DEBOUNCED until disconnectBadhri Jagan Sridharan1-4/+0
Stub from the spec: "4.5.2.2.4.2 Exiting from AttachWait.SNK State A Sink shall transition to Unattached.SNK when the state of both the CC1 and CC2 pins is SNK.Open for at least tPDDebounce. A DRP shall transition to Unattached.SRC when the state of both the CC1 and CC2 pins is SNK.Open for at least tPDDebounce." This change makes TCPM to wait in SNK_DEBOUNCED state until CC1 and CC2 pins is SNK.Open for at least tPDDebounce. Previously, TCPM resets the port if vbus is not present in PD_T_PS_SOURCE_ON. This causes TCPM to loop continuously when connected to a faulty power source that does not present vbus. Waiting in SNK_DEBOUNCED also ensures that TCPM is adherant to "4.5.2.2.4.2 Exiting from AttachWait.SNK State" requirements. [ 6169.280751] CC1: 0 -> 0, CC2: 0 -> 5 [state TOGGLING, polarity 0, connected] [ 6169.280759] state change TOGGLING -> SNK_ATTACH_WAIT [rev2 NONE_AMS] [ 6169.280771] pending state change SNK_ATTACH_WAIT -> SNK_DEBOUNCED @ 170 ms [rev2 NONE_AMS] [ 6169.282427] CC1: 0 -> 0, CC2: 5 -> 5 [state SNK_ATTACH_WAIT, polarity 0, connected] [ 6169.450825] state change SNK_ATTACH_WAIT -> SNK_DEBOUNCED [delayed 170 ms] [ 6169.450834] pending state change SNK_DEBOUNCED -> PORT_RESET @ 480 ms [rev2 NONE_AMS] [ 6169.930892] state change SNK_DEBOUNCED -> PORT_RESET [delayed 480 ms] [ 6169.931296] disable vbus discharge ret:0 [ 6169.931301] Setting usb_comm capable false [ 6169.932783] Setting voltage/current limit 0 mV 0 mA [ 6169.932802] polarity 0 [ 6169.933706] Requesting mux state 0, usb-role 0, orientation 0 [ 6169.936689] cc:=0 [ 6169.936812] pending state change PORT_RESET -> PORT_RESET_WAIT_OFF @ 100 ms [rev2 NONE_AMS] [ 6169.937157] CC1: 0 -> 0, CC2: 5 -> 0 [state PORT_RESET, polarity 0, disconnected] [ 6170.036880] state change PORT_RESET -> PORT_RESET_WAIT_OFF [delayed 100 ms] [ 6170.036890] state change PORT_RESET_WAIT_OFF -> SNK_UNATTACHED [rev2 NONE_AMS] [ 6170.036896] Start toggling [ 6170.041412] CC1: 0 -> 0, CC2: 0 -> 0 [state TOGGLING, polarity 0, disconnected] [ 6170.042973] CC1: 0 -> 0, CC2: 0 -> 5 [state TOGGLING, polarity 0, connected] [ 6170.042976] state change TOGGLING -> SNK_ATTACH_WAIT [rev2 NONE_AMS] [ 6170.042981] pending state change SNK_ATTACH_WAIT -> SNK_DEBOUNCED @ 170 ms [rev2 NONE_AMS] [ 6170.213014] state change SNK_ATTACH_WAIT -> SNK_DEBOUNCED [delayed 170 ms] [ 6170.213019] pending state change SNK_DEBOUNCED -> PORT_RESET @ 480 ms [rev2 NONE_AMS] [ 6170.693068] state change SNK_DEBOUNCED -> PORT_RESET [delayed 480 ms] [ 6170.693304] disable vbus discharge ret:0 [ 6170.693308] Setting usb_comm capable false [ 6170.695193] Setting voltage/current limit 0 mV 0 mA [ 6170.695210] polarity 0 [ 6170.695990] Requesting mux state 0, usb-role 0, orientation 0 [ 6170.701896] cc:=0 [ 6170.702181] pending state change PORT_RESET -> PORT_RESET_WAIT_OFF @ 100 ms [rev2 NONE_AMS] [ 6170.703343] CC1: 0 -> 0, CC2: 5 -> 0 [state PORT_RESET, polarity 0, disconnected] Fixes: f0690a25a140b8 ("staging: typec: USB Type-C Port Manager (tcpm)") Cc: stable@vger.kernel.org Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Badhri Jagan Sridharan <badhri@google.com> Link: https://lore.kernel.org/r/20211130001825.3142830-1-badhri@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-03USB: NO_LPM quirk Lenovo Powered USB-C Travel HubOle Ernst1-0/+3
This is another branded 8153 device that doesn't work well with LPM: r8152 2-2.1:1.0 enp0s13f0u2u1: Stop submitting intr, status -71 Disable LPM to resolve the issue. Signed-off-by: Ole Ernst <olebowle@gmx.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20211127090546.52072-1-olebowle@gmx.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-03xhci: Fix commad ring abort, write all 64 bits to CRCR register.Mathias Nyman1-7/+14
Turns out some xHC controllers require all 64 bits in the CRCR register to be written to execute a command abort. The lower 32 bits containing the command abort bit is written first. In case the command ring stops before we write the upper 32 bits then hardware may use these upper bits to set the commnd ring dequeue pointer. Solve this by making sure the upper 32 bits contain a valid command ring dequeue pointer. The original patch that only wrote the first 32 to stop the ring went to stable, so this fix should go there as well. Fixes: ff0e50d3564f ("xhci: Fix command ring pointer corruption while aborting a command") Cc: stable@vger.kernel.org Tested-by: Pavankumar Kondeti <quic_pkondeti@quicinc.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20211126122340.1193239-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-03x86/64/mm: Map all kernel memory into trampoline_pgdJoerg Roedel1-1/+11
The trampoline_pgd only maps the 0xfffffff000000000-0xffffffffffffffff range of kernel memory (with 4-level paging). This range contains the kernel's text+data+bss mappings and the module mapping space but not the direct mapping and the vmalloc area. This is enough to get the application processors out of real-mode, but for code that switches back to real-mode the trampoline_pgd is missing important parts of the address space. For example, consider this code from arch/x86/kernel/reboot.c, function machine_real_restart() for a 64-bit kernel: #ifdef CONFIG_X86_32 load_cr3(initial_page_table); #else write_cr3(real_mode_header->trampoline_pgd); /* Exiting long mode will fail if CR4.PCIDE is set. */ if (boot_cpu_has(X86_FEATURE_PCID)) cr4_clear_bits(X86_CR4_PCIDE); #endif /* Jump to the identity-mapped low memory code */ #ifdef CONFIG_X86_32 asm volatile("jmpl *%0" : : "rm" (real_mode_header->machine_real_restart_asm), "a" (type)); #else asm volatile("ljmpl *%0" : : "m" (real_mode_header->machine_real_restart_asm), "D" (type)); #endif The code switches to the trampoline_pgd, which unmaps the direct mapping and also the kernel stack. The call to cr4_clear_bits() will find no stack and crash the machine. The real_mode_header pointer below points into the direct mapping, and dereferencing it also causes a crash. The reason this does not crash always is only that kernel mappings are global and the CR3 switch does not flush those mappings. But if theses mappings are not in the TLB already, the above code will crash before it can jump to the real-mode stub. Extend the trampoline_pgd to contain all kernel mappings to prevent these crashes and to make code which runs on this page-table more robust. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20211202153226.22946-5-joro@8bytes.org
2021-12-03objtool: Fix pv_ops noinstr validationPeter Zijlstra2-0/+5
Boris reported that in one of his randconfig builds, objtool got infinitely stuck. Turns out there's trivial list corruption in the pv_ops tracking when a function is both in a static table and in a code assignment. Avoid re-adding function to the pv_ops[] lists when they're already on it. Fixes: db2b0c5d7b6f ("objtool: Support pv_opsindirect calls for noinstr") Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Borislav Petkov <bp@alien8.de> Link: https://lkml.kernel.org/r/20211202204534.GA16608@worktop.programming.kicks-ass.net
2021-12-02Merge tag 'drm-fixes-2021-12-03-1' of git://anongit.freedesktop.org/drm/drmLinus Torvalds39-159/+244
Pull drm fixes from Dave Airlie: "Bit of an uptick in patch count this week, though it's all relatively small overall. I suspect msm has been queuing up a few fixes to skew it here. Otherwise amdgpu has a scattered bunch of small fixes, and then some vc4, i915. virtio-gpu changes an rc1 introduced uAPI mistake, and makes it operate more like other drivers. This should be fine as no userspace relies on the behaviour yet. Summary: dma-buf: - memory leak fix msm: - kasan found memory overwrite - mmap flags - fencing error bug - ioctl NULL ptr - uninit var - devfreqless devices fix - dsi lanes fix - dp: avoid unpowered aux xfers amdgpu: - IP discovery based enumeration fixes - vkms fixes - DSC fixes for DP MST - Audio fix for hotplug with tiled displays - Misc display fixes - DP tunneling fix - DP fix - Aldebaran fix amdkfd: - Locking fix - Static checker fix - Fix double free i915: - backlight regression - Intel HDR backlight detection fix - revert TGL workaround that caused hangs virtio-gpu: - switch back to drm_poll vc4: - memory leak - error check fix - HVS modesetting fixes" * tag 'drm-fixes-2021-12-03-1' of git://anongit.freedesktop.org/drm/drm: (41 commits) Revert "drm/i915: Implement Wa_1508744258" drm/amdkfd: process_info lock not needed for svm drm/amdgpu: adjust the kfd reset sequence in reset sriov function drm/amd/display: add connector type check for CRC source set drm/amdkfd: fix double free mem structure drm/amdkfd: set "r = 0" explicitly before goto drm/amd/display: Add work around for tunneled MST. drm/amd/display: Fix for the no Audio bug with Tiled Displays drm/amd/display: Clear DPCD lane settings after repeater training drm/amd/display: Allow DSC on supported MST branch devices drm/amdgpu: Don't halt RLC on GFX suspend drm/amdgpu: fix the missed handling for SDMA2 and SDMA3 drm/amdgpu: check atomic flag to differeniate with legacy path drm/amdgpu: cancel the correct hrtimer on exit drm/amdgpu/sriov/vcn: add new vcn ip revision check case for SIENNA_CICHLID drm/i915/dp: Perform 30ms delay after source OUI write dma-buf: system_heap: Use 'for_each_sgtable_sg' in pages free flow drm/i915: Add support for panels with VESA backlights with PWM enable/disable drm/vc4: kms: Fix previous HVS commit wait drm/vc4: kms: Don't duplicate pending commit ...
2021-12-03Merge tag 'drm-intel-fixes-2021-12-02' of ↵Dave Airlie5-13/+42
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - Fixing a regression where the backlight brightness control stopped working. - Fix the Intel HDR backlight support detection. - Reverting a w/a to fix a gpu Hang in TGL. The w/a itself was also for a hang, but in a much rarer scenario. The proper solution need to be done with help from user space and it will be addressed later. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Yakf9hdnR5or+zNP@intel.com
2021-12-03Merge tag 'drm-misc-fixes-2021-12-02' of ↵Dave Airlie6-66/+28
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes Switch back to drm_poll for virtio, multiple fixes (memory leak, improper error check, some functional fixes too) for vc4, memory leak fix in dma-buf, Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20211202084440.u3b7lbeulj7k3ltg@houat
2021-12-02Merge tag 'net-5.16-rc4' of ↵Linus Torvalds104-412/+1000
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from wireless, and wireguard. Mostly scattered driver changes this week, with one big clump in mv88e6xxx. Nothing of note, really. Current release - regressions: - smc: keep smc_close_final()'s error code during active close Current release - new code bugs: - iwlwifi: various static checker fixes (int overflow, leaks, missing error codes) - rtw89: fix size of firmware header before transfer, avoid crash - mt76: fix timestamp check in tx_status; fix pktid leak; - mscc: ocelot: fix missing unlock on error in ocelot_hwstamp_set() Previous releases - regressions: - smc: fix list corruption in smc_lgr_cleanup_early - ipv4: convert fib_num_tclassid_users to atomic_t Previous releases - always broken: - tls: fix authentication failure in CCM mode - vrf: reset IPCB/IP6CB when processing outbound pkts, prevent incorrect processing - dsa: mv88e6xxx: fixes for various device errata - rds: correct socket tunable error in rds_tcp_tune() - ipv6: fix memory leak in fib6_rule_suppress - wireguard: reset peer src endpoint when netns exits - wireguard: improve resilience to DoS around incoming handshakes - tcp: fix page frag corruption on page fault which involves TCP - mpls: fix missing attributes in delete notifications - mt7915: fix NULL pointer dereference with ad-hoc mode Misc: - rt2x00: be more lenient about EPROTO errors during start - mlx4_en: update reported link modes for 1/10G" * tag 'net-5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (85 commits) net: dsa: b53: Add SPI ID table gro: Fix inconsistent indenting selftests: net: Correct case name net/rds: correct socket tunable error in rds_tcp_tune() mctp: Don't let RTM_DELROUTE delete local routes net/smc: Keep smc_close_final rc during active close ibmvnic: drop bad optimization in reuse_tx_pools() ibmvnic: drop bad optimization in reuse_rx_pools() net/smc: fix wrong list_del in smc_lgr_cleanup_early Fix Comment of ETH_P_802_3_MIN ethernet: aquantia: Try MAC address from device tree ipv4: convert fib_num_tclassid_users to atomic_t net: avoid uninit-value from tcp_conn_request net: annotate data-races on txq->xmit_lock_owner octeontx2-af: Fix a memleak bug in rvu_mbox_init() net/mlx4_en: Fix an use-after-free bug in mlx4_en_try_alloc_resources() vrf: Reset IPCB/IP6CB when processing outbound pkts in vrf dev xmit net: qlogic: qlcnic: Fix a NULL pointer dereference in qlcnic_83xx_add_rings() net: dsa: mv88e6xxx: Link in pcs_get_state() if AN is bypassed net: dsa: mv88e6xxx: Fix inband AN for 2500base-x on 88E6393X family ...
2021-12-02Merge tag 'trace-v5.16-rc3' of ↵Linus Torvalds4-1/+9
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Three tracing fixes: - Allow compares of strings when using signed and unsigned characters - Fix kmemleak false positive for histogram entries - Handle negative numbers for user defined kretprobe data sizes" * tag 'trace-v5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: kprobes: Limit max data_size of the kretprobe instances tracing: Fix a kmemleak false positive in tracing_map tracing/histograms: String compares should not care about signed values
2021-12-02Merge tag 'for-linus-5.16-2' of git://github.com/cminyard/linux-ipmiLinus Torvalds1-8/+33
Pull IPMI fixes from Corey Minyard: "Some changes that went in 5.16 had issues. When working on the design a piece was redesigned and things got missed. And the message type was not being initialized when it was allocated, resulting in crashes. In addition, the IPMI driver has had a shutdown issue where it could still have an item in a system workqueue after it had been shutdown. Move to a private workqueue to avoid that problem" * tag 'for-linus-5.16-2' of git://github.com/cminyard/linux-ipmi: ipmi:ipmb: Fix unknown command response ipmi: fix IPMI_SMI_MSG_TYPE_IPMB_DIRECT response length checking ipmi: fix oob access due to uninit smi_msg type ipmi: msghandler: Make symbol 'remove_work_wq' static ipmi: Move remove_work to dedicated workqueue
2021-12-02s390: update defconfigsHeiko Carstens3-3/+16
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-12-02Revert "drm/i915: Implement Wa_1508744258"José Roberto de Souza1-7/+0
This workarounds are causing hangs, because I missed the fact that it needs to be enabled for all cases and disabled when doing a resolve pass. So KMD only needs to whitelist it and UMD will be the one setting it on per case. This reverts commit 28ec02c9cbebf3feeaf21a59df9dfbc02bda3362. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/4145 Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Fixes: 28ec02c9cbeb ("drm/i915: Implement Wa_1508744258") Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211119140931.32791-1-jose.souza@intel.com (cherry picked from commit f3799ff16fcfacd44aee55db162830df461b631f) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-12-02sched/cputime: Fix getrusage(RUSAGE_THREAD) with nohz_fullFrederic Weisbecker2-5/+12
getrusage(RUSAGE_THREAD) with nohz_full may return shorter utime/stime than the actual time. task_cputime_adjusted() snapshots utime and stime and then adjust their sum to match the scheduler maintained cputime.sum_exec_runtime. Unfortunately in nohz_full, sum_exec_runtime is only updated once per second in the worst case, causing a discrepancy against utime and stime that can be updated anytime by the reader using vtime. To fix this situation, perform an update of cputime.sum_exec_runtime when the cputime snapshot reports the task as actually running while the tick is disabled. The related overhead is then contained within the relevant situations. Reported-by: Hasegawa Hitomi <hasegawa-hitomi@fujitsu.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Hasegawa Hitomi <hasegawa-hitomi@fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Acked-by: Phil Auld <pauld@redhat.com> Link: https://lore.kernel.org/r/20211026141055.57358-3-frederic@kernel.org
2021-12-02timers/nohz: Last resort update jiffies on nohz_full IRQ entryFrederic Weisbecker2-1/+9
When at least one CPU runs in nohz_full mode, a dedicated timekeeper CPU is guaranteed to stay online and to never stop its tick. Meanwhile on some rare case, the dedicated timekeeper may be running with interrupts disabled for a while, such as in stop_machine. If jiffies stop being updated, a nohz_full CPU may end up endlessly programming the next tick in the past, taking the last jiffies update monotonic timestamp as a stale base, resulting in an tick storm. Here is a scenario where it matters: 0) CPU 0 is the timekeeper and CPU 1 a nohz_full CPU. 1) A stop machine callback is queued to execute somewhere. 2) CPU 0 reaches MULTI_STOP_DISABLE_IRQ while CPU 1 is still in MULTI_STOP_PREPARE. Hence CPU 0 can't do its timekeeping duty. CPU 1 can still take IRQs. 3) CPU 1 receives an IRQ which queues a timer callback one jiffy forward. 4) On IRQ exit, CPU 1 schedules the tick one jiffy forward, taking last_jiffies_update as a base. But last_jiffies_update hasn't been updated for 2 jiffies since the timekeeper has interrupts disabled. 5) clockevents_program_event(), which relies on ktime_get(), observes that the expiration is in the past and therefore programs the min delta event on the clock. 6) The tick fires immediately, goto 3) 7) Tick storm, the nohz_full CPU is drown and takes ages to reach MULTI_STOP_DISABLE_IRQ, which is the only way out of this situation. Solve this with unconditionally updating jiffies if the value is stale on nohz_full IRQ entry. IRQs and other disturbances are expected to be rare enough on nohz_full for the unconditional call to ktime_get() to actually matter. Reported-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/20211026141055.57358-2-frederic@kernel.org
2021-12-02net: dsa: b53: Add SPI ID tableFlorian Fainelli1-0/+14
Currently autoloading for SPI devices does not use the DT ID table, it uses SPI modalises. Supporting OF modalises is going to be difficult if not impractical, an attempt was made but has been reverted, so ensure that module autoloading works for this driver by adding an id_table listing the SPI IDs for everything. Fixes: 96c8395e2166 ("spi: Revert modalias changes") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02gro: Fix inconsistent indentingJiapeng Chong1-3/+3
Eliminate the follow smatch warning: net/ipv6/ip6_offload.c:249 ipv6_gro_receive() warn: inconsistent indenting. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02selftests: net: Correct case nameLi Zhijian1-2/+2
ipv6_addr_bind/ipv4_addr_bind are function names. Previously, bind test would not be run by default due to the wrong case names Fixes: 34d0302ab861 ("selftests: Add ipv6 address bind tests to fcnal-test") Fixes: 75b2b2b3db4c ("selftests: Add ipv4 address bind tests to fcnal-test") Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02net/rds: correct socket tunable error in rds_tcp_tune()William Kucharski1-1/+1
Correct an error where setting /proc/sys/net/rds/tcp/rds_tcp_rcvbuf would instead modify the socket's sk_sndbuf and would leave sk_rcvbuf untouched. Fixes: c6a58ffed536 ("RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket") Signed-off-by: William Kucharski <william.kucharski@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02mctp: Don't let RTM_DELROUTE delete local routesMatt Johnston1-4/+5
We need to test against the existing route type, not the rtm_type in the netlink request. Fixes: 83f0a0b7285b ("mctp: Specify route types, require rtm_type in RTM_*ROUTE messages") Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02net/smc: Keep smc_close_final rc during active closeTony Lu1-2/+6
When smc_close_final() returns error, the return code overwrites by kernel_sock_shutdown() in smc_close_active(). The return code of smc_close_final() is more important than kernel_sock_shutdown(), and it will pass to userspace directly. Fix it by keeping both return codes, if smc_close_final() raises an error, return it or kernel_sock_shutdown()'s. Link: https://lore.kernel.org/linux-s390/1f67548e-cbf6-0dce-82b5-10288a4583bd@linux.ibm.com/ Fixes: 606a63c9783a ("net/smc: Ensure the active closing peer first closes clcsock") Suggested-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Tony Lu <tonylu@linux.alibaba.com> Reviewed-by: Wen Gu <guwen@linux.alibaba.com> Acked-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02ibmvnic: drop bad optimization in reuse_tx_pools()Sukadev Bhattiprolu1-11/+3
When trying to decide whether or not reuse existing rx/tx pools we tried to allow a range of values for the pool parameters rather than exact matches. This was intended to reuse the resources for instance when switching between two VIO servers with different default parameters. But this optimization is incomplete and breaks when we try to change the number of queues for instance. The optimization needs to be updated, so drop it for now and simplify the code. Fixes: bbd809305bc7 ("ibmvnic: Reuse tx pools when possible") Reported-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Reviewed-by: Rick Lindsley <ricklind@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02ibmvnic: drop bad optimization in reuse_rx_pools()Sukadev Bhattiprolu1-11/+3
When trying to decide whether or not reuse existing rx/tx pools we tried to allow a range of values for the pool parameters rather than exact matches. This was intended to reuse the resources for instance when switching between two VIO servers with different default parameters. But this optimization is incomplete and breaks when we try to change the number of queues for instance. The optimization needs to be updated, so drop it for now and simplify the code. Fixes: 489de956e7a2 ("ibmvnic: Reuse rx pools when possible") Reported-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Reviewed-by: Rick Lindsley <ricklind@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02net/smc: fix wrong list_del in smc_lgr_cleanup_earlyDust Li1-4/+3
smc_lgr_cleanup_early() meant to delete the link group from the link group list, but it deleted the list head by mistake. This may cause memory corruption since we didn't remove the real link group from the list and later memseted the link group structure. We got a list corruption panic when testing: [  231.277259] list_del corruption. prev->next should be ffff8881398a8000, but was 0000000000000000 [  231.278222] ------------[ cut here ]------------ [  231.278726] kernel BUG at lib/list_debug.c:53! [  231.279326] invalid opcode: 0000 [#1] SMP NOPTI [  231.279803] CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.10.46+ #435 [  231.280466] Hardware name: Alibaba Cloud ECS, BIOS 8c24b4c 04/01/2014 [  231.281248] Workqueue: events smc_link_down_work [  231.281732] RIP: 0010:__list_del_entry_valid+0x70/0x90 [  231.282258] Code: 4c 60 82 e8 7d cc 6a 00 0f 0b 48 89 fe 48 c7 c7 88 4c 60 82 e8 6c cc 6a 00 0f 0b 48 89 fe 48 c7 c7 c0 4c 60 82 e8 5b cc 6a 00 <0f> 0b 48 89 fe 48 c7 c7 00 4d 60 82 e8 4a cc 6a 00 0f 0b cc cc cc [  231.284146] RSP: 0018:ffffc90000033d58 EFLAGS: 00010292 [  231.284685] RAX: 0000000000000054 RBX: ffff8881398a8000 RCX: 0000000000000000 [  231.285415] RDX: 0000000000000001 RSI: ffff88813bc18040 RDI: ffff88813bc18040 [  231.286141] RBP: ffffffff8305ad40 R08: 0000000000000003 R09: 0000000000000001 [  231.286873] R10: ffffffff82803da0 R11: ffffc90000033b90 R12: 0000000000000001 [  231.287606] R13: 0000000000000000 R14: ffff8881398a8000 R15: 0000000000000003 [  231.288337] FS:  0000000000000000(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000 [  231.289160] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [  231.289754] CR2: 0000000000e72058 CR3: 000000010fa96006 CR4: 00000000003706f0 [  231.290485] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [  231.291211] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [  231.291940] Call Trace: [  231.292211]  smc_lgr_terminate_sched+0x53/0xa0 [  231.292677]  smc_switch_conns+0x75/0x6b0 [  231.293085]  ? update_load_avg+0x1a6/0x590 [  231.293517]  ? ttwu_do_wakeup+0x17/0x150 [  231.293907]  ? update_load_avg+0x1a6/0x590 [  231.294317]  ? newidle_balance+0xca/0x3d0 [  231.294716]  smcr_link_down+0x50/0x1a0 [  231.295090]  ? __wake_up_common_lock+0x77/0x90 [  231.295534]  smc_link_down_work+0x46/0x60 [  231.295933]  process_one_work+0x18b/0x350 Fixes: a0a62ee15a829 ("net/smc: separate locks for SMCD and SMCR link group lists") Signed-off-by: Dust Li <dust.li@linux.alibaba.com> Acked-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02Fix Comment of ETH_P_802_3_MINXiayu Zhang1-1/+1
The description of ETH_P_802_3_MIN is misleading. The value of EthernetType in Ethernet II frame is more than 0x0600, the value of Length in 802.3 frame is less than 0x0600. Signed-off-by: Xiayu Zhang <Xiayu.Zhang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02ethernet: aquantia: Try MAC address from device treeTianhao Chai1-10/+14
Apple M1 Mac minis (2020) with 10GE NICs do not have MAC address in the card, but instead need to obtain MAC addresses from the device tree. In this case the hardware will report an invalid MAC. Currently atlantic driver does not query the DT for MAC address and will randomly assign a MAC if the NIC doesn't have a permanent MAC burnt in. This patch causes the driver to perfer a valid MAC address from OF (if present) over HW self-reported MAC and only fall back to a random MAC address when neither of them is valid. Signed-off-by: Tianhao Chai <cth451@gmail.com> Reviewed-by: Igor Russkikh <irusskikh@marvell.com> Reviewed-by: Hector Martin <marcan@marcan.st> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02ipv4: convert fib_num_tclassid_users to atomic_tEric Dumazet5-7/+7
Before commit faa041a40b9f ("ipv4: Create cleanup helper for fib_nh") changes to net->ipv4.fib_num_tclassid_users were protected by RTNL. After the change, this is no longer the case, as free_fib_info_rcu() runs after rcu grace period, without rtnl being held. Fixes: faa041a40b9f ("ipv4: Create cleanup helper for fib_nh") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Ahern <dsahern@kernel.org> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02gfs2: gfs2_create_inode reworkAndreas Gruenbacher1-6/+10
When gfs2_lookup_by_inum() calls gfs2_inode_lookup() for an uncached inode, gfs2_inode_lookup() will place a new tentative inode into the inode cache before verifying that there is a valid inode at the given address. This can race with gfs2_create_inode() which doesn't check for duplicates inodes. gfs2_create_inode() will try to assign the new inode to the corresponding inode glock, and glock_set_object() will complain that the glock is still in use by gfs2_inode_lookup's tentative inode. We noticed this bug after adding commit 486408d690e1 ("gfs2: Cancel remote delete work asynchronously") which allowed delete_work_func() to race with gfs2_create_inode(), but the same race exists for open-by-handle. Fix that by switching from insert_inode_hash() to insert_inode_locked4(), which does check for duplicate inodes. We know we've just managed to to allocate the new inode, so an inode tentatively created by gfs2_inode_lookup() will eventually go away and insert_inode_locked4() will always succeed. In addition, don't flush the inode glock work anymore (this can now only make things worse) and clean up glock_{set,clear}_object for the inode glock somewhat. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-12-02gfs2: gfs2_inode_lookup reworkAndreas Gruenbacher1-51/+33
Rework gfs2_inode_lookup() to only set up the new inode's glocks after verifying that the new inode is valid. There is no need for flushing the inode glock work queue anymore now, so remove that as well. While at it, get rid of the useless wrapper around iget5_locked() and its unnecessary is_bad_inode() check. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-12-02gfs2: gfs2_inode_lookup cleanupAndreas Gruenbacher1-7/+2
In gfs2_inode_lookup, once the inode has been looked up, we check if the inode generation (no_formal_ino) is the one we're looking for. If it isn't and the inode wasn't in the inode cache, we discard the newly looked up inode. This is unnecessary, complicates the code, and makes future changes to gfs2_inode_lookup harder, so change the code to retain newly looked up inodes instead. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-12-02gfs2: Fix remote demote of weak glock holdersAndreas Gruenbacher1-3/+7
When we mock up a temporary holder in gfs2_glock_cb to demote weak holders in response to a remote locking conflict, we don't set the HIF_HOLDER flag. This causes function may_grant to BUG. Fix by setting the missing HIF_HOLDER flag in the mock glock holder. In addition, define the mock glock holder where it is used. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-12-02arm64: ftrace: add missing BTIsMark Rutland1-0/+6
When branch target identifiers are in use, code reachable via an indirect branch requires a BTI landing pad at the branch target site. When building FTRACE_WITH_REGS atop patchable-function-entry, we miss BTIs at the start start of the `ftrace_caller` and `ftrace_regs_caller` trampolines, and when these are called from a module via a PLT (which will use a `BR X16`), we will encounter a BTI failure, e.g. | # insmod lkdtm.ko | lkdtm: No crash points registered, enable through debugfs | # echo function_graph > /sys/kernel/debug/tracing/current_tracer | # cat /sys/kernel/debug/provoke-crash/DIRECT | Unhandled 64-bit el1h sync exception on CPU0, ESR 0x34000001 -- BTI | CPU: 0 PID: 174 Comm: cat Not tainted 5.16.0-rc2-dirty #3 | Hardware name: linux,dummy-virt (DT) | pstate: 60400405 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=jc) | pc : ftrace_caller+0x0/0x3c | lr : lkdtm_debugfs_open+0xc/0x20 [lkdtm] | sp : ffff800012e43b00 | x29: ffff800012e43b00 x28: 0000000000000000 x27: ffff800012e43c88 | x26: 0000000000000000 x25: 0000000000000000 x24: ffff0000c171f200 | x23: ffff0000c27b1e00 x22: ffff0000c2265240 x21: ffff0000c23c8c30 | x20: ffff8000090ba380 x19: 0000000000000000 x18: 0000000000000000 | x17: 0000000000000000 x16: ffff80001002bb4c x15: 0000000000000000 | x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000900ff0 | x11: ffff0000c4166310 x10: ffff800012e43b00 x9 : ffff8000104f2384 | x8 : 0000000000000001 x7 : 0000000000000000 x6 : 000000000000003f | x5 : 0000000000000040 x4 : ffff800012e43af0 x3 : 0000000000000001 | x2 : ffff8000090b0000 x1 : ffff0000c171f200 x0 : ffff0000c23c8c30 | Kernel panic - not syncing: Unhandled exception | CPU: 0 PID: 174 Comm: cat Not tainted 5.16.0-rc2-dirty #3 | Hardware name: linux,dummy-virt (DT) | Call trace: | dump_backtrace+0x0/0x1a4 | show_stack+0x24/0x30 | dump_stack_lvl+0x68/0x84 | dump_stack+0x1c/0x38 | panic+0x168/0x360 | arm64_exit_nmi.isra.0+0x0/0x80 | el1h_64_sync_handler+0x68/0xd4 | el1h_64_sync+0x78/0x7c | ftrace_caller+0x0/0x3c | do_dentry_open+0x134/0x3b0 | vfs_open+0x38/0x44 | path_openat+0x89c/0xe40 | do_filp_open+0x8c/0x13c | do_sys_openat2+0xbc/0x174 | __arm64_sys_openat+0x6c/0xbc | invoke_syscall+0x50/0x120 | el0_svc_common.constprop.0+0xdc/0x100 | do_el0_svc+0x84/0xa0 | el0_svc+0x28/0x80 | el0t_64_sync_handler+0xa8/0x130 | el0t_64_sync+0x1a0/0x1a4 | SMP: stopping secondary CPUs | Kernel Offset: disabled | CPU features: 0x0,00000f42,da660c5f | Memory Limit: none | ---[ end Kernel panic - not syncing: Unhandled exception ]--- Fix this by adding the required `BTI C`, as we only require these to be reachable via BL for direct calls or BR X16/X17 for PLTs. For now, these are open-coded in the function prologue, matching the style of the `__hwasan_tag_mismatch` trampoline. In future we may wish to consider adding a new SYM_CODE_START_*() variant which has an implicit BTI. When ftrace is built atop mcount, the trampolines are marked with SYM_FUNC_START(), and so get an implicit BTI. We may need to change these over to SYM_CODE_START() in future for RELIABLE_STACKTRACE, in case we need to apply special care aroud the return address being rewritten. Fixes: 97fed779f2a6 ("arm64: bti: Provide Kconfig for kernel mode BTI") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20211129135709.2274019-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-02arm64: kexec: use __pa_symbol(empty_zero_page)Mark Rutland1-1/+1
In machine_kexec_post_load() we use __pa() on `empty_zero_page`, so that we can use the physical address during arm64_relocate_new_kernel() to switch TTBR1 to a new set of tables. While `empty_zero_page` is part of the old kernel, we won't clobber it until after this switch, so using it is benign. However, `empty_zero_page` is part of the kernel image rather than a linear map address, so it is not correct to use __pa(x), and we should instead use __pa_symbol(x) or __pa(lm_alias(x)). Otherwise, when the kernel is built with DEBUG_VIRTUAL, we'll encounter splats as below, as I've seen when fuzzing v5.16-rc3 with Syzkaller: | ------------[ cut here ]------------ | virt_to_phys used for non-linear address: 000000008492561a (empty_zero_page+0x0/0x1000) | WARNING: CPU: 3 PID: 11492 at arch/arm64/mm/physaddr.c:15 __virt_to_phys+0x120/0x1c0 arch/arm64/mm/physaddr.c:12 | CPU: 3 PID: 11492 Comm: syz-executor.0 Not tainted 5.16.0-rc3-00001-g48bd452a045c #1 | Hardware name: linux,dummy-virt (DT) | pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : __virt_to_phys+0x120/0x1c0 arch/arm64/mm/physaddr.c:12 | lr : __virt_to_phys+0x120/0x1c0 arch/arm64/mm/physaddr.c:12 | sp : ffff80001af17bb0 | x29: ffff80001af17bb0 x28: ffff1cc65207b400 x27: ffffb7828730b120 | x26: 0000000000000e11 x25: 0000000000000000 x24: 0000000000000001 | x23: ffffb7828963e000 x22: ffffb78289644000 x21: 0000600000000000 | x20: 000000000000002d x19: 0000b78289644000 x18: 0000000000000000 | x17: 74706d6528206131 x16: 3635323934383030 x15: 303030303030203a | x14: 1ffff000035e2eb8 x13: ffff6398d53f4f0f x12: 1fffe398d53f4f0e | x11: 1fffe398d53f4f0e x10: ffff6398d53f4f0e x9 : ffffb7827c6f76dc | x8 : ffff1cc6a9fa7877 x7 : 0000000000000001 x6 : ffff6398d53f4f0f | x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffff1cc66f2a99c0 | x2 : 0000000000040000 x1 : d7ce7775b09b5d00 x0 : 0000000000000000 | Call trace: | __virt_to_phys+0x120/0x1c0 arch/arm64/mm/physaddr.c:12 | machine_kexec_post_load+0x284/0x670 arch/arm64/kernel/machine_kexec.c:150 | do_kexec_load+0x570/0x670 kernel/kexec.c:155 | __do_sys_kexec_load kernel/kexec.c:250 [inline] | __se_sys_kexec_load kernel/kexec.c:231 [inline] | __arm64_sys_kexec_load+0x1d8/0x268 kernel/kexec.c:231 | __invoke_syscall arch/arm64/kernel/syscall.c:38 [inline] | invoke_syscall+0x90/0x2e0 arch/arm64/kernel/syscall.c:52 | el0_svc_common.constprop.2+0x1e4/0x2f8 arch/arm64/kernel/syscall.c:142 | do_el0_svc+0xf8/0x150 arch/arm64/kernel/syscall.c:181 | el0_svc+0x60/0x248 arch/arm64/kernel/entry-common.c:603 | el0t_64_sync_handler+0x90/0xb8 arch/arm64/kernel/entry-common.c:621 | el0t_64_sync+0x180/0x184 arch/arm64/kernel/entry.S:572 | irq event stamp: 2428 | hardirqs last enabled at (2427): [<ffffb7827c6f2308>] __up_console_sem+0xf0/0x118 kernel/printk/printk.c:255 | hardirqs last disabled at (2428): [<ffffb7828223df98>] el1_dbg+0x28/0x80 arch/arm64/kernel/entry-common.c:375 | softirqs last enabled at (2424): [<ffffb7827c411c00>] softirq_handle_end kernel/softirq.c:401 [inline] | softirqs last enabled at (2424): [<ffffb7827c411c00>] __do_softirq+0xa28/0x11e4 kernel/softirq.c:587 | softirqs last disabled at (2417): [<ffffb7827c59015c>] do_softirq_own_stack include/asm-generic/softirq_stack.h:10 [inline] | softirqs last disabled at (2417): [<ffffb7827c59015c>] invoke_softirq kernel/softirq.c:439 [inline] | softirqs last disabled at (2417): [<ffffb7827c59015c>] __irq_exit_rcu kernel/softirq.c:636 [inline] | softirqs last disabled at (2417): [<ffffb7827c59015c>] irq_exit_rcu+0x53c/0x688 kernel/softirq.c:648 | ---[ end trace 0ca578534e7ca938 ]--- With or without DEBUG_VIRTUAL __pa() will fall back to __kimg_to_phys() for non-linear addresses, and will happen to do the right thing in this case, even with the warning. But we should not depend upon this, and to keep the warning useful we should fix this case. Fix this issue by using __pa_symbol(), which handles kernel image addresses (and checks its input is a kernel image address). This matches what we do elsewhere, e.g. in arch/arm64/include/asm/pgtable.h: | #define ZERO_PAGE(vaddr) phys_to_page(__pa_symbol(empty_zero_page)) Fixes: 3744b5280e67 ("arm64: kexec: install a copy of the linear-map") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Link: https://lore.kernel.org/r/20211130121849.3319010-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-02arm64: update PAC description for kernelKuan-Ying Lee1-5/+4
Remove the paragraph which has nothing to do with the kernel and add PAC description related to kernel. Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> Link: https://lore.kernel.org/r/20211201034014.20048-1-Kuan-Ying.Lee@mediatek.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-02KVM: x86/mmu: Retry page fault if root is invalidated by memslot updateSean Christopherson2-3/+23
Bail from the page fault handler if the root shadow page was obsoleted by a memslot update. Do the check _after_ acuiring mmu_lock, as the TDP MMU doesn't rely on the memslot/MMU generation, and instead relies on the root being explicit marked invalid by kvm_mmu_zap_all_fast(), which takes mmu_lock for write. For the TDP MMU, inserting a SPTE into an obsolete root can leak a SP if kvm_tdp_mmu_zap_invalidated_roots() has already zapped the SP, i.e. has moved past the gfn associated with the SP. For other MMUs, the resulting behavior is far more convoluted, though unlikely to be truly problematic. Installing SPs/SPTEs into the obsolete root isn't directly problematic, as the obsolete root will be unloaded and dropped before the vCPU re-enters the guest. But because the legacy MMU tracks shadow pages by their role, any SP created by the fault can can be reused in the new post-reload root. Again, that _shouldn't_ be problematic as any leaf child SPTEs will be created for the current/valid memslot generation, and kvm_mmu_get_page() will not reuse child SPs from the old generation as they will be flagged as obsolete. But, given that continuing with the fault is pointess (the root will be unloaded), apply the check to all MMUs. Fixes: b7cccd397f31 ("KVM: x86/mmu: Fast invalidation for TDP MMU") Cc: stable@vger.kernel.org Cc: Ben Gardon <bgardon@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211120045046.3940942-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-02KVM: VMX: Set failure code in prepare_vmcs02()Dan Carpenter1-1/+3
The error paths in the prepare_vmcs02() function are supposed to set *entry_failure_code but this path does not. It leads to using an uninitialized variable in the caller. Fixes: 71f7347025bf ("KVM: nVMX: Load GUEST_IA32_PERF_GLOBAL_CTRL MSR on VM-Entry") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Message-Id: <20211130125337.GB24578@kili> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-02KVM: ensure APICv is considered inactive if there is no APICPaolo Bonzini4-4/+8
kvm_vcpu_apicv_active() returns false if a virtual machine has no in-kernel local APIC, however kvm_apicv_activated might still be true if there are no reasons to disable APICv; in fact it is quite likely that there is none because APICv is inhibited by specific configurations of the local APIC and those configurations cannot be programmed. This triggers a WARN: WARN_ON_ONCE(kvm_apicv_activated(vcpu->kvm) != kvm_vcpu_apicv_active(vcpu)); To avoid this, introduce another cause for APICv inhibition, namely the absence of an in-kernel local APIC. This cause is enabled by default, and is dropped by either KVM_CREATE_IRQCHIP or the enabling of KVM_CAP_IRQCHIP_SPLIT. Reported-by: Ignat Korchagin <ignat@cloudflare.com> Fixes: ee49a8932971 ("KVM: x86: Move SVM's APICv sanity check to common x86", 2021-10-22) Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Tested-by: Ignat Korchagin <ignat@cloudflare.com> Message-Id: <20211130123746.293379-1-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-02KVM: x86/pmu: Fix reserved bits for AMD PerfEvtSeln registerLike Xu1-1/+1
If we run the following perf command in an AMD Milan guest: perf stat \ -e cpu/event=0x1d0/ \ -e cpu/event=0x1c7/ \ -e cpu/umask=0x1f,event=0x18e/ \ -e cpu/umask=0x7,event=0x18e/ \ -e cpu/umask=0x18,event=0x18e/ \ ./workload dmesg will report a #GP warning from an unchecked MSR access error on MSR_F15H_PERF_CTLx. This is because according to APM (Revision: 4.03) Figure 13-7, the bits [35:32] of AMD PerfEvtSeln register is a part of the event select encoding, which extends the EVENT_SELECT field from 8 bits to 12 bits. Opportunistically update pmu->reserved_bits for reserved bit 19. Reported-by: Jim Mattson <jmattson@google.com> Fixes: ca724305a2b0 ("KVM: x86/vPMU: Implement AMD vPMU code for KVM") Signed-off-by: Like Xu <likexu@tencent.com> Message-Id: <20211118130320.95997-1-likexu@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-02ata: replace snprintf in show functions with sysfs_emitYang Guang1-1/+1
coccinelle report: ./drivers/ata/libata-sata.c:830:8-16: WARNING: use scnprintf or sprintf Use sysfs_emit instead of scnprintf or sprintf makes more sense. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Yang Guang <yang.guang5@zte.com.cn> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2021-12-01net: avoid uninit-value from tcp_conn_requestEric Dumazet2-3/+16
A recent change triggers a KMSAN warning, because request sockets do not initialize @sk_rx_queue_mapping field. Add sk_rx_queue_update() helper to make our intent clear. BUG: KMSAN: uninit-value in sk_rx_queue_set include/net/sock.h:1922 [inline] BUG: KMSAN: uninit-value in tcp_conn_request+0x3bcc/0x4dc0 net/ipv4/tcp_input.c:6922 sk_rx_queue_set include/net/sock.h:1922 [inline] tcp_conn_request+0x3bcc/0x4dc0 net/ipv4/tcp_input.c:6922 tcp_v4_conn_request+0x218/0x2a0 net/ipv4/tcp_ipv4.c:1528 tcp_rcv_state_process+0x2c5/0x3290 net/ipv4/tcp_input.c:6406 tcp_v4_do_rcv+0xb4e/0x1330 net/ipv4/tcp_ipv4.c:1738 tcp_v4_rcv+0x468d/0x4ed0 net/ipv4/tcp_ipv4.c:2100 ip_protocol_deliver_rcu+0x760/0x10b0 net/ipv4/ip_input.c:204 ip_local_deliver_finish net/ipv4/ip_input.c:231 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] ip_local_deliver+0x584/0x8c0 net/ipv4/ip_input.c:252 dst_input include/net/dst.h:460 [inline] ip_sublist_rcv_finish net/ipv4/ip_input.c:551 [inline] ip_list_rcv_finish net/ipv4/ip_input.c:601 [inline] ip_sublist_rcv+0x11fd/0x1520 net/ipv4/ip_input.c:609 ip_list_rcv+0x95f/0x9a0 net/ipv4/ip_input.c:644 __netif_receive_skb_list_ptype net/core/dev.c:5505 [inline] __netif_receive_skb_list_core+0xe34/0x1240 net/core/dev.c:5553 __netif_receive_skb_list+0x7fc/0x960 net/core/dev.c:5605 netif_receive_skb_list_internal+0x868/0xde0 net/core/dev.c:5696 gro_normal_list net/core/dev.c:5850 [inline] napi_complete_done+0x579/0xdd0 net/core/dev.c:6587 virtqueue_napi_complete drivers/net/virtio_net.c:339 [inline] virtnet_poll+0x17b6/0x2350 drivers/net/virtio_net.c:1557 __napi_poll+0x14e/0xbc0 net/core/dev.c:7020 napi_poll net/core/dev.c:7087 [inline] net_rx_action+0x824/0x1880 net/core/dev.c:7174 __do_softirq+0x1fe/0x7eb kernel/softirq.c:558 invoke_softirq+0xa4/0x130 kernel/softirq.c:432 __irq_exit_rcu kernel/softirq.c:636 [inline] irq_exit_rcu+0x76/0x130 kernel/softirq.c:648 common_interrupt+0xb6/0xd0 arch/x86/kernel/irq.c:240 asm_common_interrupt+0x1e/0x40 smap_restore arch/x86/include/asm/smap.h:67 [inline] get_shadow_origin_ptr mm/kmsan/instrumentation.c:31 [inline] __msan_metadata_ptr_for_load_1+0x28/0x30 mm/kmsan/instrumentation.c:63 tomoyo_check_acl+0x1b0/0x630 security/tomoyo/domain.c:173 tomoyo_path_permission security/tomoyo/file.c:586 [inline] tomoyo_check_open_permission+0x61f/0xe10 security/tomoyo/file.c:777 tomoyo_file_open+0x24f/0x2d0 security/tomoyo/tomoyo.c:311 security_file_open+0xb1/0x1f0 security/security.c:1635 do_dentry_open+0x4e4/0x1bf0 fs/open.c:809 vfs_open+0xaf/0xe0 fs/open.c:957 do_open fs/namei.c:3426 [inline] path_openat+0x52f1/0x5dd0 fs/namei.c:3559 do_filp_open+0x306/0x760 fs/namei.c:3586 do_sys_openat2+0x263/0x8f0 fs/open.c:1212 do_sys_open fs/open.c:1228 [inline] __do_sys_open fs/open.c:1236 [inline] __se_sys_open fs/open.c:1232 [inline] __x64_sys_open+0x314/0x380 fs/open.c:1232 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x44/0xae Uninit was created at: __alloc_pages+0xbc7/0x10a0 mm/page_alloc.c:5409 alloc_pages+0x8a5/0xb80 alloc_slab_page mm/slub.c:1810 [inline] allocate_slab+0x287/0x1c20 mm/slub.c:1947 new_slab mm/slub.c:2010 [inline] ___slab_alloc+0xbdf/0x1e90 mm/slub.c:3039 __slab_alloc mm/slub.c:3126 [inline] slab_alloc_node mm/slub.c:3217 [inline] slab_alloc mm/slub.c:3259 [inline] kmem_cache_alloc+0xbb3/0x11c0 mm/slub.c:3264 reqsk_alloc include/net/request_sock.h:91 [inline] inet_reqsk_alloc+0xaf/0x8b0 net/ipv4/tcp_input.c:6712 tcp_conn_request+0x910/0x4dc0 net/ipv4/tcp_input.c:6852 tcp_v4_conn_request+0x218/0x2a0 net/ipv4/tcp_ipv4.c:1528 tcp_rcv_state_process+0x2c5/0x3290 net/ipv4/tcp_input.c:6406 tcp_v4_do_rcv+0xb4e/0x1330 net/ipv4/tcp_ipv4.c:1738 tcp_v4_rcv+0x468d/0x4ed0 net/ipv4/tcp_ipv4.c:2100 ip_protocol_deliver_rcu+0x760/0x10b0 net/ipv4/ip_input.c:204 ip_local_deliver_finish net/ipv4/ip_input.c:231 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] ip_local_deliver+0x584/0x8c0 net/ipv4/ip_input.c:252 dst_input include/net/dst.h:460 [inline] ip_sublist_rcv_finish net/ipv4/ip_input.c:551 [inline] ip_list_rcv_finish net/ipv4/ip_input.c:601 [inline] ip_sublist_rcv+0x11fd/0x1520 net/ipv4/ip_input.c:609 ip_list_rcv+0x95f/0x9a0 net/ipv4/ip_input.c:644 __netif_receive_skb_list_ptype net/core/dev.c:5505 [inline] __netif_receive_skb_list_core+0xe34/0x1240 net/core/dev.c:5553 __netif_receive_skb_list+0x7fc/0x960 net/core/dev.c:5605 netif_receive_skb_list_internal+0x868/0xde0 net/core/dev.c:5696 gro_normal_list net/core/dev.c:5850 [inline] napi_complete_done+0x579/0xdd0 net/core/dev.c:6587 virtqueue_napi_complete drivers/net/virtio_net.c:339 [inline] virtnet_poll+0x17b6/0x2350 drivers/net/virtio_net.c:1557 __napi_poll+0x14e/0xbc0 net/core/dev.c:7020 napi_poll net/core/dev.c:7087 [inline] net_rx_action+0x824/0x1880 net/core/dev.c:7174 __do_softirq+0x1fe/0x7eb kernel/softirq.c:558 Fixes: 342159ee394d ("net: avoid dirtying sk->sk_rx_queue_mapping") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Link: https://lore.kernel.org/r/20211130182939.2584764-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-01net: annotate data-races on txq->xmit_lock_ownerEric Dumazet2-7/+17
syzbot found that __dev_queue_xmit() is reading txq->xmit_lock_owner without annotations. No serious issue there, let's document what is happening there. BUG: KCSAN: data-race in __dev_queue_xmit / __dev_queue_xmit write to 0xffff888139d09484 of 4 bytes by interrupt on cpu 0: __netif_tx_unlock include/linux/netdevice.h:4437 [inline] __dev_queue_xmit+0x948/0xf70 net/core/dev.c:4229 dev_queue_xmit_accel+0x19/0x20 net/core/dev.c:4265 macvlan_queue_xmit drivers/net/macvlan.c:543 [inline] macvlan_start_xmit+0x2b3/0x3d0 drivers/net/macvlan.c:567 __netdev_start_xmit include/linux/netdevice.h:4987 [inline] netdev_start_xmit include/linux/netdevice.h:5001 [inline] xmit_one+0x105/0x2f0 net/core/dev.c:3590 dev_hard_start_xmit+0x72/0x120 net/core/dev.c:3606 sch_direct_xmit+0x1b2/0x7c0 net/sched/sch_generic.c:342 __dev_xmit_skb+0x83d/0x1370 net/core/dev.c:3817 __dev_queue_xmit+0x590/0xf70 net/core/dev.c:4194 dev_queue_xmit+0x13/0x20 net/core/dev.c:4259 neigh_hh_output include/net/neighbour.h:511 [inline] neigh_output include/net/neighbour.h:525 [inline] ip6_finish_output2+0x995/0xbb0 net/ipv6/ip6_output.c:126 __ip6_finish_output net/ipv6/ip6_output.c:191 [inline] ip6_finish_output+0x444/0x4c0 net/ipv6/ip6_output.c:201 NF_HOOK_COND include/linux/netfilter.h:296 [inline] ip6_output+0x10e/0x210 net/ipv6/ip6_output.c:224 dst_output include/net/dst.h:450 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] ndisc_send_skb+0x486/0x610 net/ipv6/ndisc.c:508 ndisc_send_rs+0x3b0/0x3e0 net/ipv6/ndisc.c:702 addrconf_rs_timer+0x370/0x540 net/ipv6/addrconf.c:3898 call_timer_fn+0x2e/0x240 kernel/time/timer.c:1421 expire_timers+0x116/0x240 kernel/time/timer.c:1466 __run_timers+0x368/0x410 kernel/time/timer.c:1734 run_timer_softirq+0x2e/0x60 kernel/time/timer.c:1747 __do_softirq+0x158/0x2de kernel/softirq.c:558 __irq_exit_rcu kernel/softirq.c:636 [inline] irq_exit_rcu+0x37/0x70 kernel/softirq.c:648 sysvec_apic_timer_interrupt+0x3e/0xb0 arch/x86/kernel/apic/apic.c:1097 asm_sysvec_apic_timer_interrupt+0x12/0x20 read to 0xffff888139d09484 of 4 bytes by interrupt on cpu 1: __dev_queue_xmit+0x5e3/0xf70 net/core/dev.c:4213 dev_queue_xmit_accel+0x19/0x20 net/core/dev.c:4265 macvlan_queue_xmit drivers/net/macvlan.c:543 [inline] macvlan_start_xmit+0x2b3/0x3d0 drivers/net/macvlan.c:567 __netdev_start_xmit include/linux/netdevice.h:4987 [inline] netdev_start_xmit include/linux/netdevice.h:5001 [inline] xmit_one+0x105/0x2f0 net/core/dev.c:3590 dev_hard_start_xmit+0x72/0x120 net/core/dev.c:3606 sch_direct_xmit+0x1b2/0x7c0 net/sched/sch_generic.c:342 __dev_xmit_skb+0x83d/0x1370 net/core/dev.c:3817 __dev_queue_xmit+0x590/0xf70 net/core/dev.c:4194 dev_queue_xmit+0x13/0x20 net/core/dev.c:4259 neigh_resolve_output+0x3db/0x410 net/core/neighbour.c:1523 neigh_output include/net/neighbour.h:527 [inline] ip6_finish_output2+0x9be/0xbb0 net/ipv6/ip6_output.c:126 __ip6_finish_output net/ipv6/ip6_output.c:191 [inline] ip6_finish_output+0x444/0x4c0 net/ipv6/ip6_output.c:201 NF_HOOK_COND include/linux/netfilter.h:296 [inline] ip6_output+0x10e/0x210 net/ipv6/ip6_output.c:224 dst_output include/net/dst.h:450 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] ndisc_send_skb+0x486/0x610 net/ipv6/ndisc.c:508 ndisc_send_rs+0x3b0/0x3e0 net/ipv6/ndisc.c:702 addrconf_rs_timer+0x370/0x540 net/ipv6/addrconf.c:3898 call_timer_fn+0x2e/0x240 kernel/time/timer.c:1421 expire_timers+0x116/0x240 kernel/time/timer.c:1466 __run_timers+0x368/0x410 kernel/time/timer.c:1734 run_timer_softirq+0x2e/0x60 kernel/time/timer.c:1747 __do_softirq+0x158/0x2de kernel/softirq.c:558 __irq_exit_rcu kernel/softirq.c:636 [inline] irq_exit_rcu+0x37/0x70 kernel/softirq.c:648 sysvec_apic_timer_interrupt+0x8d/0xb0 arch/x86/kernel/apic/apic.c:1097 asm_sysvec_apic_timer_interrupt+0x12/0x20 kcsan_setup_watchpoint+0x94/0x420 kernel/kcsan/core.c:443 folio_test_anon include/linux/page-flags.h:581 [inline] PageAnon include/linux/page-flags.h:586 [inline] zap_pte_range+0x5ac/0x10e0 mm/memory.c:1347 zap_pmd_range mm/memory.c:1467 [inline] zap_pud_range mm/memory.c:1496 [inline] zap_p4d_range mm/memory.c:1517 [inline] unmap_page_range+0x2dc/0x3d0 mm/memory.c:1538 unmap_single_vma+0x157/0x210 mm/memory.c:1583 unmap_vmas+0xd0/0x180 mm/memory.c:1615 exit_mmap+0x23d/0x470 mm/mmap.c:3170 __mmput+0x27/0x1b0 kernel/fork.c:1113 mmput+0x3d/0x50 kernel/fork.c:1134 exit_mm+0xdb/0x170 kernel/exit.c:507 do_exit+0x608/0x17a0 kernel/exit.c:819 do_group_exit+0xce/0x180 kernel/exit.c:929 get_signal+0xfc3/0x1550 kernel/signal.c:2852 arch_do_signal_or_restart+0x8c/0x2e0 arch/x86/kernel/signal.c:868 handle_signal_work kernel/entry/common.c:148 [inline] exit_to_user_mode_loop kernel/entry/common.c:172 [inline] exit_to_user_mode_prepare+0x113/0x190 kernel/entry/common.c:207 __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline] syscall_exit_to_user_mode+0x20/0x40 kernel/entry/common.c:300 do_syscall_64+0x50/0xd0 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0x00000000 -> 0xffffffff Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 28712 Comm: syz-executor.0 Tainted: G W 5.16.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Link: https://lore.kernel.org/r/20211130170155.2331929-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-01octeontx2-af: Fix a memleak bug in rvu_mbox_init()Zhou Qingyang1-1/+1
In rvu_mbox_init(), mbox_regions is not freed or passed out under the switch-default region, which could lead to a memory leak. Fix this bug by changing 'return err' to 'goto free_regions'. This bug was found by a static analyzer. The analysis employs differential checking to identify inconsistent security operations (e.g., checks or kfrees) between two code paths and confirms that the inconsistent operations are not recovered in the current function or the callers, so they constitute bugs. Note that, as a bug found by static analysis, it can be a false positive or hard to trigger. Multiple researchers have cross-reviewed the bug. Builds with CONFIG_OCTEONTX2_AF=y show no new warnings, and our static analyzer no longer warns about this code. Fixes: 98c561116360 (“octeontx2-af: cn10k: Add mbox support for CN10K platform”) Signed-off-by: Zhou Qingyang <zhou1615@umn.edu> Link: https://lore.kernel.org/r/20211130165039.192426-1-zhou1615@umn.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-01net/mlx4_en: Fix an use-after-free bug in mlx4_en_try_alloc_resources()Zhou Qingyang1-2/+7
In mlx4_en_try_alloc_resources(), mlx4_en_copy_priv() is called and tmp->tx_cq will be freed on the error path of mlx4_en_copy_priv(). After that mlx4_en_alloc_resources() is called and there is a dereference of &tmp->tx_cq[t][i] in mlx4_en_alloc_resources(), which could lead to a use after free problem on failure of mlx4_en_copy_priv(). Fix this bug by adding a check of mlx4_en_copy_priv() This bug was found by a static analyzer. The analysis employs differential checking to identify inconsistent security operations (e.g., checks or kfrees) between two code paths and confirms that the inconsistent operations are not recovered in the current function or the callers, so they constitute bugs. Note that, as a bug found by static analysis, it can be a false positive or hard to trigger. Multiple researchers have cross-reviewed the bug. Builds with CONFIG_MLX4_EN=m show no new warnings, and our static analyzer no longer warns about this code. Fixes: ec25bc04ed8e ("net/mlx4_en: Add resilience in low memory systems") Signed-off-by: Zhou Qingyang <zhou1615@umn.edu> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20211130164438.190591-1-zhou1615@umn.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org>