summaryrefslogtreecommitdiff
path: root/arch/s390
AgeCommit message (Collapse)AuthorFilesLines
2024-09-28Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2-9/+19
Pull x86 kvm updates from Paolo Bonzini: "x86: - KVM currently invalidates the entirety of the page tables, not just those for the memslot being touched, when a memslot is moved or deleted. This does not traditionally have particularly noticeable overhead, but Intel's TDX will require the guest to re-accept private pages if they are dropped from the secure EPT, which is a non starter. Actually, the only reason why this is not already being done is a bug which was never fully investigated and caused VM instability with assigned GeForce GPUs, so allow userspace to opt into the new behavior. - Advertise AVX10.1 to userspace (effectively prep work for the "real" AVX10 functionality that is on the horizon) - Rework common MSR handling code to suppress errors on userspace accesses to unsupported-but-advertised MSRs This will allow removing (almost?) all of KVM's exemptions for userspace access to MSRs that shouldn't exist based on the vCPU model (the actual cleanup is non-trivial future work) - Rework KVM's handling of x2APIC ICR, again, because AMD (x2AVIC) splits the 64-bit value into the legacy ICR and ICR2 storage, whereas Intel (APICv) stores the entire 64-bit value at the ICR offset - Fix a bug where KVM would fail to exit to userspace if one was triggered by a fastpath exit handler - Add fastpath handling of HLT VM-Exit to expedite re-entering the guest when there's already a pending wake event at the time of the exit - Fix a WARN caused by RSM entering a nested guest from SMM with invalid guest state, by forcing the vCPU out of guest mode prior to signalling SHUTDOWN (the SHUTDOWN hits the VM altogether, not the nested guest) - Overhaul the "unprotect and retry" logic to more precisely identify cases where retrying is actually helpful, and to harden all retry paths against putting the guest into an infinite retry loop - Add support for yielding, e.g. to honor NEED_RESCHED, when zapping rmaps in the shadow MMU - Refactor pieces of the shadow MMU related to aging SPTEs in prepartion for adding multi generation LRU support in KVM - Don't stuff the RSB after VM-Exit when RETPOLINE=y and AutoIBRS is enabled, i.e. when the CPU has already flushed the RSB - Trace the per-CPU host save area as a VMCB pointer to improve readability and cleanup the retrieval of the SEV-ES host save area - Remove unnecessary accounting of temporary nested VMCB related allocations - Set FINAL/PAGE in the page fault error code for EPT violations if and only if the GVA is valid. If the GVA is NOT valid, there is no guest-side page table walk and so stuffing paging related metadata is nonsensical - Fix a bug where KVM would incorrectly synthesize a nested VM-Exit instead of emulating posted interrupt delivery to L2 - Add a lockdep assertion to detect unsafe accesses of vmcs12 structures - Harden eVMCS loading against an impossible NULL pointer deref (really truly should be impossible) - Minor SGX fix and a cleanup - Misc cleanups Generic: - Register KVM's cpuhp and syscore callbacks when enabling virtualization in hardware, as the sole purpose of said callbacks is to disable and re-enable virtualization as needed - Enable virtualization when KVM is loaded, not right before the first VM is created Together with the previous change, this simplifies a lot the logic of the callbacks, because their very existence implies virtualization is enabled - Fix a bug that results in KVM prematurely exiting to userspace for coalesced MMIO/PIO in many cases, clean up the related code, and add a testcase - Fix a bug in kvm_clear_guest() where it would trigger a buffer overflow _if_ the gpa+len crosses a page boundary, which thankfully is guaranteed to not happen in the current code base. Add WARNs in more helpers that read/write guest memory to detect similar bugs Selftests: - Fix a goof that caused some Hyper-V tests to be skipped when run on bare metal, i.e. NOT in a VM - Add a regression test for KVM's handling of SHUTDOWN for an SEV-ES guest - Explicitly include one-off assets in .gitignore. Past Sean was completely wrong about not being able to detect missing .gitignore entries - Verify userspace single-stepping works when KVM happens to handle a VM-Exit in its fastpath - Misc cleanups" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (127 commits) Documentation: KVM: fix warning in "make htmldocs" s390: Enable KVM_S390_UCONTROL config in debug_defconfig selftests: kvm: s390: Add VM run test case KVM: SVM: let alternatives handle the cases when RSB filling is required KVM: VMX: Set PFERR_GUEST_{FINAL,PAGE}_MASK if and only if the GVA is valid KVM: x86/mmu: Use KVM_PAGES_PER_HPAGE() instead of an open coded equivalent KVM: x86/mmu: Add KVM_RMAP_MANY to replace open coded '1' and '1ul' literals KVM: x86/mmu: Fold mmu_spte_age() into kvm_rmap_age_gfn_range() KVM: x86/mmu: Morph kvm_handle_gfn_range() into an aging specific helper KVM: x86/mmu: Honor NEED_RESCHED when zapping rmaps and blocking is allowed KVM: x86/mmu: Add a helper to walk and zap rmaps for a memslot KVM: x86/mmu: Plumb a @can_yield parameter into __walk_slot_rmaps() KVM: x86/mmu: Move walk_slot_rmaps() up near for_each_slot_rmap_range() KVM: x86/mmu: WARN on MMIO cache hit when emulating write-protected gfn KVM: x86/mmu: Detect if unprotect will do anything based on invalid_list KVM: x86/mmu: Subsume kvm_mmu_unprotect_page() into the and_retry() version KVM: x86: Rename reexecute_instruction()=>kvm_unprotect_and_retry_on_failure() KVM: x86: Update retry protection fields when forcing retry on emulation failure KVM: x86: Apply retry protection to "unprotect on failure" path KVM: x86: Check EMULTYPE_WRITE_PF_TO_SP before unprotecting gfn ...
2024-09-28Merge tag 's390-6.12-2' of ↵Linus Torvalds2-50/+40
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Vasily Gorbik: - Clean up and improve vdso code: use SYM_* macros for function and data annotations, add CFI annotations to fix GDB unwinding, optimize the chacha20 implementation - Add vfio-ap driver feature advertisement for use by libvirt and mdevctl * tag 's390-6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/vfio-ap: Driver feature advertisement s390/vdso: Use one large alternative instead of an alternative branch s390/vdso: Use SYM_DATA_START_LOCAL()/SYM_DATA_END() for data objects tools: Add additional SYM_*() stubs to linkage.h s390/vdso: Use macros for annotation of asm functions s390/vdso: Add CFI annotations to __arch_chacha20_blocks_nostack() s390/vdso: Fix comment within __arch_chacha20_blocks_nostack() s390/vdso: Get rid of permutation constants
2024-09-27[tree-wide] finally take no_llseek outAl Viro6-6/+0
no_llseek had been defined to NULL two years ago, in commit 868941b14441 ("fs: remove no_llseek") To quote that commit, At -rc1 we'll need do a mechanical removal of no_llseek - git grep -l -w no_llseek | grep -v porting.rst | while read i; do sed -i '/\<no_llseek\>/d' $i done would do it. Unfortunately, that hadn't been done. Linus, could you do that now, so that we could finally put that thing to rest? All instances are of the form .llseek = no_llseek, so it's obviously safe. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-09-26Merge tag 'asm-generic-6.12' of ↵Linus Torvalds1-2/+1
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic updates from Arnd Bergmann: "These are only two small patches, one cleanup for arch/alpha and a preparation patch cleaning up the handling of runtime constants in the linker scripts" * tag 'asm-generic-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: runtime constants: move list of constants to vmlinux.lds.h alpha: no need to include asm/xchg.h twice
2024-09-25Merge tag 'memblock-v6.12-rc1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull memblock updates from Mike Rapoport: - new memblock_estimated_nr_free_pages() helper to replace totalram_pages() which is less accurate when CONFIG_DEFERRED_STRUCT_PAGE_INIT is set - fixes for memblock tests * tag 'memblock-v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: s390/mm: get estimated free pages by memblock api kernel/fork.c: get estimated free pages by memblock api mm/memblock: introduce a new helper memblock_estimated_nr_free_pages() memblock test: fix implicit declaration of function 'strscpy' memblock test: fix implicit declaration of function 'isspace' memblock test: fix implicit declaration of function 'memparse' memblock test: add the definition of __setup() memblock test: fix implicit declaration of function 'virt_to_phys' tools/testing: abstract two init.h into common include directory memblock tests: include export.h in linkage.h as kernel dose memblock tests: include memory_hotplug.h in mmzone.h as kernel dose
2024-09-24Merge tag 'v6.12-p2' of ↵Linus Torvalds1-1/+4
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: - Disable buggy p10 aes-gcm code on powerpc - Fix module aliases in paes_s390 - Fix buffer overread in caam * tag 'v6.12-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: powerpc/p10-aes-gcm - Disable CRYPTO_AES_GCM_P10 crypto: s390/paes - Fix module aliases crypto: caam - Pad SG length when allocating hash edesc
2024-09-23Merge tag 'pci-v6.12-changes' of ↵Linus Torvalds4-13/+14
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Wait for device readiness after reset by polling Vendor ID and looking for Configuration RRS instead of polling the Command register and looking for non-error completions, to avoid hardware retries done for RRS on non-Vendor ID reads (Bjorn Helgaas) - Rename CRS Completion Status to RRS ('Request Retry Status') to match PCIe r6.0 spec usage (Bjorn Helgaas) - Clear LBMS bit after a manual link retrain so we don't try to retrain a link when there's no downstream device anymore (Maciej W. Rozycki) - Revert to the original link speed after retraining fails instead of leaving it restricted to 2.5GT/s, so a future device has a chance to use higher speeds (Maciej W. Rozycki) - Wait for each level of downstream bus, not just the first, to become accessible before restoring devices on that bus (Ilpo Järvinen) - Add ARCH_PCI_DEV_GROUPS so s390 can add its own attribute_groups without having to stomp on the core's pdev->dev.groups (Lukas Wunner) Driver binding: - Export pcim_request_region(), a managed counterpart of pci_request_region(), for use by drivers (Philipp Stanner) - Export pcim_iomap_region() and deprecate pcim_iomap_regions() (Philipp Stanner) - Request the PCI BAR used by xboxvideo (Philipp Stanner) - Request and map drm/ast BARs with pcim_iomap_region() (Philipp Stanner) MSI: - Add MSI_FLAG_NO_AFFINITY flag for devices that mux MSIs onto a single IRQ line and cannot set the affinity of each MSI to a specific CPU core (Marek Vasut) - Use MSI_FLAG_NO_AFFINITY and remove unnecessary .irq_set_affinity() implementations in aardvark, altera, brcmstb, dwc, mediatek-gen3, mediatek, mobiveil, plda, rcar, tegra, vmd, xilinx-nwl, xilinx-xdma, and xilinx drivers to avoid 'IRQ: set affinity failed' warnings (Marek Vasut) Power management: - Add pwrctl support for ATH11K inside the WCN6855 package (Konrad Dybcio) PCI device hotplug: - Remove unnecessary hpc_ops struct from shpchp (ngn) - Check for PCI_POSSIBLE_ERROR(), not 0xffffffff, in cpqphp (weiyufeng) Virtualization: - Mark Creative Labs EMU20k2 INTx masking as broken (Alex Williamson) - Add an ACS quirk for Qualcomm SA8775P, which doesn't advertise ACS but does provide ACS-like features (Subramanian Ananthanarayanan) IOMMU: - Add function 0 DMA alias quirk for Glenfly Arise audio function, which uses the function 0 Requester ID (WangYuli) NPEM: - Add Native PCIe Enclosure Management (NPEM) support for sysfs control of NVMe RAID storage indicators (ok/fail/locate/ rebuild/etc) (Mariusz Tkaczyk) - Add support for the ACPI _DSM PCIe SSD status LED management, which is functionally similar to NPEM but mediated by platform firmware (Mariusz Tkaczyk) Device trees: - Drop minItems and maxItems from ranges in PCI generic host binding since host bridges may have several MMIO and I/O port apertures (Frank Li) - Add kirin, rcar-gen2, uniphier DT binding top-level constraints for clocks (Krzysztof Kozlowski) Altera PCIe controller driver: - Convert altera DT bindings from text to YAML (Matthew Gerlach) - Replace TLP_REQ_ID() with macro PCI_DEVID(), which does the same thing and is what other drivers use (Jinjie Ruan) Broadcom STB PCIe controller driver: - Add DT binding maxItems for reset controllers (Jim Quinlan) - Use the 'bridge' reset method if described in the DT (Jim Quinlan) - Use the 'swinit' reset method if described in the DT (Jim Quinlan) - Add 'has_phy' so the existence of a 'rescal' reset controller doesn't imply software control of it (Jim Quinlan) - Add support for many inbound DMA windows (Jim Quinlan) - Rename SoC 'type' to 'soc_base' express the fact that SoCs come in families of multiple similar devices (Jim Quinlan) - Add Broadcom 7712 DT description and driver support (Jim Quinlan) - Sort enums, pcie_offsets[], pcie_cfg_data, .compatible strings for maintainability (Bjorn Helgaas) Freescale i.MX6 PCIe controller driver: - Add imx6q-pcie 'dbi2' and 'atu' reg-names for i.MX8M Endpoints (Richard Zhu) - Fix a code restructuring error that caused i.MX8MM and i.MX8MP Endpoints to fail to establish link (Richard Zhu) - Fix i.MX8MP Endpoint occasional failure to trigger MSI by enforcing outbound alignment requirement (Richard Zhu) - Call phy_power_off() in the .probe() error path (Frank Li) - Rename internal names from imx6_* to imx_* since i.MX7/8/9 are also supported (Frank Li) - Manage Refclk by using SoC-specific callbacks instead of switch statements (Frank Li) - Manage core reset by using SoC-specific callbacks instead of switch statements (Frank Li) - Expand comments for erratum ERR010728 workaround (Frank Li) - Use generic PHY APIs to configure mode, speed, and submode, which is harmless for devices that implement their own internal PHY management and don't set the generic imx_pcie->phy (Frank Li) - Add i.MX8Q (i.MX8QM, i.MX8QXP, and i.MX8DXL) DT binding and driver Root Complex support (Richard Zhu) Freescale Layerscape PCIe controller driver: - Replace layerscape-pcie DT binding compatible fsl,lx2160a-pcie with fsl,lx2160ar2-pcie (Frank Li) - Add layerscape-pcie DT binding deprecated 'num-viewport' property to address a DT checker warning (Frank Li) - Change layerscape-pcie DT binding 'fsl,pcie-scfg' to phandle-array (Frank Li) Loongson PCIe controller driver: - Increase max PCI hosts to 8 for Loongson-3C6000 and newer chipsets (Huacai Chen) Marvell Aardvark PCIe controller driver: - Fix issue with emulating Configuration RRS for two-byte reads of Vendor ID; previously it only worked for four-byte reads (Bjorn Helgaas) MediaTek PCIe Gen3 controller driver: - Add per-SoC struct mtk_gen3_pcie_pdata to support multiple SoC types (Lorenzo Bianconi) - Use reset_bulk APIs to manage PHY reset lines (Lorenzo Bianconi) - Add DT and driver support for Airoha EN7581 PCIe controller (Lorenzo Bianconi) Qualcomm PCIe controller driver: - Update qcom,pcie-sc7280 DT binding with eight interrupts (Rayyan Ansari) - Add back DT 'vddpe-3v3-supply', which was incorrectly removed earlier (Johan Hovold) - Drop endpoint redundant masking of global IRQ events (Manivannan Sadhasivam) - Clarify unknown global IRQ message and only log it once to avoid a flood (Manivannan Sadhasivam) - Add 'linux,pci-domain' property to endpoint DT binding (Manivannan Sadhasivam) - Assign PCI domain number for endpoint controllers (Manivannan Sadhasivam) - Add 'qcom_pcie_ep' and the PCI domain number to IRQ names for endpoint controller (Manivannan Sadhasivam) - Add global SPI interrupt for PCIe link events to DT binding (Manivannan Sadhasivam) - Add global RC interrupt handler to handle 'Link up' events and automatically enumerate hot-added devices (Manivannan Sadhasivam) - Avoid mirroring of DBI and iATU register space so it doesn't overlap BAR MMIO space (Prudhvi Yarlagadda) - Enable controller resources like PHY only after PERST# is deasserted to partially avoid the problem that the endpoint SoC crashes when accessing things when Refclk is absent (Manivannan Sadhasivam) - Add 16.0 GT/s equalization and RX lane margining settings (Shashank Babu Chinta Venkata) - Pass domain number to pci_bus_release_domain_nr() explicitly to avoid a NULL pointer dereference (Manivannan Sadhasivam) Renesas R-Car PCIe controller driver: - Make the read-only const array 'check_addr' static (Colin Ian King) - Add R-Car V4M (R8A779H0) PCIe host and endpoint to DT binding (Yoshihiro Shimoda) TI DRA7xx PCIe controller driver: - Request IRQF_ONESHOT for 'dra7xx-pcie-main' IRQ since the primary handler is NULL (Siddharth Vadapalli) - Handle IRQ request errors during root port and endpoint probe (Siddharth Vadapalli) TI J721E PCIe driver: - Add DT 'ti,syscon-acspcie-proxy-ctrl' and driver support to enable the ACSPCIE module to drive Refclk for the Endpoint (Siddharth Vadapalli) - Extract the cadence link setup from cdns_pcie_host_setup() so link setup can be done separately during resume (Thomas Richard) - Add T_PERST_CLK_US definition for the mandatory delay between Refclk becoming stable and PERST# being deasserted (Thomas Richard) - Add j721e suspend and resume support (Théo Lebrun) TI Keystone PCIe controller driver: - Fix NULL pointer checking when applying MRRS limitation quirk for AM65x SR 1.0 Errata #i2037 (Dan Carpenter) Xilinx NWL PCIe controller driver: - Fix off-by-one error in INTx IRQ handler that caused INTx interrupts to be lost or delivered as the wrong interrupt (Sean Anderson) - Rate-limit misc interrupt messages (Sean Anderson) - Turn off the clock on probe failure and device removal (Sean Anderson) - Add DT binding and driver support for enabling/disabling PHYs (Sean Anderson) - Add PCIe phy bindings for the ZCU102 (Sean Anderson) Xilinx XDMA PCIe controller driver: - Add support for Xilinx QDMA Soft IP PCIe Root Port Bridge to DT binding and xilinx-dma-pl driver (Thippeswamy Havalige) Miscellaneous: - Fix buffer overflow in kirin_pcie_parse_port() (Alexandra Diupina) - Fix minor kerneldoc issues and typos (Bjorn Helgaas) - Use PCI_DEVID() macro in aer_inject() instead of open-coding it (Jinjie Ruan) - Check pcie_find_root_port() return in x86 fixups to avoid NULL pointer dereferences (Samasth Norway Ananda) - Make pci_bus_type constant (Kunwu Chan) - Remove unused declarations of __pci_pme_wakeup() and pci_vpd_release() (Yue Haibing) - Remove any leftover .*.cmd files with make clean (zhang jiao) - Remove unused BILLION macro (zhang jiao)" * tag 'pci-v6.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (132 commits) PCI: Fix typos dt-bindings: PCI: qcom: Allow 'vddpe-3v3-supply' again tools: PCI: Remove unused BILLION macro tools: PCI: Remove .*.cmd files with make clean PCI: Pass domain number to pci_bus_release_domain_nr() explicitly PCI: dra7xx: Fix error handling when IRQ request fails in probe PCI: dra7xx: Fix threaded IRQ request for "dra7xx-pcie-main" IRQ PCI: qcom: Add RX lane margining settings for 16.0 GT/s PCI: qcom: Add equalization settings for 16.0 GT/s PCI: dwc: Always cache the maximum link speed value in dw_pcie::max_link_speed PCI: dwc: Rename 'dw_pcie::link_gen' to 'dw_pcie::max_link_speed' PCI: qcom-ep: Enable controller resources like PHY only after refclk is available PCI: Mark Creative Labs EMU20k2 INTx masking as broken dt-bindings: PCI: imx6q-pcie: Add reg-name "dbi2" and "atu" for i.MX8M PCIe Endpoint dt-bindings: PCI: altera: msi: Convert to YAML PCI: imx6: Add i.MX8Q PCIe Root Complex (RC) support PCI: Rename CRS Completion Status to RRS PCI: aardvark: Correct Configuration RRS checking PCI: Wait for device readiness with Configuration RRS PCI: brcmstb: Sort enums, pcie_offsets[], pcie_cfg_data, .compatible strings ...
2024-09-23s390/vdso: Use one large alternative instead of an alternative branchHeiko Carstens1-19/+16
Replace the alternative branch with a larger alternative that contains both paths. That way the two paths are closer together and it is easier to change both paths if the need should arise. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-09-23s390/vdso: Use SYM_DATA_START_LOCAL()/SYM_DATA_END() for data objectsHeiko Carstens1-2/+3
Use SYM_DATA_START_LOCAL()/SYM_DATA_END() in vgetrandom-chacha.S so that the constants end up in an object with correct size: readelf -Ws vgetrandom-chacha.o Num: Value Size Type Bind Vis Ndx Name ... 5: 0000000000000000 32 OBJECT LOCAL DEFAULT 5 chacha20_constants Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-09-23s390/vdso: Use macros for annotation of asm functionsJens Remus1-10/+4
Use the macros SYM_FUNC_START() and SYM_FUNC_END() to annotate the functions in assembly. Signed-off-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-09-23s390/vdso: Add CFI annotations to __arch_chacha20_blocks_nostack()Jens Remus1-0/+3
This allows proper unwinding, for instance when using a debugger such as GDB. Signed-off-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-09-23s390/vdso: Fix comment within __arch_chacha20_blocks_nostack()Heiko Carstens1-4/+4
Fix comment within __arch_chacha20_blocks_nostack() so the comment reflects what the code is doing. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-09-23s390/vdso: Get rid of permutation constantsHeiko Carstens1-15/+10
The three byte masks for VECTOR PERMUTE are not needed, since the instruction VECTOR SHIFT LEFT DOUBLE BY BYTE can be used to implement the required "rotate left" easily. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-09-21Merge tag 's390-6.12-1' of ↵Linus Torvalds73-860/+2387
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Optimize ftrace and kprobes code patching and avoid stop machine for kprobes if sequential instruction fetching facility is available - Add hiperdispatch feature to dynamically adjust CPU capacity in vertical polarization to improve scheduling efficiency and overall performance. Also add infrastructure for handling warning track interrupts (WTI), allowing for graceful CPU preemption - Rework crypto code pkey module and split it into separate, independent modules for sysfs, PCKMO, CCA, and EP11, allowing modules to load only when the relevant hardware is available - Add hardware acceleration for HMAC modes and the full AES-XTS cipher, utilizing message-security assist extensions (MSA) 10 and 11. It introduces new shash implementations for HMAC-SHA224/256/384/512 and registers the hardware-accelerated AES-XTS cipher as the preferred option. Also add clear key token support - Add MSA 10 and 11 processor activity instrumentation counters to perf and update PAI Extension 1 NNPA counters - Cleanup cpu sampling facility code and rework debug/WARN_ON_ONCE statements - Add support for SHA3 performance enhancements introduced with MSA 12 - Add support for the query authentication information feature of MSA 13 and introduce the KDSA CPACF instruction. Provide query and query authentication information in sysfs, enabling tools like cpacfinfo to present this data in a human-readable form - Update kernel disassembler instructions - Always enable EXPOLINE_EXTERN if supported by the compiler to ensure kpatch compatibility - Add missing warning handling and relocated lowcore support to the early program check handler - Optimize ftrace_return_address() and avoid calling unwinder - Make modules use kernel ftrace trampolines - Strip relocs from the final vmlinux ELF file to make it roughly 2 times smaller - Dump register contents and call trace for early crashes to the console - Generate ptdump address marker array dynamically - Fix rcu_sched stalls that might occur when adding or removing large amounts of pages at once to or from the CMM balloon - Fix deadlock caused by recursive lock of the AP bus scan mutex - Unify sync and async register save areas in entry code - Cleanup debug prints in crypto code - Various cleanup and sanitizing patches for the decompressor - Various small ftrace cleanups * tag 's390-6.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (84 commits) s390/crypto: Display Query and Query Authentication Information in sysfs s390/crypto: Add Support for Query Authentication Information s390/crypto: Rework RRE and RRF CPACF inline functions s390/crypto: Add KDSA CPACF Instruction s390/disassembler: Remove duplicate instruction format RSY_RDRU s390/boot: Move boot_printk() code to own file s390/boot: Use boot_printk() instead of sclp_early_printk() s390/boot: Rename decompressor_printk() to boot_printk() s390/boot: Compile all files with the same march flag s390: Use MARCH_HAS_*_FEATURES defines s390: Provide MARCH_HAS_*_FEATURES defines s390/facility: Disable compile time optimization for decompressor code s390/boot: Increase minimum architecture to z10 s390/als: Remove obsolete comment s390/sha3: Fix SHA3 selftests failures s390/pkey: Add AES xts and HMAC clear key token support s390/cpacf: Add MSA 10 and 11 new PCKMO functions s390/mm: Add cond_resched() to cmm_alloc/free_pages() s390/pai_ext: Update PAI extension 1 counters s390/pai_crypto: Add support for MSA 10 and 11 pai counters ...
2024-09-21Merge tag 'mm-stable-2024-09-20-02-31' of ↵Linus Torvalds9-51/+38
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Along with the usual shower of singleton patches, notable patch series in this pull request are: - "Align kvrealloc() with krealloc()" from Danilo Krummrich. Adds consistency to the APIs and behaviour of these two core allocation functions. This also simplifies/enables Rustification. - "Some cleanups for shmem" from Baolin Wang. No functional changes - mode code reuse, better function naming, logic simplifications. - "mm: some small page fault cleanups" from Josef Bacik. No functional changes - code cleanups only. - "Various memory tiering fixes" from Zi Yan. A small fix and a little cleanup. - "mm/swap: remove boilerplate" from Yu Zhao. Code cleanups and simplifications and .text shrinkage. - "Kernel stack usage histogram" from Pasha Tatashin and Shakeel Butt. This is a feature, it adds new feilds to /proc/vmstat such as $ grep kstack /proc/vmstat kstack_1k 3 kstack_2k 188 kstack_4k 11391 kstack_8k 243 kstack_16k 0 which tells us that 11391 processes used 4k of stack while none at all used 16k. Useful for some system tuning things, but partivularly useful for "the dynamic kernel stack project". - "kmemleak: support for percpu memory leak detect" from Pavel Tikhomirov. Teaches kmemleak to detect leaksage of percpu memory. - "mm: memcg: page counters optimizations" from Roman Gushchin. "3 independent small optimizations of page counters". - "mm: split PTE/PMD PT table Kconfig cleanups+clarifications" from David Hildenbrand. Improves PTE/PMD splitlock detection, makes powerpc/8xx work correctly by design rather than by accident. - "mm: remove arch_make_page_accessible()" from David Hildenbrand. Some folio conversions which make arch_make_page_accessible() unneeded. - "mm, memcg: cg2 memory{.swap,}.peak write handlers" fro David Finkel. Cleans up and fixes our handling of the resetting of the cgroup/process peak-memory-use detector. - "Make core VMA operations internal and testable" from Lorenzo Stoakes. Rationalizaion and encapsulation of the VMA manipulation APIs. With a view to better enable testing of the VMA functions, even from a userspace-only harness. - "mm: zswap: fixes for global shrinker" from Takero Funaki. Fix issues in the zswap global shrinker, resulting in improved performance. - "mm: print the promo watermark in zoneinfo" from Kaiyang Zhao. Fill in some missing info in /proc/zoneinfo. - "mm: replace follow_page() by folio_walk" from David Hildenbrand. Code cleanups and rationalizations (conversion to folio_walk()) resulting in the removal of follow_page(). - "improving dynamic zswap shrinker protection scheme" from Nhat Pham. Some tuning to improve zswap's dynamic shrinker. Significant reductions in swapin and improvements in performance are shown. - "mm: Fix several issues with unaccepted memory" from Kirill Shutemov. Improvements to the new unaccepted memory feature, - "mm/mprotect: Fix dax puds" from Peter Xu. Implements mprotect on DAX PUDs. This was missing, although nobody seems to have notied yet. - "Introduce a store type enum for the Maple tree" from Sidhartha Kumar. Cleanups and modest performance improvements for the maple tree library code. - "memcg: further decouple v1 code from v2" from Shakeel Butt. Move more cgroup v1 remnants away from the v2 memcg code. - "memcg: initiate deprecation of v1 features" from Shakeel Butt. Adds various warnings telling users that memcg v1 features are deprecated. - "mm: swap: mTHP swap allocator base on swap cluster order" from Chris Li. Greatly improves the success rate of the mTHP swap allocation. - "mm: introduce numa_memblks" from Mike Rapoport. Moves various disparate per-arch implementations of numa_memblk code into generic code. - "mm: batch free swaps for zap_pte_range()" from Barry Song. Greatly improves the performance of munmap() of swap-filled ptes. - "support large folio swap-out and swap-in for shmem" from Baolin Wang. With this series we no longer split shmem large folios into simgle-page folios when swapping out shmem. - "mm/hugetlb: alloc/free gigantic folios" from Yu Zhao. Nice performance improvements and code reductions for gigantic folios. - "support shmem mTHP collapse" from Baolin Wang. Adds support for khugepaged's collapsing of shmem mTHP folios. - "mm: Optimize mseal checks" from Pedro Falcato. Fixes an mprotect() performance regression due to the addition of mseal(). - "Increase the number of bits available in page_type" from Matthew Wilcox. Increases the number of bits available in page_type! - "Simplify the page flags a little" from Matthew Wilcox. Many legacy page flags are now folio flags, so the page-based flags and their accessors/mutators can be removed. - "mm: store zero pages to be swapped out in a bitmap" from Usama Arif. An optimization which permits us to avoid writing/reading zero-filled zswap pages to backing store. - "Avoid MAP_FIXED gap exposure" from Liam Howlett. Fixes a race window which occurs when a MAP_FIXED operqtion is occurring during an unrelated vma tree walk. - "mm: remove vma_merge()" from Lorenzo Stoakes. Major rotorooting of the vma_merge() functionality, making ot cleaner, more testable and better tested. - "misc fixups for DAMON {self,kunit} tests" from SeongJae Park. Minor fixups of DAMON selftests and kunit tests. - "mm: memory_hotplug: improve do_migrate_range()" from Kefeng Wang. Code cleanups and folio conversions. - "Shmem mTHP controls and stats improvements" from Ryan Roberts. Cleanups for shmem controls and stats. - "mm: count the number of anonymous THPs per size" from Barry Song. Expose additional anon THP stats to userspace for improved tuning. - "mm: finish isolate/putback_lru_page()" from Kefeng Wang: more folio conversions and removal of now-unused page-based APIs. - "replace per-quota region priorities histogram buffer with per-context one" from SeongJae Park. DAMON histogram rationalization. - "Docs/damon: update GitHub repo URLs and maintainer-profile" from SeongJae Park. DAMON documentation updates. - "mm/vdpa: correct misuse of non-direct-reclaim __GFP_NOFAIL and improve related doc and warn" from Jason Wang: fixes usage of page allocator __GFP_NOFAIL and GFP_ATOMIC flags. - "mm: split underused THPs" from Yu Zhao. Improve THP=always policy. This was overprovisioning THPs in sparsely accessed memory areas. - "zram: introduce custom comp backends API" frm Sergey Senozhatsky. Add support for zram run-time compression algorithm tuning. - "mm: Care about shadow stack guard gap when getting an unmapped area" from Mark Brown. Fix up the various arch_get_unmapped_area() implementations to better respect guard areas. - "Improve mem_cgroup_iter()" from Kinsey Ho. Improve the reliability of mem_cgroup_iter() and various code cleanups. - "mm: Support huge pfnmaps" from Peter Xu. Extends the usage of huge pfnmap support. - "resource: Fix region_intersects() vs add_memory_driver_managed()" from Huang Ying. Fix a bug in region_intersects() for systems with CXL memory. - "mm: hwpoison: two more poison recovery" from Kefeng Wang. Teaches a couple more code paths to correctly recover from the encountering of poisoned memry. - "mm: enable large folios swap-in support" from Barry Song. Support the swapin of mTHP memory into appropriately-sized folios, rather than into single-page folios" * tag 'mm-stable-2024-09-20-02-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (416 commits) zram: free secondary algorithms names uprobes: turn xol_area->pages[2] into xol_area->page uprobes: introduce the global struct vm_special_mapping xol_mapping Revert "uprobes: use vm_special_mapping close() functionality" mm: support large folios swap-in for sync io devices mm: add nr argument in mem_cgroup_swapin_uncharge_swap() helper to support large folios mm: fix swap_read_folio_zeromap() for large folios with partial zeromap mm/debug_vm_pgtable: Use pxdp_get() for accessing page table entries set_memory: add __must_check to generic stubs mm/vma: return the exact errno in vms_gather_munmap_vmas() memcg: cleanup with !CONFIG_MEMCG_V1 mm/show_mem.c: report alloc tags in human readable units mm: support poison recovery from copy_present_page() mm: support poison recovery from do_cow_fault() resource, kunit: add test case for region_intersects() resource: make alloc_free_mem_region() works for iomem_resource mm: z3fold: deprecate CONFIG_Z3FOLD vfio/pci: implement huge_fault support mm/arm64: support large pfn mappings mm/x86: support large pfn mappings ...
2024-09-21crypto: s390/paes - Fix module aliasesHerbert Xu1-1/+4
The paes_s390 module didn't declare the correct aliases for the algorithms that it registered. Instead it declared an alias for the non-existent paes algorithm. The Crypto API will eventually try to load the paes algorithm, to construct the cbc(paes) instance. But because the module does not actually contain a "paes" algorithm, this will fail. Previously this failure was hidden and the the cbc(paes) lookup will be retried. This was fixed recently, thus exposing the buggy alias in paes_s390. Replace the bogus paes alias with aliases for the actual algorithms. Reported-by: Ingo Franzki <ifranzki@linux.ibm.com> Fixes: e7a4142b35ce ("crypto: api - Fix generic algorithm self-test races") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Ingo Franzki <ifranzki@linux.ibm.com> Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-09-19Merge tag 'dma-mapping-6.12-2024-09-19' of ↵Linus Torvalds2-2/+2
git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping updates from Christoph Hellwig: - support DMA zones for arm64 systems where memory starts at > 4GB (Baruch Siach, Catalin Marinas) - support direct calls into dma-iommu and thus obsolete dma_map_ops for many common configurations (Leon Romanovsky) - add DMA-API tracing (Sean Anderson) - remove the not very useful return value from various dma_set_* APIs (Christoph Hellwig) - misc cleanups and minor optimizations (Chen Y, Yosry Ahmed, Christoph Hellwig) * tag 'dma-mapping-6.12-2024-09-19' of git://git.infradead.org/users/hch/dma-mapping: dma-mapping: reflow dma_supported dma-mapping: reliably inform about DMA support for IOMMU dma-mapping: add tracing for dma-mapping API calls dma-mapping: use IOMMU DMA calls for common alloc/free page calls dma-direct: optimize page freeing when it is not addressable dma-mapping: clearly mark DMA ops as an architecture feature vdpa_sim: don't select DMA_OPS arm64: mm: keep low RAM dma zone dma-mapping: don't return errors from dma_set_max_seg_size dma-mapping: don't return errors from dma_set_seg_boundary dma-mapping: don't return errors from dma_set_min_align_mask scsi: check that busses support the DMA API before setting dma parameters arm64: mm: fix DMA zone when dma-ranges is missing dma-mapping: direct calls for dma-iommu dma-mapping: call ->unmap_page and ->unmap_sg unconditionally arm64: support DMA zone above 4GB dma-mapping: replace zone_dma_bits by zone_dma_limit dma-mapping: use bit masking to check VM_DMA_COHERENT
2024-09-17Merge tag 'kvm-s390-next-6.12-1' of ↵Paolo Bonzini2-9/+19
https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD * New ucontrol selftest * Inline assembly touchups
2024-09-17s390/pci_mmio: use follow_pfnmap APIPeter Xu1-10/+12
Use the new API that can understand huge pfn mappings. Link: https://lkml.kernel.org/r/20240826204353.2228736-12-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Cc: Niklas Schnelle <schnelle@linux.ibm.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Gavin Shan <gshan@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-17mm: always define pxx_pgprot()Peter Xu1-0/+1
There're: - 8 archs (arc, arm64, include, mips, powerpc, s390, sh, x86) that support pte_pgprot(). - 2 archs (x86, sparc) that support pmd_pgprot(). - 1 arch (x86) that support pud_pgprot(). Always define them to be used in generic code, and then we don't need to fiddle with "#ifdef"s when doing so. Link: https://lkml.kernel.org/r/20240826204353.2228736-9-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Gavin Shan <gshan@redhat.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Niklas Schnelle <schnelle@linux.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-16s390: Enable KVM_S390_UCONTROL config in debug_defconfigChristoph Schlameuss1-0/+1
To simplify testing enable UCONTROL KVM by default in debug kernels. Signed-off-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Link: https://lore.kernel.org/r/20240807154512.316936-11-schlameuss@linux.ibm.com Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Message-ID: <20240807154512.316936-11-schlameuss@linux.ibm.com>
2024-09-13s390/vdso: Wire up getrandom() vdso implementationHeiko Carstens10-8/+289
Provide the s390 specific vdso getrandom() architecture backend. _vdso_rng_data required data is placed within the _vdso_data vvar page, by using a hardcoded offset larger than vdso_data. As required the chacha20 implementation does not write to the stack. The implementation follows more or less the arm64 implementations and makes use of vector instructions. It has a fallback to the getrandom() system call for machines where the vector facility is not installed. The check if the vector facility is installed, as well as an optimization for machines with the vector-enhancements facility 2, is implemented with alternatives, avoiding runtime checks. Note that __kernel_getrandom() is implemented without the vdso user wrapper which would setup a stack frame for odd cases (aka very old glibc variants) where the caller has not done that. All callers of __kernel_getrandom() are required to setup a stack frame, like the C ABI requires it. The vdso testcases vdso_test_getrandom and vdso_test_chacha pass. Benchmark on a z16: $ ./vdso_test_getrandom bench-single vdso: 25000000 times in 0.493703559 seconds syscall: 25000000 times in 6.584025337 seconds Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13s390/vdso: Move vdso symbol handling to separate header fileHeiko Carstens4-14/+19
The vdso.h header file, which is included at many places, includes generated header files. This can easily lead to recursive header file inclusions if the vdso code is changed. Therefore move the vdso symbol code, which requires the generated header files, to a separate header file, and include it at the two locations which require it. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13s390/vdso: Allow alternatives in vdso codeHeiko Carstens2-0/+23
Implement the infrastructure required to allow alternatives in vdso code. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13s390/module: Provide find_section() helperHeiko Carstens1-0/+14
Provide find_section() helper function which can be used to find a section by name, similar to other architectures. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13s390/facility: Let test_facility() generate static branch if possibleHeiko Carstens1-8/+29
Let test_facility() generate a branch instruction if the tested facility is a constant, and where the result cannot be evaluated during compile time. The branch instruction defaults to "false" and is patched to nop (branch not taken) if the tested facility is available. This avoids runtime checks and is similar to x86's static_cpu_has() and arm64's alternative_has_cap_likely(). Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13s390/alternatives: Remove ALT_FACILITY_EARLYHeiko Carstens2-6/+2
Patch all alternatives which depend on facilities from the decompressor. There is no technical reason which enforces to split patching of such alternatives to the decompressor and the kernel. This simplifies alternative handling a bit, since one alternative type is removed. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13s390/facility: Disable compile time optimization for decompressor codeHeiko Carstens1-2/+4
Disable compile time optimizations of test_facility() for the decompressor. The decompressor should not contain any optimized code depending on the architecture level set the kernel image is compiled for to avoid unexpected operation exceptions. Add a __DECOMPRESSOR check to test_facility() to enforce that facilities are always checked during runtime for the decompressor. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-12s390/crypto: Display Query and Query Authentication Information in sysfsFinn Callies2-0/+120
Displays the query (fc=0) and query authentication information (fc=127) as binary in sysfs per CPACF instruction. Files are located in /sys/devices/system/cpu/cpacf/. These information can be fetched via asm already except for PCKMO because this instruction is privileged. To offer a unified interface all CPACF instructions will have this information displayed in sysfs in files <instruction>_query_raw and <instruction>_query_auth_info_raw. A new tool introduced into s390-tools called cpacfinfo will use this information to convert and display in human readable form. Suggested-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Finn Callies <fcallies@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-09-12s390/crypto: Add Support for Query Authentication InformationFinn Callies1-1/+30
Introduce functions __cpacf_qai() and wrapper cpacf_qai() to the respective existing functions __cpacf_query() and cpacf_query() are introduced to support the Query Authentication Information feature of MSA 13. Suggested-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Finn Callies <fcallies@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-09-12s390/crypto: Rework RRE and RRF CPACF inline functionsFinn Callies1-33/+41
Rework of the __cpacf_query_rre() and __cpacf_query_rrf() functions to support additional function codes. A function code is passed as a new parameter to specify which subfunction of the supplied Instruction is to be called. Suggested-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Finn Callies <fcallies@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-09-12s390/crypto: Add KDSA CPACF InstructionFinn Callies1-0/+23
Add the function code definitions for using the KDSA function to the CPACF header file. Suggested-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Finn Callies <fcallies@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-09-12s390/disassembler: Remove duplicate instruction format RSY_RDRUJens Remus2-5/+4
Instruction format RSY_RDRU is a duplicate of RSY_RURD2. Use the latter, as it follows the s390-specific conventions for instruction format naming used in binutils. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-09-09mm: make arch_get_unmapped_area() take vm_flags by defaultMark Brown1-2/+2
Patch series "mm: Care about shadow stack guard gap when getting an unmapped area", v2. As covered in the commit log for c44357c2e76b ("x86/mm: care about shadow stack guard gap during placement") our current mmap() implementation does not take care to ensure that a new mapping isn't placed with existing mappings inside it's own guard gaps. This is particularly important for shadow stacks since if two shadow stacks end up getting placed adjacent to each other then they can overflow into each other which weakens the protection offered by the feature. On x86 there is a custom arch_get_unmapped_area() which was updated by the above commit to cover this case by specifying a start_gap for allocations with VM_SHADOW_STACK. Both arm64 and RISC-V have equivalent features and use the generic implementation of arch_get_unmapped_area() so let's make the equivalent change there so they also don't get shadow stack pages placed without guard pages. The arm64 and RISC-V shadow stack implementations are currently on the list: https://lore.kernel.org/r/20240829-arm64-gcs-v12-0-42fec94743 https://lore.kernel.org/lkml/20240403234054.2020347-1-debug@rivosinc.com/ Given the addition of the use of vm_flags in the generic implementation we also simplify the set of possibilities that have to be dealt with in the core code by making arch_get_unmapped_area() take vm_flags as standard. This is a bit invasive since the prototype change touches quite a few architectures but since the parameter is ignored the change is straightforward, the simplification for the generic code seems worth it. This patch (of 3): When we introduced arch_get_unmapped_area_vmflags() in 961148704acd ("mm: introduce arch_get_unmapped_area_vmflags()") we did so as part of properly supporting guard pages for shadow stacks on x86_64, which uses a custom arch_get_unmapped_area(). Equivalent features are also present on both arm64 and RISC-V, both of which use the generic implementation of arch_get_unmapped_area() and will require equivalent modification there. Rather than continue to deal with having two versions of the functions let's bite the bullet and have all implementations of arch_get_unmapped_area() take vm_flags as a parameter. The new parameter is currently ignored by all implementations other than x86. The only caller that doesn't have a vm_flags available is mm_get_unmapped_area(), as for the x86 implementation and the wrapper used on other architectures this is modified to supply no flags. No functional changes. Link: https://lkml.kernel.org/r/20240904-mm-generic-shadow-stack-guard-v2-0-a46b8b6dc0ed@kernel.org Link: https://lkml.kernel.org/r/20240904-mm-generic-shadow-stack-guard-v2-1-a46b8b6dc0ed@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Acked-by: Helge Deller <deller@gmx.de> [parisc] Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Chris Zankel <chris@zankel.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Naveen N Rao <naveen@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-07s390/boot: Move boot_printk() code to own fileHeiko Carstens3-113/+126
Keep the printk code separate from the program check code and move boot_printk() and helper functions to own printk.c file. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-09-07s390/boot: Use boot_printk() instead of sclp_early_printk()Heiko Carstens5-40/+12
Consistently use boot_printk() everywhere instead of sclp_early_printk() at some places. For some places it was required (e.g. als.c), in order to stay in code compiled for the same architecture level, for other places it is not obvious why sclp_early_printk() was used instead of decompressor_printk(). Given that the whole decompressor code is compiled for the same architecture level, there is no requirement left to use different printk functions. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-09-07s390/boot: Rename decompressor_printk() to boot_printk()Heiko Carstens4-41/+37
Rename decompressor_printk() to boot_printk() just to have a shorter function name, which also makes the code more readable. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-09-07s390/boot: Compile all files with the same march flagHeiko Carstens1-14/+5
Only a couple of files of the decompressor are compiled with the minimum architecture level. This is problematic for potential function calls between compile units, especially if a target function is within a compile until compiled for a higher architecture level, since that may lead to an unexpected operation exception. Therefore compile all files of the decompressor for the same (minimum) architecture level. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-09-07s390: Use MARCH_HAS_*_FEATURES definesHeiko Carstens6-19/+26
Replace CONFIG_HAVE_MARCH_*_FEATURES with MARCH_HAS_*_FEATURES everywhere so code gets compiled correctly depending on if the target is the kernel or the decompressor. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-09-07s390: Provide MARCH_HAS_*_FEATURES definesHeiko Carstens1-0/+38
Provide MARCH_HAS_*_FEATURES defines which are supposed to be used everywhere instead of the CONFIG_HAVE_MARCH_*_FEATURES defines. Various header files contain code which depend on the CONFIG_HAVE_MARCH_*_FEATURES defines, allowing for compile time optimizations. If such code is used within the decompressor wrong code may be generated (the compiler may generate instructions which are not available for the minimum architecture level of the decompressor). Therefore provide a new header file with MARCH_HAS_*_FEATURES defines, which are only available if __DECOMPRESSOR is not defined. This way code generation for the kernel image is still optimized depending on CONFIG_HAVE_MARCH_*_FEATURES, while code generated for the decompressor is compiled for the minimum architecture level. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-09-07s390/facility: Disable compile time optimization for decompressor codeHeiko Carstens1-2/+4
Disable compile time optimizations of test_facility() for the decompressor. The decompressor should not contain any optimized code depending on the architecture level set the kernel image is compiled for to avoid unexpected operation exceptions. Add a __DECOMPRESSOR check to test_facility() to enforce that facilities are always checked during runtime for the decompressor. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-09-07s390/boot: Increase minimum architecture to z10Heiko Carstens1-4/+0
The decompressor code is partially compiled with march=z900 so it is possible to print an error message in case a kernel is booted on a machine which misses facilities to execute the kernel. Given that the decompressor code also includes header files from the core kernel this causes problems for inline assemblies and other code where the minimum assumed architecture level is set to z10 in the meantime. If such code is also used in the decompressor (e.g. inline functions) z900 support must be implemented again. In order to avoid this and to keep things simple just raise the minimum architecture level to z10 for the decompressor just like for the kernel. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-09-07s390/als: Remove obsolete commentHeiko Carstens1-8/+0
The bss section of the decompressor is part of the compressed kernel image since commit 980d5f9ab36b ("s390/boot: enable .bss section for compressed kernel"). Remove a now incorrect comment that states that the bss section must not be accessed. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-09-05s390/sha3: Fix SHA3 selftests failuresIngo Franzki3-0/+7
Since commit "s390/sha3: Support sha3 performance enhancements" the selftests of the sha3_256_s390 and sha3_512_s390 kernel digests sometimes fail with: alg: shash: sha3-256-s390 test failed (wrong result) on test vector 3, cfg="import/export" alg: self-tests for sha3-256 using sha3-256-s390 failed (rc=-22) or with alg: ahash: sha3-256-s390 test failed (wrong result) on test vector 3, cfg="digest misaligned splits crossing pages" alg: self-tests for sha3-256 using sha3-256-s390 failed (rc=-22) The first failure is because the newly introduced context field 'first_message_part' is not copied during export and import operations. Because of that the value of 'first_message_part' is more or less random after an import into a newly allocated context and may or may not fit to the state of the imported SHA3 operation, causing an invalid hash when it does not fit. Save the 'first_message_part' field in the currently unused field 'partial' of struct sha3_state, even though the meaning of 'partial' is not exactly the same as 'first_message_part'. For the caller the returned state blob is opaque and it must only be ensured that the state can be imported later on by the module that exported it. The second failure is when on entry of s390_sha_update() the flag 'first_message_part' is on, and kimd is called in the first 'if (index)' block as well as in the second 'if (len >= bsize)' block. In this case, the 'first_message_part' is turned off after the first kimd, but the function code incorrectly retains the NIP flag. Reset the NIP flag after the first kimd unconditionally besides turning 'first_message_part' off. Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com> Fixes: 88c02b3f79a6 ("s390/sha3: Support sha3 performance enhancements") Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Reviewed-by: Joerg Schmidbauer <jschmidb@de.ibm.com> Signed-off-by: Ingo Franzki <ifranzki@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-09-05s390/pkey: Add AES xts and HMAC clear key token supportHarald Freudenberger1-0/+4
Add support for deriving protected keys from clear key token for AES xts and HMAC keys via PCKMO instruction. Add support for protected key generation and unwrap of protected key tokens for these key types. Furthermore 4 new sysfs attributes are introduced: - /sys/devices/virtual/misc/pkey/protkey/protkey_aes_xts_128 - /sys/devices/virtual/misc/pkey/protkey/protkey_aes_xts_256 - /sys/devices/virtual/misc/pkey/protkey/protkey_hmac_512 - /sys/devices/virtual/misc/pkey/protkey/protkey_hmac_1024 Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-09-05s390/cpacf: Add MSA 10 and 11 new PCKMO functionsHarald Freudenberger1-12/+16
Add the defines for the new PCKMO functions covering MSA 10 (AES XTS "double" keys) and MSA 11 (HMAC keys) support. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-09-05s390/mm: Add cond_resched() to cmm_alloc/free_pages()Gerald Schaefer1-1/+17
Adding/removing large amount of pages at once to/from the CMM balloon can result in rcu_sched stalls or workqueue lockups, because of busy looping w/o cond_resched(). Prevent this by adding a cond_resched(). cmm_free_pages() holds a spin_lock while looping, so it cannot be added directly to the existing loop. Instead, introduce a wrapper function that operates on maximum 256 pages at once, and add it there. Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-09-05s390/pai_ext: Update PAI extension 1 countersThomas Richter1-0/+9
Update the internal array of PAI extension 1 NNPA counter string table to support specialized processor instrumentation assist instructions. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-09-05s390/pai_crypto: Add support for MSA 10 and 11 pai countersThomas Richter1-0/+16
Update the internal array of PAI crypto counter string table with new counters supported with Message Security Assist extension (MSA) 10 and MSA 11. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Acked-by: Finn Callies <fcallies@linux.ibm.com> Tested-by: Finn Callies <fcallies@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-09-03arch, mm: move definition of node_data to generic codeMike Rapoport (Microsoft)3-20/+1
Every architecture that supports NUMA defines node_data in the same way: struct pglist_data *node_data[MAX_NUMNODES]; No reason to keep multiple copies of this definition and its forward declarations, especially when such forward declaration is the only thing in include/asm/mmzone.h for many architectures. Add definition and declaration of node_data to generic code and drop architecture-specific versions. Link: https://lkml.kernel.org/r/20240807064110.1003856-8-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Tested-by: Zi Yan <ziy@nvidia.com> # for x86_64 and arm64 Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> [arm64 + CXL via QEMU] Acked-by: Dan Williams <dan.j.williams@intel.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiaxun Yang <jiaxun.yang@flygoat.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Rob Herring (Arm) <robh@kernel.org> Cc: Samuel Holland <samuel.holland@sifive.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>