drm-misc - Kernel DRM miscellaneous fixes and cross-tree changes

Age	Commit message (Collapse)	Author	Files	Lines
2024-03-26	iommu/arm-smmu-v3: Fix access for STE.SHCFG	Mostafa Saleh	2	-12/+25
	STE attributes(NSCFG, PRIVCFG, INSTCFG) use value 0 for "Use Icomming", for some reason SHCFG doesn't follow that, and it is defined as "0b01". Currently the driver sets SHCFG to Use Incoming for stage-2 and bypass domains. However according to the User Manual (ARM IHI 0070 F.b): When SMMU_IDR1.ATTR_TYPES_OVR == 0, this field is RES0 and the incoming Shareability attribute is used. This patch adds a condition for writing SHCFG to Use incoming to be compliant with the architecture, and defines ATTR_TYPE_OVR as a new feature discovered from IDR1. This also required to propagate the SMMU through some functions args. There is no need to add similar condition for the newly introduced function arm_smmu_get_ste_used() as the values of the STE are the same before and after any transition, so this will not trigger any change. (we already do the same for the VMID). Although this is a misconfiguration from the driver, this has been there for a long time, so probably no HW running Linux is affected by it. Reported-by: Will Deacon <will@kernel.org> Closes: https://lore.kernel.org/all/20240215134952.GA690@willie-the-truck/ Signed-off-by: Mostafa Saleh <smostafa@google.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240323134658.464743-1-smostafa@google.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-26	iommu/arm-smmu-v3: Add cpu_to_le64() around STRTAB_STE_0_V	Jason Gunthorpe	1	-1/+2
	STRTAB_STE_0_V is a CPU value, it needs conversion for sparse to be clean. The missing annotation was a mistake introduced by splitting the ops out from the STE writer. Fixes: 7da51af9125c ("iommu/arm-smmu-v3: Make STE programming independent of the callers") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202403011441.5WqGrYjp-lkp@intel.com/ Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/0-v1-98b23ebb0c84+9f-smmu_cputole_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-13	Merge tag 'iommu-updates-v6.9' of ↵	Linus Torvalds	6	-318/+529
	git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: "Core changes: - Constification of bus_type pointer - Preparations for user-space page-fault delivery - Use a named kmem_cache for IOVA magazines Intel VT-d changes from Lu Baolu: - Add RBTree to track iommu probed devices - Add Intel IOMMU debugfs document - Cleanup and refactoring ARM-SMMU Updates from Will Deacon: - Device-tree binding updates for a bunch of Qualcomm SoCs - SMMUv2: Support for Qualcomm X1E80100 MDSS - SMMUv3: Significant rework of the driver's STE manipulation and domain handling code. This is the initial part of a larger scale rework aiming to improve the driver's implementation of the IOMMU-API in preparation for hooking up IOMMUFD support. AMD-Vi Updates: - Refactor GCR3 table support for SVA - Cleanups Some smaller cleanups and fixes" * tag 'iommu-updates-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (88 commits) iommu: Fix compilation without CONFIG_IOMMU_INTEL iommu/amd: Fix sleeping in atomic context iommu/dma: Document min_align_mask assumption iommu/vt-d: Remove scalabe mode in domain_context_clear_one() iommu/vt-d: Remove scalable mode context entry setup from attach_dev iommu/vt-d: Setup scalable mode context entry in probe path iommu/vt-d: Fix NULL domain on device release iommu: Add static iommu_ops->release_domain iommu/vt-d: Improve ITE fault handling if target device isn't present iommu/vt-d: Don't issue ATS Invalidation request when device is disconnected PCI: Make pci_dev_is_disconnected() helper public for other drivers iommu/vt-d: Use device rbtree in iopf reporting path iommu/vt-d: Use rbtree to track iommu probed devices iommu/vt-d: Merge intel_svm_bind_mm() into its caller iommu/vt-d: Remove initialization for dynamically heap-allocated rcu_head iommu/vt-d: Remove treatment for revoking PASIDs with pending page faults iommu/vt-d: Add the document for Intel IOMMU debugfs iommu/vt-d: Use kcalloc() instead of kzalloc() iommu/vt-d: Remove INTEL_IOMMU_BROKEN_GFX_WA iommu: re-use local fwnode variable in iommu_ops_from_fwnode() ...
2024-03-11	Merge tag 'irq-msi-2024-03-10' of ↵	Linus Torvalds	1	-2/+3
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull MSI updates from Thomas Gleixner: "Updates for the MSI interrupt subsystem and initial RISC-V MSI support. The core changes have been adopted from previous work which converted ARM[64] to the new per device MSI domain model, which was merged to support multiple MSI domain per device. The ARM[64] changes are being worked on too, but have not been ready yet. The core and platform-MSI changes have been split out to not hold up RISC-V and to avoid that RISC-V builds on the scheduled for removal interfaces. The core support provides new interfaces to handle wire to MSI bridges in a straight forward way and introduces new platform-MSI interfaces which are built on top of the per device MSI domain model. Once ARM[64] is converted over the old platform-MSI interfaces and the related ugliness in the MSI core code will be removed. The actual MSI parts for RISC-V were finalized late and have been post-poned for the next merge window. Drivers: - Add a new driver for the Andes hart-level interrupt controller - Rework the SiFive PLIC driver to prepare for MSI suport - Expand the RISC-V INTC driver to support the new RISC-V AIA controller which provides the basis for MSI on RISC-V - A few fixup for the fallout of the core changes" * tag 'irq-msi-2024-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits) irqchip/riscv-intc: Fix low-level interrupt handler setup for AIA x86/apic/msi: Use DOMAIN_BUS_GENERIC_MSI for HPET/IO-APIC domain search genirq/matrix: Dynamic bitmap allocation irqchip/riscv-intc: Add support for RISC-V AIA irqchip/sifive-plic: Improve locking safety by using irqsave/irqrestore irqchip/sifive-plic: Parse number of interrupts and contexts early in plic_probe() irqchip/sifive-plic: Cleanup PLIC contexts upon irqdomain creation failure irqchip/sifive-plic: Use riscv_get_intc_hwnode() to get parent fwnode irqchip/sifive-plic: Use devm_xyz() for managed allocation irqchip/sifive-plic: Use dev_xyz() in-place of pr_xyz() irqchip/sifive-plic: Convert PLIC driver into a platform driver irqchip/riscv-intc: Introduce Andes hart-level interrupt controller irqchip/riscv-intc: Allow large non-standard interrupt number genirq/irqdomain: Don't call ops->select for DOMAIN_BUS_ANY tokens irqchip/imx-intmux: Handle pure domain searches correctly genirq/msi: Provide MSI_FLAG_PARENT_PM_DEV genirq/irqdomain: Reroute device MSI create_mapping genirq/msi: Provide allocation/free functions for "wired" MSI interrupts genirq/msi: Optionally use dev->fwnode for device domain genirq/msi: Provide DOMAIN_BUS_WIRED_TO_MSI ...
2024-03-08	Merge branches 'arm/mediatek', 'arm/renesas', 'arm/smmu', 'x86/vt-d', ↵	Joerg Roedel	6	-318/+529
	'x86/amd' and 'core' into next
2024-03-01	iommu: constify of_phandle_args in xlate	Krzysztof Kozlowski	3	-3/+6
	The xlate callbacks are supposed to translate of_phandle_args to proper provider without modifying the of_phandle_args. Make the argument pointer to const for code safety and readability. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20240216144027.185959-2-krzysztof.kozlowski@linaro.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-29	iommu/arm-smmu-v3: Convert to domain_alloc_paging()	Jason Gunthorpe	1	-5/+17
	Now that the BLOCKED and IDENTITY behaviors are managed with their own domains change to the domain_alloc_paging() op. For now SVA remains using the old interface, eventually it will get its own op that can pass in the device and mm_struct which will let us have a sane lifetime for the mmu_notifier. Call arm_smmu_domain_finalise() early if dev is available. Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Moritz Fischer <moritzf@google.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/16-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29	iommu/arm-smmu-v3: Pass arm_smmu_domain and arm_smmu_device to finalize	Jason Gunthorpe	1	-17/+18
	Instead of putting container_of() casts in the internals, use the proper type in this call chain. This makes it easier to check that the two global static domains are not leaking into call chains they should not. Passing the smmu avoids the only caller from having to set it and unset it in the error path. Reviewed-by: Michael Shavit <mshavit@google.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Moritz Fischer <moritzf@google.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/15-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29	iommu/arm-smmu-v3: Use the identity/blocked domain during release	Jason Gunthorpe	1	-5/+2
	Consolidate some more code by having release call arm_smmu_attach_dev_identity/blocked() instead of open coding this. Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Moritz Fischer <moritzf@google.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/14-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29	iommu/arm-smmu-v3: Add a global static BLOCKED domain	Jason Gunthorpe	1	-0/+19
	Using the same design as the IDENTITY domain install an STRTAB_STE_0_CFG_ABORT STE. Reviewed-by: Michael Shavit <mshavit@google.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Moritz Fischer <moritzf@google.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/13-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29	iommu/arm-smmu-v3: Add a global static IDENTITY domain	Jason Gunthorpe	2	-25/+58
	Move to the new static global for identity domains. Move all the logic out of arm_smmu_attach_dev into an identity only function. Reviewed-by: Michael Shavit <mshavit@google.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Moritz Fischer <moritzf@google.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/12-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29	iommu/arm-smmu-v3: Check that the RID domain is S1 in SVA	Jason Gunthorpe	1	-1/+7
	The SVA code only works if the RID domain is a S1 domain and has already installed the cdtable. Originally the check for this was in arm_smmu_sva_bind() but when the op was removed the test didn't get copied over to the new arm_smmu_sva_set_dev_pasid(). Without the test wrong usage usually will hit a WARN_ON() in arm_smmu_write_ctx_desc() due to a missing ctx table. However, the next patches wil change things so that an IDENTITY domain is not a struct arm_smmu_domain and this will get into memory corruption if the struct is wrongly casted. Fail in arm_smmu_sva_set_dev_pasid() if the STE does not have a S1, which is a proxy for the STE having a pointer to the CD table. Write it in a way that will be compatible with the next patches. Fixes: 386fa64fd52b ("arm-smmu-v3/sva: Add SVA domain support") Reported-by: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> Closes: https://lore.kernel.org/linux-iommu/2a828e481416405fb3a4cceb9e075a59@huawei.com/ Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/11-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29	iommu/arm-smmu-v3: Remove arm_smmu_master->domain	Jason Gunthorpe	2	-17/+10
	Introducing global statics which are of type struct iommu_domain, not struct arm_smmu_domain makes it difficult to retain arm_smmu_master->domain, as it can no longer point to an IDENTITY or BLOCKED domain. The only place that uses the value is arm_smmu_detach_dev(). Change things to work like other drivers and call iommu_get_domain_for_dev() to obtain the current domain. The master->domain is subtly protecting the master->domain_head against being unused as only PAGING domains will set master->domain and only paging domains use the master->domain_head. To make it simple keep the master->domain_head initialized so that the list_del() logic just does nothing for attached non-PAGING domains. Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Moritz Fischer <moritzf@google.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/10-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29	iommu/arm-smmu-v3: Pass smmu_domain to arm_enable/disable_ats()	Jason Gunthorpe	1	-7/+6
	The caller already has the domain, just pass it in. A following patch will remove master->domain. Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Moritz Fischer <moritzf@google.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/9-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29	iommu/arm-smmu-v3: Put writing the context descriptor in the right order	Jason Gunthorpe	1	-9/+20
	Get closer to the IOMMU API ideal that changes between domains can be hitless. The ordering for the CD table entry is not entirely clean from this perspective. When switching away from a STE with a CD table programmed in it we should write the new STE first, then clear any old data in the CD entry. If we are programming a CD table for the first time to a STE then the CD entry should be programmed before the STE is loaded. If we are replacing a CD table entry when the STE already points at the CD entry then we just need to do the make/break sequence. Lift this code out of arm_smmu_detach_dev() so it can all be sequenced properly. The only other caller is arm_smmu_release_device() and it is going to free the cdtable anyhow, so it doesn't matter what is in it. Reviewed-by: Michael Shavit <mshavit@google.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Moritz Fischer <moritzf@google.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/8-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29	iommu/arm-smmu-v3: Do not change the STE twice during arm_smmu_attach_dev()	Jason Gunthorpe	1	-6/+9
	This was needed because the STE code required the STE to be in ABORT/BYPASS inorder to program a cdtable or S2 STE. Now that the STE code can automatically handle all transitions we can remove this step from the attach_dev flow. A few small bugs exist because of this: 1) If the core code does BLOCKED -> UNMANAGED with disable_bypass=false then there will be a moment where the STE points at BYPASS. Since this can be done by VFIO/IOMMUFD it is a small security race. 2) If the core code does IDENTITY -> DMA then any IOMMU_RESV_DIRECT regions will temporarily become BLOCKED. We'd like drivers to work in a way that allows IOMMU_RESV_DIRECT to be continuously functional during these transitions. Make arm_smmu_release_device() put the STE back to the correct ABORT/BYPASS setting. Fix a bug where a IOMMU_RESV_DIRECT was ignored on this path. As noted before the reordering of the linked list/STE/CD changes is OK against concurrent arm_smmu_share_asid() because of the arm_smmu_asid_lock. Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Moritz Fischer <moritzf@google.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/7-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29	iommu/arm-smmu-v3: Compute the STE only once for each master	Jason Gunthorpe	1	-35/+22
	Currently arm_smmu_install_ste_for_dev() iterates over every SID and computes from scratch an identical STE. Every SID should have the same STE contents. Turn this inside out so that the STE is supplied by the caller and arm_smmu_install_ste_for_dev() simply installs it to every SID. This is possible now that the STE generation does not inform what sequence should be used to program it. This allows splitting the STE calculation up according to the call site, which following patches will make use of, and removes the confusing NULL domain special case that only supported arm_smmu_detach_dev(). Reviewed-by: Michael Shavit <mshavit@google.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Moritz Fischer <moritzf@google.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/6-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29	iommu/arm-smmu-v3: Hold arm_smmu_asid_lock during all of attach_dev	Jason Gunthorpe	1	-9/+13
	The BTM support wants to be able to change the ASID of any smmu_domain. When it goes to do this it holds the arm_smmu_asid_lock and iterates over the target domain's devices list. During attach of a S1 domain we must ensure that the devices list and CD are in sync, otherwise we could miss CD updates or a parallel CD update could push an out of date CD. This is pretty complicated, and almost works today because arm_smmu_detach_dev() removes the master from the linked list before working on the CD entries, preventing parallel update of the CD. However, it does have an issue where the CD can remain programed while the domain appears to be unattached. arm_smmu_share_asid() will then not clear any CD entriess and install its own CD entry with the same ASID concurrently. This creates a small race window where the IOMMU can see two ASIDs pointing to different translations. CPU0 CPU1 arm_smmu_attach_dev() arm_smmu_detach_dev() spin_lock_irqsave(&smmu_domain->devices_lock, flags); list_del(&master->domain_head); spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); arm_smmu_mmu_notifier_get() arm_smmu_alloc_shared_cd() arm_smmu_share_asid(): // Does nothing due to list_del above arm_smmu_update_ctx_desc_devices() arm_smmu_tlb_inv_asid() arm_smmu_write_ctx_desc() ** Now the ASID is in two CDs with different translation arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL); Solve this by wrapping most of the attach flow in the arm_smmu_asid_lock. This locks more than strictly needed to prepare for the next patch which will reorganize the order of the linked list, STE and CD changes. Move arm_smmu_detach_dev() till after we have initialized the domain so the lock can be held for less time. Reviewed-by: Michael Shavit <mshavit@google.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Moritz Fischer <moritzf@google.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/5-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29	iommu/arm-smmu-v3: Build the whole STE in arm_smmu_make_s2_domain_ste()	Jason Gunthorpe	2	-14/+15
	Half the code was living in arm_smmu_domain_finalise_s2(), just move it here and take the values directly from the pgtbl_ops instead of storing copies. Reviewed-by: Michael Shavit <mshavit@google.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Moritz Fischer <moritzf@google.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/4-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29	iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains into functions	Jason Gunthorpe	1	-55/+83
	This is preparation to move the STE calculation higher up in to the call chain and remove arm_smmu_write_strtab_ent(). These new functions will be called directly from attach_dev. Reviewed-by: Moritz Fischer <mdf@kernel.org> Reviewed-by: Michael Shavit <mshavit@google.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Moritz Fischer <moritzf@google.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/3-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29	iommu/arm-smmu-v3: Consolidate the STE generation for abort/bypass	Jason Gunthorpe	1	-42/+55
	This allows writing the flow of arm_smmu_write_strtab_ent() around abort and bypass domains more naturally. Note that the core code no longer supplies NULL domains, though there is still a flow in the driver that end up in arm_smmu_write_strtab_ent() with NULL. A later patch will remove it. Remove the duplicate calculation of the STE in arm_smmu_init_bypass_stes() and remove the force parameter. arm_smmu_rmr_install_bypass_ste() can now simply invoke arm_smmu_make_bypass_ste() directly. Rename arm_smmu_init_bypass_stes() to arm_smmu_init_initial_stes() to better reflect its purpose. Reviewed-by: Michael Shavit <mshavit@google.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Moritz Fischer <moritzf@google.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/2-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29	iommu/arm-smmu-v3: Make STE programming independent of the callers	Jason Gunthorpe	1	-64/+211
	As the comment in arm_smmu_write_strtab_ent() explains, this routine has been limited to only work correctly in certain scenarios that the caller must ensure. Generally the caller must put the STE into ABORT or BYPASS before attempting to program it to something else. The iommu core APIs would ideally expect the driver to do a hitless change of iommu_domain in a number of cases: - RESV_DIRECT support wants IDENTITY -> DMA -> IDENTITY to be hitless for the RESV ranges - PASID upgrade has IDENTIY on the RID with no PASID then a PASID paging domain installed. The RID should not be impacted - PASID downgrade has IDENTIY on the RID and all PASID's removed. The RID should not be impacted - RID does PAGING -> BLOCKING with active PASID, PASID's should not be impacted - NESTING -> NESTING for carrying all the above hitless cases in a VM into the hypervisor. To comprehensively emulate the HW in a VM we should assume the VM OS is running logic like this and expecting hitless updates to be relayed to real HW. For CD updates arm_smmu_write_ctx_desc() has a similar comment explaining how limited it is, and the driver does have a need for hitless CD updates: - SMMUv3 BTM S1 ASID re-label - SVA mm release should change the CD to answert not-present to all requests without allowing logging (EPD0) The next patches/series are going to start removing some of this logic from the callers, and add more complex state combinations than currently. At the end everything that can be hitless will be hitless, including all of the above. Introduce arm_smmu_write_ste() which will run through the multi-qword programming sequence to avoid creating an incoherent 'torn' STE in the HW caches. It automatically detects which of two algorithms to use: 1) The disruptive V=0 update described in the spec which disrupts the entry and does three syncs to make the change: - Write V=0 to QWORD 0 - Write the entire STE except QWORD 0 - Write QWORD 0 2) A hitless update algorithm that follows the same rational that the driver already uses. It is safe to change IGNORED bits that HW doesn't use: - Write the target value into all currently unused bits - Write a single QWORD, this makes the new STE live atomically - Ensure now unused bits are 0 The detection of which path to use and the implementation of the hitless update rely on a "used bitmask" describing what bits the HW is actually using based on the V/CFG/etc bits. This flows from the spec language, typically indicated as IGNORED. Knowing which bits the HW is using we can update the bits it does not use and then compute how many QWORDS need to be changed. If only one qword needs to be updated the hitless algorithm is possible. Later patches will include CD updates in this mechanism so make the implementation generic using a struct arm_smmu_entry_writer and struct arm_smmu_entry_writer_ops to abstract the differences between STE and CD to be plugged in. At this point it generates the same sequence of updates as the current code, except that zeroing the VMID on entry to BYPASS/ABORT will do an extra sync (this seems to be an existing bug). Going forward this will use a V=0 transition instead of cycling through ABORT if a hitfull change is required. This seems more appropriate as ABORT will fail DMAs without any logging, but dropping a DMA due to transient V=0 is probably signaling a bug, so the C_BAD_STE is valuable. Add STRTAB_STE_1_SHCFG_INCOMING to s2_cfg, this was editing the STE in place and subtly inherited the value of data[1] from abort/bypass. Signed-off-by: Michael Shavit <mshavit@google.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/1-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-22	iommu/arm-smmu-qcom: Add X1E80100 MDSS compatible	Abel Vesa	1	-0/+1
	Add the X1E80100 MDSS compatible to clients compatible list, as it also needs the workarounds. Signed-off-by: Abel Vesa <abel.vesa@linaro.org> Link: https://lore.kernel.org/r/20240131-x1e80100-iommu-arm-smmu-qcom-v1-1-c1240419c718@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2024-02-22	iommu/arm-smmu-v3: Do not use GFP_KERNEL under as spinlock	Jason Gunthorpe	1	-26/+12
	If the SMMU is configured to use a two level CD table then arm_smmu_write_ctx_desc() allocates a CD table leaf internally using GFP_KERNEL. Due to recent changes this is being done under a spinlock to iterate over the device list - thus it will trigger a sleeping while atomic warning: arm_smmu_sva_set_dev_pasid() mutex_lock(&sva_lock); __arm_smmu_sva_bind() arm_smmu_mmu_notifier_get() spin_lock_irqsave() arm_smmu_write_ctx_desc() arm_smmu_get_cd_ptr() arm_smmu_alloc_cd_leaf_table() dmam_alloc_coherent(GFP_KERNEL) This is a 64K high order allocation and really should not be done atomically. At the moment the rework of the SVA to follow the new API is half finished. Recently the CD table memory was moved from the domain to the master, however we have the confusing situation where the SVA code is wrongly using the RID domains device's list to track which CD tables the SVA is installed in. Remove the logic to replicate the CD across all the domain's masters during attach. We know which master and which CD table the PASID should be installed in. Right now SVA only works when dma-iommu.c is in control of the RID translation, which means we have a single iommu_domain shared across the entire group and that iommu_domain is not shared outside the group. Critically this means that the iommu_group->devices list and RID's smmu_domain->devices list describe the same set of masters. For PCI cases the core code also insists on singleton groups so there is only one entry in the smmu_domain->devices list that is equal to the master being passed in to arm_smmu_sva_set_dev_pasid(). Only non-PCI cases may have multi-device groups. However, the core code will repeat the calls to arm_smmu_sva_set_dev_pasid() across the entire iommu_group->devices list. Instead of having arm_smmu_mmu_notifier_get() indirectly loop over all the devices in the group via the RID's smmu_domain, rely on __arm_smmu_sva_bind() to be called for each device in the group and install the repeated CD entry that way. This avoids taking the spinlock to access the devices list and permits the arm_smmu_write_ctx_desc() to use a sleeping allocation. Leave the arm_smmu_mm_release() as a confusing situation, this requires tracking attached masters inside the SVA domain. Removing the loop allows arm_smmu_write_ctx_desc() to be called outside the spinlock and thus is safe to use GFP_KERNEL. Move the clearing of the CD into arm_smmu_sva_remove_dev_pasid() so that arm_smmu_mmu_notifier_get/put() remain paired functions. Fixes: 24503148c545 ("iommu/arm-smmu-v3: Refactor write_ctx_desc") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/all/4e25d161-0cf8-4050-9aa3-dfa21cd63e56@moroto.mountain/ Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Michael Shavit <mshavit@google.com> Link: https://lore.kernel.org/r/0-v3-11978fc67151+112-smmu_cd_atomic_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-16	iommu: Make iommu_report_device_fault() return void	Lu Baolu	1	-2/+2
	As the iommu_report_device_fault() has been converted to auto-respond a page fault if it fails to enqueue it, there's no need to return a code in any case. Make it return void. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240212012227.119381-17-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-16	iommu: Make iopf_group_response() return void	Lu Baolu	1	-32/+18
	The iopf_group_response() should return void, as nothing can do anything with the failure. This implies that ops->page_response() must also return void; this is consistent with what the drivers do. The failure paths, which are all integrity validations of the fault, should be WARN_ON'd, not return codes. If the iommu core fails to enqueue the fault, it should respond the fault directly by calling ops->page_response() instead of returning an error number and relying on the iommu drivers to do so. Consolidate the error fault handling code in the core. Co-developed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240212012227.119381-16-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-16	iommu: Separate SVA and IOPF	Lu Baolu	2	-2/+0
	Add CONFIG_IOMMU_IOPF for page fault handling framework and select it from its real consumer. Move iopf function declaration from iommu-sva.h to iommu.h and remove iommu-sva.h as it's empty now. Consolidate all SVA related code into iommu-sva.c: - Move iommu_sva_domain_alloc() from iommu.c to iommu-sva.c. - Move sva iopf handling code from io-pgfault.c to iommu-sva.c. Consolidate iommu_report_device_fault() and iommu_page_response() into io-pgfault.c. Export iopf_free_group() and iopf_group_response() for iopf handlers implemented in modules. Some functions are renamed with more meaningful names. No other intentional functionality changes. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Longfang Liu <liulongfang@huawei.com> Link: https://lore.kernel.org/r/20240212012227.119381-11-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-16	iommu: Merge iommu_fault_event and iopf_fault	Lu Baolu	1	-2/+2
	The iommu_fault_event and iopf_fault data structures store the same information about an iopf fault. They are also used in the same way. Merge these two data structures into a single one to make the code more concise and easier to maintain. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Longfang Liu <liulongfang@huawei.com> Link: https://lore.kernel.org/r/20240212012227.119381-8-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-16	iommu: Remove iommu_[un]register_device_fault_handler()	Lu Baolu	1	-12/+1
	The individual iommu driver reports the iommu page faults by calling iommu_report_device_fault(), where a pre-registered device fault handler is called to route the fault to another fault handler installed on the corresponding iommu domain. The pre-registered device fault handler is static and won't be dynamic as the fault handler is eventually per iommu domain. Replace calling device fault handler with iommu_queue_iopf(). After this replacement, the registering and unregistering fault handler interfaces are not needed anywhere. Remove the interfaces and the related data structures to avoid dead code. Convert cookie parameter of iommu_queue_iopf() into a device pointer that is really passed. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Longfang Liu <liulongfang@huawei.com> Link: https://lore.kernel.org/r/20240212012227.119381-7-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-16	iommu/arm-smmu-v3: Remove unrecoverable faults reporting	Lu Baolu	1	-33/+13
	No device driver registers fault handler to handle the reported unrecoveraable faults. Remove it to avoid dead code. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Longfang Liu <liulongfang@huawei.com> Link: https://lore.kernel.org/r/20240212012227.119381-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-15	irqchip: Convert all platform MSI users to the new API	Thomas Gleixner	1	-2/+3
	Switch all the users of the platform MSI domain over to invoke the new interfaces which branch to the original platform MSI functions when the irqdomain associated to the caller device does not yet provide MSI parent functionality. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Anup Patel <apatel@ventanamicro.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240127161753.114685-7-apatel@ventanamicro.com
2024-02-13	Revert "iommu/arm-smmu: Convert to domain_alloc_paging()"	Dmitry Baryshkov	1	-11/+6
	This reverts commit 9b3febc3a3da ("iommu/arm-smmu: Convert to domain_alloc_paging()"). It breaks Qualcomm MSM8996 platform. Calling arm_smmu_write_context_bank() from new codepath results in the platform being reset because of the unclocked hardware access. Fixes: 9b3febc3a3da ("iommu/arm-smmu: Convert to domain_alloc_paging()") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Acked-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20240213-iommu-revert-domain-alloc-v1-1-325ff55dece4@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2024-01-18	Merge tag 'iommu-updates-v6.8' of ↵	Linus Torvalds	7	-132/+173
	git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: "Core changes: - Fix race conditions in device probe path - Retire IOMMU bus_ops - Support for passing custom allocators to page table drivers - Clean up Kconfig around IOMMU_SVA - Support for sharing SVA domains with all devices bound to a mm - Firmware data parsing cleanup - Tracing improvements for iommu-dma code - Some smaller fixes and cleanups ARM-SMMU drivers: - Device-tree binding updates: - Add additional compatible strings for Qualcomm SoCs - Document Adreno clocks for Qualcomm's SM8350 SoC - SMMUv2: - Implement support for the ->domain_alloc_paging() callback - Ensure Secure context is restored following suspend of Qualcomm SMMU implementation - SMMUv3: - Disable stalling mode for the "quiet" context descriptor - Minor refactoring and driver cleanups Intel VT-d driver: - Cleanup and refactoring AMD IOMMU driver: - Improve IO TLB invalidation logic - Small cleanups and improvements Rockchip IOMMU driver: - DT binding update to add Rockchip RK3588 Apple DART driver: - Apple M1 USB4/Thunderbolt DART support - Cleanups Virtio IOMMU driver: - Add support for iotlb_sync_map - Enable deferred IO TLB flushes" * tag 'iommu-updates-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (66 commits) iommu: Don't reserve 0-length IOVA region iommu/vt-d: Move inline helpers to header files iommu/vt-d: Remove unused vcmd interfaces iommu/vt-d: Remove unused parameter of intel_pasid_setup_pass_through() iommu/vt-d: Refactor device_to_iommu() to retrieve iommu directly iommu/sva: Fix memory leak in iommu_sva_bind_device() dt-bindings: iommu: rockchip: Add Rockchip RK3588 iommu/dma: Trace bounce buffer usage when mapping buffers iommu/arm-smmu: Convert to domain_alloc_paging() iommu/arm-smmu: Pass arm_smmu_domain to internal functions iommu/arm-smmu: Implement IOMMU_DOMAIN_BLOCKED iommu/arm-smmu: Convert to a global static identity domain iommu/arm-smmu: Reorganize arm_smmu_domain_add_master() iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent() iommu/arm-smmu-v3: Add a type for the STE iommu/arm-smmu-v3: disable stall for quiet_cd iommu/qcom: restore IOMMU state if needed iommu/arm-smmu-qcom: Add QCM2290 MDSS compatible iommu/arm-smmu-qcom: Add missing GMU entry to match table ...
2024-01-08	mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER	Kirill A. Shutemov	1	-1/+1
	commit 23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely") has changed the definition of MAX_ORDER to be inclusive. This has caused issues with code that was not yet upstream and depended on the previous definition. To draw attention to the altered meaning of the define, rename MAX_ORDER to MAX_PAGE_ORDER. Link: https://lkml.kernel.org/r/20231228144704.14033-2-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-03	Merge branches 'apple/dart', 'arm/rockchip', 'arm/smmu', 'virtio', ↵	Joerg Roedel	7	-132/+173
	'x86/vt-d', 'x86/amd' and 'core' into next
2023-12-13	iommu/arm-smmu: Convert to domain_alloc_paging()	Jason Gunthorpe	1	-6/+11
	Now that the BLOCKED and IDENTITY behaviors are managed with their own domains change to the domain_alloc_paging() op. The check for using_legacy_binding is now redundant, arm_smmu_def_domain_type() always returns IOMMU_DOMAIN_IDENTITY for this mode, so the core code will never attempt to create a DMA domain in the first place. Since commit a4fdd9762272 ("iommu: Use flush queue capability") the core code only passes in IDENTITY/BLOCKED/UNMANAGED/DMA domain types. It will not pass in IDENTITY or BLOCKED if the global statics exist, so the test for DMA is also redundant now too. Call arm_smmu_init_domain_context() early if a dev is available. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/5-v2-c86cc8c2230e+160bb-smmu_newapi_jgg@nvidia.com [will: Simplify arm_smmu_domain_alloc_paging() since 'cfg' cannot be NULL] Signed-off-by: Will Deacon <will@kernel.org>
2023-12-13	iommu/arm-smmu: Pass arm_smmu_domain to internal functions	Jason Gunthorpe	1	-12/+10
	Keep the types consistent, all the callers of these functions already have obtained a struct arm_smmu_domain, don't needlessly go to/from an iommu_domain through the internal call chains. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/4-v2-c86cc8c2230e+160bb-smmu_newapi_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2023-12-13	iommu/arm-smmu: Implement IOMMU_DOMAIN_BLOCKED	Jason Gunthorpe	1	-3/+25
	Using the same design as IDENTITY setup a S2CR_TYPE_FAULT s2cr for the device. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/3-v2-c86cc8c2230e+160bb-smmu_newapi_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2023-12-13	iommu/arm-smmu: Convert to a global static identity domain	Jason Gunthorpe	2	-29/+53
	Create a global static identity domain with it's own arm_smmu_attach_dev_identity() that simply calls arm_smmu_master_install_s2crs() with the identity parameters. This is done by giving the attach path for identity its own unique implementation that simply calls arm_smmu_master_install_s2crs(). Remove ARM_SMMU_DOMAIN_BYPASS and all checks of IOMMU_DOMAIN_IDENTITY. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/2-v2-c86cc8c2230e+160bb-smmu_newapi_jgg@nvidia.com [will: Move duplicated autosuspend logic into a helper function] Signed-off-by: Will Deacon <will@kernel.org>
2023-12-13	iommu/arm-smmu: Reorganize arm_smmu_domain_add_master()	Jason Gunthorpe	1	-13/+10
	Make arm_smmu_domain_add_master() not use the smmu_domain to detect the s2cr configuration, instead pass it in as a parameter. It always returns zero so make it return void. Since it no longer really does anything to do with a domain call it arm_smmu_master_install_s2crs(). This is done to make the next two patches able to re-use this code without forcing the creation of a struct arm_smmu_domain. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/1-v2-c86cc8c2230e+160bb-smmu_newapi_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2023-12-13	iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED	Jason Gunthorpe	2	-4/+1
	Currently this is exactly the same as ARM_SMMU_DOMAIN_S2, so just remove it. The ongoing work to add nesting support through iommufd will do something a little different. Reviewed-by: Moritz Fischer <mdf@kernel.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Will Deacon <will@kernel.org>
2023-12-13	iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent()	Jason Gunthorpe	1	-7/+2
	The only caller is arm_smmu_install_ste_for_dev() which never has a NULL master. Remove the confusing if. Reviewed-by: Moritz Fischer <mdf@kernel.org> Reviewed-by: Michael Shavit <mshavit@google.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Will Deacon <will@kernel.org>
2023-12-13	iommu/arm-smmu-v3: Add a type for the STE	Jason Gunthorpe	2	-31/+35
	Instead of passing a naked __le16 * around to represent a STE wrap it in a "struct arm_smmu_ste" with an array of the correct size. This makes it much clearer which functions will comprise the "STE API". Reviewed-by: Moritz Fischer <mdf@kernel.org> Reviewed-by: Michael Shavit <mshavit@google.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Will Deacon <will@kernel.org>
2023-12-12	iommu/arm-smmu-v3: disable stall for quiet_cd	Wenkai Lin	1	-0/+3
	In the stall model, invalid transactions were expected to be stalled and aborted by the IOPF handler. However, when killing a test case with a huge amount of data, the accelerator streamline can not stop until all data is consumed even if the page fault handler reports errors. As a result, the kill may take a long time, about 10 seconds with numerous iopf interrupts. So disable stall for quiet_cd in the non-force stall model, since force stall model (STALL_MODEL==0b10) requires CD.S must be 1. Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org> Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com> Suggested-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Link: https://lore.kernel.org/r/20231206005727.46150-1-zhangfei.gao@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2023-12-12	iommu/qcom: restore IOMMU state if needed	Vladimir Lypak	1	-1/+9
	If the IOMMU has a power domain then some state will be lost in qcom_iommu_suspend and TZ will reset device if we don't call qcom_scm_restore_sec_cfg before accessing it again. Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com> [luca@z3ntu.xyz: reword commit message a bit] Signed-off-by: Luca Weiss <luca@z3ntu.xyz> Link: https://lore.kernel.org/r/20231011-msm8953-iommu-restore-v1-1-48a0c93809a2@z3ntu.xyz Signed-off-by: Will Deacon <will@kernel.org>
2023-12-12	iommu/arm-smmu-qcom: Add QCM2290 MDSS compatible	Konrad Dybcio	1	-0/+1
	Add the QCM2290 MDSS compatible to clients compatible list, as it also needs the workarounds. Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> Link: https://lore.kernel.org/r/20231125-topic-rb1_feat-v3-5-4cbb567743bb@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2023-12-12	iommu/arm-smmu-qcom: Add missing GMU entry to match table	Rob Clark	1	-0/+1
	In some cases the firmware expects cbndx 1 to be assigned to the GMU, so we also want the default domain for the GMU to be an identy domain. This way it does not get a context bank assigned. Without this, both of_dma_configure() and drm/msm's iommu_domain_attach() will trigger allocating and configuring a context bank. So GMU ends up attached to both cbndx 1 and later cbndx 2. This arrangement seemingly confounds and surprises the firmware if the GPU later triggers a translation fault, resulting (on sc8280xp / lenovo x13s, at least) in the SMMU getting wedged and the GPU stuck without memory access. Cc: stable@vger.kernel.org Signed-off-by: Rob Clark <robdclark@chromium.org> Tested-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20231210180655.75542-1-robdclark@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2023-12-12	iommu: Mark dev_iommu_priv_set() with a lockdep	Jason Gunthorpe	2	-2/+0
	A perfect driver would only call dev_iommu_priv_set() from its probe callback. We've made it functionally correct to call it from the of_xlate by adding a lock around that call. lockdep assert that iommu_probe_device_lock is held to discourage misuse. Exclude PPC kernels with CONFIG_FSL_PAMU turned on because FSL_PAMU uses a global static for its priv and abuses priv for its domain. Remove the pointless stores of NULL, all these are on paths where the core code will free dev->iommu after the op returns. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Tested-by: Hector Martin <marcan@marcan.st> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/5-v2-16e4def25ebb+820-iommu_fwspec_p1_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-12-12	iommu: Add mm_get_enqcmd_pasid() helper function	Tina Zhang	1	-8/+15
	mm_get_enqcmd_pasid() should be used by architecture code and closely related to learn the PASID value that the x86 ENQCMD operation should use for the mm. For the moment SMMUv3 uses this without any connection to ENQCMD, it will be cleaned up similar to how the prior patch made VT-d use the PASID argument of set_dev_pasid(). The motivation is to replace mm->pasid with an iommu private data structure that is introduced in a later patch. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Tina Zhang <tina.zhang@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20231027000525.1278806-4-tina.zhang@intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-11-27	iommu: Clean up open-coded ownership checks	Robin Murphy	3	-24/+4
	Some drivers already implement their own defence against the possibility of being given someone else's device. Since this is now taken care of by the core code (and via a slightly different path from the original fwspec-based idea), let's clean them up. Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/58a9879ce3f03562bb061e6714fe6efb554c3907.1700589539.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>