summaryrefslogtreecommitdiffstats
path: root/drivers/iommu
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'soc-drivers-6.3' of ↵Linus Torvalds2023-02-273-3/+3
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC driver updates from Arnd Bergmann: "As usual, there are lots of minor driver changes across SoC platforms from NXP, Amlogic, AMD Zynq, Mediatek, Qualcomm, Apple and Samsung. These usually add support for additional chip variations in existing drivers, but also add features or bugfixes. The SCMI firmware subsystem gains a unified raw userspace interface through debugfs, which can be used for validation purposes. Newly added drivers include: - New power management drivers for StarFive JH7110, Allwinner D1 and Renesas RZ/V2M - A driver for Qualcomm battery and power supply status - A SoC device driver for identifying Nuvoton WPCM450 chips - A regulator coupler driver for Mediatek MT81xxv" * tag 'soc-drivers-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (165 commits) power: supply: Introduce Qualcomm PMIC GLINK power supply soc: apple: rtkit: Do not copy the reg state structure to the stack soc: sunxi: SUN20I_PPU should depend on PM memory: renesas-rpc-if: Remove redundant division of dummy soc: qcom: socinfo: Add IDs for IPQ5332 and its variant dt-bindings: arm: qcom,ids: Add IDs for IPQ5332 and its variant dt-bindings: power: qcom,rpmpd: add RPMH_REGULATOR_LEVEL_LOW_SVS_L1 firmware: qcom_scm: Move qcom_scm.h to include/linux/firmware/qcom/ MAINTAINERS: Update qcom CPR maintainer entry dt-bindings: firmware: document Qualcomm SM8550 SCM dt-bindings: firmware: qcom,scm: add qcom,scm-sa8775p compatible soc: qcom: socinfo: Add Soc IDs for IPQ8064 and variants dt-bindings: arm: qcom,ids: Add Soc IDs for IPQ8064 and variants soc: qcom: socinfo: Add support for new field in revision 17 soc: qcom: smd-rpm: Add IPQ9574 compatible soc: qcom: pmic_glink: remove redundant calculation of svid soc: qcom: stats: Populate all subsystem debugfs files dt-bindings: soc: qcom,rpmh-rsc: Update to allow for generic nodes soc: qcom: pmic_glink: add CONFIG_NET/CONFIG_OF dependencies soc: qcom: pmic_glink: Introduce altmode support ...
| * firmware: qcom_scm: Move qcom_scm.h to include/linux/firmware/qcom/Elliot Berman2023-02-083-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Move include/linux/qcom_scm.h to include/linux/firmware/qcom/qcom_scm.h. This removes 1 of a few remaining Qualcomm-specific headers into a more approciate subdirectory under include/. Suggested-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com> Reviewed-by: Guru Das Srinagesh <quic_gurus@quicinc.com> Acked-by: Mukesh Ojha <quic_mojha@quicinc.com> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20230203210956.3580811-1-quic_eberman@quicinc.com
* | Merge tag 'for-linus-iommufd' of ↵Linus Torvalds2023-02-2410-36/+122
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd Pull iommufd updates from Jason Gunthorpe: "Some polishing and small fixes for iommufd: - Remove IOMMU_CAP_INTR_REMAP, instead rely on the interrupt subsystem - Use GFP_KERNEL_ACCOUNT inside the iommu_domains - Support VFIO_NOIOMMU mode with iommufd - Various typos - A list corruption bug if HWPTs are used for attach" * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: iommufd: Do not add the same hwpt to the ioas->hwpt_list twice iommufd: Make sure to zero vfio_iommu_type1_info before copying to user vfio: Support VFIO_NOIOMMU with iommufd iommufd: Add three missing structures in ucmd_buffer selftests: iommu: Fix test_cmd_destroy_access() call in user_copy iommu: Remove IOMMU_CAP_INTR_REMAP irq/s390: Add arch_is_isolated_msi() for s390 iommu/x86: Replace IOMMU_CAP_INTR_REMAP with IRQ_DOMAIN_FLAG_ISOLATED_MSI genirq/msi: Rename IRQ_DOMAIN_MSI_REMAP to IRQ_DOMAIN_ISOLATED_MSI genirq/irqdomain: Remove unused irq_domain_check_msi_remap() code iommufd: Convert to msi_device_has_isolated_msi() vfio/type1: Convert to iommu_group_has_isolated_msi() iommu: Add iommu_group_has_isolated_msi() genirq/msi: Add msi_device_has_isolated_msi()
| * \ Merge tag 'v6.2' into iommufd.git for-nextJason Gunthorpe2023-02-215-17/+35
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Resolve conflicts from the signature change in iommu_map: - drivers/infiniband/hw/usnic/usnic_uiom.c Switch iommu_map_atomic() to iommu_map(.., GFP_ATOMIC) - drivers/vfio/vfio_iommu_type1.c Following indenting change for GFP_KERNEL Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | iommufd: Do not add the same hwpt to the ioas->hwpt_list twiceJason Gunthorpe2023-02-151-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The hwpt is added to the hwpt_list only during its creation, it is never added again. This hunk is some missed leftover from rework. Adding it twice will corrupt the linked list in some cases. It effects HWPT specific attachment, which is something the test suite cannot cover until we can create a legitimate struct device with a non-system iommu "driver" (ie we need the bus removed from the iommu code) Cc: stable@vger.kernel.org Fixes: e8d57210035b ("iommufd: Add kAPI toward external drivers for physical devices") Link: https://lore.kernel.org/r/1-v1-4336b5cb2fe4+1d7-iommufd_hwpt_jgg@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reported-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | iommufd: Make sure to zero vfio_iommu_type1_info before copying to userJason Gunthorpe2023-02-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Missed a zero initialization here. Most of the struct is filled with a copy_from_user(), however minsz for that copy is smaller than the actual struct by 8 bytes, thus we don't fill the padding. Cc: stable@vger.kernel.org # 6.1+ Fixes: d624d6652a65 ("iommufd: vfio container FD ioctl compatibility") Link: https://lore.kernel.org/r/0-v1-a74499ece799+1a-iommufd_get_info_leak_jgg@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reported-by: syzbot+cb1e0978f6bf46b83a58@syzkaller.appspotmail.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | Merge branch 'vfio-no-iommu' into iommufd.git for-nextJason Gunthorpe2023-02-033-20/+89
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | Shared branch with VFIO for the no-iommu support. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| | * | | vfio: Support VFIO_NOIOMMU with iommufdJason Gunthorpe2023-02-033-20/+89
| | | |/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a small amount of emulation to vfio_compat to accept the SET_IOMMU to VFIO_NOIOMMU_IOMMU and have vfio just ignore iommufd if it is working on a no-iommu enabled device. Move the enable_unsafe_noiommu_mode module out of container.c into vfio_main.c so that it is always available even if VFIO_CONTAINER=n. This passes Alex's mini-test: https://github.com/awilliam/tests/blob/master/vfio-noiommu-pci-device-open.c Link: https://lore.kernel.org/r/0-v3-480cd64a16f7+1ad0-iommufd_noiommu_jgg@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | Merge branch 'iommu-memory-accounting' of ↵Jason Gunthorpe2023-01-307-63/+69
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/joro/iommu intoiommufd/for-next Jason Gunthorpe says: ==================== iommufd follows the same design as KVM and uses memory cgroups to limit the amount of kernel memory a iommufd file descriptor can pin down. The various internal data structures already use GFP_KERNEL_ACCOUNT to charge its own memory. However, one of the biggest consumers of kernel memory is the IOPTEs stored under the iommu_domain and these allocations are not tracked. This series is the first step in fixing it. The iommu driver contract already includes a 'gfp' argument to the map_pages op, allowing iommufd to specify GFP_KERNEL_ACCOUNT and then having the driver allocate the IOPTE tables with that flag will capture a significant amount of the allocations. Update the iommu_map() API to pass in the GFP argument, and fix all call sites. Replace iommu_map_atomic(). Audit the "enterprise" iommu drivers to make sure they do the right thing. Intel and S390 ignore the GFP argument and always use GFP_ATOMIC. This is problematic for iommufd anyhow, so fix it. AMD and ARM SMMUv2/3 are already correct. A follow up series will be needed to capture the allocations made when the iommu_domain itself is allocated, which will complete the job. ==================== * 'iommu-memory-accounting' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/s390: Use GFP_KERNEL in sleepable contexts iommu/s390: Push the gfp parameter to the kmem_cache_alloc()'s iommu/intel: Use GFP_KERNEL in sleepable contexts iommu/intel: Support the gfp argument to the map_pages op iommu/intel: Add a gfp parameter to alloc_pgtable_page() iommufd: Use GFP_KERNEL_ACCOUNT for iommu_map() iommu/dma: Use the gfp parameter in __iommu_dma_alloc_noncontiguous() iommu: Add a gfp parameter to iommu_map_sg() iommu: Remove iommu_map_atomic() iommu: Add a gfp parameter to iommu_map() Link: https://lore.kernel.org/linux-iommu/0-v3-76b587fe28df+6e3-iommu_map_gfp_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | | iommufd: Add three missing structures in ucmd_bufferYi Liu2023-01-231-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct iommu_ioas_copy, struct iommu_option and struct iommu_vfio_ioas are missed in ucmd_buffer. Although they are smaller than the size of ucmd_buffer, it is safer to list them in ucmd_buffer explicitly. Fixes: aad37e71d5c4 ("iommufd: IOCTLs for the io_pagetable") Fixes: d624d6652a65 ("iommufd: vfio container FD ioctl compatibility") Link: https://lore.kernel.org/r/20230120122040.280219-1-yi.l.liu@intel.com Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | | iommu: Remove IOMMU_CAP_INTR_REMAPJason Gunthorpe2023-01-111-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No iommu driver implements this any more, get rid of it. Link: https://lore.kernel.org/r/9-v3-3313bb5dd3a3+10f11-secure_msi_jgg@nvidia.com Tested-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | | irq/s390: Add arch_is_isolated_msi() for s390Jason Gunthorpe2023-01-111-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | s390 doesn't use irq_domains, so it has no place to set IRQ_DOMAIN_FLAG_ISOLATED_MSI. Instead of continuing to abuse the iommu subsystem to convey this information add a simple define which s390 can make statically true. The define will cause msi_device_has_isolated() to return true. Remove IOMMU_CAP_INTR_REMAP from the s390 iommu driver. Link: https://lore.kernel.org/r/8-v3-3313bb5dd3a3+10f11-secure_msi_jgg@nvidia.com Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Tested-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | | iommu/x86: Replace IOMMU_CAP_INTR_REMAP with IRQ_DOMAIN_FLAG_ISOLATED_MSIJason Gunthorpe2023-01-113-6/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On x86 platforms when the HW can support interrupt remapping the iommu driver creates an irq_domain for the IR hardware and creates a child MSI irq_domain. When the global irq_remapping_enabled is set, the IR MSI domain is assigned to the PCI devices (by intel_irq_remap_add_device(), or amd_iommu_set_pci_msi_domain()) making those devices have the isolated MSI property. Due to how interrupt domains work, setting IRQ_DOMAIN_FLAG_ISOLATED_MSI on the parent IR domain will cause all struct devices attached to it to return true from msi_device_has_isolated_msi(). This replaces the IOMMU_CAP_INTR_REMAP flag as all places using IOMMU_CAP_INTR_REMAP also call msi_device_has_isolated_msi() Set the flag and delete the cap. Link: https://lore.kernel.org/r/7-v3-3313bb5dd3a3+10f11-secure_msi_jgg@nvidia.com Tested-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | | iommufd: Convert to msi_device_has_isolated_msi()Jason Gunthorpe2023-01-111-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Trivially use the new API. Link: https://lore.kernel.org/r/4-v3-3313bb5dd3a3+10f11-secure_msi_jgg@nvidia.com Tested-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | | iommu: Add iommu_group_has_isolated_msi()Jason Gunthorpe2023-01-111-0/+26
| | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Compute the isolated_msi over all the devices in the IOMMU group because iommufd and vfio both need to know that the entire group is isolated before granting access to it. Link: https://lore.kernel.org/r/2-v3-3313bb5dd3a3+10f11-secure_msi_jgg@nvidia.com Tested-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | | | Merge tag 'iommu-updates-v6.3' of ↵Linus Torvalds2023-02-2433-542/+2192
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: - Consolidate iommu_map/unmap functions. There have been blocking and atomic variants so far, but that was problematic as this approach does not scale with required new variants which just differ in the GFP flags used. So Jason consolidated this back into single functions that take a GFP parameter. - Retire the detach_dev() call-back in iommu_ops - Arm SMMU updates from Will: - Device-tree binding updates: - Cater for three power domains on SM6375 - Document existing compatible strings for Qualcomm SoCs - Tighten up clocks description for platform-specific compatible strings - Enable Qualcomm workarounds for some additional platforms that need them - Intel VT-d updates from Lu Baolu: - Add Intel IOMMU performance monitoring support - Set No Execute Enable bit in PASID table entry - Two performance optimizations - Fix PASID directory pointer coherency - Fix missed rollbacks in error path - Cleanups - Apple t8110 DART support - Exynos IOMMU: - Implement better fault handling - Error handling fixes - Renesas IPMMU: - Add device tree bindings for r8a779g0 - AMD IOMMU: - Various fixes for handling on SNP-enabled systems and handling of faults with unknown request-ids - Cleanups and other small fixes - Various other smaller fixes and cleanups * tag 'iommu-updates-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (71 commits) iommu/amd: Skip attach device domain is same as new domain iommu: Attach device group to old domain in error path iommu/vt-d: Allow to use flush-queue when first level is default iommu/vt-d: Fix PASID directory pointer coherency iommu/vt-d: Avoid superfluous IOTLB tracking in lazy mode iommu/vt-d: Fix error handling in sva enable/disable paths iommu/amd: Improve page fault error reporting iommu/amd: Do not identity map v2 capable device when snp is enabled iommu: Fix error unwind in iommu_group_alloc() iommu/of: mark an unused function as __maybe_unused iommu: dart: DART_T8110_ERROR range should be 0 to 5 iommu/vt-d: Enable IOMMU perfmon support iommu/vt-d: Add IOMMU perfmon overflow handler support iommu/vt-d: Support cpumask for IOMMU perfmon iommu/vt-d: Add IOMMU perfmon support iommu/vt-d: Support Enhanced Command Interface iommu/vt-d: Retrieve IOMMU perfmon capability information iommu/vt-d: Support size of the register set in DRHD iommu/vt-d: Set No Execute Enable bit in PASID table entry iommu/vt-d: Remove sva from intel_svm_dev ...
| | \ \ \
| | \ \ \
| | \ \ \
| | \ \ \
| | \ \ \
| | \ \ \
| | \ \ \
| | \ \ \
| | \ \ \
| | \ \ \
| | \ \ \
| | \ \ \
| *-----------. \ \ \ Merge branches 'apple/dart', 'arm/exynos', 'arm/renesas', 'arm/smmu', ↵Joerg Roedel2023-02-1833-542/+2192
| |\ \ \ \ \ \ \ \ \ \ | | | | | |_|_|_|/ / / | | | | |/| | | | | / | | |_|_|_|_|_|_|_|/ | |/| | | | | | | | 'x86/vt-d', 'x86/amd' and 'core' into next
| | | | | | | | * | iommu: Attach device group to old domain in error pathVasant Hegde2023-02-181-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | iommu_attach_group() attaches all devices in a group to domain and then sets group domain (group->domain). Current code (__iommu_attach_group()) does not handle error path. This creates problem as devices to domain attachment is in inconsistent state. Flow: - During boot iommu attach devices to default domain - Later some device driver (like amd/iommu_v2 or vfio) tries to attach device to new domain. - In iommu_attach_group() path we detach device from current domain. Then it tries to attach devices to new domain. - If it fails to attach device to new domain then device to domain link is broken. - iommu_attach_group() returns error. - At this stage iommu_attach_group() caller thinks, attaching device to new domain failed and devices are still attached to old domain. - But in reality device to old domain link is broken. It will result in all sort of failures (like IO page fault) later. To recover from this situation, we need to attach all devices back to the old domain. Also log warning if it fails attach device back to old domain. Suggested-by: Lu Baolu <baolu.lu@linux.intel.com> Reported-by: Matt Fagnani <matt.fagnani@bell.net> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Matt Fagnani <matt.fagnani@bell.net> Link: https://lore.kernel.org/r/20230215052642.6016-1-vasant.hegde@amd.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=216865 Link: https://lore.kernel.org/lkml/15d0f9ff-2a56-b3e9-5b45-e6b23300ae3b@leemhuis.info/ Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | * | iommu: Fix error unwind in iommu_group_alloc()Jason Gunthorpe2023-02-161-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If either iommu_group_grate_file() fails then the iommu_group is leaked. Destroy it on these error paths. Found by kselftest/iommu/iommufd_fail_nth Fixes: bc7d12b91bd3 ("iommu: Implement reserved_regions iommu-group sysfs file") Fixes: c52c72d3dee8 ("iommu: Add sysfs attribyte for domain type") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/0-v1-8f616bee028d+8b-iommu_group_alloc_leak_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | * | iommu/of: mark an unused function as __maybe_unusedRandy Dunlap2023-02-161-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When CONFIG_OF_ADDRESS is not set, there is a build warning/error about an unused function. Annotate the function to quieten the warning/error. ../drivers/iommu/of_iommu.c:176:29: warning: 'iommu_resv_region_get_type' defined but not used [-Wunused-function] 176 | static enum iommu_resv_type iommu_resv_region_get_type(struct device *dev, struct resource *phys, | ^~~~~~~~~~~~~~~~~~~~~~~~~~ Fixes: a5bf3cfce8cb ("iommu: Implement of_iommu_get_resv_regions()") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Thierry Reding <treding@nvidia.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Will Deacon <will@kernel.org> Cc: iommu@lists.linux.dev Reviewed-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20230209010359.23831-1-rdunlap@infradead.org [joro: Improve code formatting] Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | * | iommu/exynos: Add missing set_platform_dma_ops callbackMarek Szyprowski2023-02-031-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add set_platform_dma_ops() required for proper driver operation on ARM 32bit arch after recent changes in the IOMMU framework (detach ops removal). Fixes: c1fe9119ee70 ("iommu: Add set_platform_dma_ops callbacks") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20230123093102.12392-1-m.szyprowski@samsung.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | * | Merge branch 'iommu-memory-accounting' into coreJoerg Roedel2023-01-257-63/+69
| | | | | | | | |\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge patch-set from Jason: "Let iommufd charge IOPTE allocations to the memory cgroup" Description: IOMMUFD follows the same design as KVM and uses memory cgroups to limit the amount of kernel memory a iommufd file descriptor can pin down. The various internal data structures already use GFP_KERNEL_ACCOUNT to charge its own memory. However, one of the biggest consumers of kernel memory is the IOPTEs stored under the iommu_domain and these allocations are not tracked. This series is the first step in fixing it. The iommu driver contract already includes a 'gfp' argument to the map_pages op, allowing iommufd to specify GFP_KERNEL_ACCOUNT and then having the driver allocate the IOPTE tables with that flag will capture a significant amount of the allocations. Update the iommu_map() API to pass in the GFP argument, and fix all call sites. Replace iommu_map_atomic(). Audit the "enterprise" iommu drivers to make sure they do the right thing. Intel and S390 ignore the GFP argument and always use GFP_ATOMIC. This is problematic for iommufd anyhow, so fix it. AMD and ARM SMMUv2/3 are already correct. A follow up series will be needed to capture the allocations made when the iommu_domain itself is allocated, which will complete the job. Link: https://lore.kernel.org/linux-iommu/0-v3-76b587fe28df+6e3-iommu_map_gfp_jgg@nvidia.com/
| | | | | | | | | * iommu/s390: Use GFP_KERNEL in sleepable contextsJason Gunthorpe2023-01-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These contexts are sleepable, so use the proper annotation. The GFP_ATOMIC was added mechanically in the prior patches. Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/10-v3-76b587fe28df+6e3-iommu_map_gfp_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | | * iommu/s390: Push the gfp parameter to the kmem_cache_alloc()'sJason Gunthorpe2023-01-251-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dma_alloc_cpu_table() and dma_alloc_page_table() are eventually called by iommufd through s390_iommu_map_pages() and it should not be forced to atomic. Thread the gfp parameter through the call chain starting from s390_iommu_map_pages(). Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/9-v3-76b587fe28df+6e3-iommu_map_gfp_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | | * iommu/intel: Use GFP_KERNEL in sleepable contextsJason Gunthorpe2023-01-251-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These contexts are sleepable, so use the proper annotation. The GFP_ATOMIC was added mechanically in the prior patches. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/8-v3-76b587fe28df+6e3-iommu_map_gfp_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | | * iommu/intel: Support the gfp argument to the map_pages opJason Gunthorpe2023-01-251-9/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Flow it down to alloc_pgtable_page() via pfn_to_dma_pte() and __domain_mapping(). Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/7-v3-76b587fe28df+6e3-iommu_map_gfp_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | | * iommu/intel: Add a gfp parameter to alloc_pgtable_page()Jason Gunthorpe2023-01-253-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is eventually called by iommufd through intel_iommu_map_pages() and it should not be forced to atomic. Push the GFP_ATOMIC to all callers. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/6-v3-76b587fe28df+6e3-iommu_map_gfp_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | | * iommufd: Use GFP_KERNEL_ACCOUNT for iommu_map()Jason Gunthorpe2023-01-251-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | iommufd follows the same design as KVM and uses memory cgroups to limit the amount of kernel memory a iommufd file descriptor can pin down. The various internal data structures already use GFP_KERNEL_ACCOUNT. However, one of the biggest consumers of kernel memory is the IOPTEs stored under the iommu_domain. Many drivers will allocate these at iommu_map() time and will trivially do the right thing if we pass in GFP_KERNEL_ACCOUNT. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/5-v3-76b587fe28df+6e3-iommu_map_gfp_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | | * iommu/dma: Use the gfp parameter in __iommu_dma_alloc_noncontiguous()Jason Gunthorpe2023-01-251-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function does an allocation of a buffer to return to the caller and then goes on to allocate some internal memory, eg the scatterlist and IOPTEs. Instead of hard wiring GFP_KERNEL and a wrong GFP_ATOMIC, continue to use the passed in gfp flags for all of the allocations. Clear the zone and policy bits that are only relevant for the buffer allocation before re-using them for internal allocations. Auditing says this is never called from an atomic context, so the GFP_ATOMIC is the incorrect flag. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/4-v3-76b587fe28df+6e3-iommu_map_gfp_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | | * iommu: Add a gfp parameter to iommu_map_sg()Jason Gunthorpe2023-01-252-18/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Follow the pattern for iommu_map() and remove iommu_map_sg_atomic(). This allows __iommu_dma_alloc_noncontiguous() to use a GFP_KERNEL allocation here, based on the provided gfp flags. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/3-v3-76b587fe28df+6e3-iommu_map_gfp_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | | * iommu: Remove iommu_map_atomic()Jason Gunthorpe2023-01-252-8/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is only one call site and it can now just pass the GFP_ATOMIC to the normal iommu_map(). Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/2-v3-76b587fe28df+6e3-iommu_map_gfp_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | | * iommu: Add a gfp parameter to iommu_map()Jason Gunthorpe2023-01-253-14/+16
| | | | | |_|_|_|/ | | | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The internal mechanisms support this, but instead of exposting the gfp to the caller it wrappers it into iommu_map() and iommu_map_atomic() Fix this instead of adding more variants for GFP_KERNEL_ACCOUNT. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Link: https://lore.kernel.org/r/1-v3-76b587fe28df+6e3-iommu_map_gfp_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | * iommu: dma: Use of_iommu_get_resv_regions()Thierry Reding2023-01-251-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For device tree nodes, use the standard of_iommu_get_resv_regions() implementation to obtain the reserved memory regions associated with a device. Cc: Rob Herring <robh+dt@kernel.org> Cc: Frank Rowand <frowand.list@gmail.com> Cc: devicetree@vger.kernel.org Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20230120174251.4004100-5-thierry.reding@gmail.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | * iommu: Implement of_iommu_get_resv_regions()Thierry Reding2023-01-251-0/+94
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is an implementation that IOMMU drivers can use to obtain reserved memory regions from a device tree node. It uses the reserved-memory DT bindings to find the regions associated with a given device. If these regions are marked accordingly, identity mappings will be created for them in the IOMMU domain that the devices will be attached to. Cc: Frank Rowand <frowand.list@gmail.com> Cc: devicetree@vger.kernel.org Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20230120174251.4004100-4-thierry.reding@gmail.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | * iommu/fsl_pamu: Fix compile error after adding set_platform_dma_opsJoerg Roedel2023-01-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The struct initializer for set_platform_dma_ops uses a semicolon as separator where a comma is required. Fix the compile error by using the correct separator. Fixes: c1fe9119ee70 ("iommu: Add set_platform_dma_ops callbacks") Signed-off-by: Joerg Roedel <jroedel@suse.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20230113191528.23638-1-joro@8bytes.org
| | | | | | | | * iommu/ipmmu-vmsa: Remove ipmmu_utlb_disable()Joerg Roedel2023-01-131-12/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function is unused after commit 1b932ceddd19 ("iommu: Remove detach_dev callbacks") and so compilation fails with drivers/iommu/ipmmu-vmsa.c:305:13: error: ‘ipmmu_utlb_disable’ defined but not used [-Werror=unused-function] 305 | static void ipmmu_utlb_disable(struct ipmmu_vmsa_domain *domain, | ^~~~~~~~~~~~~~~~~~ Remove the function to fix the compile error. Fixes: 1b932ceddd19 ("iommu: Remove detach_dev callbacks") Signed-off-by: Joerg Roedel <jroedel@suse.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20230113185640.8050-1-joro@8bytes.org
| | | | | | | | * iommu: Tidy up io-pgtable dependenciesRobin Murphy2023-01-131-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some io-pgtable implementations, and thus their users too, carry a slightly odd dependency to get around the GENERIC_ATOMIC64 version of cmpxchg64() often failing to compile. Since this is a functional dependency, it's a bit misleading and untidy to tie it explicitly to COMPILE_TEST while assuming that it's also implied by the other platform/architecture options. Make things clearer by separating these functional dependencies into distinct statements from those controlling visibility, and since they do look a bit non-obvious to the uninitiated, also commenting them for good measure. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/51d8c78e2ecc6696ac5907526580209ea6da167f.1673553587.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | * iommu: Remove detach_dev callbackLu Baolu2023-01-132-33/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The detach_dev callback of domain ops is not called in the IOMMU core. Remove this callback to avoid dead code. The trace event for detaching domain from device is removed accordingly. Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20230110025408.667767-6-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | * iommu: Remove deferred attach check from __iommu_detach_device()Jason Gunthorpe2023-01-131-34/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At the current moment, __iommu_detach_device() is only called via call chains that are after the device driver is attached - eg via explicit attach APIs called by the device driver. Commit bd421264ed30 ("iommu: Fix deferred domain attachment") has removed deferred domain attachment check from __iommu_attach_device() path, so it should just unconditionally work in the __iommu_detach_device() path. It actually looks like a bug that we were blocking detach on these paths since the attach was unconditional and the caller is going to free the (probably) UNAMANGED domain once this returns. The only place we should be testing for deferred attach is during the initial point the dma device is linked to the group, and then again during the dma api calls. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20230110025408.667767-5-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | * iommu: Add set_platform_dma_ops callbacksLu Baolu2023-01-137-21/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For those IOMMU drivers that don't provide default domain support, add an implementation of set_platform_dma_ops callback so that the IOMMU core could return the DMA control to platform DMA ops. At the same time, with the set_platform_dma_ops implemented, there is no need for detach_dev. Remove it to avoid dead code. Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20230110025408.667767-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | * iommu: Add set_platform_dma_ops iommu opsLu Baolu2023-01-131-4/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When VFIO finishes assigning a device to user space and calls iommu_group_release_dma_owner() to return the device to kernel, the IOMMU core will attach the default domain to the device. Unfortunately, some IOMMU drivers don't support default domain, hence in the end, the core calls .detach_dev instead. This adds set_platform_dma_ops iommu ops to make it clear that what it does is returning control back to the platform DMA ops. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20230110025408.667767-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | * iommu: Remove detach_dev callbacksLu Baolu2023-01-139-117/+0
| | | | | |_|_|/ | | | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The iommu core calls the driver's detach_dev domain op callback only when a device is finished assigning to user space and iommu_group_release_dma_owner() is called to return the device to the kernel, where iommu core wants to set the default domain to the device but the driver didn't provide one. In other words, if any iommu driver provides default domain support, the .detach_dev callback will never be called. This removes the detach_dev callbacks in those IOMMU drivers that support default domain. Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Sven Peter <sven@svenpeter.dev> # apple-dart Acked-by: Chunyan Zhang <zhang.lyra@gmail.com> # sprd Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> # amd Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20230110025408.667767-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * iommu/amd: Skip attach device domain is same as new domainVasant Hegde2023-02-181-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If device->domain is same as new domain then we can skip the device attach process. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20230215052642.6016-2-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * iommu/amd: Improve page fault error reportingVasant Hegde2023-02-161-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If IOMMU domain for device group is not setup properly then we may hit IOMMU page fault. Current page fault handler assumes that domain is always setup and it will hit NULL pointer derefence (see below sample log). Lets check whether domain is setup or not and log appropriate message. Sample log: ---------- amdgpu 0000:00:01.0: amdgpu: SE 1, SH per SE 1, CU per SH 8, active_cu_number 6 BUG: kernel NULL pointer dereference, address: 0000000000000058 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 2 PID: 56 Comm: irq/24-AMD-Vi Not tainted 6.2.0-rc2+ #89 Hardware name: xxx RIP: 0010:report_iommu_fault+0x11/0x90 [...] Call Trace: <TASK> amd_iommu_int_thread+0x60c/0x760 ? __pfx_irq_thread_fn+0x10/0x10 irq_thread_fn+0x1f/0x60 irq_thread+0xea/0x1a0 ? preempt_count_add+0x6a/0xa0 ? __pfx_irq_thread_dtor+0x10/0x10 ? __pfx_irq_thread+0x10/0x10 kthread+0xe9/0x110 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2c/0x50 </TASK> Reported-by: Matt Fagnani <matt.fagnani@bell.net> Suggested-by: Joerg Roedel <joro@8bytes.org> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216865 Link: https://lore.kernel.org/lkml/15d0f9ff-2a56-b3e9-5b45-e6b23300ae3b@leemhuis.info/ Link: https://lore.kernel.org/r/20230215052642.6016-3-vasant.hegde@amd.com Cc: stable@vger.kernel.org [joro: Edit commit message] Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * iommu/amd: Do not identity map v2 capable device when snp is enabledVasant Hegde2023-02-161-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Flow: - Booted system with SNP enabled, memory encryption off and IOMMU DMA translation mode - AMD driver detects v2 capable device and amd_iommu_def_domain_type() returns identity mode - amd_iommu_domain_alloc() returns NULL an SNP is enabled - System will fail to register device On SNP enabled system, passthrough mode is not supported. IOMMU default domain is set to translation mode. We need to return zero from amd_iommu_def_domain_type() so that it allocates translation domain. Fixes: fb2accadaa94 ("iommu/amd: Introduce function to check and enable SNP") CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20230207091752.7656-1-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * iommu/amd: Add a length limitation for the ivrs_acpihid command-line parameterGavrilov Ilia2023-02-031-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 'acpiid' buffer in the parse_ivrs_acpihid function may overflow, because the string specifier in the format string sscanf() has no width limitation. Found by InfoTeCS on behalf of Linux Verification Center (linuxtesting.org) with SVACE. Fixes: ca3bf5d47cec ("iommu/amd: Introduces ivrs_acpihid kernel parameter") Cc: stable@vger.kernel.org Signed-off-by: Ilia.Gavrilov <Ilia.Gavrilov@infotecs.ru> Reviewed-by: Kim Phillips <kim.phillips@amd.com> Link: https://lore.kernel.org/r/20230202082719.1513849-1-Ilia.Gavrilov@infotecs.ru Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * iommu/amd: Do not clear event/ppr log buffer when snp is enabledTom Lendacky2023-01-201-4/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current code clears event log and ppr log entry after processing it due to hardware errata ([1] erratum #732, #733). We do not have hardware issue on SNP enabled system. When SNP is enabled, the event logs, PPR log and completion wait buffer are read-only to the host (see SNP FW ABI spec [2]). Clearing those entry will result in a kernel #PF for an RMP violation. Hence do not clear event and ppr log entry after processing it. [1] http://developer.amd.com/wordpress/media/2012/10/48931_15h_Mod_10h-1Fh_Rev_Guide.pdf [2] https://www.amd.com/system/files/TechDocs/56860.pdf Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20230117044038.5728-1-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * iommu/amd: Fix error handling for pdev_pri_ats_enable()Vasant Hegde2023-01-131-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current code throws kernel warning if it fails to enable pasid/pri [1]. Do not call pci_disable_[pasid/pri] if pci_enable_[pasid/pri] failed. [1] https://lore.kernel.org/linux-iommu/15d0f9ff-2a56-b3e9-5b45-e6b23300ae3b@leemhuis.info/ Reported-by: Matt Fagnani <matt.fagnani@bell.net> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20230111121503.5931-1-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * iommu/amd: Do not allocate io_pgtable_ops for passthrough domainVasant Hegde2023-01-131-0/+4
| | | | | |_|/ | | | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In passthrough mode we do not use IOMMU page table. Hence we don't need to allocate io_pgtable_ops. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20230105091728.42469-1-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | * iommu/vt-d: Allow to use flush-queue when first level is defaultTina Zhang2023-02-161-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 29b32839725f ("iommu/vt-d: Do not use flush-queue when caching-mode is on") forced default domains to be strict mode as long as IOMMU caching-mode is flagged. The reason for doing this is that when vIOMMU uses VT-d caching mode to synchronize shadowing page tables, the strict mode shows better performance. However, this optimization is orthogonal to the first-level page table because the Intel VT-d architecture does not define the caching mode of the first-level page table. Refer to VT-d spec, section 6.1, "When the CM field is reported as Set, any software updates to remapping structures other than first-stage mapping (including updates to not- present entries or present entries whose programming resulted in translation faults) requires explicit invalidation of the caches." Exclude the first-level page table from this optimization. Generally using first-stage translation in vIOMMU implies nested translation enabled in the physical IOMMU. In this case the first-stage page table is wholly captured by the guest. The vIOMMU only needs to transfer the cache invalidations on vIOMMU to the physical IOMMU. Forcing the default domain to strict mode will cause more frequent cache invalidations, resulting in performance degradation. In a real performance benchmark test measured by iperf receive, the performance result on Sapphire Rapids 100Gb NIC shows: w/ this fix ~51 Gbits/s, w/o this fix ~39.3 Gbits/s. Theoretically a first-stage IOMMU page table can still be shadowed in absence of the caching mode, e.g. with host write-protecting guest IOMMU page table to synchronize changed PTEs with the physical IOMMU page table. In this case the shadowing overhead is decoupled from emulating IOTLB invalidation then the overhead of the latter part is solely decided by the frequency of IOTLB invalidations. Hence allowing guest default dma domain to be lazy can also benefit the overall performance by reducing the total VM-exit numbers. Fixes: 29b32839725f ("iommu/vt-d: Do not use flush-queue when caching-mode is on") Reported-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Suggested-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Tina Zhang <tina.zhang@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20230214025618.2292889-1-tina.zhang@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>