summaryrefslogtreecommitdiffstats
path: root/drivers/iommu
Commit message (Collapse)AuthorAgeFilesLines
* iommu: Fix wrong freeing of iommu_device->devJoerg Roedel2017-08-153-14/+26
| | | | | | | | | | | | | | | | | | | | | The struct iommu_device has a 'struct device' embedded into it, not as a pointer, but the whole struct. In the conversion of the iommu drivers to use struct iommu_device it was forgotten that the relase function for that struct device simply calls kfree() on the pointer. This frees memory that was never allocated and causes memory corruption. To fix this issue, use a pointer to struct device instead of embedding the whole struct. This needs some updates in the iommu sysfs code as well as the Intel VT-d and AMD IOMMU driver. Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Fixes: 39ab9555c241 ('iommu: Add sysfs bindings for struct iommu_device') Cc: stable@vger.kernel.org # >= v4.11 Signed-off-by: Joerg Roedel <jroedel@suse.de>
* iommu/arm-smmu: fix null-pointer dereference in arm_smmu_add_deviceArtem Savkov2017-08-111-0/+7
| | | | | | | | | | | | | Commit c54451a "iommu/arm-smmu: Fix the error path in arm_smmu_add_device" removed fwspec assignment in legacy_binding path as redundant which is wrong. It needs to be updated after fwspec initialisation in arm_smmu_register_legacy_master() as it is dereferenced later. Without this there is a NULL-pointer dereference panic during boot on some hosts. Signed-off-by: Artem Savkov <asavkov@redhat.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
* iommu/amd: Fix schedule-while-atomic BUG in initialization codeJoerg Roedel2017-07-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The register_syscore_ops() function takes a mutex and might sleep. In the IOMMU initialization code it is invoked during irq-remapping setup already, where irqs are disabled. This causes a schedule-while-atomic bug: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747 in_atomic(): 0, irqs_disabled(): 1, pid: 1, name: swapper/0 no locks held by swapper/0/1. irq event stamp: 304 hardirqs last enabled at (303): [<ffffffff818a87b6>] _raw_spin_unlock_irqrestore+0x36/0x60 hardirqs last disabled at (304): [<ffffffff8235d440>] enable_IR_x2apic+0x79/0x196 softirqs last enabled at (36): [<ffffffff818ae75f>] __do_softirq+0x35f/0x4ec softirqs last disabled at (31): [<ffffffff810c1955>] irq_exit+0x105/0x120 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc2.1.el7a.test.x86_64.debug #1 Hardware name: PowerEdge C6145 /040N24, BIOS 3.5.0 10/28/2014 Call Trace: dump_stack+0x85/0xca ___might_sleep+0x22a/0x260 __might_sleep+0x4a/0x80 __mutex_lock+0x58/0x960 ? iommu_completion_wait.part.17+0xb5/0x160 ? register_syscore_ops+0x1d/0x70 ? iommu_flush_all_caches+0x120/0x150 mutex_lock_nested+0x1b/0x20 register_syscore_ops+0x1d/0x70 state_next+0x119/0x910 iommu_go_to_state+0x29/0x30 amd_iommu_enable+0x13/0x23 Fix it by moving the register_syscore_ops() call to the next initialization step, which runs with irqs enabled. Reported-by: Artem Savkov <asavkov@redhat.com> Tested-by: Artem Savkov <asavkov@redhat.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Fixes: 2c0ae1720c09 ('iommu/amd: Convert iommu initialization to state machine') Signed-off-by: Joerg Roedel <jroedel@suse.de>
* iommu/amd: Enable ga_log_intr when enabling guest_modeSuravee Suthikulpanit2017-07-251-0/+1
| | | | | | | | | | | | IRTE[GALogIntr] bit should set when enabling guest_mode, which enables IOMMU to generate entry in GALog when IRTE[IsRun] is not set, and send an interrupt to notify IOMMU driver. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: stable@vger.kernel.org # v4.9+ Fixes: d98de49a53e48 ('iommu/amd: Enable vAPIC interrupt remapping mode by default') Signed-off-by: Joerg Roedel <jroedel@suse.de>
* iommu/io-pgtable: Sanitise map/unmap addressesRobin Murphy2017-07-202-0/+13
| | | | | | | | | | | | | It may be an egregious error to attempt to use addresses outside the range of the pagetable format, but that still doesn't mean we should merrily wreak havoc by silently mapping/unmapping whatever truncated portions of them might happen to correspond to real addresses. Add some up-front checks to sanitise our inputs so that buggy callers don't invite potential memory corruption. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* iommu/arm-smmu: Fix the error path in arm_smmu_add_deviceVivek Gautam2017-07-201-4/+3
| | | | | | | | | | | fwspec->iommu_priv is available only after arm_smmu_master_cfg instance has been allocated. We shouldn't free it before that. Also it's logical to free the master cfg itself without checking for fwspec. Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> [will: remove redundant assignment to fwspec] Signed-off-by: Will Deacon <will.deacon@arm.com>
* Revert "iommu/io-pgtable: Avoid redundant TLB syncs"Robin Murphy2017-07-201-8/+1
| | | | | | | | | | | | | | | | | | | | The tlb_sync_pending flag was necessary for correctness in the Mediatek M4U driver, but since it offered a small theoretical optimisation for all io-pgtable users it was implemented as a high-level thing. However, now that some users may not be using a synchronising lock, there are several ways this flag can go wrong for them, and at worst it could result in incorrect behaviour. Since we've addressed the correctness issue within the Mediatek driver itself, and fixing the optimisation aspect to be concurrency-safe would be quite a headache (and impose extra overhead on every operation for the sake of slightly helping one case which will virtually never happen in typical usage), let's just retire it. This reverts commit 88492a4700360a086e55d8874ad786105a5e8b0f. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* iommu/mtk: Avoid redundant TLB syncs locallyRobin Murphy2017-07-202-0/+7
| | | | | | | | | | | | | | | | | | | | | | Under certain circumstances, the io-pgtable code may end up issuing two TLB sync operations without any intervening invalidations. This goes badly for the M4U hardware, since it means the second sync ends up polling for a non-existent operation to finish, and as a result times out and warns. The io_pgtable_tlb_* helpers implement a high-level optimisation to avoid issuing the second sync at all in such cases, but in order to work correctly that requires all pagetable operations to be serialised under a lock, thus is no longer applicable to all io-pgtable users. Since we're the only user actually relying on this flag for correctness, let's reimplement it locally to avoid the headache of trying to make the high-level version concurrency-safe for other users. CC: Yong Wu <yong.wu@mediatek.com> CC: Matthias Brugger <matthias.bgg@gmail.com> Tested-by: Yong Wu <yong.wu@mediatek.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* iommu/arm-smmu: Reintroduce locking around TLB sync operationsWill Deacon2017-07-201-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 523d7423e21b ("iommu/arm-smmu: Remove io-pgtable spinlock") removed the locking used to serialise map/unmap calls into the io-pgtable code from the ARM SMMU driver. This is good for performance, but opens us up to a nasty race with TLB syncs because the TLB sync register is shared within a context bank (or even globally for stage-2 on SMMUv1). There are two cases to consider: 1. A CPU can be spinning on the completion of a TLB sync, take an interrupt which issues a subsequent TLB sync, and then report a timeout on return from the interrupt. 2. A CPU can be spinning on the completion of a TLB sync, but other CPUs can continuously issue additional TLB syncs in such a way that the backoff logic reports a timeout. Rather than fix this by spinning for completion of prior TLB syncs before issuing a new one (which may suffer from fairness issues on large systems), instead reintroduce locking around TLB sync operations in the ARM SMMU driver. Fixes: 523d7423e21b ("iommu/arm-smmu: Remove io-pgtable spinlock") Cc: Robin Murphy <robin.murphy@arm.com> Reported-by: Ray Jui <ray.jui@broadcom.com> Tested-by: Ray Jui <ray.jui@broadcom.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* Merge tag 'iommu-updates-v4.13' of ↵Linus Torvalds2017-07-1218-527/+1134
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "This update comes with: - Support for lockless operation in the ARM io-pgtable code. This is an important step to solve the scalability problems in the common dma-iommu code for ARM - Some Errata workarounds for ARM SMMU implemenations - Rewrite of the deferred IO/TLB flush code in the AMD IOMMU driver. The code suffered from very high flush rates, with the new implementation the flush rate is down to ~1% of what it was before - Support for amd_iommu=off when booting with kexec. The problem here was that the IOMMU driver bailed out early without disabling the iommu hardware, if it was enabled in the old kernel - The Rockchip IOMMU driver is now available on ARM64 - Align the return value of the iommu_ops->device_group call-backs to not miss error values - Preempt-disable optimizations in the Intel VT-d and common IOVA code to help Linux-RT - Various other small cleanups and fixes" * tag 'iommu-updates-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (60 commits) iommu/vt-d: Constify intel_dma_ops iommu: Warn once when device_group callback returns NULL iommu/omap: Return ERR_PTR in device_group call-back iommu: Return ERR_PTR() values from device_group call-backs iommu/s390: Use iommu_group_get_for_dev() in s390_iommu_add_device() iommu/vt-d: Don't disable preemption while accessing deferred_flush() iommu/iova: Don't disable preempt around this_cpu_ptr() iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #126 iommu/arm-smmu-v3: Enable ACPI based HiSilicon CMD_PREFETCH quirk(erratum 161010701) iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #74 ACPI/IORT: Fixup SMMUv3 resource size for Cavium ThunderX2 SMMUv3 model iommu/arm-smmu-v3, acpi: Add temporary Cavium SMMU-V3 IORT model number definitions iommu/io-pgtable-arm: Use dma_wmb() instead of wmb() when publishing table iommu/io-pgtable: depend on !GENERIC_ATOMIC64 when using COMPILE_TEST with LPAE iommu/arm-smmu-v3: Remove io-pgtable spinlock iommu/arm-smmu: Remove io-pgtable spinlock iommu/io-pgtable-arm-v7s: Support lockless operation iommu/io-pgtable-arm: Support lockless operation iommu/io-pgtable: Introduce explicit coherency iommu/io-pgtable-arm-v7s: Refactor split_blk_unmap ...
| *-------------. Merge branches 'iommu/fixes', 'arm/rockchip', 'arm/renesas', 'arm/smmu', ↵Joerg Roedel2017-06-2818-527/+1134
| |\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'arm/core', 'x86/vt-d', 'x86/amd', 's390' and 'core' into next
| | | | | | | | | * iommu: Warn once when device_group callback returns NULLJoerg Roedel2017-06-281-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This callback should never return NULL. Print a warning if that happens so that we notice and can fix it. Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | | * iommu/omap: Return ERR_PTR in device_group call-backJoerg Roedel2017-06-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make sure that the device_group callback returns an ERR_PTR instead of NULL. Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | | * iommu: Return ERR_PTR() values from device_group call-backsJoerg Roedel2017-06-281-12/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The generic device_group call-backs in iommu.c return NULL in case of error. Since they are getting ERR_PTR values from iommu_group_alloc(), just pass them up instead. Reported-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | | * iommu/iova: Don't disable preempt around this_cpu_ptr()Sebastian Andrzej Siewior2017-06-281-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 583248e6620a ("iommu/iova: Disable preemption around use of this_cpu_ptr()") disables preemption while accessing a per-CPU variable. This does keep lockdep quiet. However I don't see the point why it is bad if we get migrated after its access to another CPU. __iova_rcache_insert() and __iova_rcache_get() immediately locks the variable after obtaining it - before accessing its members. _If_ we get migrated away after retrieving the address of cpu_rcache before taking the lock then the *other* task on the same CPU will retrieve the same address of cpu_rcache and will spin on the lock. alloc_iova_fast() disables preemption while invoking free_cpu_cached_iovas() on each CPU. The function itself uses per_cpu_ptr() which does not trigger a warning (like this_cpu_ptr() does). It _could_ make sense to use get_online_cpus() instead but the we have a hotplug notifier for CPU down (and none for up) so we are good. Cc: Joerg Roedel <joro@8bytes.org> Cc: iommu@lists.linux-foundation.org Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | * | iommu/s390: Use iommu_group_get_for_dev() in s390_iommu_add_device()Joerg Roedel2017-06-281-10/+5
| | | | | | | | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The iommu_group_get_for_dev() function also attaches the device to its group, so this code doesn't need to be in the iommu driver. Further by using this function the driver can make use of default domains in the future. Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * | iommu/amd: Free already flushed ring-buffer entries before full-checkJoerg Roedel2017-06-221-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To benefit from IOTLB flushes on other CPUs we have to free the already flushed IOVAs from the ring-buffer before we do the queue_ring_full() check. Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * | iommu/amd: Remove amd_iommu_disabled check from amd_iommu_detect()Joerg Roedel2017-06-221-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This check needs to happens later now, when all previously enabled IOMMUs have been disabled. Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * | iommu/amd: Free IOMMU resources when disabled on command lineJoerg Roedel2017-06-221-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After we made sure that all IOMMUs have been disabled we need to make sure that all resources we allocated are released again. Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * | iommu/amd: Set global pointers to NULL after freeing themJoerg Roedel2017-06-221-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid any tries to double-free these pointers. Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * | iommu/amd: Check for error states first in iommu_go_to_state()Joerg Roedel2017-06-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Check if we are in an error state already before calling into state_next(). Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * | iommu/amd: Add new init-state IOMMU_CMDLINE_DISABLEDJoerg Roedel2017-06-221-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This will be used when during initialization we detect that the iommu should be disabled. Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * | iommu/amd: Rename free_on_init_error()Joerg Roedel2017-06-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function will also be used to free iommu resources when amd_iommu=off was specified on the kernel command line. So rename the function to reflect that. Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * | iommu/amd: Disable IOMMUs at boot if they are enabledJoerg Roedel2017-06-221-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When booting, make sure the IOMMUs are disabled. They could be previously enabled if we boot into a kexec or kdump kernel. So make sure they are off. Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * | iommu/amd: Suppress IO_PAGE_FAULTs in kdump kernelJoerg Roedel2017-06-163-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When booting into a kdump kernel, suppress IO_PAGE_FAULTs by default for all devices. But allow the faults again when a domain is assigned to a device. Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * | iommu/amd: Remove queue_release() functionJoerg Roedel2017-06-081-20/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can use queue_ring_free_flushed() instead, so remove this redundancy. Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * | iommu/amd: Add per-domain timer to flush per-cpu queuesJoerg Roedel2017-06-081-17/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a timer to each dma_ops domain so that we flush unused IOTLB entries regularily, even if the queues don't get full all the time. Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * | iommu/amd: Add flush counters to struct dma_ops_domainJoerg Roedel2017-06-081-0/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The counters are increased every time the TLB for a given domain is flushed. We also store the current value of that counter into newly added entries of the flush-queue, so that we can tell whether this entry is already flushed. Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * | iommu/amd: Add locking to per-domain flush-queueJoerg Roedel2017-06-081-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With locking we can safely access the flush-queues of other cpus. Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * | iommu/amd: Make use of the per-domain flush queueJoerg Roedel2017-06-081-4/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fill the flush-queue on unmap and only flush the IOMMU and device TLBs when a per-cpu queue gets full. Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * | iommu/amd: Add per-domain flush-queue data structuresJoerg Roedel2017-06-081-0/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make the flush-queue per dma-ops domain and add code allocate and free the flush-queues; Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * | iommu/amd: Rip out old queue flushing codeJoerg Roedel2017-06-081-137/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The queue flushing is pretty inefficient when it flushes the queues for all cpus at once. Further it flushes all domains from all IOMMUs for all CPUs, which is overkill as well. Rip it out to make room for something more efficient. Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * | iommu/amd: Reduce delay waiting for command buffer spaceTom Lendacky2017-06-081-20/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently if there is no room to add a command to the command buffer, the driver performs a "completion wait" which only returns when all commands on the queue have been processed. There is no need to wait for the entire command queue to be executed before adding the next command. Update the driver to perform the same udelay() loop that the "completion wait" performs, but instead re-read the head pointer to determine if sufficient space is available. The very first time it is found that there is no space available, the udelay() will be skipped to immediately perform the opportunistic read of the head pointer. If it is still found that there is not sufficient space, then the udelay() will be performed. Signed-off-by: Leo Duran <leo.duran@amd.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * | iommu/amd: Reduce amount of MMIO when submitting commandsTom Lendacky2017-06-083-13/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As newer, higher speed devices are developed, perf data shows that the amount of MMIO that is performed when submitting commands to the IOMMU causes performance issues. Currently, the command submission path reads the command buffer head and tail pointers and then writes the tail pointer once the command is ready. The tail pointer is only ever updated by the driver so it can be tracked by the driver without having to read it from the hardware. The head pointer is updated by the hardware, but can be read opportunistically. Reading the head pointer only when it appears that there might not be room in the command buffer and then re-checking the available space reduces the number of times the head pointer has to be read. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * | iommu/amd: Constify irq_domain_opsTobias Klauser2017-05-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct irq_domain_ops is not modified, so it can be made const. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * | iommu/amd: Ratelimit io-page-faults per deviceJoerg Roedel2017-05-301-7/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Misbehaving devices can cause an endless chain of io-page-faults, flooding dmesg and making the system-log unusable or even prevent the system from booting. So ratelimit the error messages about io-page-faults on a per-device basis. Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | * | | iommu/vt-d: Constify intel_dma_opsArvind Yadav2017-06-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most dma_map_ops structures are never modified. Constify these structures such that these can be write-protected. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | * | | iommu/vt-d: Don't disable preemption while accessing deferred_flush()Sebastian Andrzej Siewior2017-06-281-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | get_cpu() disables preemption and returns the current CPU number. The CPU number is only used once while retrieving the address of the local's CPU deferred_flush pointer. We can instead use raw_cpu_ptr() while we remain preemptible. The worst thing that can happen is that flush_unmaps_timeout() is invoked multiple times: once by taskA after seeing HIGH_WATER_MARK and then preempted to another CPU and then by taskB which saw HIGH_WATER_MARK on the same CPU as taskA. It is also likely that ->size got from HIGH_WATER_MARK to 0 right after its read because another CPU invoked flush_unmaps_timeout() for this CPU. The access to flush_data is protected by a spinlock so even if we get migrated to another CPU or preempted - the data structure is protected. While at it, I marked deferred_flush static since I can't find a reference to it outside of this file. Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: iommu@lists.linux-foundation.org Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | * | | iommu/vt-d: Constify irq_domain_opsTobias Klauser2017-05-301-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct irq_domain_ops is not modified, so it can be made const. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | * | | iommu/vt-d: Unwrap __get_valid_domain_for_dev()Peter Xu2017-05-301-14/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We do find_domain() in __get_valid_domain_for_dev(), while we do the same thing in get_valid_domain_for_dev(). No need to do it twice. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | * | | iommu/vt-d: Helper function to query if a pasid has any active usersCQ Tang2017-05-171-0/+30
| | | | | | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A driver would need to know if there are any active references to a a PASID before cleaning up its resources. This function helps check if there are any active users of a PASID before it can perform any recovery on that device. To: Joerg Roedel <joro@8bytes.org> To: linux-kernel@vger.kernel.org To: David Woodhouse <dwmw2@infradead.org> Cc: Jean-Phillipe Brucker <jean-philippe.brucker@arm.com> Cc: iommu@lists.linux-foundation.org Signed-off-by: CQ Tang <cq.tang@intel.com> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | * / / iommu/iova: Sort out rbtree limit_pfn handlingRobin Murphy2017-05-172-13/+10
| | | | | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When walking the rbtree, the fact that iovad->start_pfn and limit_pfn are both inclusive limits creates an ambiguity once limit_pfn reaches the bottom of the address space and they overlap. Commit 5016bdb796b3 ("iommu/iova: Fix underflow bug in __alloc_and_insert_iova_range") fixed the worst side-effect of this, that of underflow wraparound leading to bogus allocations, but the remaining fallout is that any attempt to allocate start_pfn itself erroneously fails. The cleanest way to resolve the ambiguity is to simply make limit_pfn an exclusive limit when inside the guts of the rbtree. Since we're working with PFNs, representing one past the top of the address space is always possible without fear of overflow, and elsewhere it just makes life a little more straightforward. Reported-by: Aaron Sierra <asierra@xes-inc.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | * | | Merge branch 'for-joerg/arm-smmu/updates' of ↵Joerg Roedel2017-06-286-228/+445
| | | | |\ \ \ | | | | | |_|/ | | | | |/| | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into arm/smmu
| | | | | * | iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #126Geetha Sowjanya2017-06-231-24/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cavium ThunderX2 SMMU doesn't support MSI and also doesn't have unique irq lines for gerror, eventq and cmdq-sync. New named irq "combined" is set as a errata workaround, which allows to share the irq line by register single irq handler for all the interrupts. Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Geetha sowjanya <gakula@caviumnetworks.com> [will: reworked irq equality checking and added SPI check] Signed-off-by: Will Deacon <will.deacon@arm.com>
| | | | | * | iommu/arm-smmu-v3: Enable ACPI based HiSilicon CMD_PREFETCH quirk(erratum ↵shameer2017-06-231-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 161010701) HiSilicon SMMUv3 on Hip06/Hip07 platforms doesn't support CMD_PREFETCH command. The dt based support for this quirk is already present in the driver(hisilicon,broken-prefetch-cmd). This adds ACPI support for the quirk using the IORT smmu model number. Signed-off-by: shameer <shameerali.kolothum.thodi@huawei.com> Signed-off-by: hanjun <guohanjun@huawei.com> [will: rewrote patch] Signed-off-by: Will Deacon <will.deacon@arm.com>
| | | | | * | iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #74Linu Cherian2017-06-231-18/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cavium ThunderX2 SMMU implementation doesn't support page 1 register space and PAGE0_REGS_ONLY option is enabled as an errata workaround. This option when turned on, replaces all page 1 offsets used for EVTQ_PROD/CONS, PRIQ_PROD/CONS register access with page 0 offsets. SMMU resource size checks are now based on SMMU option PAGE0_REGS_ONLY, since resource size can be either 64k/128k. For this, arm_smmu_device_dt_probe/acpi_probe has been moved before platform_get_resource call, so that SMMU options are set beforehand. Signed-off-by: Linu Cherian <linu.cherian@cavium.com> Signed-off-by: Geetha Sowjanya <geethasowjanya.akula@cavium.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| | | | | * | iommu/arm-smmu-v3, acpi: Add temporary Cavium SMMU-V3 IORT model number ↵Robert Richter2017-06-231-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | definitions The model number is already defined in acpica and we are actually waiting for the acpi maintainers to include it: https://github.com/acpica/acpica/commit/d00a4eb86e64 Adding those temporary definitions until the change makes it into include/acpi/actbl2.h. Once that is done this patch can be reverted. Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| | | | | * | iommu/io-pgtable-arm: Use dma_wmb() instead of wmb() when publishing tableWill Deacon2017-06-232-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When writing a new table entry, we must ensure that the contents of the table is made visible to the SMMU page table walker before the updated table entry itself. This is currently achieved using wmb(), which expands to an expensive and unnecessary DSB instruction. Ideally, we'd just use cmpxchg64_release when writing the table entry, but this doesn't have memory ordering semantics on !SMP systems. Instead, use dma_wmb(), which emits DMB OSHST. Strictly speaking, this does more than we require (since it targets the outer-shareable domain), but it's likely to be significantly faster than the DSB approach. Reported-by: Linu Cherian <linu.cherian@cavium.com> Suggested-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| | | | | * | iommu/io-pgtable: depend on !GENERIC_ATOMIC64 when using COMPILE_TEST with LPAEWill Deacon2017-06-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The LPAE/ARMv8 page table format relies on the ability to read and write 64-bit page table entries in an atomic fashion. With the move to a lockless implementation, we also need support for cmpxchg64 to resolve races when installing table entries concurrently. Unfortunately, not all architectures support cmpxchg64, so the code can fail to compiler when building for these architectures using COMPILE_TEST. Rather than disable COMPILE_TEST altogether, instead check that GENERIC_ATOMIC64 is not selected, which is a reasonable indication that the architecture has support for 64-bit cmpxchg. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| | | | | * | iommu/arm-smmu-v3: Remove io-pgtable spinlockRobin Murphy2017-06-231-27/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As for SMMUv2, take advantage of io-pgtable's newfound tolerance for concurrency. Unfortunately in this case the command queue lock remains a point of serialisation for the unmap path, but there may be a little more we can do to ameliorate that in future. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>