Linux Stable - Linux kernel stable tree

	Commit message (Collapse)	Author	Age	Files	Lines
*	Add braces to avoid "ambiguous ‘else’" compiler warnings	Linus Torvalds	2016-09-24	2	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 194dc870a5890e855ecffb30f3b80ba7c88f96d6 upstream. Some of our "for_each_xyz()" macro constructs make gcc unhappy about lack of braces around if-statements inside or outside the loop, because the loop construct itself has a "if-then-else" statement inside of it. The resulting warnings look something like this: drivers/gpu/drm/i915/i915_debugfs.c: In function ‘i915_dump_lrc’: drivers/gpu/drm/i915/i915_debugfs.c:2103:6: warning: suggest explicit braces to avoid ambiguous ‘else’ [-Wparentheses] if (ctx != dev_priv->kernel_context) ^ even if the code itself is fine. Since the warning is fairly easy to avoid by adding a braces around the if-statement near the for_each_xyz() construct, do so, rather than disabling the otherwise potentially useful warning. (The if-then-else statements used in the "for_each_xyz()" constructs are designed to be inherently safe even with no braces, but in this case it's quite understandable that gcc isn't really able to tell that). This finally leaves the standard "allmodconfig" build with just a handful of remaining warnings, so new and valid warnings hopefully will stand out. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	iommu/arm-smmu: Don't BUG() if we find aborting STEs with disable_bypass	Will Deacon	2016-09-07	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 5bc0a11664e17e9f9551983f5b660bd48b57483c upstream. The disable_bypass cmdline option changes the SMMUv3 driver to put down faulting stream table entries by default, as opposed to bypassing transactions from unconfigured devices. In this mode of operation, it is entirely expected to see aborting entries in the stream table if and when we come to installing a valid translation, so don't trigger a BUG() as a result of misdiagnosing these entries as stream table corruption. Fixes: 48ec83bcbcf5 ("iommu/arm-smmu: Add initial driver support for ARM SMMUv3 devices") Tested-by: Robin Murphy <robin.murphy@arm.com> Reported-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	iommu/arm-smmu: Disable stalling faults for all endpoints	Will Deacon	2016-09-07	1	-27/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 3714ce1d6655098ee69ede632883e5874d67e4ab upstream. Enabling stalling faults can result in hardware deadlock on poorly designed systems, particularly those with a PCI root complex upstream of the SMMU. Although it's not really Linux's job to save hardware integrators from their own misfortune, it is our job to stop userspace (e.g. VFIO clients) from hosing the system for everybody else, even if they might already be required to have elevated privileges. Given that the fault handling code currently executes entirely in IRQ context, there is nothing that can sensibly be done to recover from things like page faults anyway, so let's rip this code out for now and avoid the potential for deadlock. Fixes: 48ec83bcbcf5 ("iommu/arm-smmu: Add initial driver support for ARM SMMUv3 devices") Reported-by: Matt Evans <matt.evans@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	iommu/arm-smmu: Fix CMDQ error handling	Will Deacon	2016-09-07	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit aea2037e0d3e23c3be1498feae29f71ca997d9e6 upstream. In the unlikely event of a global command queue error, the ARM SMMUv3 driver attempts to convert the problematic command into a CMD_SYNC and resume the command queue. Unfortunately, this code is pretty badly broken: 1. It uses the index into the error string table as the CMDQ index, so we probably read the wrong entry out of the queue 2. The arguments to queue_write are the wrong way round, so we end up writing from the queue onto the stack. These happily cancel out, so the kernel is likely to stay alive, but the command queue will probably fault again when we resume. This patch fixes the error handling code to use the correct queue index and write back the CMD_SYNC to the faulting entry. Fixes: 48ec83bcbcf5 ("iommu/arm-smmu: Add initial driver support for ARM SMMUv3 devices") Reported-by: Diwakar Subraveti <Diwakar.Subraveti@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	iommu/io-pgtable-arm-v7s: Fix attributes when splitting blocks	Robin Murphy	2016-09-07	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit e633fc7a1347528c3b4a6bbdeb41f5d63988242c upstream. Due to the attribute bits being all over the place in the different types of short-descriptor PTEs, when remapping an existing entry, e.g. splitting a section into pages, we take the approach of decomposing the PTE attributes back to the IOMMU API flags to start from scratch. On inspection, though, the existing code seems to have got the read-only bit backwards and ignored the XN bit. How embarrassing... Fortunately the primary user so far, the Mediatek IOMMU, both never splits blocks (because it only serves non-overlapping DMA API calls) and also ignores permissions anyway, but let's put things right before any future users trip up. Fixes: e5fc9753b1a8 ("iommu/io-pgtable: Add ARMv7 short descriptor support") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	iommu/dma: Don't put uninitialised IOVA domains	Robin Murphy	2016-09-07	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 3ec60043f7c02e1f79e4a90045ff2d2e80042941 upstream. Due to the limitations of having to wait until we see a device's DMA restrictions before we know how we want an IOVA domain initialised, there is a window for error if a DMA ops domain is allocated but later freed without ever being used. In that case, init_iova_domain() was never called, so calling put_iova_domain() from iommu_put_dma_cookie() ends up trying to take an uninitialised lock and crashing. Make things robust by skipping the call unless the IOVA domain actually has been initialised, as we probably should have done from the start. Fixes: 0db2e5d18f76 ("iommu: Implement common IOMMU ops for DMA mapping") Reported-by: Nate Watterson <nwatters@codeaurora.org> Reviewed-by: Nate Watterson <nwatters@codeaurora.org> Tested-by: Nate Watterson <nwatters@codeaurora.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	iommu/amd: Update Alias-DTE in update_device_table()	Joerg Roedel	2016-08-20	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 3254de6bf74fe94c197c9f819fe62a3a3c36f073 upstream. Not doing so might cause IO-Page-Faults when a device uses an alias request-id and the alias-dte is left in a lower page-mode which does not cover the address allocated from the iova-allocator. Fixes: 492667dacc0a ('x86/amd-iommu: Remove amd_iommu_pd_table') Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	iommu/amd: Init unity mappings only for dma_ops domains	Joerg Roedel	2016-08-20	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit b548e786ce47017107765bbeb0f100202525ea83 upstream. The default domain for a device might also be identity-mapped. In this case the kernel would crash when unity mappings are defined for the device. Fix that by making sure the domain is a dma_ops domain. Fixes: 0bb6e243d7fb ('iommu/amd: Support IOMMU_DOMAIN_DMA type allocation') Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	iommu/amd: Handle IOMMU_DOMAIN_DMA in ops->domain_free call-back	Joerg Roedel	2016-08-20	1	-8/+17
\| \| \| \| \| \| \| \| \| \| \| \|	commit cda7005ba2cbd0744fea343dd5b2aa637eba5b9e upstream. This domain type is not yet handled in the iommu_ops->domain_free() call-back. Fix that. Fixes: 0bb6e243d7fb ('iommu/amd: Support IOMMU_DOMAIN_DMA type allocation') Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	iommu/io-pgtable-arm: Fix iova_to_phys for block entries	Will Deacon	2016-08-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 7c6d90e2bb1a98b86d73b9e8ab4d97ed5507e37c upstream. The implementation of iova_to_phys for the long-descriptor ARM io-pgtable code always masks with the granule size when inserting the low virtual address bits into the physical address determined from the page tables. In cases where the leaf entry is found before the final level of table (i.e. due to a block mapping), this results in rounding down to the bottom page of the block mapping. Consequently, the physical address range batching in the vfio_unmap_unpin is defeated and we end up taking the long way home. This patch fixes the problem by masking the virtual address with the appropriate mask for the level at which the leaf descriptor is located. The short-descriptor code already gets this right, so no change is needed there. Reported-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	iommu/vt-d: Return error code in domain_context_mapping_one()	Wei Yang	2016-08-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 5c365d18a73d3979db37006eaacefc0008869c0f upstream. In 'commit <55d940430ab9> ("iommu/vt-d: Get rid of domain->iommu_lock")', the error handling path is changed a little, which makes the function always return 0. This path fixes this. Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Fixes: 55d940430ab9 ('iommu/vt-d: Get rid of domain->iommu_lock') Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	iommu/exynos: Suppress unbinding to prevent system failure	Marek Szyprowski	2016-08-20	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit b54b874fbaf5e024723e50dfb035a9916d6752b4 upstream. Removal of IOMMU driver cannot be done reliably, so Exynos IOMMU driver doesn't support this operation. It is essential for system operation, so it makes sense to prevent unbinding by disabling bind/unbind sysfs feature for SYSMMU controller driver to avoid kernel ops or trashing memory caused by such operation. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	iommu/amd: Fix unity mapping initialization race	Joerg Roedel	2016-07-06	1	-2/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is a race condition in the AMD IOMMU init code that causes requested unity mappings to be blocked by the IOMMU for a short period of time. This results on boot failures and IO_PAGE_FAULTs on some machines. Fix this by making sure the unity mappings are installed before all other DMA is blocked. Fixes: aafd8ba0ca74 ('iommu/amd: Implement add_device and remove_device') Cc: stable@vger.kernel.org # v4.2+ Signed-off-by: Joerg Roedel <jroedel@suse.de>
*	iommu/vt-d: Fix infinite loop in free_all_cpu_cached_iovas	Aaron Campbell	2016-07-04	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Per VT-d spec Section 10.4.2 ("Capability Register"), the maximum number of possible domains is 64K; indeed this is the maximum value that the cap_ndoms() macro will expand to. Since the value 65536 will not fix in a u16, the 'did' variable must be promoted to an int, otherwise the test for < 65536 will always be true and the loop will never end. The symptom, in my case, was a hung machine during suspend. Fixes: 3bd4f9112f87 ("iommu/vt-d: Fix overflow of iommu->domains array") Signed-off-by: Aaron Campbell <aaron@monkey.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>
*	iommu/amd: Initialize devid variable before using it	Nicolas Iooss	2016-06-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Commit 2a0cb4e2d423 ("iommu/amd: Add new map for storing IVHD dev entry type HID") added a call to DUMP_printk in init_iommu_from_acpi() which used the value of devid before this variable was initialized. Fixes: 2a0cb4e2d423 ('iommu/amd: Add new map for storing IVHD dev entry type HID') Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>
*	iommu/vt-d: Fix overflow of iommu->domains array	Jan Niehusmann	2016-06-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The valid range of 'did' in get_iommu_domain(*iommu, did) is 0..cap_ndoms(iommu->cap), so don't exceed that range in free_all_cpu_cached_iovas(). The user-visible impact of the out-of-bounds access is the machine hanging on suspend-to-ram. It is, in fact, a kernel panic, but due to already suspended devices, that's often not visible to the user. Fixes: 22e2f9fa63b0 ("iommu/vt-d: Use per-cpu IOVA caching") Signed-off-by: Jan Niehusmann <jan@gondor.com> Tested-By: Marius Vlad <marius.c.vlad@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
*	iommu/iova: Disable preemption around use of this_cpu_ptr()	Chris Wilson	2016-06-27	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Between acquiring the this_cpu_ptr() and using it, ideally we don't want to be preempted and work on another CPU's private data. this_cpu_ptr() checks whether or not preemption is disable, and get_cpu_ptr() provides a convenient wrapper for operating on the cpu ptr inside a preemption disabled critical section (which currently is provided by the spinlock). [ 167.997877] BUG: using smp_processor_id() in preemptible [00000000] code: usb-storage/216 [ 167.997940] caller is debug_smp_processor_id+0x17/0x20 [ 167.997945] CPU: 7 PID: 216 Comm: usb-storage Tainted: G U 4.7.0-rc1-gfxbench-RO_Patchwork_1057+ #1 [ 167.997948] Hardware name: Hewlett-Packard HP Pro 3500 Series/2ABF, BIOS 8.11 10/24/2012 [ 167.997951] 0000000000000000 ffff880118b7f9c8 ffffffff8140dca5 0000000000000007 [ 167.997958] ffffffff81a3a7e9 ffff880118b7f9f8 ffffffff8142a927 0000000000000000 [ 167.997965] ffff8800d499ed58 0000000000000001 00000000000fffff ffff880118b7fa08 [ 167.997971] Call Trace: [ 167.997977] [<ffffffff8140dca5>] dump_stack+0x67/0x92 [ 167.997981] [<ffffffff8142a927>] check_preemption_disabled+0xd7/0xe0 [ 167.997985] [<ffffffff8142a947>] debug_smp_processor_id+0x17/0x20 [ 167.997990] [<ffffffff81507e17>] alloc_iova_fast+0xb7/0x210 [ 167.997994] [<ffffffff8150c55f>] intel_alloc_iova+0x7f/0xd0 [ 167.997998] [<ffffffff8151021d>] intel_map_sg+0xbd/0x240 [ 167.998002] [<ffffffff810e5efd>] ? debug_lockdep_rcu_enabled+0x1d/0x20 [ 167.998009] [<ffffffff81596059>] usb_hcd_map_urb_for_dma+0x4b9/0x5a0 [ 167.998013] [<ffffffff81596d19>] usb_hcd_submit_urb+0xe9/0xaa0 [ 167.998017] [<ffffffff810cff2f>] ? mark_held_locks+0x6f/0xa0 [ 167.998022] [<ffffffff810d525c>] ? __raw_spin_lock_init+0x1c/0x50 [ 167.998025] [<ffffffff810e5efd>] ? debug_lockdep_rcu_enabled+0x1d/0x20 [ 167.998028] [<ffffffff815988f3>] usb_submit_urb+0x3f3/0x5a0 [ 167.998032] [<ffffffff810d0082>] ? trace_hardirqs_on_caller+0x122/0x1b0 [ 167.998035] [<ffffffff81599ae7>] usb_sg_wait+0x67/0x150 [ 167.998039] [<ffffffff815dc202>] usb_stor_bulk_transfer_sglist.part.3+0x82/0xd0 [ 167.998042] [<ffffffff815dc29c>] usb_stor_bulk_srb+0x4c/0x60 [ 167.998045] [<ffffffff815dc42e>] usb_stor_Bulk_transport+0x17e/0x420 [ 167.998049] [<ffffffff815dcf32>] usb_stor_invoke_transport+0x242/0x540 [ 167.998052] [<ffffffff810e5efd>] ? debug_lockdep_rcu_enabled+0x1d/0x20 [ 167.998058] [<ffffffff815dba19>] usb_stor_transparent_scsi_command+0x9/0x10 [ 167.998061] [<ffffffff815de518>] usb_stor_control_thread+0x158/0x260 [ 167.998064] [<ffffffff815de3c0>] ? fill_inquiry_response+0x20/0x20 [ 167.998067] [<ffffffff815de3c0>] ? fill_inquiry_response+0x20/0x20 [ 167.998071] [<ffffffff8109ddfa>] kthread+0xea/0x100 [ 167.998078] [<ffffffff817ac6af>] ret_from_fork+0x1f/0x40 [ 167.998081] [<ffffffff8109dd10>] ? kthread_create_on_node+0x1f0/0x1f0 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96293 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: iommu@lists.linux-foundation.org Cc: linux-kernel@vger.kernel.org Fixes: 9257b4a206fc ('iommu/iova: introduce per-cpu caching to iova allocation') Signed-off-by: Joerg Roedel <jroedel@suse.de>
*	iommu/vt-d: Enable QI on all IOMMUs before setting root entry	Joerg Roedel	2016-06-17	1	-5/+12
\| \| \| \| \| \| \| \| \| \| \| \|	This seems to be required on some X58 chipsets on systems with more than one IOMMU. QI does not work until it is enabled on all IOMMUs in the system. Reported-by: Dheeraj CVR <cvr.dheeraj@gmail.com> Tested-by: Dheeraj CVR <cvr.dheeraj@gmail.com> Fixes: 5f0a7f7614a9 ('iommu/vt-d: Make root entry visible for hardware right after allocation') Cc: stable@vger.kernel.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
*	iommu/rockchip: Fix zap cache during device attach	John Keeping	2016-06-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	rk_iommu_command() takes a struct rk_iommu and iterates over the slave MMUs, so this is doubly wrong in that we're passing in the wrong pointer and talking to MMUs that we shouldn't be. Fixes: cd6438c5f844 ("iommu/rockchip: Reconstruct to support multi slaves") Cc: stable@vger.kernel.org Signed-off-by: John Keeping <john@metanate.com> Tested-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>
*	iommu/arm-smmu: Wire up map_sg for arm-smmu-v3	Jean-Philippe Brucker	2016-06-13	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	The map_sg callback is missing from arm_smmu_ops, but is required by iommu.h. Similarly to most other IOMMU drivers, connect it to default_iommu_map_sg. Cc: <stable@vger.kernel.org> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
*	remove lots of IS_ERR_VALUE abuses	Arnd Bergmann	2016-05-27	2	-13/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Most users of IS_ERR_VALUE() in the kernel are wrong, as they pass an 'int' into a function that takes an 'unsigned long' argument. This happens to work because the type is sign-extended on 64-bit architectures before it gets converted into an unsigned type. However, anything that passes an 'unsigned short' or 'unsigned int' argument into IS_ERR_VALUE() is guaranteed to be broken, as are 8-bit integers and types that are wider than 'unsigned long'. Andrzej Hajda has already fixed a lot of the worst abusers that were causing actual bugs, but it would be nice to prevent any users that are not passing 'unsigned long' arguments. This patch changes all users of IS_ERR_VALUE() that I could find on 32-bit ARM randconfig builds and x86 allmodconfig. For the moment, this doesn't change the definition of IS_ERR_VALUE() because there are probably still architecture specific users elsewhere. Almost all the warnings I got are for files that are better off using 'if (err)' or 'if (err < 0)'. The only legitimate user I could find that we get a warning for is the (32-bit only) freescale fman driver, so I did not remove the IS_ERR_VALUE() there but changed the type to 'unsigned long'. For 9pfs, I just worked around one user whose calling conventions are so obscure that I did not dare change the behavior. I was using this definition for testing: #define IS_ERR_VALUE(x) ((unsigned long)NULL == (typeof (x))NULL && \ unlikely((unsigned long long)(x) >= (unsigned long long)(typeof(x))-MAX_ERRNO)) which ends up making all 16-bit or wider types work correctly with the most plausible interpretation of what IS_ERR_VALUE() was supposed to return according to its users, but also causes a compile-time warning for any users that do not pass an 'unsigned long' argument. I suggested this approach earlier this year, but back then we ended up deciding to just fix the users that are obviously broken. After the initial warning that caused me to get involved in the discussion (fs/gfs2/dir.c) showed up again in the mainline kernel, Linus asked me to send the whole thing again. [ Updated the 9p parts as per Al Viro - Linus ] Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Andrzej Hajda <a.hajda@samsung.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.org/lkml/2016/1/7/363 Link: https://lkml.org/lkml/2016/5/27/486 Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> # For nvmem part Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	Merge git://git.infradead.org/intel-iommu	Linus Torvalds	2016-05-27	2	-118/+617
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull intel IOMMU updates from David Woodhouse: "This patchset improves the scalability of the Intel IOMMU code by resolving two spinlock bottlenecks and eliminating the linearity of the IOVA allocator, yielding up to ~5x performance improvement and approaching 'iommu=off' performance" * git://git.infradead.org/intel-iommu: iommu/vt-d: Use per-cpu IOVA caching iommu/iova: introduce per-cpu caching to iova allocation iommu/vt-d: change intel-iommu to use IOVA frame numbers iommu/vt-d: avoid dev iotlb logic for domains with no dev iotlbs iommu/vt-d: only unmap mapped entries iommu/vt-d: correct flush_unmaps pfn usage iommu/vt-d: per-cpu deferred invalidation queues iommu/vt-d: refactoring of deferred flush entries
\| *	iommu/vt-d: Use per-cpu IOVA caching	Omer Peleg	2016-04-20	1	-12/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 9257b4a2 ('iommu/iova: introduce per-cpu caching to iova allocation') introduced per-CPU IOVA caches to massively improve scalability. Use them. Signed-off-by: Omer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased, cleaned up and reworded the commit message] Signed-off-by: Adam Morrison <mad@cs.technion.ac.il> Reviewed-by: Shaohua Li <shli@fb.com> Reviewed-by: Ben Serebrin <serebrin@google.com> [dwmw2: split out VT-d part into a separate patch] Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
\| *	iommu/iova: introduce per-cpu caching to iova allocation	Omer Peleg	2016-04-20	1	-24/+393
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	IOVA allocation has two problems that impede high-throughput I/O. First, it can do a linear search over the allocated IOVA ranges. Second, the rbtree spinlock that serializes IOVA allocations becomes contended. Address these problems by creating an API for caching allocated IOVA ranges, so that the IOVA allocator isn't accessed frequently. This patch adds a per-CPU cache, from which CPUs can alloc/free IOVAs without taking the rbtree spinlock. The per-CPU caches are backed by a global cache, to avoid invoking the (linear-time) IOVA allocator without needing to make the per-CPU cache size excessive. This design is based on magazines, as described in "Magazines and Vmem: Extending the Slab Allocator to Many CPUs and Arbitrary Resources" (currently available at https://www.usenix.org/legacy/event/usenix01/bonwick.html) Adding caching on top of the existing rbtree allocator maintains the property that IOVAs are densely packed in the IO virtual address space, which is important for keeping IOMMU page table usage low. To keep the cache size reasonable, we bound the IOVA space a CPU can cache by 32 MiB (we cache a bounded number of IOVA ranges, and only ranges of size <= 128 KiB). The shared global cache is bounded at 4 MiB of IOVA space. Signed-off-by: Omer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased, cleaned up and reworded the commit message] Signed-off-by: Adam Morrison <mad@cs.technion.ac.il> Reviewed-by: Shaohua Li <shli@fb.com> Reviewed-by: Ben Serebrin <serebrin@google.com> [dwmw2: split out VT-d part into a separate patch] Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
\| *	iommu/vt-d: change intel-iommu to use IOVA frame numbers	Omer Peleg	2016-04-20	1	-32/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make intel-iommu map/unmap/invalidate work with IOVA pfns instead of pointers to "struct iova". This avoids using the iova struct from the IOVA red-black tree and the resulting explicit find_iova() on unmap. This patch will allow us to cache IOVAs in the next patch, in order to avoid rbtree operations for the majority of map/unmap operations. Note: In eliminating the find_iova() operation, we have also eliminated the sanity check previously done in the unmap flow. Arguably, this was overhead that is better avoided in production code, but it could be brought back as a debug option for driver development. Signed-off-by: Omer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased, fixed to not break iova api, and reworded the commit message] Signed-off-by: Adam Morrison <mad@cs.technion.ac.il> Reviewed-by: Shaohua Li <shli@fb.com> Reviewed-by: Ben Serebrin <serebrin@google.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
\| *	iommu/vt-d: avoid dev iotlb logic for domains with no dev iotlbs	Omer Peleg	2016-04-20	1	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch avoids taking the device_domain_lock in iommu_flush_dev_iotlb() for domains with no dev iotlb devices. Signed-off-by: Omer Peleg <omer@cs.technion.ac.il> [gvdl@google.com: fixed locking issues] Signed-off-by: Godfrey van der Linden <gvdl@google.com> [mad@cs.technion.ac.il: rebased and reworded the commit message] Signed-off-by: Adam Morrison <mad@cs.technion.ac.il> Reviewed-by: Shaohua Li <shli@fb.com> Reviewed-by: Ben Serebrin <serebrin@google.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
\| *	iommu/vt-d: only unmap mapped entries	Omer Peleg	2016-04-20	1	-11/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current unmap implementation unmaps the entire area covered by the IOVA range, which is a power-of-2 aligned region. The corresponding map, however, only maps those pages originally mapped by the user. This discrepancy can lead to unmapping of already unmapped entries, which is unneeded work. With this patch, only mapped pages are unmapped. This is also a baseline for a map/unmap implementation based on IOVAs and not iova structures, which will allow caching. Signed-off-by: Omer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased and reworded the commit message] Signed-off-by: Adam Morrison <mad@cs.technion.ac.il> Reviewed-by: Shaohua Li <shli@fb.com> Reviewed-by: Ben Serebrin <serebrin@google.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
\| *	iommu/vt-d: correct flush_unmaps pfn usage	Omer Peleg	2016-04-20	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change flush_unmaps() to correctly pass iommu_flush_iotlb_psi() dma addresses. (x86_64 mm and dma have the same size for pages at the moment, but this usage improves consistency.) Signed-off-by: Omer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased and reworded the commit message] Signed-off-by: Adam Morrison <mad@cs.technion.ac.il> Reviewed-by: Shaohua Li <shli@fb.com> Reviewed-by: Ben Serebrin <serebrin@google.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
\| *	iommu/vt-d: per-cpu deferred invalidation queues	Omer Peleg	2016-04-20	1	-41/+92
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The IOMMU's IOTLB invalidation is a costly process. When iommu mode is not set to "strict", it is done asynchronously. Current code amortizes the cost of invalidating IOTLB entries by batching all the invalidations in the system and performing a single global invalidation instead. The code queues pending invalidations in a global queue that is accessed under the global "async_umap_flush_lock" spinlock, which can result is significant spinlock contention. This patch splits this deferred queue into multiple per-cpu deferred queues, and thus gets rid of the "async_umap_flush_lock" and its contention. To keep existing deferred invalidation behavior, it still invalidates the pending invalidations of all CPUs whenever a CPU reaches its watermark or a timeout occurs. Signed-off-by: Omer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased, cleaned up and reworded the commit message] Signed-off-by: Adam Morrison <mad@cs.technion.ac.il> Reviewed-by: Shaohua Li <shli@fb.com> Reviewed-by: Ben Serebrin <serebrin@google.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
\| *	iommu/vt-d: refactoring of deferred flush entries	Omer Peleg	2016-04-20	1	-19/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, deferred flushes' info is striped between several lists in the flush tables. Instead, move all information about a specific flush to a single entry in this table. This patch does not introduce any functional change. Signed-off-by: Omer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased and reworded the commit message] Signed-off-by: Adam Morrison <mad@cs.technion.ac.il> Reviewed-by: Shaohua Li <shli@fb.com> Reviewed-by: Ben Serebrin <serebrin@google.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* \|	Merge tag 'devicetree-for-4.7' of ↵	Linus Torvalds	2016-05-20	1	-8/+30
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree updates from Rob Herring: - Rewrite of the unflattening code to avoid recursion and lessen the stack usage. - Rewrite of the phandle args parsing code to get rid of the fixed args size. This is needed for IOMMU code. - Sync to latest dtc which adds more dts style checking. These warnings are enabled with "W=1" compiles. - Tegra documentation updates related to the above warnings. - A bunch of spelling and other doc fixes. - Various vendor prefix additions. * tag 'devicetree-for-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (52 commits) devicetree: Add Creative Technology vendor id gpio: dt-bindings: add ibm,ppc4xx-gpio binding of/unittest: Remove unnecessary module.h header inclusion drivers/of: Fix build warning in populate_node() drivers/of: Fix depth when unflattening devicetree of: dynamic: changeset prop-update revert fix drivers/of: Export of_detach_node() drivers/of: Return allocated memory from of_fdt_unflatten_tree() drivers/of: Specify parent node in of_fdt_unflatten_tree() drivers/of: Rename unflatten_dt_node() drivers/of: Avoid recursively calling unflatten_dt_node() drivers/of: Split unflatten_dt_node() of: include errno.h in of_graph.h of: document refcount incrementation of of_get_cpu_node() Documentation: dt: soc: fix spelling mistakes Documentation: dt: power: fix spelling mistake Documentation: dt: pinctrl: fix spelling mistake Documentation: dt: opp: fix spelling mistake Documentation: dt: net: fix spelling mistakes Documentation: dt: mtd: fix spelling mistake ...
\| * \|	iommu/arm-smmu: Make use of phandle iterators in device-tree parsing	Joerg Roedel	2016-04-19	1	-8/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove the usage of of_parse_phandle_with_args() and replace it by the phandle-iterator implementation so that we can parse out all of the potentially present 128 stream-ids. Signed-off-by: Joerg Roedel <jroedel@suse.de> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Rob Herring <robh@kernel.org>
* \| \|	Merge tag 'iommu-updates-v4.7' of ↵	Linus Torvalds	2016-05-19	19	-393/+891
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "The updates include: - rate limiting for the VT-d fault handler - remove statistics code from the AMD IOMMU driver. It is unused and should be replaced by something more generic if needed - per-domain pagesize-bitmaps in IOMMU core code to support systems with different types of IOMMUs - support for ACPI devices in the AMD IOMMU driver - 4GB mode support for Mediatek IOMMU driver - ARM-SMMU updates from Will Deacon: - support for 64k pages with SMMUv1 implementations (e.g MMU-401) - remove open-coded 64-bit MMIO accessors - initial support for 16-bit VMIDs, as supported by some ThunderX SMMU implementations - a couple of errata workarounds for silicon in the field - various fixes here and there" * tag 'iommu-updates-v4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (44 commits) iommu/arm-smmu: Use per-domain page sizes. iommu/amd: Remove statistics code iommu/dma: Finish optimising higher-order allocations iommu: Allow selecting page sizes per domain iommu: of: enforce const-ness of struct iommu_ops iommu: remove unused priv field from struct iommu_ops iommu/dma: Implement scatterlist segment merging iommu/arm-smmu: Clear cache lock bit of ACR iommu/arm-smmu: Support SMMUv1 64KB supplement iommu/arm-smmu: Decouple context format from kernel config iommu/arm-smmu: Tidy up 64-bit/atomic I/O accesses io-64-nonatomic: Add relaxed accessor variants iommu/arm-smmu: Work around MMU-500 prefetch errata iommu/arm-smmu: Convert ThunderX workaround to new method iommu/arm-smmu: Differentiate specific implementations iommu/arm-smmu: Workaround for ThunderX erratum #27704 iommu/arm-smmu: Add support for 16 bit VMID iommu/amd: Move get_device_id() and friends to beginning of file iommu/amd: Don't use IS_ERR_VALUE to check integer values iommu/amd: Signedness bug in acpihid_device_group() ...
\| \| \ \
\| \| \ \
\| \| \ \
\| \| \ \
\| \| \ \
\| \| \ \
\| \| \ \
\| \| \ \
\| \| \ \
\| \| \ \
\| *---------. \ \	Merge branches 'arm/io-pgtable', 'arm/rockchip', 'arm/omap', 'x86/vt-d', ↵	Joerg Roedel	2016-05-09	15	-401/+938
\| \|\ \ \ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	'ppc/pamu', 'core' and 'x86/amd' into next
\| \| \| \| \| \| \| * \| \|	iommu/amd: Remove statistics code	Joerg Roedel	2016-05-09	3	-131/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The statistics are not really used for anything and should be replaced by generic and per-device statistic counters. Remove the code for now. Signed-off-by: Joerg Roedel <jroedel@suse.de>
\| \| \| \| \| \| \| * \| \|	iommu/amd: Move get_device_id() and friends to beginning of file	Joerg Roedel	2016-04-21	1	-54/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	They will be needed there later. Signed-off-by: Joerg Roedel <jroedel@suse.de>
\| \| \| \| \| \| \| * \| \|	iommu/amd: Don't use IS_ERR_VALUE to check integer values	Joerg Roedel	2016-04-21	1	-10/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use the better 'var < 0' check. Fixes: 7aba6cb9ee9d ('iommu/amd: Make call-sites of get_device_id aware of its return value') Signed-off-by: Joerg Roedel <jroedel@suse.de>
\| \| \| \| \| \| \| * \| \|	iommu/amd: Signedness bug in acpihid_device_group()	Dan Carpenter	2016-04-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	"devid" needs to be signed for the error handling to work. Fixes: b097d11a0fa3f ('iommu/amd: Manage iommu_group for ACPI HID devices') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
\| \| \| \| \| \| \| * \| \|	iommu/amd: Set AMD iommu callbacks for amba bus	Wan Zongshun	2016-04-07	1	-1/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	AMD Uart DMA belongs to ACPI HID type device, and its driver is basing on AMBA Bus, need also IOMMU support. This patch is just to set the AMD iommu callbacks for amba bus. Signed-off-by: Wan Zongshun <Vincent.Wan@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
\| \| \| \| \| \| \| * \| \|	iommu/amd: Manage iommu_group for ACPI HID devices	Wan Zongshun	2016-04-07	1	-1/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch creates a new function for finding or creating an IOMMU group for acpihid(ACPI Hardware ID) device. The acpihid devices with the same devid will be put into same group and there will have the same domain id and share the same page table. Signed-off-by: Wan Zongshun <Vincent.Wan@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
\| \| \| \| \| \| \| * \| \|	iommu/amd: Add iommu support for ACPI HID devices	Wan Zongshun	2016-04-07	1	-9/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current IOMMU driver make assumption that the downstream devices are PCI. With the newly added ACPI-HID IVHD device entry support, this is no longer true. This patch is to add dev type check and to distinguish the pci and acpihid device code path. Signed-off-by: Wan Zongshun <Vincent.Wan@amd.com> Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
\| \| \| \| \| \| \| * \| \|	iommu/amd: Make call-sites of get_device_id aware of its return value	Wan Zongshun	2016-04-07	1	-10/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch is to make the call-sites of get_device_id aware of its return value. Signed-off-by: Wan Zongshun <Vincent.Wan@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
\| \| \| \| \| \| \| * \| \|	iommu/amd: Introduces ivrs_acpihid kernel parameter	Suravee Suthikulpanit	2016-04-07	1	-0/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch introduces a new kernel parameter, ivrs_acpihid. This is used to override existing ACPI-HID IVHD device entry, or add an entry in case it is missing in the IVHD. Signed-off-by: Wan Zongshun <Vincent.Wan@amd.com> Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
\| \| \| \| \| \| \| * \| \|	iommu/amd: Add new map for storing IVHD dev entry type HID	Wan Zongshun	2016-04-07	3	-0/+137
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch introduces acpihid_map, which is used to store the new IVHD device entry extracted from BIOS IVRS table. It also provides a utility function add_acpi_hid_device(), to add this types of devices to the map. Signed-off-by: Wan Zongshun <Vincent.Wan@amd.com> Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
\| \| \| \| \| \| \| * \| \|	iommu/amd: Use the most comprehensive IVHD type that the driver can support	Suravee Suthikulpanit	2016-04-07	1	-29/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The IVRS in more recent AMD system usually contains multiple IVHD block types (e.g. 0x10, 0x11, and 0x40) for each IOMMU. The newer IVHD types provide more information (e.g. new features specified in the IOMMU spec), while maintain compatibility with the older IVHD type. Having multiple IVHD type allows older IOMMU drivers to still function (e.g. using the older IVHD type 0x10) while the newer IOMMU driver can use the newer IVHD types (e.g. 0x11 and 0x40). Therefore, the IOMMU driver should only make use of the newest IVHD type that it can support. This patch adds new logic to determine the highest level of IVHD type it can support, and use it throughout the to initialize the driver. This requires adding another pass to the IVRS parsing to determine appropriate IVHD type (see function get_highest_supported_ivhd_type()) before parsing the contents. [Vincent: fix the build error of IVHD_DEV_ACPI_HID flag not found] Signed-off-by: Wan Zongshun <vincent.wan@amd.com> Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
\| \| \| \| \| \| \| * \| \|	iommu/amd: Modify ivhd_header structure to support type 11h and 40h	Suravee Suthikulpanit	2016-04-07	1	-2/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch modifies the existing struct ivhd_header, which currently only support IVHD type 0x10, to add new fields from IVHD type 11h and 40h. It also modifies the pointer calculation to allow support for IVHD type 11h and 40h Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
\| \| \| \| \| \| \| * \| \|	iommu/amd: Adding Extended Feature Register check for PC support	Suravee Suthikulpanit	2016-04-07	1	-8/+24
\| \| \| \| \| \| \| \|/ / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The IVHD header type 11h and 40h introduce the PCSup bit in the EFR Register Image bit fileds. This should be used to determine the IOMMU performance support instead of relying on the PNCounters and PNBanks. Note also that the PNCouters and PNBanks bits in the IOMMU attributes field of IVHD headers type 11h are incorrectly programmed on some systems. So, we should not rely on it to determine the performance counter/banks size. Instead, these values should be read from the MMIO Offset 0030h IOMMU Extended Feature Register. Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
\| \| \| \| \| \| * \| \|	iommu/arm-smmu: Use per-domain page sizes.	Robin Murphy	2016-05-09	2	-21/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we can accurately reflect the context format we choose for each domain, do that instead of imposing the global lowest-common-denominator restriction and potentially ending up with nothing. We currently have a strict 1:1 correspondence between domains and context banks, so we don't need to entertain the possibility of multiple formats _within_ a domain. Signed-off-by: Will Deacon <will.deacon@arm.com> [rm: split from original patch, added SMMUv3] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
\| \| \| \| \| \| * \| \|	iommu/dma: Finish optimising higher-order allocations	Robin Murphy	2016-05-09	1	-21/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we know exactly which page sizes our caller wants to use in the given domain, we can restrict higher-order allocation attempts to just those sizes, if any, and avoid wasting any time or effort on other sizes which offer no benefit. In the same vein, this also lets us accommodate a minimum order greater than 0 for special cases. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Tested-by: Yong Wu <yong.wu@mediatek.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
\| \| \| \| \| \| * \| \|	iommu: Allow selecting page sizes per domain	Robin Murphy	2016-05-09	3	-12/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Many IOMMUs support multiple page table formats, meaning that any given domain may only support a subset of the hardware page sizes presented in iommu_ops->pgsize_bitmap. There are also certain use-cases where the creator of a domain may want to control which page sizes are used, for example to force the use of hugepage mappings to reduce pagetable walk depth. To this end, add a per-domain pgsize_bitmap to represent the subset of page sizes actually in use, to make it possible for domains with different requirements to coexist. Signed-off-by: Will Deacon <will.deacon@arm.com> [rm: hijacked and rebased original patch with new commit message] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>