summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'powerpc-4.10-1' of ↵Linus Torvalds2016-12-16259-2634/+8698
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Highlights include: - Support for the kexec_file_load() syscall, which is a prereq for secure and trusted boot. - Prevent kernel execution of userspace on P9 Radix (similar to SMEP/PXN). - Sort the exception tables at build time, to save time at boot, and store them as relative offsets to save space in the kernel image & memory. - Allow building the kernel with thin archives, which should allow us to build an allyesconfig once some other fixes land. - Build fixes to allow us to correctly rebuild when changing the kernel endian from big to little or vice versa. - Plumbing so that we can avoid doing a full mm TLB flush on P9 Radix. - Initial stack protector support (-fstack-protector). - Support for dumping the radix (aka. Linux) and hash page tables via debugfs. - Fix an oops in cxl coredump generation when cxl_get_fd() is used. - Freescale updates from Scott: "Highlights include 8xx hugepage support, qbman fixes/cleanup, device tree updates, and some misc cleanup." - Many and varied fixes and minor enhancements as always. Thanks to: Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Anshuman Khandual, Anton Blanchard, Balbir Singh, Bartlomiej Zolnierkiewicz, Christophe Jaillet, Christophe Leroy, Denis Kirjanov, Elimar Riesebieter, Frederic Barrat, Gautham R. Shenoy, Geliang Tang, Geoff Levand, Jack Miller, Johan Hovold, Lars-Peter Clausen, Libin, Madhavan Srinivasan, Michael Neuling, Nathan Fontenot, Naveen N. Rao, Nicholas Piggin, Pan Xinhui, Peter Senna Tschudin, Rashmica Gupta, Rui Teng, Russell Currey, Scott Wood, Simon Guo, Suraj Jitindar Singh, Thiago Jung Bauermann, Tobias Klauser, Vaibhav Jain" [ And thanks to Michael, who took time off from a new baby to get this pull request done. - Linus ] * tag 'powerpc-4.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (174 commits) powerpc/fsl/dts: add FMan node for t1042d4rdb powerpc/fsl/dts: add sg_2500_aqr105_phy4 alias on t1024rdb powerpc/fsl/dts: add QMan and BMan nodes on t1024 powerpc/fsl/dts: add QMan and BMan nodes on t1023 soc/fsl/qman: test: use DEFINE_SPINLOCK() powerpc/fsl-lbc: use DEFINE_SPINLOCK() powerpc/8xx: Implement support of hugepages powerpc: get hugetlbpage handling more generic powerpc: port 64 bits pgtable_cache to 32 bits powerpc/boot: Request no dynamic linker for boot wrapper soc/fsl/bman: Use resource_size instead of computation soc/fsl/qe: use builtin_platform_driver powerpc/fsl_pmc: use builtin_platform_driver powerpc/83xx/suspend: use builtin_platform_driver powerpc/ftrace: Fix the comments for ftrace_modify_code powerpc/perf: macros for power9 format encoding powerpc/perf: power9 raw event format encoding powerpc/perf: update attribute_group data structure powerpc/perf: factor out the event format field powerpc/mm/iommu, vfio/spapr: Put pages on VFIO container shutdown ...
| * Merge branch 'next' of ↵Michael Ellerman2016-12-1647-580/+1040
| |\ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next Freescale updates from Scott: "Highlights include 8xx hugepage support, qbman fixes/cleanup, device tree updates, and some misc cleanup."
| | * powerpc/fsl/dts: add FMan node for t1042d4rdbMadalin Bucur2016-12-091-0/+52
| | | | | | | | | | | | | | | Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * powerpc/fsl/dts: add sg_2500_aqr105_phy4 alias on t1024rdbMadalin Bucur2016-12-091-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | The alias is used by the boot loader to perform a device tree fixup. Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * powerpc/fsl/dts: add QMan and BMan nodes on t1024Madalin Bucur2016-12-092-0/+58
| | | | | | | | | | | | | | | Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * powerpc/fsl/dts: add QMan and BMan nodes on t1023Madalin Bucur2016-12-092-0/+132
| | | | | | | | | | | | | | | Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * soc/fsl/qman: test: use DEFINE_SPINLOCK()Fabian Frederick2016-12-091-1/+1
| | | | | | | | | | | | | | | Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Scott Wood <oss@buserror.net>
| | * powerpc/fsl-lbc: use DEFINE_SPINLOCK()Fabian Frederick2016-12-091-1/+1
| | | | | | | | | | | | | | | Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Scott Wood <oss@buserror.net>
| | * powerpc/8xx: Implement support of hugepagesChristophe Leroy2016-12-0911-30/+225
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 8xx uses a two level page table with two different linux page size support (4k and 16k). 8xx also support two different hugepage sizes 512k and 8M. In order to support them on linux we define two different page table layout. The size of pages is in the PGD entry, using PS field (bits 28-29): 00 : Small pages (4k or 16k) 01 : 512k pages 10 : reserved 11 : 8M pages For 512K hugepage size a pgd entry have the below format [<hugepte address >0101] . The hugepte table allocated will contain 8 entries pointing to 512K huge pte in 4k pages mode and 64 entries in 16k pages mode. For 8M in 16k mode, a pgd entry have the below format [<hugepte address >1101] . The hugepte table allocated will contain 8 entries pointing to 8M huge pte. For 8M in 4k mode, multiple pgd entries point to the same hugepte address and pgd entry will have the below format [<hugepte address>1101]. The hugepte table allocated will only have one entry. For the time being, we do not support CPU15 ERRATA when HUGETLB is selected Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> (v3, for the generic bits) Signed-off-by: Scott Wood <oss@buserror.net>
| | * powerpc: get hugetlbpage handling more genericChristophe Leroy2016-12-091-114/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Today there are two implementations of hugetlbpages which are managed by exclusive #ifdefs: * FSL_BOOKE: several directory entries points to the same single hugepage * BOOK3S: one upper level directory entry points to a table of hugepages In preparation of implementation of hugepage support on the 8xx, we need a mix of the two above solutions, because the 8xx needs both cases depending on the size of pages: * In 4k page size mode, each PGD entry covers a 4M bytes area. It means that 2 PGD entries will be necessary to cover an 8M hugepage while a single PGD entry will cover 8x 512k hugepages. * In 16 page size mode, each PGD entry covers a 64M bytes area. It means that 8x 8M hugepages will be covered by one PGD entry and 64x 512k hugepages will be covers by one PGD entry. This patch: * removes #ifdefs in favor of if/else based on the range sizes * merges the two huge_pte_alloc() functions as they are pretty similar * merges the two hugetlbpage_init() functions as they are pretty similar Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> (v3) Signed-off-by: Scott Wood <oss@buserror.net>
| | * powerpc: port 64 bits pgtable_cache to 32 bitsChristophe Leroy2016-12-0911-174/+227
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Today powerpc64 uses a set of pgtable_caches while powerpc32 uses standard pages when using 4k pages and a single pgtable_cache if using other size pages. In preparation of implementing huge pages on the 8xx, this patch replaces the specific powerpc32 handling by the 64 bits approach. This is done by: * moving 64 bits pgtable_cache_add() and pgtable_cache_init() in a new file called init-common.c * modifying pgtable_cache_init() to also handle the case without PMD * removing the 32 bits version of pgtable_cache_add() and pgtable_cache_init() * copying related header contents from 64 bits into both the book3s/32 and nohash/32 header files On the 8xx, the following cache sizes will be used: * 4k pages mode: - PGT_CACHE(10) for PGD - PGT_CACHE(3) for 512k hugepage tables * 16k pages mode: - PGT_CACHE(6) for PGD - PGT_CACHE(7) for 512k hugepage tables - PGT_CACHE(3) for 8M hugepage tables Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * soc/fsl/bman: Use resource_size instead of computationWei Yongjun2016-12-041-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use resource_size function on resource object instead of explicit computation. Generated by: scripts/coccinelle/api/resource_size.cocci Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * soc/fsl/qe: use builtin_platform_driverGeliang Tang2016-12-041-5/+1
| | | | | | | | | | | | | | | | | | | | | Use builtin_platform_driver() helper to simplify the code. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * powerpc/fsl_pmc: use builtin_platform_driverGeliang Tang2016-12-041-5/+1
| | | | | | | | | | | | | | | | | | | | | Use builtin_platform_driver() helper to simplify the code. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * powerpc/83xx/suspend: use builtin_platform_driverGeliang Tang2016-12-041-5/+1
| | | | | | | | | | | | | | | | | | | | | Use builtin_platform_driver() helper to simplify the code. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * soc/qman: Handle endianness of h/w descriptorsClaudiu Manoil2016-11-235-62/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The hardware descriptors have big endian (BE) format. Provide proper endianness handling for the remaining descriptor fields, to ensure they are correctly accessed by non-BE CPUs too. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * soc/qman: Clean up CGR CSCN target update operationsClaudiu Manoil2016-11-232-16/+25
| | | | | | | | | | | | | | | Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * soc/qman: Change remaining contextB into context_bClaudiu Manoil2016-11-232-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are multiple occurences of both contextB and context_b in different h/w descriptors, referring to the same descriptor field known as "Context B". Stick with the "context_b" naming, for obvious reasons including consistency (see also context_a). Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * soc/qbman: Handle endianness of qm/bm_in/out()Claudiu Manoil2016-11-232-6/+6
| | | | | | | | | | | | | | | Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * soc/qman: Drop unused field from eqcr/dqrr descriptorsClaudiu Manoil2016-11-232-4/+2
| | | | | | | | | | | | | | | | | | | | | ORP ("Order Restoration Point") mechanism not supported. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * soc/qman: Fix accesses to fqid, cleanupClaudiu Manoil2016-11-232-16/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Preventively mask every access to the 'fqid' h/w field, since it is defined as a 24-bit field, for every h/w descriptor. Add generic accessors for this field to ensure correct access. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * soc/qman: Remove unused struct qm_mcc* layoutsClaudiu Manoil2016-11-232-46/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. qm_mcc_querywq layout not used for now, so drop it; 2. queryfq, queryfq_np and alterfq are used only for accesses to the 'fqid' field, so replace these with a generic 'fq' layout. As a consequence, 'querycgr' turns into 'cgr' following the same reasoning above and for consistent naming. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * soc/qman: Remove redundant checks from qman_create_cgr()Claudiu Manoil2016-11-231-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | opts is checked redundantly. Move local_opts declaration inside its usage scope. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * soc/qman: test: Don't use dummy platform device for dma mappingClaudiu Manoil2016-11-233-11/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace dummy platform device hack with a reference to a portal's platform device, in order to dma map the test frame for this small unit test. The 2 qman symbols need to be exported because this self test is a kernel module. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * soc/qman: Don't add a new platform device for dma mappingClaudiu Manoil2016-11-233-21/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The qman portals are platform devices themselves, so they should handle dma mappings. Creating a dummy platform device in order to support dma mapping operations is not justified (and not portable). Instead, do the mapping against the first portal that has been initialised. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * soc/qman: test: Fix implementation of fd_cmp()Claudiu Manoil2016-11-231-15/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function must only return the truth value of whether two frame descriptors are different or not. It does NOT have to compute some obscure difference between fd fields and return it as an int, making sparse complain about type conversions in the process. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * soc/qman: Fix direct access to fd's addr_lo, use proper accesorClaudiu Manoil2016-11-231-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Use the proper accessor to get the FD address. Accessing the internal field "addr_lo" directly is not portable and error prone. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * soc/qman: Fix struct qm_fqd set accessor for context_aClaudiu Manoil2016-11-231-1/+1
| | | | | | | | | | | | | | | | | | | | | context_a.hi is 32bit Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * soc/qbman: Fix resource leak on portal probing error pathClaudiu Manoil2016-11-232-10/+24
| | | | | | | | | | | | | | | | | | | | | | | | In case init_pcfg() returns with error the CI region must be unmapped too. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * soc/qman: Fix h/w resource cleanup error path handlingClaudiu Manoil2016-11-231-5/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | qman_query_fq*() may return other error codes apart from -ERANGE, in which cases the error handling done by the resource cleanup callers would be wrong. The patch fixes the handling of those cases, and cleans up related code inside the resource cleanup & release handlers (i.e. replace hardcoded fqid value with corresponding define). Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * soc/qman: Replace of_get_property() with portable equivalentMadalin Bucur2016-11-232-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | Use arch portable of_property_read_u32() instead, which takes care of endianness conversions. Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com> Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * soc/qman: Check ioremap return valueMadalin Bucur2016-11-231-0/+3
| | | | | | | | | | | | | | | | | | Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com> Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * powerpc/85xx: Enable gpio power/reset driverAndy Fleming2016-11-231-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | These config changes build: drivers/power/reset/gpio-poweroff.c drivers/power/reset/gpio-restart.c Signed-off-by: Andy Fleming <afleming@gmail.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * powerpc/fsl_soc: improve and simplify get_baudrateHeiner Kallweit2016-11-231-8/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Use of_property_read_u32 instead of the generic of_get_property to simplify the code. In addition move the declaration of fs_baudrate into get_baudrate because it's private to this function. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * powerpc/fsl_soc: improve and simplify get_brgfreqHeiner Kallweit2016-11-231-17/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use of_property_read_u32 instead of the generic of_get_property to simplify the code. In addition move the declaration of brgfreq into get_brgfreq because it's private to this function. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> [scottwood: minor whitespace fixes] Signed-off-by: Scott Wood <oss@buserror.net>
| | * powerpc/fsl_soc: improve and simplify fsl_get_sys_freqHeiner Kallweit2016-11-231-10/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Use of_property_read_u32 instead of the generic of_get_property to simplify the code. In addition move the declaration of sysfreq into fsl_get_sys_freq because it's private to this function. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * powerpc/85xx/qemu: Enable CONFIG_E500 and CONFIG_PPC_E500MCDavid Engraf2016-11-221-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | The QEMU e500 board needs to enable CONFIG_E500 to correctly boot. QEMU for ppc64 uses e5500/e6500 emulation, thus CONFIG_PPC_E500MC is required as well. Signed-off-by: David Engraf <david.engraf@sysgo.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * soc/fsl: fix spelling mistakes in critical error messagesColin Ian King2016-11-222-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | Trivial fix to spelling mistake "uncommited" to "uncommitted" in critical error messages. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Scott Wood <oss@buserror.net>
| | * powerpc/dts: add device tree entry for W83793 on T4240RDBFlorian Larysch2016-11-221-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The T4240RDB contains a W83793 hardware monitoring chip. Add a device tree entry to make the driver attach to it, as the i2c-mpc bus driver dropped support for class-based instantiation of devices a long time ago. Signed-off-by: Florian Larysch <fl@n621.de> Signed-off-by: Scott Wood <oss@buserror.net>
| | * DT: i2c: W83793 is a trivial deviceFlorian Larysch2016-11-221-0/+1
| | | | | | | | | | | | | | | | | | Signed-off-by: Florian Larysch <fl@n621.de> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Scott Wood <oss@buserror.net>
| * | powerpc/boot: Request no dynamic linker for boot wrapperNicholas Piggin2016-12-051-1/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The boot wrapper performs its own relocations and does not require PT_INTERP segment. However currently we don't tell the linker that. Prior to binutils 2.28 that works OK. But since binutils commit 1a9ccd70f9a7 ("Fix the linker so that it will not silently generate ELF binaries with invalid program headers. Fix readelf to report such invalid binaries.") binutils tries to create a program header segment due to PT_INTERP, and the link fails because there is no space for it: ld: arch/powerpc/boot/zImage.pseries: Not enough room for program headers, try linking with -N ld: final link failed: Bad value So tell the linker not to do that, by passing --no-dynamic-linker. Cc: stable@vger.kernel.org Reported-by: Anton Blanchard <anton@samba.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Drop dependency on ld-version.sh and massage change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc/ftrace: Fix the comments for ftrace_modify_codeLibin2016-12-031-7/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no need to worry about module and __init text disappearing case, because that ftrace has a module notifier that is called when a module is being unloaded and before the text goes away and this code grabs the ftrace_lock mutex and removes the module functions from the ftrace list, such that it will no longer do any modifications to that module's text, the update to make functions be traced or not is done under the ftrace_lock mutex as well. And by now, __init section codes should not been modified by ftrace, because it is black listed in recordmcount.c and ignored by ftrace. Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Li Bin <huawei.libin@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc/perf: macros for power9 format encodingMadhavan Srinivasan2016-12-022-8/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch to add macros and contants to support the power9 raw event encoding format. Couple of functions added since some of the bits fields like PMCxCOMB and THRESH_CMP has different width and location within MMCR* in power9. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc/perf: power9 raw event format encodingMadhavan Srinivasan2016-12-021-0/+134
| | | | | | | | | | | | | | | | | | | | | | | | Patch to update the power9 raw event encoding format information and add support for the same in power9-pmu.c. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc/perf: update attribute_group data structureMadhavan Srinivasan2016-12-021-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | Rename the power_pmu and attribute_group variables that support PowerISA v2.07. Add a cpu feature flag check to pick the PowerISA v2.07 format structures to support. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc/perf: factor out the event format fieldMadhavan Srinivasan2016-12-023-70/+42
| | | | | | | | | | | | | | | | | | | | | Factor out the format field structure for PowerISA v2.07. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | powerpc/mm/iommu, vfio/spapr: Put pages on VFIO container shutdownAlexey Kardashevskiy2016-12-023-15/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At the moment the userspace tool is expected to request pinning of the entire guest RAM when VFIO IOMMU SPAPR v2 driver is present. When the userspace process finishes, all the pinned pages need to be put; this is done as a part of the userspace memory context (MM) destruction which happens on the very last mmdrop(). This approach has a problem that a MM of the userspace process may live longer than the userspace process itself as kernel threads use userspace process MMs which was runnning on a CPU where the kernel thread was scheduled to. If this happened, the MM remains referenced until this exact kernel thread wakes up again and releases the very last reference to the MM, on an idle system this can take even hours. This moves preregistered regions tracking from MM to VFIO; insteads of using mm_iommu_table_group_mem_t::used, tce_container::prereg_list is added so each container releases regions which it has pre-registered. This changes the userspace interface to return EBUSY if a memory region is already registered in a container. However it should not have any practical effect as the only userspace tool available now does register memory region once per container anyway. As tce_iommu_register_pages/tce_iommu_unregister_pages are called under container->lock, this does not need additional locking. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | vfio/spapr: Reference mm in tce_containerAlexey Kardashevskiy2016-12-021-60/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In some situations the userspace memory context may live longer than the userspace process itself so if we need to do proper memory context cleanup, we better have tce_container take a reference to mm_struct and use it later when the process is gone (@current or @current->mm is NULL). This references mm and stores the pointer in the container; this is done in a new helper - tce_iommu_mm_set() - when one of the following happens: - a container is enabled (IOMMU v1); - a first attempt to pre-register memory is made (IOMMU v2); - a DMA window is created (IOMMU v2). The @mm stays referenced till the container is destroyed. This replaces current->mm with container->mm everywhere except debug prints. This adds a check that current->mm is the same as the one stored in the container to prevent userspace from making changes to a memory context of other processes. DMA map/unmap ioctls() do not check for @mm as they already check for @enabled which is set after tce_iommu_mm_set() is called. This does not reference a task as multiple threads within the same mm are allowed to ioctl() to vfio and supposedly they will have same limits and capabilities and if they do not, we'll just fail with no harm made. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | vfio/spapr: Postpone default window creationAlexey Kardashevskiy2016-12-021-15/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We are going to allow the userspace to configure container in one memory context and pass container fd to another so we are postponing memory allocations accounted against the locked memory limit. One of previous patches took care of it_userspace. At the moment we create the default DMA window when the first group is attached to a container; this is done for the userspace which is not DDW-aware but familiar with the SPAPR TCE IOMMU v2 in the part of memory pre-registration - such client expects the default DMA window to exist. This postpones the default DMA window allocation till one of the folliwing happens: 1. first map/unmap request arrives; 2. new window is requested; This adds noop for the case when the userspace requested removal of the default window which has not been created yet. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | vfio/spapr: Add a helper to create default DMA windowAlexey Kardashevskiy2016-12-021-45/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is already a helper to create a DMA window which does allocate a table and programs it to the IOMMU group. However tce_iommu_take_ownership_ddw() did not use it and did these 2 calls itself to simplify error path. Since we are going to delay the default window creation till the default window is accessed/removed or new window is added, we need a helper to create a default window from all these cases. This adds tce_iommu_create_default_window(). Since it relies on a VFIO container to have at least one IOMMU group (for future use), this changes tce_iommu_attach_group() to add a group to the container first and then call the new helper. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>