| Commit message (Collapse) | Author | Age | Files | Lines |
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Highlights include:
- Support for the kexec_file_load() syscall, which is a prereq for
secure and trusted boot.
- Prevent kernel execution of userspace on P9 Radix (similar to
SMEP/PXN).
- Sort the exception tables at build time, to save time at boot, and
store them as relative offsets to save space in the kernel image &
memory.
- Allow building the kernel with thin archives, which should allow us
to build an allyesconfig once some other fixes land.
- Build fixes to allow us to correctly rebuild when changing the
kernel endian from big to little or vice versa.
- Plumbing so that we can avoid doing a full mm TLB flush on P9
Radix.
- Initial stack protector support (-fstack-protector).
- Support for dumping the radix (aka. Linux) and hash page tables via
debugfs.
- Fix an oops in cxl coredump generation when cxl_get_fd() is used.
- Freescale updates from Scott: "Highlights include 8xx hugepage
support, qbman fixes/cleanup, device tree updates, and some misc
cleanup."
- Many and varied fixes and minor enhancements as always.
Thanks to:
Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Anshuman
Khandual, Anton Blanchard, Balbir Singh, Bartlomiej Zolnierkiewicz,
Christophe Jaillet, Christophe Leroy, Denis Kirjanov, Elimar
Riesebieter, Frederic Barrat, Gautham R. Shenoy, Geliang Tang, Geoff
Levand, Jack Miller, Johan Hovold, Lars-Peter Clausen, Libin,
Madhavan Srinivasan, Michael Neuling, Nathan Fontenot, Naveen N.
Rao, Nicholas Piggin, Pan Xinhui, Peter Senna Tschudin, Rashmica
Gupta, Rui Teng, Russell Currey, Scott Wood, Simon Guo, Suraj
Jitindar Singh, Thiago Jung Bauermann, Tobias Klauser, Vaibhav Jain"
[ And thanks to Michael, who took time off from a new baby to get this
pull request done. - Linus ]
* tag 'powerpc-4.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (174 commits)
powerpc/fsl/dts: add FMan node for t1042d4rdb
powerpc/fsl/dts: add sg_2500_aqr105_phy4 alias on t1024rdb
powerpc/fsl/dts: add QMan and BMan nodes on t1024
powerpc/fsl/dts: add QMan and BMan nodes on t1023
soc/fsl/qman: test: use DEFINE_SPINLOCK()
powerpc/fsl-lbc: use DEFINE_SPINLOCK()
powerpc/8xx: Implement support of hugepages
powerpc: get hugetlbpage handling more generic
powerpc: port 64 bits pgtable_cache to 32 bits
powerpc/boot: Request no dynamic linker for boot wrapper
soc/fsl/bman: Use resource_size instead of computation
soc/fsl/qe: use builtin_platform_driver
powerpc/fsl_pmc: use builtin_platform_driver
powerpc/83xx/suspend: use builtin_platform_driver
powerpc/ftrace: Fix the comments for ftrace_modify_code
powerpc/perf: macros for power9 format encoding
powerpc/perf: power9 raw event format encoding
powerpc/perf: update attribute_group data structure
powerpc/perf: factor out the event format field
powerpc/mm/iommu, vfio/spapr: Put pages on VFIO container shutdown
...
|
| |\
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next
Freescale updates from Scott:
"Highlights include 8xx hugepage support, qbman fixes/cleanup, device
tree updates, and some misc cleanup."
|
| | |
| | |
| | |
| | |
| | | |
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The alias is used by the boot loader to perform a device tree
fixup.
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | | |
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | | |
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | | |
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | | |
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
8xx uses a two level page table with two different linux page size
support (4k and 16k). 8xx also support two different hugepage sizes
512k and 8M. In order to support them on linux we define two different
page table layout.
The size of pages is in the PGD entry, using PS field (bits 28-29):
00 : Small pages (4k or 16k)
01 : 512k pages
10 : reserved
11 : 8M pages
For 512K hugepage size a pgd entry have the below format
[<hugepte address >0101] . The hugepte table allocated will contain 8
entries pointing to 512K huge pte in 4k pages mode and 64 entries in
16k pages mode.
For 8M in 16k mode, a pgd entry have the below format
[<hugepte address >1101] . The hugepte table allocated will contain 8
entries pointing to 8M huge pte.
For 8M in 4k mode, multiple pgd entries point to the same hugepte
address and pgd entry will have the below format
[<hugepte address>1101]. The hugepte table allocated will only have one
entry.
For the time being, we do not support CPU15 ERRATA when HUGETLB is
selected
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> (v3, for the generic bits)
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Today there are two implementations of hugetlbpages which are managed
by exclusive #ifdefs:
* FSL_BOOKE: several directory entries points to the same single hugepage
* BOOK3S: one upper level directory entry points to a table of hugepages
In preparation of implementation of hugepage support on the 8xx, we
need a mix of the two above solutions, because the 8xx needs both cases
depending on the size of pages:
* In 4k page size mode, each PGD entry covers a 4M bytes area. It means
that 2 PGD entries will be necessary to cover an 8M hugepage while a
single PGD entry will cover 8x 512k hugepages.
* In 16 page size mode, each PGD entry covers a 64M bytes area. It means
that 8x 8M hugepages will be covered by one PGD entry and 64x 512k
hugepages will be covers by one PGD entry.
This patch:
* removes #ifdefs in favor of if/else based on the range sizes
* merges the two huge_pte_alloc() functions as they are pretty similar
* merges the two hugetlbpage_init() functions as they are pretty similar
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> (v3)
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Today powerpc64 uses a set of pgtable_caches while powerpc32 uses
standard pages when using 4k pages and a single pgtable_cache
if using other size pages.
In preparation of implementing huge pages on the 8xx, this patch
replaces the specific powerpc32 handling by the 64 bits approach.
This is done by:
* moving 64 bits pgtable_cache_add() and pgtable_cache_init()
in a new file called init-common.c
* modifying pgtable_cache_init() to also handle the case
without PMD
* removing the 32 bits version of pgtable_cache_add() and
pgtable_cache_init()
* copying related header contents from 64 bits into both the
book3s/32 and nohash/32 header files
On the 8xx, the following cache sizes will be used:
* 4k pages mode:
- PGT_CACHE(10) for PGD
- PGT_CACHE(3) for 512k hugepage tables
* 16k pages mode:
- PGT_CACHE(6) for PGD
- PGT_CACHE(7) for 512k hugepage tables
- PGT_CACHE(3) for 8M hugepage tables
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Use resource_size function on resource object
instead of explicit computation.
Generated by: scripts/coccinelle/api/resource_size.cocci
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Use builtin_platform_driver() helper to simplify the code.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Use builtin_platform_driver() helper to simplify the code.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Use builtin_platform_driver() helper to simplify the code.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The hardware descriptors have big endian (BE) format.
Provide proper endianness handling for the remaining
descriptor fields, to ensure they are correctly
accessed by non-BE CPUs too.
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | | |
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
There are multiple occurences of both contextB and context_b
in different h/w descriptors, referring to the same descriptor
field known as "Context B". Stick with the "context_b" naming,
for obvious reasons including consistency (see also context_a).
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | | |
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
ORP ("Order Restoration Point") mechanism not supported.
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Preventively mask every access to the 'fqid' h/w field,
since it is defined as a 24-bit field, for every h/w
descriptor. Add generic accessors for this field to
ensure correct access.
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
1. qm_mcc_querywq layout not used for now, so drop it;
2. queryfq, queryfq_np and alterfq are used only for accesses to
the 'fqid' field, so replace these with a generic 'fq' layout.
As a consequence, 'querycgr' turns into 'cgr' following the
same reasoning above and for consistent naming.
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
opts is checked redundantly.
Move local_opts declaration inside its usage scope.
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Replace dummy platform device hack with a reference to a portal's
platform device, in order to dma map the test frame for this
small unit test. The 2 qman symbols need to be exported because
this self test is a kernel module.
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The qman portals are platform devices themselves, so they should
handle dma mappings. Creating a dummy platform device in order to
support dma mapping operations is not justified (and not portable).
Instead, do the mapping against the first portal that has been
initialised.
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This function must only return the truth value of whether
two frame descriptors are different or not.
It does NOT have to compute some obscure difference between
fd fields and return it as an int, making sparse complain
about type conversions in the process.
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Use the proper accessor to get the FD address.
Accessing the internal field "addr_lo" directly is not portable
and error prone.
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
context_a.hi is 32bit
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
In case init_pcfg() returns with error the CI region
must be unmapped too.
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
qman_query_fq*() may return other error codes apart from
-ERANGE, in which cases the error handling done by the
resource cleanup callers would be wrong. The patch
fixes the handling of those cases, and cleans up related
code inside the resource cleanup & release handlers (i.e.
replace hardcoded fqid value with corresponding define).
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Use arch portable of_property_read_u32() instead, which takes
care of endianness conversions.
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | | |
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
These config changes build:
drivers/power/reset/gpio-poweroff.c
drivers/power/reset/gpio-restart.c
Signed-off-by: Andy Fleming <afleming@gmail.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Use of_property_read_u32 instead of the generic of_get_property to
simplify the code. In addition move the declaration of fs_baudrate
into get_baudrate because it's private to this function.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Use of_property_read_u32 instead of the generic of_get_property to
simplify the code. In addition move the declaration of brgfreq
into get_brgfreq because it's private to this function.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
[scottwood: minor whitespace fixes]
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Use of_property_read_u32 instead of the generic of_get_property to
simplify the code. In addition move the declaration of sysfreq
into fsl_get_sys_freq because it's private to this function.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The QEMU e500 board needs to enable CONFIG_E500 to correctly boot. QEMU
for ppc64 uses e5500/e6500 emulation, thus CONFIG_PPC_E500MC is required
as well.
Signed-off-by: David Engraf <david.engraf@sysgo.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Trivial fix to spelling mistake "uncommited" to "uncommitted" in
critical error messages.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The T4240RDB contains a W83793 hardware monitoring chip. Add a device
tree entry to make the driver attach to it, as the i2c-mpc bus driver
dropped support for class-based instantiation of devices a long time
ago.
Signed-off-by: Florian Larysch <fl@n621.de>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | | |
Signed-off-by: Florian Larysch <fl@n621.de>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Scott Wood <oss@buserror.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The boot wrapper performs its own relocations and does not require
PT_INTERP segment. However currently we don't tell the linker that.
Prior to binutils 2.28 that works OK. But since binutils commit
1a9ccd70f9a7 ("Fix the linker so that it will not silently generate ELF
binaries with invalid program headers. Fix readelf to report such
invalid binaries.") binutils tries to create a program header segment
due to PT_INTERP, and the link fails because there is no space for it:
ld: arch/powerpc/boot/zImage.pseries: Not enough room for program headers, try linking with -N
ld: final link failed: Bad value
So tell the linker not to do that, by passing --no-dynamic-linker.
Cc: stable@vger.kernel.org
Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Drop dependency on ld-version.sh and massage change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
There is no need to worry about module and __init text disappearing
case, because that ftrace has a module notifier that is called when a
module is being unloaded and before the text goes away and this code
grabs the ftrace_lock mutex and removes the module functions from the
ftrace list, such that it will no longer do any modifications to that
module's text, the update to make functions be traced or not is done
under the ftrace_lock mutex as well. And by now, __init section codes
should not been modified by ftrace, because it is black listed in
recordmcount.c and ignored by ftrace.
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Li Bin <huawei.libin@huawei.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Patch to add macros and contants to support the power9 raw
event encoding format. Couple of functions added since some of the
bits fields like PMCxCOMB and THRESH_CMP has different width and location
within MMCR* in power9.
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Patch to update the power9 raw event encoding format
information and add support for the same in power9-pmu.c.
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Rename the power_pmu and attribute_group variables that
support PowerISA v2.07. Add a cpu feature flag check to pick
the PowerISA v2.07 format structures to support.
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Factor out the format field structure for PowerISA v2.07.
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
At the moment the userspace tool is expected to request pinning of
the entire guest RAM when VFIO IOMMU SPAPR v2 driver is present.
When the userspace process finishes, all the pinned pages need to
be put; this is done as a part of the userspace memory context (MM)
destruction which happens on the very last mmdrop().
This approach has a problem that a MM of the userspace process
may live longer than the userspace process itself as kernel threads
use userspace process MMs which was runnning on a CPU where
the kernel thread was scheduled to. If this happened, the MM remains
referenced until this exact kernel thread wakes up again
and releases the very last reference to the MM, on an idle system this
can take even hours.
This moves preregistered regions tracking from MM to VFIO; insteads of
using mm_iommu_table_group_mem_t::used, tce_container::prereg_list is
added so each container releases regions which it has pre-registered.
This changes the userspace interface to return EBUSY if a memory
region is already registered in a container. However it should not
have any practical effect as the only userspace tool available now
does register memory region once per container anyway.
As tce_iommu_register_pages/tce_iommu_unregister_pages are called
under container->lock, this does not need additional locking.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
In some situations the userspace memory context may live longer than
the userspace process itself so if we need to do proper memory context
cleanup, we better have tce_container take a reference to mm_struct and
use it later when the process is gone (@current or @current->mm is NULL).
This references mm and stores the pointer in the container; this is done
in a new helper - tce_iommu_mm_set() - when one of the following happens:
- a container is enabled (IOMMU v1);
- a first attempt to pre-register memory is made (IOMMU v2);
- a DMA window is created (IOMMU v2).
The @mm stays referenced till the container is destroyed.
This replaces current->mm with container->mm everywhere except debug
prints.
This adds a check that current->mm is the same as the one stored in
the container to prevent userspace from making changes to a memory
context of other processes.
DMA map/unmap ioctls() do not check for @mm as they already check
for @enabled which is set after tce_iommu_mm_set() is called.
This does not reference a task as multiple threads within the same mm
are allowed to ioctl() to vfio and supposedly they will have same limits
and capabilities and if they do not, we'll just fail with no harm made.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
We are going to allow the userspace to configure container in
one memory context and pass container fd to another so
we are postponing memory allocations accounted against
the locked memory limit. One of previous patches took care of
it_userspace.
At the moment we create the default DMA window when the first group is
attached to a container; this is done for the userspace which is not
DDW-aware but familiar with the SPAPR TCE IOMMU v2 in the part of memory
pre-registration - such client expects the default DMA window to exist.
This postpones the default DMA window allocation till one of
the folliwing happens:
1. first map/unmap request arrives;
2. new window is requested;
This adds noop for the case when the userspace requested removal
of the default window which has not been created yet.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
There is already a helper to create a DMA window which does allocate
a table and programs it to the IOMMU group. However
tce_iommu_take_ownership_ddw() did not use it and did these 2 calls
itself to simplify error path.
Since we are going to delay the default window creation till
the default window is accessed/removed or new window is added,
we need a helper to create a default window from all these cases.
This adds tce_iommu_create_default_window(). Since it relies on
a VFIO container to have at least one IOMMU group (for future use),
this changes tce_iommu_attach_group() to add a group to the container
first and then call the new helper.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|