diff options
author | Alexey Kardashevskiy <aik@ozlabs.ru> | 2022-06-29 16:06:14 +1000 |
---|---|---|
committer | Michael Ellerman <mpe@ellerman.id.au> | 2022-07-28 16:22:13 +1000 |
commit | b1fc44eaa9ba31e28c4125d6b9205a3582b47b5d (patch) | |
tree | 4d5d2d25d7a6755f123f93a5d7abc1873baeed8c /arch/powerpc/kexec | |
parent | c7255058b5430b5c42932383bd8887d591e7973a (diff) | |
download | linux-b1fc44eaa9ba31e28c4125d6b9205a3582b47b5d.tar.gz linux-b1fc44eaa9ba31e28c4125d6b9205a3582b47b5d.tar.bz2 linux-b1fc44eaa9ba31e28c4125d6b9205a3582b47b5d.zip |
pseries/iommu/ddw: Fix kdump to work in absence of ibm,dma-window
The pseries platform uses 32bit default DMA window (always 4K pages) and
optional 64bit DMA window available via DDW ("Dynamic DMA Windows"),
64K or 2M pages. For ages the default one was not removed and a huge
window was created in addition. Things changed with SRIOV-enabled
PowerVM which creates a default-and-bigger DMA window in 64bit space
(still using 4K pages) for IOV VFs so certain OSes do not need to use
the DDW API in order to utilize all available TCE budget.
Linux on the other hand removes the default window and creates a bigger
one (with more TCEs or/and a bigger page size - 64K/2M) in a bid to map
the entire RAM, and if the new window size is smaller than that - it
still uses this new bigger window. The result is that the default window
is removed but the "ibm,dma-window" property is not.
When kdump is invoked, the existing code tries reusing the existing 64bit
DMA window which location and parameters are stored in the device tree but
this fails as the new property does not make it to the kdump device tree
blob. So the code falls back to the default window which does not exist
anymore although the device tree says that it does. The result of that
is that PCI devices become unusable and cannot be used for kdumping.
This preserves the DMA64 and DIRECT64 properties in the device tree blob
for the crash kernel. Since the crash kernel setup is done after device
drivers are loaded and probed, the proper DMA config is stored at least
for boot time devices.
Because DDW window is optional and the code configures the default window
first, the existing code creates an IOMMU table descriptor for
the non-existing default DMA window. It is harmless for kdump as it does
not touch the actual window (only reads what is mapped and marks those IO
pages as used) but it is bad for kexec which clears it thinking it is
a smaller default window rather than a bigger DDW window.
This removes the "ibm,dma-window" property from the device tree after
a bigger window is created and the crash kernel setup picks it up.
Fixes: 381ceda88c4c ("powerpc/pseries/iommu: Make use of DDW for indirect mapping")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220629060614.1680476-1-aik@ozlabs.ru
Diffstat (limited to 'arch/powerpc/kexec')
-rw-r--r-- | arch/powerpc/kexec/file_load_64.c | 54 |
1 files changed, 54 insertions, 0 deletions
diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c index b4981b651d9a..5d2c22aa34fb 100644 --- a/arch/powerpc/kexec/file_load_64.c +++ b/arch/powerpc/kexec/file_load_64.c @@ -1038,6 +1038,48 @@ out: return ret; } +static int copy_property(void *fdt, int node_offset, const struct device_node *dn, + const char *propname) +{ + const void *prop, *fdtprop; + int len = 0, fdtlen = 0, ret; + + prop = of_get_property(dn, propname, &len); + fdtprop = fdt_getprop(fdt, node_offset, propname, &fdtlen); + + if (fdtprop && !prop) + ret = fdt_delprop(fdt, node_offset, propname); + else if (prop) + ret = fdt_setprop(fdt, node_offset, propname, prop, len); + + return ret; +} + +static int update_pci_dma_nodes(void *fdt, const char *dmapropname) +{ + struct device_node *dn; + int pci_offset, root_offset, ret = 0; + + if (!firmware_has_feature(FW_FEATURE_LPAR)) + return 0; + + root_offset = fdt_path_offset(fdt, "/"); + for_each_node_with_property(dn, dmapropname) { + pci_offset = fdt_subnode_offset(fdt, root_offset, of_node_full_name(dn)); + if (pci_offset < 0) + continue; + + ret = copy_property(fdt, pci_offset, dn, "ibm,dma-window"); + if (ret < 0) + break; + ret = copy_property(fdt, pci_offset, dn, dmapropname); + if (ret < 0) + break; + } + + return ret; +} + /** * setup_new_fdt_ppc64 - Update the flattend device-tree of the kernel * being loaded. @@ -1099,6 +1141,18 @@ int setup_new_fdt_ppc64(const struct kimage *image, void *fdt, if (ret < 0) goto out; +#define DIRECT64_PROPNAME "linux,direct64-ddr-window-info" +#define DMA64_PROPNAME "linux,dma64-ddr-window-info" + ret = update_pci_dma_nodes(fdt, DIRECT64_PROPNAME); + if (ret < 0) + goto out; + + ret = update_pci_dma_nodes(fdt, DMA64_PROPNAME); + if (ret < 0) + goto out; +#undef DMA64_PROPNAME +#undef DIRECT64_PROPNAME + /* Update memory reserve map */ ret = get_reserved_memory_ranges(&rmem); if (ret) |