summaryrefslogtreecommitdiffstats
path: root/drivers/pci
Commit message (Collapse)AuthorAgeFilesLines
* PCI: Use static const struct, not const static structKrzysztof Wilczynski2019-10-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 8050f3f6645ae0f7e4c1304593f6f7eb2ee7d85c ] Move the static keyword to the front of declarations of pci_regs_behavior[] and pcie_cap_regs_behavior[], which resolves compiler warnings when building with "W=1": drivers/pci/pci-bridge-emul.c:41:1: warning: ‘static’ is not at beginning of declaration [-Wold-style-declaration] const static struct pci_bridge_reg_behavior pci_regs_behavior[] = { ^ drivers/pci/pci-bridge-emul.c:176:1: warning: ‘static’ is not at beginning of declaration [-Wold-style-declaration] const static struct pci_bridge_reg_behavior pcie_cap_regs_behavior[] = { ^ Link: https://lore.kernel.org/r/20190826151436.4672-1-kw@linux.com Link: https://lore.kernel.org/r/20190828131733.5817-1-kw@linux.com Signed-off-by: Krzysztof Wilczynski <kw@linux.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* PCI: exynos: Propagate errors for optional PHYsThierry Reding2019-10-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit ddd6960087d4b45759434146d681a94bbb1c54ad ] devm_of_phy_get() can fail for a number of reasons besides probe deferral. It can for example return -ENOMEM if it runs out of memory as it tries to allocate devres structures. Propagating only -EPROBE_DEFER is problematic because it results in these legitimately fatal errors being treated as "PHY not specified in DT". What we really want is to ignore the optional PHYs only if they have not been specified in DT. devm_of_phy_get() returns -ENODEV in this case, so that's the special case that we need to handle. So we propagate all errors, except -ENODEV, so that real failures will still cause the driver to fail probe. Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Andrew Murray <andrew.murray@arm.com> Cc: Jingoo Han <jingoohan1@gmail.com> Cc: Kukjin Kim <kgene@kernel.org> Cc: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* PCI: imx6: Propagate errors for optional regulatorsThierry Reding2019-10-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 2170a09fb4b0f66e06e5bcdcbc98c9ccbf353650 ] regulator_get_optional() can fail for a number of reasons besides probe deferral. It can for example return -ENOMEM if it runs out of memory as it tries to allocate data structures. Propagating only -EPROBE_DEFER is problematic because it results in these legitimately fatal errors being treated as "regulator not specified in DT". What we really want is to ignore the optional regulators only if they have not been specified in DT. regulator_get_optional() returns -ENODEV in this case, so that's the special case that we need to handle. So we propagate all errors, except -ENODEV, so that real failures will still cause the driver to fail probe. Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Andrew Murray <andrew.murray@arm.com> Cc: Richard Zhu <hongxing.zhu@nxp.com> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Shawn Guo <shawnguo@kernel.org> Cc: Sascha Hauer <s.hauer@pengutronix.de> Cc: Fabio Estevam <festevam@gmail.com> Cc: kernel@pengutronix.de Cc: linux-imx@nxp.com Signed-off-by: Sasha Levin <sashal@kernel.org>
* PCI: histb: Propagate errors for optional regulatorsThierry Reding2019-10-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 8f9e1641ba445437095411d9fda2324121110d5d ] regulator_get_optional() can fail for a number of reasons besides probe deferral. It can for example return -ENOMEM if it runs out of memory as it tries to allocate data structures. Propagating only -EPROBE_DEFER is problematic because it results in these legitimately fatal errors being treated as "regulator not specified in DT". What we really want is to ignore the optional regulators only if they have not been specified in DT. regulator_get_optional() returns -ENODEV in this case, so that's the special case that we need to handle. So we propagate all errors, except -ENODEV, so that real failures will still cause the driver to fail probe. Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Andrew Murray <andrew.murray@arm.com> Cc: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* PCI: rockchip: Propagate errors for optional regulatorsThierry Reding2019-10-071-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 0e3ff0ac5f71bdb6be2a698de0ed0c7e6e738269 ] regulator_get_optional() can fail for a number of reasons besides probe deferral. It can for example return -ENOMEM if it runs out of memory as it tries to allocate data structures. Propagating only -EPROBE_DEFER is problematic because it results in these legitimately fatal errors being treated as "regulator not specified in DT". What we really want is to ignore the optional regulators only if they have not been specified in DT. regulator_get_optional() returns -ENODEV in this case, so that's the special case that we need to handle. So we propagate all errors, except -ENODEV, so that real failures will still cause the driver to fail probe. Tested-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Andrew Murray <andrew.murray@arm.com> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Acked-by: Shawn Lin <shawn.lin@rock-chips.com> Cc: Shawn Lin <shawn.lin@rock-chips.com> Cc: Heiko Stuebner <heiko@sntech.de> Cc: linux-rockchip@lists.infradead.org Signed-off-by: Sasha Levin <sashal@kernel.org>
* PCI: Add pci_info_ratelimited() to ratelimit PCI separatelyKrzysztof Wilczynski2019-10-071-2/+2
| | | | | | | | | | | | | | | | [ Upstream commit 7f1c62c443a453deb6eb3515e3c05650ffe0dcf0 ] Do not use printk_ratelimit() in drivers/pci/pci.c as it shares the rate limiting state with all other callers to the printk_ratelimit(). Add pci_info_ratelimited() (similar to pci_notice_ratelimited() added in the commit a88a7b3eb076 ("vfio: Use dev_printk() when possible")) and use it instead of printk_ratelimit() + pci_info(). Link: https://lore.kernel.org/r/20190825224616.8021-1-kw@linux.com Signed-off-by: Krzysztof Wilczynski <kw@linux.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* PCI: layerscape: Add the bar_fixed_64bit property to the endpoint driverXiaowei Bao2019-10-071-0/+1
| | | | | | | | | | | | | | | | | [ Upstream commit fd5d16531a39322c3d7433d9f8a36203c9aaeddc ] The layerscape PCIe controller have 4 BARs. BAR0 and BAR1 are 32bit, BAR2 and BAR4 are 64bit and that's a fixed hardware configuration. Set the bar_fixed_64bit variable accordingly. Signed-off-by: Xiaowei Bao <xiaowei.bao@nxp.com> [lorenzo.pieralisi@arm.com: commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* PCI: pci-hyperv: Fix build errors on non-SYSFS configRandy Dunlap2019-10-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit f58ba5e3f6863ea4486952698898848a6db726c2 ] Fix build errors when building almost-allmodconfig but with SYSFS not set (not enabled). Fixes these build errors: ERROR: "pci_destroy_slot" [drivers/pci/controller/pci-hyperv.ko] undefined! ERROR: "pci_create_slot" [drivers/pci/controller/pci-hyperv.ko] undefined! drivers/pci/slot.o is only built when SYSFS is enabled, so pci-hyperv.o has an implicit dependency on SYSFS. Make that explicit. Also, depending on X86 && X86_64 is not needed, so just change that to depend on X86_64. Fixes: a15f2c08c708 ("PCI: hv: support reporting serial number as slot information") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Jake Oshins <jakeo@microsoft.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Sasha Levin <sashal@kernel.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: linux-pci@vger.kernel.org Cc: linux-hyperv@vger.kernel.org Cc: Dexuan Cui <decui@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* PCI: tegra: Fix OF node reference leakNishka Dasgupta2019-10-071-7/+15
| | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 9e38e690ace3e7a22a81fc02652fc101efb340cf ] Each iteration of for_each_child_of_node() executes of_node_put() on the previous node, but in some return paths in the middle of the loop of_node_put() is missing thus causing a reference leak. Hence stash these mid-loop return values in a variable 'err' and add a new label err_node_put which executes of_node_put() on the previous node and returns 'err' on failure. Change mid-loop return statements to point to jump to this label to fix the reference leak. Issue found with Coccinelle. Signed-off-by: Nishka Dasgupta <nishkadg.linux@gmail.com> [lorenzo.pieralisi@arm.com: rewrote commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* PCI: rpaphp: Avoid a sometimes-uninitialized warningNathan Chancellor2019-10-071-11/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 0df3e42167caaf9f8c7b64de3da40a459979afe8 ] When building with -Wsometimes-uninitialized, clang warns: drivers/pci/hotplug/rpaphp_core.c:243:14: warning: variable 'fndit' is used uninitialized whenever 'for' loop exits because its condition is false [-Wsometimes-uninitialized] for (j = 0; j < entries; j++) { ^~~~~~~~~~~ drivers/pci/hotplug/rpaphp_core.c:256:6: note: uninitialized use occurs here if (fndit) ^~~~~ drivers/pci/hotplug/rpaphp_core.c:243:14: note: remove the condition if it is always true for (j = 0; j < entries; j++) { ^~~~~~~~~~~ drivers/pci/hotplug/rpaphp_core.c:233:14: note: initialize the variable 'fndit' to silence this warning int j, fndit; ^ = 0 fndit is only used to gate a sprintf call, which can be moved into the loop to simplify the code and eliminate the local variable, which will fix this warning. Fixes: 2fcf3ae508c2 ("hotplug/drc-info: Add code to search ibm,drc-info property") Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com> Acked-by: Joel Savitz <jsavitz@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://github.com/ClangBuiltLinux/linux/issues/504 Link: https://lore.kernel.org/r/20190603221157.58502-1-natechancellor@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
* PCI: Always allow probing with driver_overrideAlex Williamson2019-09-191-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 2d2f4273cbe9058d1f5a518e5e880d27d7b3b30f upstream. Commit 0e7df22401a3 ("PCI: Add sysfs sriov_drivers_autoprobe to control VF driver binding") introduced the sriov_drivers_autoprobe attribute which allows users to prevent the kernel from automatically probing a driver for new VFs as they are created. This allows VFs to be spawned without automatically binding the new device to a host driver, such as in cases where the user intends to use the device only with a meta driver like vfio-pci. However, the current implementation prevents any use of drivers_probe with the VF while sriov_drivers_autoprobe=0. This blocks the now current general practice of setting driver_override followed by using drivers_probe to bind a device to a specified driver. The kernel never automatically sets a driver_override therefore it seems we can assume a driver_override reflects the intent of the user. Also, probing a device using a driver_override match seems outside the scope of the 'auto' part of sriov_drivers_autoprobe. Therefore, let's allow driver_override matches regardless of sriov_drivers_autoprobe, which we can do by simply testing if a driver_override is set for a device as a 'can probe' condition. Fixes: 0e7df22401a3 ("PCI: Add sysfs sriov_drivers_autoprobe to control VF driver binding") Link: https://lore.kernel.org/lkml/155742996741.21878.569845487290798703.stgit@gimli.home Link: https://lore.kernel.org/linux-pci/155672991496.20698.4279330795743262888.stgit@gimli.home/T/#u Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Revert "PCI: Add missing link delays required by the PCIe spec"Mika Westerberg2019-08-163-86/+10
| | | | | | | | | | | | | | | | | | | | | | commit 0617bdede5114a0002298b12cd0ca2b0cfd0395d upstream. Commit c2bf1fc212f7 ("PCI: Add missing link delays required by the PCIe spec") turned out causing issues with some systems either by making them unresponsive or slowing down runtime and system wide resume of PCIe devices. While root cause for the unresponsiveness is still under investigation given the amount of issues reported better to revert it for now. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204413 Link: https://lore.kernel.org/linux-pci/SL2P216MB01878BBCD75F21D882AEEA2880C60@SL2P216MB0187.KORP216.PROD.OUTLOOK.COM/ Link: https://lore.kernel.org/linux-pci/2857501d-c167-547d-c57d-d5d24ea1f1dc@molgen.mpg.de/ Reported-by: Matthias Andree <matthias.andree@gmx.de> Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Reported-by: Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* PCI: OF: Initialize dev->fwnode appropriatelyJean-Philippe Brucker2019-08-061-0/+8
| | | | | | | | | | | | | [ Upstream commit 59b099a6c75e4ddceeaf9676422d8d91d0049755 ] For PCI devices that have an OF node, set the fwnode as well. This way drivers that rely on fwnode don't need the special case described by commit f94277af03ea ("of/platform: Initialise dev->fwnode appropriately"). Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* PCI: dwc: pci-dra7xx: Fix compilation when !CONFIG_GPIOLIBYueHaibing2019-07-311-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 381ed79c8655a40268ee7391f716edd90c5c3a97 ] If CONFIG_GPIOLIB is not selected the compilation results in the following build errors: drivers/pci/controller/dwc/pci-dra7xx.c: In function dra7xx_pcie_probe: drivers/pci/controller/dwc/pci-dra7xx.c:777:10: error: implicit declaration of function devm_gpiod_get_optional; did you mean devm_regulator_get_optional? [-Werror=implicit-function-declaration] reset = devm_gpiod_get_optional(dev, NULL, GPIOD_OUT_HIGH); drivers/pci/controller/dwc/pci-dra7xx.c:778:45: error: ‘GPIOD_OUT_HIGH’ undeclared (first use in this function); did you mean ‘GPIOF_INIT_HIGH’? reset = devm_gpiod_get_optional(dev, NULL, GPIOD_OUT_HIGH); ^~~~~~~~~~~~~~ GPIOF_INIT_HIGH Fix them by including the appropriate header file. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> [lorenzo.pieralisi@arm.com: commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* PCI: mobiveil: Use the 1st inbound window for MEM inbound transactionsHou Zhiqiang2019-07-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit f7fee1b42fe4f8171a4b1cad05c61907c33c53f6 ] The inbound and outbound windows have completely separate control registers sets in the host controller MMIO space. Windows control register are accessed through an MMIO base address and an offset that depends on the window index. Since inbound and outbound windows control registers are completely separate there is no real need to use different window indexes in the inbound/outbound windows initialization routines to prevent clashing. To fix this inconsistency, change the MEM inbound window index to 0, mirroring the outbound window set-up. Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> [lorenzo.pieralisi@arm.com: update commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Minghuan Lian <Minghuan.Lian@nxp.com> Reviewed-by: Subrahmanya Lingappa <l.subrahmanya@mobiveil.co.in> Signed-off-by: Sasha Levin <sashal@kernel.org>
* PCI: mobiveil: Initialize Primary/Secondary/Subordinate bus numbersHou Zhiqiang2019-07-311-0/+6
| | | | | | | | | | | | | | | | [ Upstream commit 6f3ab451aa5c2cbff33197d82fe8489cbd55ad91 ] The reset value of Primary, Secondary and Subordinate bus numbers is zero which is a broken setup. Program a sensible default value for Primary/Secondary/Subordinate bus numbers. Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Minghuan Lian <Minghuan.Lian@nxp.com> Reviewed-by: Subrahmanya Lingappa <l.subrahmanya@mobiveil.co.in> Signed-off-by: Sasha Levin <sashal@kernel.org>
* PCI: mobiveil: Fix the Class Code fieldHou Zhiqiang2019-07-311-3/+6
| | | | | | | | | | | | | | | | | [ Upstream commit 0122af0a08243f344a438f924e5c2486486555b3 ] Fix up the Class Code field in PCI configuration space and set it to PCI_CLASS_BRIDGE_PCI. Move the Class Code fixup to function mobiveil_host_init() where it belongs. Fixes: 9af6bcb11e12 ("PCI: mobiveil: Add Mobiveil PCIe Host Bridge IP driver") Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Minghuan Lian <Minghuan.Lian@nxp.com> Reviewed-by: Subrahmanya Lingappa <l.subrahmanya@mobiveil.co.in> Signed-off-by: Sasha Levin <sashal@kernel.org>
* PCI: mobiveil: Fix PCI base address in MEM/IO outbound windowsHou Zhiqiang2019-07-311-2/+3
| | | | | | | | | | | | | | | | | | [ Upstream commit f99536e9d2f55996038158a6559d4254a7cc1693 ] The outbound memory windows PCI base addresses should be taken from the 'ranges' property of DT node to setup MEM/IO outbound windows decoding correctly instead of being hardcoded to zero. Update the code to retrieve the PCI base address for each range and use it to program the outbound windows address decoders Fixes: 9af6bcb11e12 ("PCI: mobiveil: Add Mobiveil PCIe Host Bridge IP driver") Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Minghuan Lian <Minghuan.Lian@nxp.com> Reviewed-by: Subrahmanya Lingappa <l.subrahmanya@mobiveil.co.in> Signed-off-by: Sasha Levin <sashal@kernel.org>
* PCI: xilinx-nwl: Fix Multi MSI data programmingBharat Kumar Gogada2019-07-311-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 181fa434d0514e40ebf6e9721f2b72700287b6e2 ] According to the PCI Local Bus specification Revision 3.0, section 6.8.1.3 (Message Control for MSI), endpoints that are Multiple Message Capable as defined by bits [3:1] in the Message Control for MSI can request a number of vectors that is power of two aligned. As specified in section 6.8.1.6 "Message data for MSI", the Multiple Message Enable field (bits [6:4] of the Message Control register) defines the number of low order message data bits the function is permitted to modify to generate its system software allocated vectors. The MSI controller in the Xilinx NWL PCIe controller supports a number of MSI vectors specified through a bitmap and the hwirq number for an MSI, that is the value written in the MSI data TLP is determined by the bitmap allocation. For instance, in a situation where two endpoints sitting on the PCI bus request the following MSI configuration, with the current PCI Xilinx bitmap allocation code (that does not align MSI vector allocation on a power of two boundary): Endpoint #1: Requesting 1 MSI vector - allocated bitmap bits 0 Endpoint #2: Requesting 2 MSI vectors - allocated bitmap bits [1,2] The bitmap value(s) corresponds to the hwirq number that is programmed into the Message Data for MSI field in the endpoint MSI capability and is detected by the root complex to fire the corresponding MSI irqs. The value written in Message Data for MSI field corresponds to the first bit allocated in the bitmap for Multi MSI vectors. The current Xilinx NWL MSI allocation code allows a bitmap allocation that is not a power of two boundaries, so endpoint #2, is allowed to toggle Message Data bit[0] to differentiate between its two vectors (meaning that the MSI data will be respectively 0x0 and 0x1 for the two vectors allocated to endpoint #2). This clearly aliases with the Endpoint #1 vector allocation, resulting in a broken Multi MSI implementation. Update the code to allocate MSI bitmap ranges with a power of two alignment, fixing the bug. Fixes: ab597d35ef11 ("PCI: xilinx-nwl: Add support for Xilinx NWL PCIe Host Controller") Suggested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com> [lorenzo.pieralisi@arm.com: updated commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* PCI: sysfs: Ignore lockdep for remove attributeMarek Vasut2019-07-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit dc6b698a86fe40a50525433eb8e92a267847f6f9 ] With CONFIG_PROVE_LOCKING=y, using sysfs to remove a bridge with a device below it causes a lockdep warning, e.g., # echo 1 > /sys/class/pci_bus/0000:00/device/0000:00:00.0/remove ============================================ WARNING: possible recursive locking detected ... pci_bus 0000:01: busn_res: [bus 01] is released The remove recursively removes the subtree below the bridge. Each call uses a different lock so there's no deadlock, but the locks were all created with the same lockdep key so the lockdep checker can't tell them apart. Mark the "remove" sysfs attribute with __ATTR_IGNORE_LOCKDEP() as it is safe to ignore the lockdep check between different "remove" kernfs instances. There's discussion about a similar issue in USB at [1], which resulted in 356c05d58af0 ("sysfs: get rid of some lockdep false positives") and e9b526fe7048 ("i2c: suppress lockdep warning on delete_device"), which do basically the same thing for USB "remove" and i2c "delete_device" files. [1] https://lore.kernel.org/r/Pine.LNX.4.44L0.1204251436140.1206-100000@iolanthe.rowland.org Link: https://lore.kernel.org/r/20190526225151.3865-1-marek.vasut@gmail.com Signed-off-by: Marek Vasut <marek.vasut+renesas@gmail.com> [bhelgaas: trim commit log, details at above links] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Phil Edworthy <phil.edworthy@renesas.com> Cc: Simon Horman <horms+renesas@verge.net.au> Cc: Tejun Heo <tj@kernel.org> Cc: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
* PCI: endpoint: Allocate enough space for fixed size BARAlan Mikhak2019-07-311-1/+7
| | | | | | | | | | | | | | | | [ Upstream commit f16fb16ed16c7f561e9c41c9ae4107c7f6aa553c ] PCI endpoint test function code should honor the .bar_fixed_size parameter from underlying endpoint controller drivers or results may be unexpected. In pci_epf_test_alloc_space(), check if BAR being used for test register space is a fixed size BAR. If so, allocate the required fixed size. Signed-off-by: Alan Mikhak <alan.mikhak@sifive.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* PCI: Return error if cannot probe VFAlex Williamson2019-07-311-6/+7
| | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 76002d8b48c4b08c9bd414517dd295e132ad910b ] Commit 0e7df22401a3 ("PCI: Add sysfs sriov_drivers_autoprobe to control VF driver binding") allows the user to specify that drivers for VFs of a PF should not be probed, but it actually causes pci_device_probe() to return success back to the driver core in this case. Therefore by all sysfs appearances the device is bound to a driver, the driver link from the device exists as does the device link back from the driver, yet the driver's probe function is never called on the device. We also fail to do any sort of cleanup when we're prohibited from probing the device, the IRQ setup remains in place and we even hold a device reference. Instead, abort with errno before any setup or references are taken when pci_device_can_probe() prevents us from trying to probe the device. Link: https://lore.kernel.org/lkml/155672991496.20698.4279330795743262888.stgit@gimli.home Fixes: 0e7df22401a3 ("PCI: Add sysfs sriov_drivers_autoprobe to control VF driver binding") Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* PCI: qcom: Ensure that PERST is asserted for at least 100 msNiklas Cassel2019-07-261-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 64adde31c8e996a6db6f7a1a4131180e363aa9f2 upstream. Currently, there is only a 1 ms sleep after asserting PERST. Reading the datasheets for different endpoints, some require PERST to be asserted for 10 ms in order for the endpoint to perform a reset, others require it to be asserted for 50 ms. Several SoCs using this driver uses PCIe Mini Card, where we don't know what endpoint will be plugged in. The PCI Express Card Electromechanical Specification r2.0, section 2.2, "PERST# Signal" specifies: "On power up, the deassertion of PERST# is delayed 100 ms (TPVPERL) from the power rails achieving specified operating limits." Add a sleep of 100 ms before deasserting PERST, in order to ensure that we are compliant with the spec. Fixes: 82a823833f4e ("PCI: qcom: Add Qualcomm PCIe controller driver") Signed-off-by: Niklas Cassel <niklas.cassel@linaro.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Stanimir Varbanov <svarbanov@mm-sol.com> Cc: stable@vger.kernel.org # 4.5+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* PCI: Do not poll for PME if the device is in D3coldMika Westerberg2019-07-261-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 000dd5316e1c756a1c028f22e01d06a38249dd4d upstream. PME polling does not take into account that a device that is directly connected to the host bridge may go into D3cold as well. This leads to a situation where the PME poll thread reads from a config space of a device that is in D3cold and gets incorrect information because the config space is not accessible. Here is an example from Intel Ice Lake system where two PCIe root ports are in D3cold (I've instrumented the kernel to log the PMCSR register contents): [ 62.971442] pcieport 0000:00:07.1: Check PME status, PMCSR=0xffff [ 62.971504] pcieport 0000:00:07.0: Check PME status, PMCSR=0xffff Since 0xffff is interpreted so that PME is pending, the root ports will be runtime resumed. This repeats over and over again essentially blocking all runtime power management. Prevent this from happening by checking whether the device is in D3cold before its PME status is read. Fixes: 71a83bd727cc ("PCI/PM: add runtime PM support to PCIe port") Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Cc: 3.6+ <stable@vger.kernel.org> # v3.6+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* PCI: hv: Fix a use-after-free bug in hv_eject_device_work()Dexuan Cui2019-07-261-6/+9
| | | | | | | | | | | | | | commit 4df591b20b80cb77920953812d894db259d85bd7 upstream. Fix a use-after-free in hv_eject_device_work(). Fixes: 05f151a73ec2 ("PCI: hv: Fix a memory leak in hv_eject_device_work()") Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* PCI: Add missing link delays required by the PCIe specMika Westerberg2019-07-263-10/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit c2bf1fc212f7e6f25ace1af8f0b3ac061ea48ba5 ] Currently Linux does not follow PCIe spec regarding the required delays after reset. A concrete example is a Thunderbolt add-in-card that consists of a PCIe switch and two PCIe endpoints: +-1b.0-[01-6b]----00.0-[02-6b]--+-00.0-[03]----00.0 TBT controller +-01.0-[04-36]-- DS hotplug port +-02.0-[37]----00.0 xHCI controller \-04.0-[38-6b]-- DS hotplug port The root port (1b.0) and the PCIe switch downstream ports are all PCIe gen3 so they support 8GT/s link speeds. We wait for the PCIe hierarchy to enter D3cold (runtime): pcieport 0000:00:1b.0: power state changed by ACPI to D3cold When it wakes up from D3cold, according to the PCIe 4.0 section 5.8 the PCIe switch is put to reset and its power is re-applied. This means that we must follow the rules in PCIe 4.0 section 6.6.1. For the PCIe gen3 ports we are dealing with here, the following applies: With a Downstream Port that supports Link speeds greater than 5.0 GT/s, software must wait a minimum of 100 ms after Link training completes before sending a Configuration Request to the device immediately below that Port. Software can determine when Link training completes by polling the Data Link Layer Link Active bit or by setting up an associated interrupt (see Section 6.7.3.3). Translating this into the above topology we would need to do this (DLLLA stands for Data Link Layer Link Active): pcieport 0000:00:1b.0: wait for 100ms after DLLLA is set before access to 0000:01:00.0 pcieport 0000:02:00.0: wait for 100ms after DLLLA is set before access to 0000:03:00.0 pcieport 0000:02:02.0: wait for 100ms after DLLLA is set before access to 0000:37:00.0 I've instrumented the kernel with additional logging so we can see the actual delays the kernel performs: pcieport 0000:00:1b.0: power state changed by ACPI to D0 pcieport 0000:00:1b.0: waiting for D3cold delay of 100 ms pcieport 0000:00:1b.0: waking up bus pcieport 0000:00:1b.0: waiting for D3hot delay of 10 ms pcieport 0000:00:1b.0: restoring config space at offset 0x2c (was 0x60, writing 0x60) ... pcieport 0000:00:1b.0: PME# disabled pcieport 0000:01:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:01:00.0: PME# disabled pcieport 0000:02:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:00.0: PME# disabled pcieport 0000:02:01.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:01.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) pcieport 0000:02:01.0: PME# disabled pcieport 0000:02:02.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:02.0: PME# disabled pcieport 0000:02:04.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:04.0: PME# disabled pcieport 0000:02:01.0: PME# enabled pcieport 0000:02:01.0: waiting for D3hot delay of 10 ms pcieport 0000:02:04.0: PME# enabled pcieport 0000:02:04.0: waiting for D3hot delay of 10 ms thunderbolt 0000:03:00.0: restoring config space at offset 0x14 (was 0x0, writing 0x8a040000) ... thunderbolt 0000:03:00.0: PME# disabled xhci_hcd 0000:37:00.0: restoring config space at offset 0x10 (was 0x0, writing 0x73f00000) ... xhci_hcd 0000:37:00.0: PME# disabled For the switch upstream port (01:00.0) we wait for 100ms but not taking into account the DLLLA requirement. We then wait 10ms for D3hot -> D0 transition of the root port and the two downstream hotplug ports. This means that we deviate from what the spec requires. Performing the same check for system sleep (s2idle) transitions we can see following when resuming from s2idle: pcieport 0000:00:1b.0: power state changed by ACPI to D0 pcieport 0000:00:1b.0: restoring config space at offset 0x2c (was 0x60, writing 0x60) ... pcieport 0000:01:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:02.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) pcieport 0000:02:02.0: restoring config space at offset 0x2c (was 0x0, writing 0x0) pcieport 0000:02:01.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) pcieport 0000:02:04.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) pcieport 0000:02:02.0: restoring config space at offset 0x28 (was 0x0, writing 0x0) pcieport 0000:02:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) pcieport 0000:02:02.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1fff1) pcieport 0000:02:01.0: restoring config space at offset 0x2c (was 0x0, writing 0x60) pcieport 0000:02:02.0: restoring config space at offset 0x20 (was 0x0, writing 0x73f073f0) pcieport 0000:02:04.0: restoring config space at offset 0x2c (was 0x0, writing 0x60) pcieport 0000:02:01.0: restoring config space at offset 0x28 (was 0x0, writing 0x60) pcieport 0000:02:00.0: restoring config space at offset 0x2c (was 0x0, writing 0x0) pcieport 0000:02:02.0: restoring config space at offset 0x1c (was 0x101, writing 0x1f1) pcieport 0000:02:04.0: restoring config space at offset 0x28 (was 0x0, writing 0x60) pcieport 0000:02:01.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1ff10001) pcieport 0000:02:00.0: restoring config space at offset 0x28 (was 0x0, writing 0x0) pcieport 0000:02:02.0: restoring config space at offset 0x18 (was 0x0, writing 0x373702) pcieport 0000:02:04.0: restoring config space at offset 0x24 (was 0x10001, writing 0x49f12001) pcieport 0000:02:01.0: restoring config space at offset 0x20 (was 0x0, writing 0x73e05c00) pcieport 0000:02:00.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1fff1) pcieport 0000:02:04.0: restoring config space at offset 0x20 (was 0x0, writing 0x89f07400) pcieport 0000:02:01.0: restoring config space at offset 0x1c (was 0x101, writing 0x5151) pcieport 0000:02:00.0: restoring config space at offset 0x20 (was 0x0, writing 0x8a008a00) pcieport 0000:02:02.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020) pcieport 0000:02:04.0: restoring config space at offset 0x1c (was 0x101, writing 0x6161) pcieport 0000:02:01.0: restoring config space at offset 0x18 (was 0x0, writing 0x360402) pcieport 0000:02:00.0: restoring config space at offset 0x1c (was 0x101, writing 0x1f1) pcieport 0000:02:04.0: restoring config space at offset 0x18 (was 0x0, writing 0x6b3802) pcieport 0000:02:02.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) pcieport 0000:02:00.0: restoring config space at offset 0x18 (was 0x0, writing 0x30302) pcieport 0000:02:01.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020) pcieport 0000:02:04.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020) pcieport 0000:02:00.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020) pcieport 0000:02:01.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) pcieport 0000:02:04.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) pcieport 0000:02:00.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) xhci_hcd 0000:37:00.0: restoring config space at offset 0x10 (was 0x0, writing 0x73f00000) ... thunderbolt 0000:03:00.0: restoring config space at offset 0x14 (was 0x0, writing 0x8a040000) This is even worse. None of the mandatory delays are performed. If this would be S3 instead of s2idle then according to PCI FW spec 3.2 section 4.6.8. there is a specific _DSM that allows the OS to skip the delays but this platform does not provide the _DSM and does not go to S3 anyway so no firmware is involved that could already handle these delays. In this particular Intel Coffee Lake platform these delays are not actually needed because there is an additional delay as part of the ACPI power resource that is used to turn on power to the hierarchy but since that additional delay is not required by any of standards (PCIe, ACPI) it is not present in the Intel Ice Lake, for example where missing the mandatory delays causes pciehp to start tearing down the stack too early (links are not yet trained). For this reason, change the PCIe portdrv PM resume hooks so that they perform the mandatory delays before the downstream component gets resumed. We perform the delays before port services are resumed because otherwise pciehp might find that the link is not up (even if it is just training) and tears-down the hierarchy. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* PCI: PM: Avoid skipping bus-level PM on platforms without ACPIRafael J. Wysocki2019-06-261-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | There are platforms that do not call pm_set_suspend_via_firmware(), so pm_suspend_via_firmware() returns 'false' on them, but the power states of PCI devices (PCIe ports in particular) are changed as a result of powering down core platform components during system-wide suspend. Thus the pm_suspend_via_firmware() checks in pci_pm_suspend_noirq() and pci_pm_resume_noirq() introduced by commit 3e26c5feed2a ("PCI: PM: Skip devices in D0 for suspend-to- idle") are not sufficient to determine that devices left in D0 during suspend will remain in D0 during resume and so the bus-level power management can be skipped for them. For this reason, introduce a new global suspend flag, PM_SUSPEND_FLAG_NO_PLATFORM, set it for suspend-to-idle only and replace the pm_suspend_via_firmware() checks mentioned above with checks against this flag. Fixes: 3e26c5feed2a ("PCI: PM: Skip devices in D0 for suspend-to-idle") Reported-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
* Merge tag 'pci-v5.2-fixes-1' of ↵Linus Torvalds2019-06-221-0/+4
|\ | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fix from Bjorn Helgaas: "If an IOMMU is present, ignore the P2PDMA whitelist we added for v5.2 because we don't yet know how to support P2PDMA in that case (Logan Gunthorpe)" * tag 'pci-v5.2-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI/P2PDMA: Ignore root complex whitelist when an IOMMU is present
| * PCI/P2PDMA: Ignore root complex whitelist when an IOMMU is presentLogan Gunthorpe2019-06-191-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Presently, there is no path to DMA map P2PDMA memory, so if a TLP targeting this memory hits the root complex and an IOMMU is present, the IOMMU will reject the transaction, even if the RC would support P2PDMA. So until the kernel knows to map these DMA addresses in the IOMMU, we should not enable the whitelist when an IOMMU is present. Link: https://lore.kernel.org/linux-pci/20190522201252.2997-1-logang@deltatee.com/ Fixes: 0f97da831026 ("PCI/P2PDMA: Allow P2P DMA between any devices under AMD ZEN Root Complex") Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Christoph Hellwig <hch@lst.de>
* | Merge tag 'pm-5.2-rc6' of ↵Linus Torvalds2019-06-191-12/+35
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "Prevent PCI bridges in general (and PCIe ports in particular) from being put into low-power states during system-wide suspend transitions if there are any devices in D0 below them and refine the handling of PCI devices in D0 during suspend-to-idle cycles" * tag 'pm-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PCI: PM: Skip devices in D0 for suspend-to-idle
| * | PCI: PM: Skip devices in D0 for suspend-to-idleRafael J. Wysocki2019-06-141-12/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit d491f2b75237 ("PCI: PM: Avoid possible suspend-to-idle issue") attempted to avoid a problem with devices whose drivers want them to stay in D0 over suspend-to-idle and resume, but it did not go as far as it should with that. Namely, first of all, the power state of a PCI bridge with a downstream device in D0 must be D0 (based on the PCI PM spec r1.2, sec 6, table 6-1, if the bridge is not in D0, there can be no PCI transactions on its secondary bus), but that is not actively enforced during system-wide PM transitions, so use the skip_bus_pm flag introduced by commit d491f2b75237 for that. Second, the configuration of devices left in D0 (whatever the reason) during suspend-to-idle need not be changed and attempting to put them into D0 again by force is pointless, so explicitly avoid doing that. Fixes: d491f2b75237 ("PCI: PM: Avoid possible suspend-to-idle issue") Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
* | | mm/devm_memremap_pages: fix final page put raceDan Williams2019-06-131-14/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Logan noticed that devm_memremap_pages_release() kills the percpu_ref drops all the page references that were acquired at init and then immediately proceeds to unplug, arch_remove_memory(), the backing pages for the pagemap. If for some reason device shutdown actually collides with a busy / elevated-ref-count page then arch_remove_memory() should be deferred until after that reference is dropped. As it stands the "wait for last page ref drop" happens *after* devm_memremap_pages_release() returns, which is obviously too late and can lead to crashes. Fix this situation by assigning the responsibility to wait for the percpu_ref to go idle to devm_memremap_pages() with a new ->cleanup() callback. Implement the new cleanup callback for all devm_memremap_pages() users: pmem, devdax, hmm, and p2pdma. Link: http://lkml.kernel.org/r/155727339156.292046.5432007428235387859.stgit@dwillia2-desk3.amr.corp.intel.com Fixes: 41e94a851304 ("add devm_memremap_pages") Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reported-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | PCI/P2PDMA: track pgmap references per resource, not globallyDan Williams2019-06-131-43/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for fixing a race between devm_memremap_pages_release() and the final put of a page from the device-page-map, allocate a percpu-ref per p2pdma resource mapping. Link: http://lkml.kernel.org/r/155727338646.292046.9922678317501435597.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | PCI/P2PDMA: fix the gen_pool_add_virt() failure pathDan Williams2019-06-131-1/+3
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The pci_p2pdma_add_resource() implementation immediately frees the pgmap if gen_pool_add_virt() fails. However, that means that when @dev triggers a devres release devm_memremap_pages_release() will crash trying to access the freed @pgmap. Use the new devm_memunmap_pages() to manually free the mapping in the error path. Link: http://lkml.kernel.org/r/155727337603.292046.13101332703665246702.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Fixes: 52916982af48 ("PCI/P2PDMA: Support peer-to-peer memory") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | PCI: PM: Avoid possible suspend-to-idle issueRafael J. Wysocki2019-05-271-1/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a PCI driver leaves the device handled by it in D0 and calls pci_save_state() on the device in its ->suspend() or ->suspend_late() callback, it can expect the device to stay in D0 over the whole s2idle cycle. However, that may not be the case if there is a spurious wakeup while the system is suspended, because in that case pci_pm_suspend_noirq() will run again after pci_pm_resume_noirq() which calls pci_restore_state(), via pci_pm_default_resume_early(), so state_saved is cleared and the second iteration of pci_pm_suspend_noirq() will invoke pci_prepare_to_sleep() which may change the power state of the device. To avoid that, add a new internal flag, skip_bus_pm, that will be set by pci_pm_suspend_noirq() when it runs for the first time during the given system suspend-resume cycle if the state of the device has been saved already and the device is still in D0. Setting that flag will cause the next iterations of pci_pm_suspend_noirq() to set state_saved for pci_pm_resume_noirq(), so that it always restores the device state from the originally saved data, and avoid calling pci_prepare_to_sleep() for the device. Fixes: 33e4f80ee69b ("ACPI / PM: Ignore spurious SCI wakeups from suspend-to-idle") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
* | ACPI/PCI: PM: Add missing wakeup.flags.valid checksRafael J. Wysocki2019-05-271-1/+2
|/ | | | | | | | | | | | | | | | Both acpi_pci_need_resume() and acpi_dev_needs_resume() check if the current ACPI wakeup configuration of the device matches what is expected as far as system wakeup from sleep states is concerned, as reflected by the device_may_wakeup() return value for the device. However, they only should do that if wakeup.flags.valid is set for the device's ACPI companion, because otherwise the wakeup.prepare_count value for it is meaningless. Add the missing wakeup.flags.valid checks to these functions. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
* Merge tag 'pci-v5.2-changes' of ↵Linus Torvalds2019-05-1461-1317/+2375
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "Enumeration changes: - Add _HPX Type 3 settings support, which gives firmware more influence over device configuration (Alexandru Gagniuc) - Support fixed bus numbers from bridge Enhanced Allocation capabilities (Subbaraya Sundeep) - Add "external-facing" DT property to identify cases where we require IOMMU protection against untrusted devices (Jean-Philippe Brucker) - Enable PCIe services for host controller drivers that use managed host bridge alloc (Jean-Philippe Brucker) - Log PCIe port service messages with pci_dev, not the pcie_device (Frederick Lawler) - Convert pciehp from pciehp_debug module parameter to generic dynamic debug (Frederick Lawler) Peer-to-peer DMA: - Add whitelist of Root Complexes that support peer-to-peer DMA between Root Ports (Christian König) Native controller drivers: - Add PCI host bridge DMA ranges for bridges that can't DMA everywhere, e.g., iProc (Srinath Mannam) - Add Amazon Annapurna Labs PCIe host controller driver (Jonathan Chocron) - Fix Tegra MSI target allocation so DMA doesn't generate unwanted MSIs (Vidya Sagar) - Fix of_node reference leaks (Wen Yang) - Fix Hyper-V module unload & device removal issues (Dexuan Cui) - Cleanup R-Car driver (Marek Vasut) - Cleanup Keystone driver (Kishon Vijay Abraham I) - Cleanup i.MX6 driver (Andrey Smirnov) Significant bug fixes: - Reset Lenovo ThinkPad P50 GPU so nouveau works after reboot (Lyude Paul) - Fix Switchtec firmware update performance issue (Wesley Sheng) - Work around Pericom switch link retraining erratum (Stefan Mätje)" * tag 'pci-v5.2-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (141 commits) MAINTAINERS: Add Karthikeyan Mitran and Hou Zhiqiang for Mobiveil PCI PCI: pciehp: Remove pointless MY_NAME definition PCI: pciehp: Remove pointless PCIE_MODULE_NAME definition PCI: pciehp: Remove unused dbg/err/info/warn() wrappers PCI: pciehp: Log messages with pci_dev, not pcie_device PCI: pciehp: Replace pciehp_debug module param with dyndbg PCI: pciehp: Remove pciehp_debug uses PCI/AER: Log messages with pci_dev, not pcie_device PCI/DPC: Log messages with pci_dev, not pcie_device PCI/PME: Replace dev_printk(KERN_DEBUG) with dev_info() PCI/AER: Replace dev_printk(KERN_DEBUG) with dev_info() PCI: Replace dev_printk(KERN_DEBUG) with dev_info(), etc PCI: Replace printk(KERN_INFO) with pr_info(), etc PCI: Use dev_printk() when possible PCI: Cleanup setup-bus.c comments and whitespace PCI: imx6: Allow asynchronous probing PCI: dwc: Save root bus for driver remove hooks PCI: dwc: Use devm_pci_alloc_host_bridge() to simplify code PCI: dwc: Free MSI in dw_pcie_host_init() error path PCI: dwc: Free MSI IRQ page in dw_pcie_free_msi() ...
| * Merge branch 'pci/trivial'Bjorn Helgaas2019-05-134-394/+408
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Cleanup PCI register definitions, typos, etc (Bjorn Helgaas) - Remove unnecessary use of user-space types in CPER (Bjorn Helgaas) - Cleanup setup-bus.c comments & whitespace (Nicholas Johnson) * pci/trivial: PCI: Cleanup setup-bus.c comments and whitespace CPER: Remove unnecessary use of user-space types CPER: Add UEFI spec references PCI: Fix comment typos PCI: Cleanup register definition width and whitespace # Conflicts: # drivers/pci/pci.c # drivers/pci/setup-bus.c
| | * PCI: Cleanup setup-bus.c comments and whitespaceNicholas Johnson2019-05-071-247/+249
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cleanup comments, kernel-doc, coding style. No functional changes intended; comment and whitespace changes only. Link: https://lore.kernel.org/lkml/PS2P216MB06427E290A68CDB921FB4B2980250@PS2P216MB0642.KORP216.PROD.OUTLOOK.COM Signed-off-by: Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au> [bhelgaas: tidy related things throughout the file] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
| | * PCI: Fix comment typosBjorn Helgaas2019-04-133-160/+172
| | | | | | | | | | | | | | | | | | | | | | | | Fix spelling errors and format function comments consistently. Changes whitespace and comments only; no functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
| * | Merge branch 'pci/printk-portdrv'Bjorn Helgaas2019-05-139-89/+78
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Change some desirable KERN_DEBUG messages to KERN_INFO/KERN_ERR (Frederick Lawler) - Log PCIe port service messages with pci_dev, not the pcie_device (Frederick Lawler) - Convert pciehp from pciehp_debug module parameter to generic dynamic debug (Frederick Lawler) * pci/printk-portdrv: PCI: pciehp: Remove pointless MY_NAME definition PCI: pciehp: Remove pointless PCIE_MODULE_NAME definition PCI: pciehp: Remove unused dbg/err/info/warn() wrappers PCI: pciehp: Log messages with pci_dev, not pcie_device PCI: pciehp: Replace pciehp_debug module param with dyndbg PCI: pciehp: Remove pciehp_debug uses PCI/AER: Log messages with pci_dev, not pcie_device PCI/DPC: Log messages with pci_dev, not pcie_device PCI/PME: Replace dev_printk(KERN_DEBUG) with dev_info() PCI/AER: Replace dev_printk(KERN_DEBUG) with dev_info()
| | * | PCI: pciehp: Remove pointless MY_NAME definitionBjorn Helgaas2019-05-092-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MY_NAME is only used once and offers no benefit, so remove it. Link: https://lore.kernel.org/lkml/20190509141456.223614-11-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
| | * | PCI: pciehp: Remove pointless PCIE_MODULE_NAME definitionBjorn Helgaas2019-05-091-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PCIE_MODULE_NAME is only used once and offers no benefit, so remove it. Link: https://lore.kernel.org/lkml/20190509141456.223614-10-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
| | * | PCI: pciehp: Remove unused dbg/err/info/warn() wrappersFrederick Lawler2019-05-093-12/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace the last uses of dbg() with the equivalent pr_debug(), then remove unused dbg(), err(), info(), and warn() wrappers. Link: https://lore.kernel.org/lkml/20190509141456.223614-9-helgaas@kernel.org Signed-off-by: Frederick Lawler <fred@fredlawl.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
| | * | PCI: pciehp: Log messages with pci_dev, not pcie_deviceFrederick Lawler2019-05-095-10/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Log messages with pci_dev, not pcie_device. Factor out common message prefixes with dev_fmt(). Example output change: - pciehp 0000:00:06.0:pcie004: Slot(0) Powering on due to button press + pcieport 0000:00:06.0: pciehp: Slot(0) Powering on due to button press Link: https://lore.kernel.org/lkml/20190509141456.223614-8-helgaas@kernel.org Signed-off-by: Frederick Lawler <fred@fredlawl.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
| | * | PCI: pciehp: Replace pciehp_debug module param with dyndbgFrederick Lawler2019-05-092-13/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously pciehp debug messages were enabled by the pciehp_debug module parameter, e.g., by booting with this kernel command line option: pciehp.pciehp_debug=1 Convert this mechanism to use the generic dynamic debug (dyndbg) feature. After this commit, pciehp debug messages are enabled by building the kernel with CONFIG_DYNAMIC_DEBUG=y and booting with this command line option: dyndbg="file pciehp* +p" The dyndbg facility is much more flexible: messages can be enabled at boot- or run-time based on the file name, function name, line number, message test, etc. See Documentation/admin-guide/dynamic-debug-howto.rst for more details. Link: https://lore.kernel.org/lkml/20190509141456.223614-7-helgaas@kernel.org Signed-off-by: Frederick Lawler <fred@fredlawl.com> [bhelgaas: commit log, comment, remove pciehp_debug parameter] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
| | * | PCI: pciehp: Remove pciehp_debug usesBjorn Helgaas2019-05-091-8/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We're about to convert pciehp to the dyndbg mechanism, which means we can eventually remove pciehp_debug. Replace uses of pciehp_debug with dbg() and ctrl_dbg(), which check pciehp_debug internally. Link: https://lore.kernel.org/lkml/20190509141456.223614-6-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
| | * | PCI/AER: Log messages with pci_dev, not pcie_deviceFrederick Lawler2019-05-092-17/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Log messages with pci_dev, not pcie_device. Factor out common message prefixes with dev_fmt(). Example output change: - aer 0000:00:00.0:pci002: AER enabled with IRQ ... + pcieport 0000:00:00.0: AER: enabled with IRQ ... Link: https://lore.kernel.org/lkml/20190509141456.223614-5-helgaas@kernel.org Signed-off-by: Frederick Lawler <fred@fredlawl.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
| | * | PCI/DPC: Log messages with pci_dev, not pcie_deviceFrederick Lawler2019-05-091-19/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Log messages with pci_dev, not pcie_device. Factor out common message prefixes with dev_fmt(). Example output change: - dpc 0000:00:01.1:pcie008: DPC error containment capabilities... + pcieport 0000:00:01.1: DPC: error containment capabilities... Link: https://lore.kernel.org/lkml/20190509141456.223614-4-helgaas@kernel.org Signed-off-by: Frederick Lawler <fred@fredlawl.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>
| | * | PCI/PME: Replace dev_printk(KERN_DEBUG) with dev_info()Frederick Lawler2019-05-091-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace dev_printk(KERN_DEBUG) with dev_info() or dev_err() to be more consistent with other logging. These could be converted to dev_dbg(), but that depends on CONFIG_DYNAMIC_DEBUG and DEBUG, and we want most of these messages to *always* be in the dmesg log. Also, use dev_fmt() to add the service name. Example output change: - pcieport 0000:80:10.0: Signaling PME with IRQ ... + pcieport 0000:80:10.0: PME: Signaling with IRQ ... Link: https://lore.kernel.org/lkml/20190509141456.223614-3-helgaas@kernel.org Signed-off-by: Frederick Lawler <fred@fredlawl.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>