summaryrefslogtreecommitdiffstats
path: root/arch/powerpc/platforms/pseries/msi.c
Commit message (Collapse)AuthorAgeFilesLines
* powerpc: Enable support for 32 bit MSI-X vectorsBrian King2024-03-031-3/+8
| | | | | | | | | | | | Some devices are not capable of addressing 64 bits via DMA, which includes MSI-X vectors. This allows us to ensure these devices use MSI-X vectors in 32 bit space. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240117214632.134539-1-brking@linux.vnet.ibm.com
* powerpc/rtas: arch-wide function token lookup conversionsNathan Lynch2023-02-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the tokens for all implemented RTAS functions now available via rtas_function_token(), which is optimal and safe for arbitrary contexts, there is no need to use rtas_token() or cache its result. Most conversions are trivial, but a few are worth describing in more detail: * Error injection token comparisons for lockdown purposes are consolidated into a simple predicate: token_is_restricted_errinjct(). * A couple of special cases in block_rtas_call() do not use rtas_token() but perform string comparisons against names in the function table. These are converted to compare against token values instead, which is logically equivalent but less expensive. * The lookup for the ibm,os-term token can be deferred until needed, instead of caching it at boot to avoid device tree traversal during panic. * Since rtas_function_token() accesses a read-only data structure without taking any locks, xmon's lookup of set-indicator can be performed as needed instead of cached at startup. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-20-26929c8cce78@linux.ibm.com
* powerpc/pseries/msi: Use msi_domain_ops:: Msi_post_free()Thomas Gleixner2022-11-171-5/+2
| | | | | | | | | | | | | Use the new msi_post_free() callback which is invoked after the interrupts have been freed to tell the hypervisor about the shutdown. This allows to remove the exposure of __msi_domain_free_irqs(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20221111122014.120489922@linutronix.de
* powerpc: Add missing headersChristophe Leroy2022-05-081-0/+1
| | | | | | | | | | | | | Don't inherit headers "by chances" from asm/prom.h, asm/mpc52xx.h, asm/pci.h etc... Include the needed headers, and remove asm/prom.h when it was needed exclusively for pulling necessary headers. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/be8bdc934d152a7d8ee8d1a840d5596e2f7d85e0.1646767214.git.christophe.leroy@csgroup.eu
* powerpc/pseries/msi: Let core code check for contiguous entriesThomas Gleixner2021-12-161-25/+8
| | | | | | | | | | | Set the domain info flag and remove the check. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20211210221814.720998720@linutronix.de
* PCI/MSI: Use msi_desc::msi_indexThomas Gleixner2021-12-161-2/+2
| | | | | | | | | | | | | | | | | | The usage of msi_desc::pci::entry_nr is confusing at best. It's the index into the MSI[X] descriptor table. Use msi_desc::msi_index which is shared between all MSI incarnations instead of having a PCI specific storage for no value. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Nishanth Menon <nm@ti.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20211210221814.602911509@linutronix.de
* powerpc/pseries/msi: Use PCI device propertiesThomas Gleixner2021-12-161-2/+1
| | | | | | | | | instead of fiddling with MSI descriptors. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20211210221813.556202506@linutronix.de
* genirq/msi, treewide: Use a named struct for PCI/MSI attributesThomas Gleixner2021-12-091-3/+3
| | | | | | | | | | | | | | | The unnamed struct sucks and is in the way of further cleanups. Stick the PCI related MSI data into a real data structure and cleanup all users. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211206210224.374863119@linutronix.de
* powerpc/pseries/msi: Add an empty irq_write_msi_msg() handlerCédric Le Goater2021-10-071-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The IPR drivers tests for MSI support at probe time with MSI vector 0 and when done, frees the IRQ with free_irq(). This test was introduced by 95fecd90397e ("ipr: add test for MSI interrupt support") as an improvement of commit 5a9ef25b14d3 ("[SCSI] ipr: add MSI support") because a boot failure was reported on a Bimini PowerPC system: https://lore.kernel.org/r/1242926159.3007.5.camel@localhost.localdomain It was finally decided to remove MSI support on Bimini systems in 6eb0ac03899a ("powerpc/maple: Add a quirk to disable MSI for IPR on Bimini"). Linux 5.15-rc1 added MSI domain support to the pseries machine and when free_irq is called() in the driver, msi_domain_deactivate() also is. This resets the MSI table entry of the associate vector by calling __pci_write_msi_msg() with an empty message and breaks any further activation of the same vector. In the case of the IPR driver, it breaks the initialization sequence of the IOA. Introduce an empty irq_write_msi_msg() handler in the MSI domain of the pseries machine to avoid clearing the MSI vector entry. Updating the entry is not strictly necessary since it is initialized by the underlying hypervisor, PowerVM or QEMU/KVM. Fixes: a5f3d2c17b07 ("powerpc/pseries/pci: Add MSI domains") Signed-off-by: Cédric Le Goater <clg@kaod.org> Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Tested-by: Mahesh Salgaonkar <mahesh@linux.ibm.com> [mpe: Tweak comment wording and formatting slightly] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210930102535.1047230-1-clg@kaod.org
* powerpc/pseries/pci: Drop unused MSI codeCédric Le Goater2021-08-101-87/+0
| | | | | | | | | MSIs should be fully managed by the PCI and IRQ subsystems now. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-25-clg@kaod.org
* powerpc/pci: Drop XIVE restriction on MSI domainsCédric Le Goater2021-08-101-4/+0
| | | | | | | | | | The PowerNV and pSeries platforms now have support for both the XICS and XIVE IRQ domains. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-23-clg@kaod.org
* powerpc/pseries/pci: Add support of MSI domains to PHB hotplugCédric Le Goater2021-08-101-0/+10
| | | | | | | | | | Simply allocate or release the MSI domains when a PHB is inserted in or removed from the machine. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-11-clg@kaod.org
* powerpc/pseries/pci: Add a msi_free() handler to clear XIVE dataCédric Le Goater2021-08-101-1/+15
| | | | | | | | | | | | | | | | The MSI domain clears the IRQ with msi_domain_free(), which calls irq_domain_free_irqs_top(), which clears the handler data. This is a problem for the XIVE controller since we need to unmap MMIO pages and free a specific XIVE structure. The 'msi_free()' handler is called before irq_domain_free_irqs_top() when the handler data is still available. Use that to clear the XIVE controller data. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-10-clg@kaod.org
* powerpc/pseries/pci: Add a domain_free_irqs() handlerCédric Le Goater2021-08-101-0/+16
| | | | | | | | | | The RTAS firmware can not disable one MSI at a time. It's all or nothing. We need a custom free IRQ handler for that. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-9-clg@kaod.org
* powerpc/pseries/pci: Add MSI domainsCédric Le Goater2021-08-101-0/+185
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Two IRQ domains are added on top of default machine IRQ domain. First, the top level "pSeries-PCI-MSI" domain deals with the MSI specificities. In this domain, the HW IRQ numbers are generated by the PCI MSI layer, they compose a unique ID for an MSI source with the PCI device identifier and the MSI vector number. These numbers can be quite large on a pSeries machine running under the IBM Hypervisor and /sys/kernel/irq/ and /proc/interrupts will require small fixes to show them correctly. Second domain is the in-the-middle "pSeries-MSI" domain which acts as a proxy between the PCI MSI subsystem and the machine IRQ subsystem. It usually allocate the MSI vector numbers but, on pSeries machines, this is done by the RTAS FW and RTAS returns IRQ numbers in the IRQ number space of the machine. This is why the in-the-middle "pSeries-MSI" domain has the same HW IRQ numbers as its parent domain. Only the XIVE (P9/P10) parent domain is supported for now. We still need to add support for IRQ domain hierarchy under XICS. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-6-clg@kaod.org
* powerpc/pseries/pci: Introduce rtas_prepare_msi_irqs()Cédric Le Goater2021-08-101-4/+19
| | | | | | | | | | | | | This splits the routine setting the MSIs in two parts: allocation of MSIs for the PCI device at the FW level (RTAS) and the actual mapping and activation of the IRQs. rtas_prepare_msi_irqs() will serve as a handler for the PCI MSI domain. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-3-clg@kaod.org
* powerpc/pseries/pci: Introduce __find_pe_total_msi()Cédric Le Goater2021-08-101-2/+7
| | | | | | | | | It will help to size the PCI MSI domain. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701132750.1475580-2-clg@kaod.org
* powerpc/pseries: Don't enforce MSI affinity with kdumpGreg Kurz2021-03-011-2/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Depending on the number of online CPUs in the original kernel, it is likely for CPU #0 to be offline in a kdump kernel. The associated IRQs in the affinity mappings provided by irq_create_affinity_masks() are thus not started by irq_startup(), as per-design with managed IRQs. This can be a problem with multi-queue block devices driven by blk-mq : such a non-started IRQ is very likely paired with the single queue enforced by blk-mq during kdump (see blk_mq_alloc_tag_set()). This causes the device to remain silent and likely hangs the guest at some point. This is a regression caused by commit 9ea69a55b3b9 ("powerpc/pseries: Pass MSI affinity to irq_create_mapping()"). Note that this only happens with the XIVE interrupt controller because XICS has a workaround to bypass affinity, which is activated during kdump with the "noirqdistrib" kernel parameter. The issue comes from a combination of factors: - discrepancy between the number of queues detected by the multi-queue block driver, that was used to create the MSI vectors, and the single queue mode enforced later on by blk-mq because of kdump (i.e. keeping all queues fixes the issue) - CPU#0 offline (i.e. kdump always succeed with CPU#0) Given that I couldn't reproduce on x86, which seems to always have CPU#0 online even during kdump, I'm not sure where this should be fixed. Hence going for another approach : fine-grained affinity is for performance and we don't really care about that during kdump. Simply revert to the previous working behavior of ignoring affinity masks in this case only. Fixes: 9ea69a55b3b9 ("powerpc/pseries: Pass MSI affinity to irq_create_mapping()") Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210215094506.1196119-1-groug@kaod.org
* powerpc/pseries: Pass MSI affinity to irq_create_mapping()Laurent Vivier2020-11-301-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With virtio multiqueue, normally each queue IRQ is mapped to a CPU. Commit 0d9f0a52c8b9f ("virtio_scsi: use virtio IRQ affinity") exposed an existing shortcoming of the arch code by moving virtio_scsi to the automatic IRQ affinity assignment. The affinity is correctly computed in msi_desc but this is not applied to the system IRQs. It appears the affinity is correctly passed to rtas_setup_msi_irqs() but lost at this point and never passed to irq_domain_alloc_descs() (see commit 06ee6d571f0e ("genirq: Add affinity hint to irq allocation")) because irq_create_mapping() doesn't take an affinity parameter. Use the new irq_create_mapping_affinity() function, which allows to forward the affinity setting from rtas_setup_msi_irqs() to irq_domain_alloc_descs(). With this change, the virtqueues are correctly dispatched between the CPUs on pseries. Fixes: e75eafb9b039 ("genirq/msi: Switch to new irq spreading infrastructure") Signed-off-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kurz <groug@kaod.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20201126082852.1178497-3-lvivier@redhat.com
* treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441Thomas Gleixner2019-06-051-6/+1
| | | | | | | | | | | | | | | | | | | | | Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation version 2 of the license extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 315 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Armijn Hemel <armijn@tjaldur.nl> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190531190115.503150771@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* powerpc/eeh: Cleanup list_head field namesSam Bobroff2018-10-131-1/+2
| | | | | | | | | | | | | | | | Instances of struct eeh_pe are placed in a tree structure using the fields "child_list" and "child", so place these next to each other in the definition. The field "child" is a list entry, so remove the unnecessary and misleading use of the list initializer, LIST_HEAD(), on it. The eeh_dev struct contains two list entry fields, called "list" and "rmv_list". Rename them to "entry" and "rmv_entry" and, as above, stop initializing them with LIST_HEAD(). Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/pci: Remove OF node back pointer from pci_dnAlexey Kardashevskiy2017-08-311-9/+2
| | | | | | | | | | | | | | | | | | The check_req() helper uses pci_get_pdn() to get an OF node pointer. pci_get_pdn() returns a pci_dn pointer which either: 1) from the OF node returned by pci_device_to_OF_node(); 2) from the parent child_list where entries don't have OF node pointers. Since check_req() does not care about 2), it can call pci_device_to_OF_node() directly, hence the change. The find_pe_dn() helper uses embedded pci_dn to get an OF node which is also stored in edev->pdev so let's take a shortcut and call pci_device_to_OF_node() directly. With these 2 changes, we can finally get rid of the OF node back pointer. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: Convert to using %pOF instead of full_nameRob Herring2017-08-231-6/+6
| | | | | | | | | | | | | | | | | | Now that we have a custom printf format specifier, convert users of full_name to use %pOF instead. This is preparation to remove storing of the full path string for each node. Signed-off-by: Rob Herring <robh@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Anatolij Gustschin <agust@denx.de> Cc: Scott Wood <oss@buserror.net> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linuxppc-dev@lists.ozlabs.org Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: Remove all usages of NO_IRQMichael Ellerman2016-09-201-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NO_IRQ has been == 0 on powerpc for just over ten years (since commit 0ebfff1491ef ("[POWERPC] Add new interrupt mapping core and change platforms to use it")). It's also 0 on most other arches. Although it's fairly harmless, every now and then it causes confusion when a driver is built on powerpc and another arch which doesn't define NO_IRQ. There's at least 6 definitions of NO_IRQ in drivers/, at least some of which are to work around that problem. So we'd like to remove it. This is fairly trivial in the arch code, we just convert: if (irq == NO_IRQ) to if (!irq) if (irq != NO_IRQ) to if (irq) irq = NO_IRQ; to irq = 0; return NO_IRQ; to return 0; And a few other odd cases as well. At least for now we keep the #define NO_IRQ, because there is driver code that uses NO_IRQ and the fixes to remove those will go via other trees. Note we also change some occurrences in PPC sound drivers, drivers/ps3, and drivers/macintosh. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/pci: Export pci_traverse_device_nodes()Gavin Shan2016-05-111-2/+2
| | | | | | | | | | | | | | This renames traverse_pci_devices() to pci_traverse_device_nodes(). The function traverses all subordinate device nodes of the specified one. Also, below cleanup applied to the function. No logical changes introduced. * Rename "pre" to "fn". * Avoid assignment in if condition reported from checkpatch.pl. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/PCI: Use for_pci_msi_entry() to access MSI device listJiang Liu2015-07-221-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use accessor for_each_pci_msi_entry() to access MSI device list, so we could easily move msi_list from struct pci_dev into struct device later. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Grant Likely <grant.likely@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Stuart Yoder <stuart.yoder@freescale.com> Cc: Yijing Wang <wangyijing@huawei.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Olof Johansson <olof@lixom.net> Cc: Gavin Shan <gwshan@linux.vnet.ibm.com> Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Daniel Axtens <dja@axtens.net> Cc: Wei Yang <weiyang@linux.vnet.ibm.com> Cc: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Cc: Alexander Gordeev <agordeev@redhat.com> Cc: Scott Wood <scottwood@freescale.com> Cc: Laurentiu Tudor <Laurentiu.Tudor@freescale.com> Cc: Tudor Laurentiu <b10716@freescale.com> Cc: Hongtao Jia <hongtao.jia@freescale.com> Link: http://lkml.kernel.org/r/1436428847-8886-4-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* powerpc/pseries: Move MSI-related ops to pci_controller_opsDaniel Axtens2015-06-021-3/+13
| | | | | | | | | | | Move the pseries platform to use the pci_controller_ops structure rather than ppc_md for MSI related PCI controller operations We need to iterate all PHBs because the MSI setup happens later than find_and_init_phbs() - mpe. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/eeh: Remove device_node dependencyGavin Shan2015-03-241-2/+4
| | | | | | | | | | The patch removes struct eeh_dev::dn and the corresponding helper functions: eeh_dev_to_of_node() and of_node_to_eeh_dev(). Instead, eeh_dev_to_pdn() and pdn_to_eeh_dev() should be used to get the pdn, which might contain device_node on PowerNV platform. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* Merge branch 'irq-irqdomain-for-linus' of ↵Linus Torvalds2014-12-101-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq domain updates from Thomas Gleixner: "The real interesting irq updates: - Support for hierarchical irq domains: For complex interrupt routing scenarios where more than one interrupt related chip is involved we had no proper representation in the generic interrupt infrastructure so far. That made people implement rather ugly constructs in their nested irq chip implementations. The main offenders are x86 and arm/gic. To distangle that mess we have now hierarchical irqdomains which seperate the various interrupt chips and connect them via the hierarchical domains. That keeps the domain specific details internal to the particular hierarchy level and removes the criss/cross referencing of chip internals. The resulting hierarchy for a complex x86 system will look like this: vector mapped: 74 msi-0 mapped: 2 dmar-ir-1 mapped: 69 ioapic-1 mapped: 4 ioapic-0 mapped: 20 pci-msi-2 mapped: 45 dmar-ir-0 mapped: 3 ioapic-2 mapped: 1 pci-msi-1 mapped: 2 htirq mapped: 0 Neither ioapic nor pci-msi know about the dmar interrupt remapping between themself and the vector domain. If interrupt remapping is disabled ioapic and pci-msi become direct childs of the vector domain. In hindsight we should have done that years ago, but in hindsight we always know better :) - Support for generic MSI interrupt domain handling We have more and more non PCI related MSI interrupts, so providing a generic infrastructure for this is better than having all affected architectures implementing their own private hacks. - Support for PCI-MSI interrupt domain handling, based on the generic MSI support. This part carries the pci/msi branch from Bjorn Helgaas pci tree to avoid a massive conflict. The PCI/MSI parts are acked by Bjorn. I have two more branches on top of this. The full conversion of x86 to hierarchical domains and a partial conversion of arm/gic" * 'irq-irqdomain-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits) genirq: Move irq_chip_write_msi_msg() helper to core PCI/MSI: Allow an msi_controller to be associated to an irq domain PCI/MSI: Provide mechanism to alloc/free MSI/MSIX interrupt from irqdomain PCI/MSI: Enhance core to support hierarchy irqdomain PCI/MSI: Move cached entry functions to irq core genirq: Provide default callbacks for msi_domain_ops genirq: Introduce msi_domain_alloc/free_irqs() asm-generic: Add msi.h genirq: Add generic msi irq domain support genirq: Introduce callback irq_chip.irq_write_msi_msg genirq: Work around __irq_set_handler vs stacked domains ordering issues irqdomain: Introduce helper function irq_domain_add_hierarchy() irqdomain: Implement a method to automatically call parent domains alloc/free genirq: Introduce helper irq_domain_set_info() to reduce duplicated code genirq: Split out flow handler typedefs into seperate header file genirq: Add IRQ_SET_MASK_OK_DONE to support stacked irqchip genirq: Introduce irq_chip.irq_compose_msi_msg() to support stacked irqchip genirq: Add more helper functions to support stacked irq_chip genirq: Introduce helper functions to support stacked irq_chip irqdomain: Do irq_find_mapping and set_type for hierarchy irqdomain in case OF ...
| * PCI/MSI: Rename __read_msi_msg() to __pci_read_msi_msg()Jiang Liu2014-11-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename __read_msi_msg() to __pci_read_msi_msg() and kill unused read_msi_msg(). It's a preparation to separate generic MSI code from PCI core. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Grant Likely <grant.likely@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Yingjoe Chen <yingjoe.chen@mediatek.com> Cc: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | powerpc/pseries: Honor the generic "no_64bit_msi" flagBenjamin Herrenschmidt2014-11-241-1/+1
|/ | | | | | | Instead of the arch specific quirk which we are deprecating Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: <stable@vger.kernel.org>
* MSI/powerpc: Use __read_msi_msg() instead of read_msi_msg()Yijing Wang2014-10-011-1/+1
| | | | | | | | | | | | | | rtas_setup_msi_irqs() already has the struct msi_desc pointer required by __read_msi_msg(), so call it directly instead of having read_msi_msg() look it up from the IRQ. No functional change. [bhelgaas: changelog] Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: linuxppc-dev@lists.ozlabs.org
* PCI/MSI/PPC: Remove arch_msi_check_device()Alexander Gordeev2014-10-011-26/+16
| | | | | | | | | Move MSI checks from arch_msi_check_device() to arch_setup_msi_irqs(). This makes the code more compact and allows removing arch_msi_check_device() from generic MSI code. Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/pseries: Switch pseries drivers to use machine_xxx_initcall()Michael Ellerman2014-07-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | A lot of the code in platforms/pseries is using non-machine initcalls. That means if a kernel built with pseries support runs on another platform, for example powernv, the initcalls will still run. Most of these cases are OK, though sometimes only due to luck. Some were having more effect: * hcall_inst_init - Checking FW_FEATURE_LPAR which is set on ps3 & celleb. * mobility_sysfs_init - created sysfs files unconditionally - but no effect due to ENOSYS from rtas_ibm_suspend_me() * apo_pm_init - created sysfs, allows write - nothing checks the value written to though * alloc_dispatch_log_kmem_cache - creating kmem_cache on non-pseries machines Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/pseries: Fix endian issues in MSI codeAnton Blanchard2013-12-131-13/+15
| | | | | | | | | | | The MSI code is miscalculating quotas in little endian mode. Add required byteswaps to fix this. Before we claimed a quota of 65536, after the patch we see the correct value of 256. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/pseries: Make 32-bit MSI quirk work on systems lacking firmware supportBrian King2013-05-241-3/+37
| | | | | | | | | | | | | | Recent commit e61133dda480062d221f09e4fc18f66763f8ecd0 added support for a new firmware feature to force an adapter to use 32 bit MSIs. However, this firmware is not available for all systems. The hack below allows devices needing 32 bit MSIs to work on these systems as well. It is careful to only enable this on Gen2 slots, which should limit this to configurations where this hack is needed and tested to work. [Small change to factor out the hack into a separate function -- BenH] Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Make radeon 32-bit MSI quirk work on powernvBenjamin Herrenschmidt2013-05-241-32/+3
| | | | | | | | | This moves the quirk itself to pci_64.c as to get built on all ppc64 platforms (the only ones with a pci_dn), factors the two implementations of get_pdn() into a single pci_get_dn() and use the quirk to do 32-bit MSIs on IODA based powernv platforms. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/pseries: Force 32 bit MSIs for devices that require itBrian King2013-05-061-3/+18
| | | | | | | | | | | The following patch implements a new PAPR change which allows the OS to force the use of 32 bit MSIs, regardless of what the PCI capabilities indicate. This is required for some devices that advertise support for 64 bit MSIs but don't actually support them. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/pseries: Fix oops with MSIs when missing EEH PEsAlexey Kardashevskiy2012-11-231-1/+2
| | | | | | | | | The new EEH code introduced a small regression, if the EEH PEs are missin (which happens currently in qemu for example), it will deref a NULL pointer in the MSI code. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/eeh: Trace error based on PE from beginningGavin Shan2012-09-101-1/+5
| | | | | | | | | | | | | | | There're 2 conditions to trigger EEH error detection: invalid value returned from reading I/O or config space. On each case, the function eeh_dn_check_failure will be called to initialize EEH event and put it into the poll for further processing. The patch changes the function for a little bit so that the EEH error will be traced based on PE instead of EEH device any more. Also, the function eeh_find_device_pe() has been removed since the eeh device is tracing the PE by struct eeh_dev::pe. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/pseries: Round up MSI-X requestsAnton Blanchard2012-09-071-1/+19
| | | | | | | | | | | | | | The pseries firmware currently refuses any non power of two MSI-X request. Unfortunately most network drivers end up asking for that because they want a power of two for RX queues and one or two extra for everything else. This patch rounds up the firmware request to the next power of two if the quota allows it. If this fails we fall back to using the original request size. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/eeh: Cleanup function names in the EEH coreGavin Shan2012-03-091-1/+1
| | | | | | | | | | | | | The EEH has been implemented on pSeries platform. The original code looks a little bit nasty. The patch does cleanup on the current EEH implementation so that it looks more clean. * Try adding prefix "eeh" for functions. * Some function names have been adjusted so that they looks shorter and meaningful. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Convert to new irq_* function namesThomas Gleixner2011-03-291-2/+2
| | | | | | Scripted with coccinelle. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* powerpc/pseries: Disable MSI using new interface if possibleNishanth Aravamudan2011-03-111-2/+12
| | | | | | | | | | | | | | | On upcoming hardware, we have a PCI adapter with two functions, one of which uses MSI and the other uses MSI-X. This adapter, when MSI is disabled using the "old" firmware interface (RTAS_CHANGE_FN), still signals an MSI-X interrupt and triggers an EEH. We are working with the vendor to ensure that the hardware is not at fault, but if we use the "new" interface (RTAS_CHANGE_MSI_FN) to disable MSI, we also automatically disable MSI-X and the adapter does not appear to signal any stray MSI-X interrupt. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Acked-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/pci: Fix regression in powerpc MSI-XAndre Detsch2009-11-051-2/+0
| | | | | | | | | | | | | | Patch f598282f5145036312d90875d0ed5c14b49fd8a7 exposed a problem in powerpc MSI-X functionality, making network interfaces such as ixgbe and cxgb3 stop to work when MSI-X is enabled. RX interrupts were not being generated. The problem was caused because MSI irq was not being effectively unmasked after device initialization. Signed-off-by: Andre Detsch <adetsch@br.ibm.com> Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/pseries: Reject discontiguous/non-zero based MSI-X requestsMichael Ellerman2009-03-111-0/+24
| | | | | | | | | There's no way for us to express to firmware that we want a discontiguous, or non-zero based, range of MSI-X entries. So we must reject such requests. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/pseries: Implement a quota system for MSIsMichael Ellerman2009-02-231-2/+176
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are hardware limitations on the number of available MSIs, which firmware expresses using a property named "ibm,pe-total-#msi". This property tells us how many MSIs are available for devices below the point in the PCI tree where we find the property. For old firmwares which don't have the property, we assume there are 8 MSIs available per "partitionable endpoint" (PE). The PE can be found using existing EEH code, which uses the methods described in PAPR. For our purposes we want the parent of the node that's identified using this method. When a driver requests n MSIs for a device, we first establish where the "ibm,pe-total-#msi" property above that device is, or we find the PE if the property is not found. In both cases we call this node the "pe_dn". We then count all non-bridge devices below the pe_dn, to establish how many devices in total may need MSIs. The quota is then simply the total available divided by the number of devices, if the request is less than or equal to the quota, the request is fine and we're done. If the request is greater than the quota, we try to determine if there are any "spare" MSIs which we can give to this device. Spare MSIs are found by looking for other devices which can never use their full quota, because their "req#msi(-x)" property is less than the quota. If we find any spare, we divide the spares by the number of devices that could request more than their quota. This ensures the spare MSIs are spread evenly amongst all over-quota requestors. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/pseries: Return req#msi(-x) if request is largerMichael Ellerman2009-02-231-1/+5
| | | | | | | | | | | | If a driver asks for more MSIs than the devices "req#msi(-x)" property, we currently return -ENOSPC. This doesn't give the driver any chance to make a new request with a number that might work. So if "req#msi(-x)" is less than the request, return its value. To be 100% safe, make sure we return an error if req_msi == 0. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/pseries: Return the number of MSIs we could allocateMichael Ellerman2009-02-111-7/+9
| | | | | | | | | | If we can't allocate the requested number of MSIs, we can still tell the generic code how many we were able to allocate. That can then be passed onto the driver, allowing it to request that many in future, and probably succeeed. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/pseries: Check for MSI-X also in rtas_msi_pci_irq_fixup()Michael Ellerman2009-02-111-2/+2
| | | | | | | | We also need to check that the device isn't using MSI-X in the irq fixup routine, otherwise we might leave MSI-Xs configured at boot. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>