summaryrefslogtreecommitdiffstats
path: root/drivers/perf
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'arm64-upstream' of ↵Linus Torvalds2022-05-2310-144/+920
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - Initial support for the ARMv9 Scalable Matrix Extension (SME). SME takes the approach used for vectors in SVE and extends this to provide architectural support for matrix operations. No KVM support yet, SME is disabled in guests. - Support for crashkernel reservations above ZONE_DMA via the 'crashkernel=X,high' command line option. - btrfs search_ioctl() fix for live-lock with sub-page faults. - arm64 perf updates: support for the Hisilicon "CPA" PMU for monitoring coherent I/O traffic, support for Arm's CMN-650 and CMN-700 interconnect PMUs, minor driver fixes, kerneldoc cleanup. - Kselftest updates for SME, BTI, MTE. - Automatic generation of the system register macros from a 'sysreg' file describing the register bitfields. - Update the type of the function argument holding the ESR_ELx register value to unsigned long to match the architecture register size (originally 32-bit but extended since ARMv8.0). - stacktrace cleanups. - ftrace cleanups. - Miscellaneous updates, most notably: arm64-specific huge_ptep_get(), avoid executable mappings in kexec/hibernate code, drop TLB flushing from get_clear_flush() (and rename it to get_clear_contig()), ARCH_NR_GPIO bumped to 2048 for ARCH_APPLE. * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (145 commits) arm64/sysreg: Generate definitions for FAR_ELx arm64/sysreg: Generate definitions for DACR32_EL2 arm64/sysreg: Generate definitions for CSSELR_EL1 arm64/sysreg: Generate definitions for CPACR_ELx arm64/sysreg: Generate definitions for CONTEXTIDR_ELx arm64/sysreg: Generate definitions for CLIDR_EL1 arm64/sve: Move sve_free() into SVE code section arm64: Kconfig.platforms: Add comments arm64: Kconfig: Fix indentation and add comments arm64: mm: avoid writable executable mappings in kexec/hibernate code arm64: lds: move special code sections out of kernel exec segment arm64/hugetlb: Implement arm64 specific huge_ptep_get() arm64/hugetlb: Use ptep_get() to get the pte value of a huge page arm64: kdump: Do not allocate crash low memory if not needed arm64/sve: Generate ZCR definitions arm64/sme: Generate defintions for SVCR arm64/sme: Generate SMPRI_EL1 definitions arm64/sme: Automatically generate SMPRIMAP_EL2 definitions arm64/sme: Automatically generate SMIDR_EL1 defines arm64/sme: Automatically generate defines for SMCR ...
| * perf/arm-cmn: Decode CAL devices properly in debugfsRobin Murphy2022-05-121-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The debugfs code is lazy, and since it only keeps the bottom byte of each connect_info register to save space, it also treats the whole thing as the device_type since the other bits were reserved anyway. Upon closer inspection, though, this is no longer true on newer IP versions, so let's be good and decode the exact field properly. This should help it not get confused when a Component Aggregation Layer is present (which is already implied if Node IDs are found for both device addresses represented by the next two lines of the table). Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/6a13a6128a28cfe2eec6d09cf372a167ec9c3b65.1652274773.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
| * perf/arm-cmn: Fix filter_sel lookupRobin Murphy2022-05-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | Carefully considering the bounds of an array is all well and good, until you forget that that array also contains a NULL sentinel at the end and dereference it. So close... Reported-by: Qian Cai <quic_qiancai@quicinc.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/bebba768156aa3c0757140457bdd0fec10819388.1652217788.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
| * perf/marvell_cn10k: Fix tad_pmu_event_init() to check pmu type firstTanmay Jagdale2022-05-101-3/+3
| | | | | | | | | | | | | | | | | | | | Make sure to check the pmu type first and then check event->attr.disabled. Doing so would avoid reading the disabled attribute of an event that is not handled by TAD PMU. Signed-off-by: Tanmay Jagdale <tanmay@marvell.com> Link: https://lore.kernel.org/r/20220510102657.487539-1-tanmay@marvell.com Signed-off-by: Will Deacon <will@kernel.org>
| * drivers/perf: hisi: Add Support for CPA PMUQi Liu2022-05-062-1/+410
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On HiSilicon Hip09 platform, there is a CPA (Coherency Protocol Agent) on each SICL (Super IO Cluster) which implements packet format translation, route parsing and traffic statistics. CPA PMU has 8 PMU counters and interrupt is supported to handle counter overflow. Let's support its driver under the framework of HiSilicon PMU driver. Signed-off-by: Qi Liu <liuqi115@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/20220415102352.6665-3-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
| * drivers/perf: hisi: Associate PMUs in SICL with CPUs onlineQi Liu2022-05-063-11/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a PMU is in a SICL (Super IO cluster), it is not appropriate to associate this PMU with a CPU die. So we associate it with all CPUs online, rather than CPUs in the nearest SCCL. As the firmware of Hip09 platform hasn't been published yet, change of PMU driver will not influence backwards compatibility between driver and firmware. Signed-off-by: Qi Liu <liuqi115@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/20220415102352.6665-2-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
| * drivers/perf: arm_spe: Expose saturating counter to 16-bitShaokun Zhang2022-05-061-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to acquire more accurate latency, Armv8.8[1] has defined the CountSize field to 16-bit saturating counters when it's 0b0011. Let's support this new feature and expose its to user under sysfs. [1] https://developer.arm.com/documentation/ddi0487/latest Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/20220429063307.63251-1-zhangshaokun@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
| * perf/arm-cmn: Add CMN-700 supportRobin Murphy2022-05-061-16/+220
| | | | | | | | | | | | | | | | | | | | | | | | | | Add the identifiers, events, and subtleties for CMN-700. Highlights include yet more options for doubling up CHI channels, which finally grows event IDs beyond 8 bits for XPs, and a new set of CML gateway nodes adding support for CXL as well as CCIX, where the Link Agent is now internal to the CMN mesh so we gain regular PMU events for that too. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/cf892baa0d0258ea6cd6544b15171be0069a083a.1650320598.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
| * perf/arm-cmn: Refactor occupancy filter selectorRobin Murphy2022-05-061-72/+98
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So far, DNs and HN-Fs have each had one event ralated to occupancy trackers which are filtered by a separate field. CMN-700 raises the stakes by introducing two more sets of HN-F events with corresponding additional filter fields. Prepare for this by refactoring our filter selection and tracking logic to account for multiple filter types coexisting on the same node. This need not affect the uAPI, which can just continue to encode any per-event filter setting in the "occupid" config field, even if it's technically not the most accurate name for some of them. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/1aa47ba0455b144c416537f6b0e58dc93b467a00.1650320598.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
| * perf/arm-cmn: Add CMN-650 supportRobin Murphy2022-05-061-46/+176
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the identifiers and events for CMN-650, which slots into its evolutionary position between CMN-600 and the 700-series products. Imagine CMN-600 made bigger, and with most of the rough edges smoothed off, but that then balanced out by some bonkers PMU functionality for the new HN-P enhancement in CMN-650r2. Most of the CXG events are actually common to newer revisions of CMN-600 too, so they're arguably a little late; oh well. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/b0adc5824db53f71a2b561c293e2120390106536.1650320598.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
| * perf: check return value of armpmu_request_irq()Ren Yu2022-05-061-1/+3
| | | | | | | | | | | | | | | | When the function armpmu_request_irq() failed, goto err Signed-off-by: Ren Yu <renyu@nfschina.com> Link: https://lore.kernel.org/r/20220425100436.4881-1-renyu@nfschina.com Signed-off-by: Will Deacon <will@kernel.org>
| * perf: RISC-V: Remove non-kernel-doc ** commentsPalmer Dabbelt2022-05-061-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This will presumably trip up some tools that try to parse the comments as kernel doc when they're not. Reported-by: kernel test robot <lkp@intel.com> Fixes: 4905ec2fb7e6 ("RISC-V: Add sscofpmf extension support") Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> -- These recently landed in for-next, but I'm trying to avoid rewriting history as there's a lot in flight right now. Reviewed-by: Atish Patra <atishp@rivosinc.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20220322220147.11407-1-palmer@rivosinc.com Signed-off-by: Will Deacon <will@kernel.org>
* | arm_pmu: Validate single/group leader eventsRob Herring2022-04-131-6/+4
|/ | | | | | | | | | | | | | | | | In the case where there is only a cycle counter available (i.e. PMCR_EL0.N is 0) and an event other than CPU cycles is opened, the open should fail as the event can never possibly be scheduled. However, the event validation when an event is opened is skipped when the group leader is opened. Fix this by always validating the group leader events. Reported-by: Al Grant <al.grant@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20220408203330.4014015-1-robh@kernel.org Cc: <stable@vger.kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
* perf/imx_ddr: Fix undefined behavior due to shift overflowing the constantBorislav Petkov2022-04-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix: In file included from <command-line>:0:0: In function ‘ddr_perf_counter_enable’, inlined from ‘ddr_perf_irq_handler’ at drivers/perf/fsl_imx8_ddr_perf.c:651:2: ././include/linux/compiler_types.h:352:38: error: call to ‘__compiletime_assert_729’ \ declared with attribute error: FIELD_PREP: mask is not constant _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ... See https://lore.kernel.org/r/YkwQ6%2BtIH8GQpuct@zn.tnic for the gory details as to why it triggers with older gccs only. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Frank Li <Frank.li@nxp.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Shawn Guo <shawnguo@kernel.org> Cc: Sascha Hauer <s.hauer@pengutronix.de> Cc: Pengutronix Kernel Team <kernel@pengutronix.de> Cc: Fabio Estevam <festevam@gmail.com> Cc: NXP Linux Team <linux-imx@nxp.com> Cc: linux-arm-kernel@lists.infradead.org Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220405151517.29753-10-bp@alien8.de Signed-off-by: Will Deacon <will@kernel.org>
* perf: MARVELL_CN10K_DDR_PMU should depend on ARCH_THUNDERGeert Uytterhoeven2022-04-041-1/+1
| | | | | | | | | | | | The Marvell CN10K DRAM Subsystem (DSS) performance monitor is only present on Marvell CN10K SoCs. Hence add a dependency on ARCH_THUNDER, to prevent asking the user about this driver when configuring a kernel without Cavium Thunder (incl. Marvell CN10K) SoC support, Fixes: 68fa55f0e05c ("perf/marvell: cn10k DDR perf event core ownership") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/18bfd6e1bcf67db7ea656d684a8bbb68261eeb54.1648559364.git.geert+renesas@glider.be Signed-off-by: Will Deacon <will@kernel.org>
* perf: qcom_l2_pmu: fix an incorrect NULL check on list iteratorXiaomeng Tong2022-04-041-3/+3
| | | | | | | | | | | | | | | | | | The bug is here: return cluster; The list iterator value 'cluster' will *always* be set and non-NULL by list_for_each_entry(), so it is incorrect to assume that the iterator value will be NULL if the list is empty or no element is found. To fix the bug, return 'cluster' when found, otherwise return NULL. Cc: stable@vger.kernel.org Fixes: 21bdbb7102ed ("perf: add qcom l2 cache perf events driver") Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com> Link: https://lore.kernel.org/r/20220327055733.4070-1-xiam0nd.tong@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
* Merge tag 'riscv-for-linus-5.18-mw0' of ↵Linus Torvalds2022-03-255-0/+1289
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for Sv57-based virtual memory. - Various improvements for the MicroChip PolarFire SOC and the associated Icicle dev board, which should allow upstream kernels to boot without any additional modifications. - An improved memmove() implementation. - Support for the new Ssconfpmf and SBI PMU extensions, which allows for a much more useful perf implementation on RISC-V systems. - Support for restartable sequences. * tag 'riscv-for-linus-5.18-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (36 commits) rseq/selftests: Add support for RISC-V RISC-V: Add support for restartable sequence MAINTAINERS: Add entry for RISC-V PMU drivers Documentation: riscv: Remove the old documentation RISC-V: Add sscofpmf extension support RISC-V: Add perf platform driver based on SBI PMU extension RISC-V: Add RISC-V SBI PMU extension definitions RISC-V: Add a simple platform driver for RISC-V legacy perf RISC-V: Add a perf core library for pmu drivers RISC-V: Add CSR encodings for all HPMCOUNTERS RISC-V: Remove the current perf implementation RISC-V: Improve /proc/cpuinfo output for ISA extensions RISC-V: Do no continue isa string parsing without correct XLEN RISC-V: Implement multi-letter ISA extension probing framework RISC-V: Extract multi-letter extension names from "riscv, isa" RISC-V: Minimal parser for "riscv, isa" strings RISC-V: Correctly print supported extensions riscv: Fixed misaligned memory access. Fixed pointer comparison. MAINTAINERS: update riscv/microchip entry riscv: dts: microchip: add new peripherals to icicle kit device tree ...
| * RISC-V: Add sscofpmf extension supportAtish Patra2022-03-211-5/+217
| | | | | | | | | | | | | | | | | | | | | | | | The sscofpmf extension allows counter overflow and filtering for programmable counters. Enable the perf driver to handle the overflow interrupt. The overflow interrupt is a hart local interrupt. Thus, per cpu overflow interrupts are setup as a child under the root INTC irq domain. Signed-off-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
| * RISC-V: Add perf platform driver based on SBI PMU extensionAtish Patra2022-03-214-0/+591
| | | | | | | | | | | | | | | | | | | | | | | | | | | | RISC-V SBI specification added a PMU extension that allows to configure start/stop any pmu counter. The RISC-V perf can use most of the generic perf features except interrupt overflow and event filtering based on privilege mode which will be added in future. It also allows to monitor a handful of firmware counters that can provide insights into firmware activity during a performance analysis. Signed-off-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
| * RISC-V: Add a simple platform driver for RISC-V legacy perfAtish Patra2022-03-213-0/+153
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The old RISC-V perf implementation allowed counting of only cycle/instruction counters using perf. Restore that feature by implementing a simple platform driver under a separate config to provide backward compatibility. Any existing software stack will continue to work as it is. However, it provides an easy way out in future where we can remove the legacy driver. Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
| * RISC-V: Add a perf core library for pmu driversAtish Patra2022-03-213-0/+333
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement a perf core library that can support all the essential perf features in future. It can also accommodate any type of PMU implementation in future. Currently, both SBI based perf driver and legacy driver implemented uses the library. Most of the common perf functionalities are kept in this core library wile PMU specific driver can implement PMU specific features. For example, the SBI specific functionality will be implemented in the SBI specific driver. Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
* | Merge tag 'iommu-updates-v5.18' of ↵Linus Torvalds2022-03-241-1/+3
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: - IOMMU Core changes: - Removal of aux domain related code as it is basically dead and will be replaced by iommu-fd framework - Split of iommu_ops to carry domain-specific call-backs separatly - Cleanup to remove useless ops->capable implementations - Improve 32-bit free space estimate in iova allocator - Intel VT-d updates: - Various cleanups of the driver - Support for ATS of SoC-integrated devices listed in ACPI/SATC table - ARM SMMU updates: - Fix SMMUv3 soft lockup during continuous stream of events - Fix error path for Qualcomm SMMU probe() - Rework SMMU IRQ setup to prepare the ground for PMU support - Minor cleanups and refactoring - AMD IOMMU driver: - Some minor cleanups and error-handling fixes - Rockchip IOMMU driver: - Use standard driver registration - MSM IOMMU driver: - Minor cleanup and change to standard driver registration - Mediatek IOMMU driver: - Fixes for IOTLB flushing logic * tag 'iommu-updates-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (47 commits) iommu/amd: Improve amd_iommu_v2_exit() iommu/amd: Remove unused struct fault.devid iommu/amd: Clean up function declarations iommu/amd: Call memunmap in error path iommu/arm-smmu: Account for PMU interrupts iommu/vt-d: Enable ATS for the devices in SATC table iommu/vt-d: Remove unused function intel_svm_capable() iommu/vt-d: Add missing "__init" for rmrr_sanity_check() iommu/vt-d: Move intel_iommu_ops to header file iommu/vt-d: Fix indentation of goto labels iommu/vt-d: Remove unnecessary prototypes iommu/vt-d: Remove unnecessary includes iommu/vt-d: Remove DEFER_DEVICE_DOMAIN_INFO iommu/vt-d: Remove domain and devinfo mempool iommu/vt-d: Remove iova_cache_get/put() iommu/vt-d: Remove finding domain in dmar_insert_one_dev_info() iommu/vt-d: Remove intel_iommu::domains iommu/mediatek: Always tlb_flush_all when each PM resume iommu/mediatek: Add tlb_lock in tlb_flush_all iommu/mediatek: Remove the power status checking in tlb flush all ...
| * | perf/smmuv3: Don't cast parameter in bit operationsAndy Shevchenko2022-02-151-1/+3
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While in this particular case it would not be a (critical) issue, the pattern itself is bad and error prone in case somebody blindly copies to their code. Don't cast parameter to unsigned long pointer in the bit operations. Instead copy to a local variable on stack of a proper type and use. Note, new compilers might warn on this line for potential outbound access. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20220209184758.56578-1-andriy.shevchenko@linux.intel.com Signed-off-by: Will Deacon <will@kernel.org>
* | perf/marvell: Fix !CONFIG_OF build for CN10K DDR PMU driverWill Deacon2022-03-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When compiling the Marvell CN10K DDR PMU driver with CONFIG_OF=n, the build fails: | drivers/perf/marvell_cn10k_ddr_pmu.c:723:35: error: 'cn10k_ddr_pmu_of_match' undeclared here (not in a function); did you mean 'cn10k_ddr_pmu_driver'? Use `of_match_ptr()` to avoid referencing the non-existent match table in this configuration. Link: https://lore.kernel.org/r/202203091424.Vfe8J4W9-lkp@intel.com Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Will Deacon <will@kernel.org>
* | Merge branch 'for-next/perf-m1' into for-next/perfWill Deacon2022-03-084-0/+594
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Support for the CPU PMUs on the Apple M1. * for-next/perf-m1: drivers/perf: Add Apple icestorm/firestorm CPU PMU driver drivers/perf: arm_pmu: Handle 47 bit counters irqchip/apple-aic: Move PMU-specific registers to their own include file arm64: dts: apple: Add t8303 PMU nodes arm64: dts: apple: Add t8103 PMU interrupt affinities irqchip/apple-aic: Wire PMU interrupts irqchip/apple-aic: Parse FIQ affinities from device-tree dt-bindings: apple,aic: Add affinity description for per-cpu pseudo-interrupts dt-bindings: apple,aic: Add CPU PMU per-cpu pseudo-interrupts dt-bindings: arm-pmu: Document Apple PMU compatible strings
| * | drivers/perf: Add Apple icestorm/firestorm CPU PMU driverMarc Zyngier2022-03-083-0/+592
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new, weird and wonderful driver for the equally weird Apple PMU HW. Although the PMU itself is functional, we don't know much about the events yet, so this can be considered as yet another random number generator... Nonetheless, it can reliably count at least cycles and instructions in the usually wonky big-little way. For anything else, it of course supports raw event numbers. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
| * | drivers/perf: arm_pmu: Handle 47 bit countersMarc Zyngier2022-03-081-0/+2
| |/ | | | | | | | | | | | | | | | | | | | | The current ARM PMU framework can only deal with 32 or 64bit counters. Teach it about a 47bit flavour. Yes, this is odd. Reviewed-by: Hector Martin <marcan@marcan.st> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
* | perf/marvell: cn10k DDR perf event core ownershipBharat Bhushan2022-03-082-2/+55
| | | | | | | | | | | | | | | | | | | | | | As DDR perf event counters are not per core, so they should be accessed only by one core at a time. Select new core when previously owning core is going offline. Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com> Reviewed-by: Bhaskara Budiredla <bbudiredla@marvell.com> Link: https://lore.kernel.org/r/20220211045346.17894-5-bbhushan2@marvell.com Signed-off-by: Will Deacon <will@kernel.org>
* | perf/marvell: cn10k DDR perfmon event overflow handlingBharat Bhushan2022-03-081-0/+111
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CN10k DSS h/w perfmon does not support event overflow interrupt, so periodic timer is being used. Each event counter is 48bit, which in worst case scenario can increment at maximum 5.6 GT/s. At this rate it may take many hours to overflow these counters. Therefore polling period for overflow is set to 100 sec, which can be changed using sysfs parameter. Two fixed event counters starts counting from zero on overflow, so overflow condition is when new count less than previous count. While eight programmable event counters freezes at maximum value. Also individual counter cannot be restarted, so need to restart all eight counters. Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com> Reviewed-by: Bhaskara Budiredla <bbudiredla@marvell.com> Link: https://lore.kernel.org/r/20220211045346.17894-4-bbhushan2@marvell.com Signed-off-by: Will Deacon <will@kernel.org>
* | perf/marvell: CN10k DDR performance monitor supportBharat Bhushan2022-03-082-0/+602
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Marvell CN10k DRAM Subsystem (DSS) supports eight event counters for monitoring performance and software can program each counter to monitor any of the defined performance event. Performance events are for interface between the DDR controller and the PHY, interface between the DDR Controller and the CHI interconnect, or within the DDR Controller. Additionally DSS also supports two fixed performance event counters, one for number of ddr reads and other for ddr writes. This patch add basic support for these performance monitoring events on CN10k. Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com> Reviewed-by: Bhaskara Budiredla <bbudiredla@marvell.com> Link: https://lore.kernel.org/r/20220211045346.17894-3-bbhushan2@marvell.com Signed-off-by: Will Deacon <will@kernel.org>
* | perf/arm-cmn: Update watchpoint formatRobin Murphy2022-03-081-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | From CMN-650 onwards, some of the fields in the watchpoint config registers moved subtly enough to easily overlook. Watchpoint events are still only partially supported on newer IPs - which in itself deserves noting - but were not intended to become any *less* functional than on CMN-600. Fixes: 60d1504070c2 ("perf/arm-cmn: Support new IP features") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/e1ce4c2f1e4f73ab1c60c3a85e4037cd62dd6352.1645727871.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
* | perf/arm-cmn: Hide XP PUB events for CMN-600Robin Murphy2022-03-081-0/+3
| | | | | | | | | | | | | | | | | | | | CMN-600 doesn't have XP events for the PUB channel, but we missed the appropriate check to avoid exposing them. Fixes: 60d1504070c2 ("perf/arm-cmn: Support new IP features") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/4c108d39a0513def63acccf09ab52b328f242aeb.1645727871.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
* | perf: replace bitmap_weight with bitmap_empty where appropriateYury Norov2022-02-154-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | In some places, drivers/perf code calls bitmap_weight() to check if any bit of a given bitmap is set. It's better to use bitmap_empty() in that case because bitmap_empty() stops traversing the bitmap as soon as it finds first set bit, while bitmap_weight() counts all bits unconditionally. Signed-off-by: Yury Norov <yury.norov@gmail.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20220210224933.379149-13-yury.norov@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
* | perf: Replace acpi_bus_get_device()Rafael J. Wysocki2022-02-082-8/+4
| | | | | | | | | | | | | | | | | | | | | | | | Replace acpi_bus_get_device() that is going to be dropped with acpi_fetch_acpi_dev(). No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/10025610.nUPlyArG6x@kreacher Signed-off-by: Will Deacon <will@kernel.org>
* | perf/marvell_cn10k: Fix unused variable warning when W=1 and CONFIG_OF=nWill Deacon2022-02-081-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The kbuild helpfully reports that the Marvell CN10K TAD PMU driver emits a warning when building with W=1 and CONFIG_OF=n: | >> drivers/perf/marvell_cn10k_tad_pmu.c:371:34: warning: unused variable 'tad_pmu_of_match' [-Wunused-const-variable] static const struct of_device_id tad_pmu_of_match[] = { Guard the match table with CONFIG_OF to squash the warning. Link: https://lore.kernel.org/r/202201292349.zRQLcDDD-lkp@intel.com Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Will Deacon <will@kernel.org>
* | perf/arm-cmn: Make arm_cmn_debugfs staticRobin Murphy2022-02-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | Indeed our debugfs directory is driver-internal so should be static. Link: https://lore.kernel.org/r/202202030812.II1K2ZXf-lkp@intel.com Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/ca9248caaae69b5134f69e085fe78905dfe74378.1643911278.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
* | perf: MARVELL_CN10K_TAD_PMU should depend on ARCH_THUNDERGeert Uytterhoeven2022-02-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | The Marvell CN10K Last-Level cache Tag-and-data Units (LLC-TAD) performance monitor is only present on Marvell CN10K SoCs. Hence add a dependency on ARCH_THUNDER, to prevent asking the user about this driver when configuring a kernel without Cavium Thunder (incl. Marvell CN10K) SoC support. Fixes: 036a7584bede ("drivers: perf: Add LLC-TAD perf counter support") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/b4662a2c767d04cca19417e0c845edea2da262ad.1641995941.git.geert+renesas@glider.be Signed-off-by: Will Deacon <will@kernel.org>
* | perf/arm-ccn: Use platform_get_irq() to get the interruptLad Prabhakar2022-02-081-6/+4
|/ | | | | | | | | | | | | | | platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static allocation of IRQ resources in DT core code, this causes an issue when using hierarchical interrupt domains using "interrupts" property in the node as this bypasses the hierarchical setup and messes up the irq chaining. In preparation for removal of static setup of IRQ resource from DT core code use platform_get_irq(). Link: https://lore.kernel.org/r/20211224161334.31123-1-prabhakar.mahadev-lad.rj@bp.renesas.com Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Will Deacon <will@kernel.org>
* Merge tag 'irq-msi-2022-01-13' of ↵Linus Torvalds2022-01-131-4/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull MSI irq updates from Thomas Gleixner: "Rework of the MSI interrupt infrastructure. This is a treewide cleanup and consolidation of MSI interrupt handling in preparation for further changes in this area which are necessary to: - address existing shortcomings in the VFIO area - support the upcoming Interrupt Message Store functionality which decouples the message store from the PCI config/MMIO space" * tag 'irq-msi-2022-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (94 commits) genirq/msi: Populate sysfs entry only once PCI/MSI: Unbreak pci_irq_get_affinity() genirq/msi: Convert storage to xarray genirq/msi: Simplify sysfs handling genirq/msi: Add abuse prevention comment to msi header genirq/msi: Mop up old interfaces genirq/msi: Convert to new functions genirq/msi: Make interrupt allocation less convoluted platform-msi: Simplify platform device MSI code platform-msi: Let core code handle MSI descriptors bus: fsl-mc-msi: Simplify MSI descriptor handling soc: ti: ti_sci_inta_msi: Remove ti_sci_inta_msi_domain_free_irqs() soc: ti: ti_sci_inta_msi: Rework MSI descriptor allocation NTB/msi: Convert to msi_on_each_desc() PCI: hv: Rework MSI handling powerpc/mpic_u3msi: Use msi_for_each-desc() powerpc/fsl_msi: Use msi_for_each_desc() powerpc/pasemi/msi: Convert to msi_on_each_dec() powerpc/cell/axon_msi: Convert to msi_on_each_desc() powerpc/4xx/hsta: Rework MSI handling ...
| * perf/smmuv3: Use msi_get_virq()Thomas Gleixner2021-12-161-4/+1
| | | | | | | | | | | | | | | | | | | | Let the core code fiddle with the MSI descriptor retrieval. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20211210221815.029143589@linutronix.de
* | drivers: perf: marvell_cn10k: fix an IS_ERR() vs NULL checkDan Carpenter2022-01-041-1/+1
| | | | | | | | | | | | | | | | | | | | The devm_ioremap() function does not return error pointers. It returns NULL. Fixes: 036a7584bede ("drivers: perf: Add LLC-TAD perf counter support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20211217145907.GA16611@kili Signed-off-by: Will Deacon <will@kernel.org>
* | perf/smmuv3: Fix unused variable warning when CONFIG_OF=nWill Deacon2022-01-041-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kbuild robot reports that building the SMMUv3 PMU driver with CONFIG_OF=n results in a warning for W=1 builds: >> drivers/perf/arm_smmuv3_pmu.c:889:34: warning: unused variable 'smmu_pmu_of_match' [-Wunused-const-variable] static const struct of_device_id smmu_pmu_of_match[] = { ^ Guard the match table with #ifdef CONFIG_OF. Link: https://lore.kernel.org/r/202201041700.01KZEzhb-lkp@intel.com Fixes: 3f7be4356176 ("perf/smmuv3: Add devicetree support") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Will Deacon <will@kernel.org>
* | Merge branch 'for-next/perf-smmu' into for-next/perfWill Deacon2021-12-141-2/+64
|\ \ | | | | | | | | | | | | | | | | | | * for-next/perf-smmu: perf/smmuv3: Synthesize IIDR from CoreSight ID registers perf/smmuv3: Add devicetree support dt-bindings: Add Arm SMMUv3 PMCG binding
| * | perf/smmuv3: Synthesize IIDR from CoreSight ID registersRobin Murphy2021-12-141-1/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The SMMU_PMCG_IIDR register was not present in older revisions of the Arm SMMUv3 spec. On Arm Ltd. implementations, the IIDR value consists of fields from several PIDR registers, allowing us to present a standardized identifier to userspace. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Link: https://lore.kernel.org/r/20211117144844.241072-4-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
| * | perf/smmuv3: Add devicetree supportJean-Philippe Brucker2021-12-141-1/+10
| |/ | | | | | | | | | | | | | | | | | | Add device-tree support to the SMMUv3 PMCG driver. Signed-off-by: Jay Chen <jkchen@linux.alibaba.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20211117144844.241072-3-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
* | Merge branch 'for-next/perf-hisi' into for-next/perfWill Deacon2021-12-143-0/+959
|\ \ | | | | | | | | | | | | | | | * for-next/perf-hisi: drivers/perf: hisi: Add driver for HiSilicon PCIe PMU docs: perf: Add description for HiSilicon PCIe PMU driver
| * | drivers/perf: hisi: Add driver for HiSilicon PCIe PMUQi Liu2021-12-143-0/+959
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PCIe PMU Root Complex Integrated End Point(RCiEP) device is supported to sample bandwidth, latency, buffer occupation etc. Each PMU RCiEP device monitors multiple Root Ports, and each RCiEP is registered as a PMU in /sys/bus/event_source/devices, so users can select target PMU, and use filter to do further sets. Filtering options contains: event - select the event. port - select target Root Ports. Information of Root Ports are shown under sysfs. bdf - select requester_id of target EP device. trig_len - set trigger condition for starting event statistics. trig_mode - set trigger mode. 0 means starting to statistic when bigger than trigger condition, and 1 means smaller. thr_len - set threshold for statistics. thr_mode - set threshold mode. 0 means count when bigger than threshold, and 1 means smaller. Acked-by: Krzysztof Wilczyński <kw@linux.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Qi Liu <liuqi115@huawei.com> Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/20211202080633.2919-3-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
* | Merge branch 'for-next/perf-cn10k' into for-next/perfWill Deacon2021-12-143-0/+437
|\ \ | | | | | | | | | | | | | | | * for-next/perf-cn10k: dt-bindings: perf: Add YAML schemas for Marvell CN10K LLC-TAD pmu bindings drivers: perf: Add LLC-TAD perf counter support
| * | drivers: perf: Add LLC-TAD perf counter supportBhaskara Budiredla2021-12-143-0/+437
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This driver adds support for Last-level cache tag-and-data unit (LLC-TAD) PMU that is featured in some of the Marvell's CN10K infrastructure silicons. The LLC is divided into 2N slices distributed across N Mesh tiles in a single-socket configuration. The driver always configures the same counter for all of the TADs. The user would end up effectively reserving one of eight counters in every TAD to look across all TADs. The occurrences of events are aggregated and presented to the user at the end of an application run. The driver does not provide a way for the user to partition TADs so that different TADs are used for different applications. The event counters are zeroed to start event counting to avoid any rollover issues. TAD perf counters are 64-bit, so it's not currently possible to overflow event counters at current mesh and core frequencies. To measure tad pmu events use perf tool stat command. For instance: perf stat -e tad_dat_msh_in_dss,tad_req_msh_out_any <application> perf stat -e tad_alloc_any,tad_hit_any,tad_tag_rd <application> Signed-off-by: Bhaskara Budiredla <bbudiredla@marvell.com> Link: https://lore.kernel.org/r/20211115043506.6679-2-bbudiredla@marvell.com Signed-off-by: Will Deacon <will@kernel.org>
* | perf/arm-cmn: Add debugfs topology infoRobin Murphy2021-12-141-3/+148
| | | | | | | | | | | | | | | | | | | | | | | | In general, detailed performance analysis will require knoweldge of the the SoC beyond the CMN itself - e.g. which actual CPUs/peripherals/etc. are connected to each node. However for certain development and bringup tasks it can be useful to have a quick overview of the CMN internal topology to hand too. Add a debugfs file to map this out. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/159fd4d7e19fb3c8801a8cb64ee73ec50f55903c.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>