summaryrefslogtreecommitdiffstats
path: root/drivers/perf
Commit message (Collapse)AuthorAgeFilesLines
...
| * | perf: arm_cspmu: Set irq affinitiy only if overflow interrupt is usedIlkka Koskinen2023-06-091-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Don't try to set irq affinity if PMU doesn't have an overflow interrupt. Fixes: e37dfd65731d ("perf: arm_cspmu: Add support for ARM CoreSight PMU driver") Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/20230608203742.3503486-1-ilkka@os.amperecomputing.com Signed-off-by: Will Deacon <will@kernel.org>
| * | drivers/perf: hisi: Don't migrate perf to the CPU going to teardownJunhao He2023-06-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The driver needs to migrate the perf context if the current using CPU going to teardown. By the time calling the cpuhp::teardown() callback the cpu_online_mask() hasn't updated yet and still includes the CPU going to teardown. In current driver's implementation we may migrate the context to the teardown CPU and leads to the below calltrace: ... [ 368.104662][ T932] task:cpuhp/0 state:D stack: 0 pid: 15 ppid: 2 flags:0x00000008 [ 368.113699][ T932] Call trace: [ 368.116834][ T932] __switch_to+0x7c/0xbc [ 368.120924][ T932] __schedule+0x338/0x6f0 [ 368.125098][ T932] schedule+0x50/0xe0 [ 368.128926][ T932] schedule_preempt_disabled+0x18/0x24 [ 368.134229][ T932] __mutex_lock.constprop.0+0x1d4/0x5dc [ 368.139617][ T932] __mutex_lock_slowpath+0x1c/0x30 [ 368.144573][ T932] mutex_lock+0x50/0x60 [ 368.148579][ T932] perf_pmu_migrate_context+0x84/0x2b0 [ 368.153884][ T932] hisi_pcie_pmu_offline_cpu+0x90/0xe0 [hisi_pcie_pmu] [ 368.160579][ T932] cpuhp_invoke_callback+0x2a0/0x650 [ 368.165707][ T932] cpuhp_thread_fun+0xe4/0x190 [ 368.170316][ T932] smpboot_thread_fn+0x15c/0x1a0 [ 368.175099][ T932] kthread+0x108/0x13c [ 368.179012][ T932] ret_from_fork+0x10/0x18 ... Use function cpumask_any_but() to find one correct active cpu to fixes this issue. Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU") Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230608114326.27649-1-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
| * | drivers/perf: apple_m1: Force 63bit counters for M2 CPUsMarc Zyngier2023-06-052-6/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sidharth reports that on M2, the PMU never generates any interrupt when using 'perf record', which is a annoying as you get no sample. I'm temped to say "no sample, no problem", but others may have a different opinion. Upon investigation, it appears that the counters on M2 are significantly different from the ones on M1, as they count on 64 bits instead of 48. Which of course, in the fine M1 tradition, means that we can only use 63 bits, as the top bit is used to signal the interrupt... This results in having to introduce yet another flag to indicate yet another odd counter width. Who knows what the next crazy implementation will do... With this, perf can work out the correct offset, and 'perf record' works as intended. Tested on M2 and M2-Pro CPUs. Cc: Janne Grunau <j@jannau.net> Cc: Hector Martin <marcan@marcan.st> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Fixes: 7d0bfb7c9977 ("drivers/perf: apple_m1: Add Apple M2 support") Reported-by: Sidharth Kshatriya <sid.kshatriya@gmail.com> Tested-by: Sidharth Kshatriya <sid.kshatriya@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230528080205.288446-1-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
| * | perf/arm-cmn: Fix DTC resetRobin Murphy2023-06-051-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It turns out that my naive DTC reset logic fails to work as intended, since, after checking with the hardware designers, the PMU actually needs to be fully enabled in order to correctly clear any pending overflows. Therefore, invert the sequence to start with turning on both enables so that we can reliably get the DTCs into a known state, then moving to our normal counters-stopped state from there. Since all the DTM counters have already been unpaired during the initial discovery pass, we just need to additionally reset the cycle counters to ensure that no other unexpected overflows occur during this period. Fixes: 0ba64770a2f2 ("perf: Add Arm CMN-600 PMU driver") Reported-by: Geoff Blake <blakgeof@amazon.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/0ea4559261ea394f827c9aee5168c77a60aaee03.1684946389.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
| * | perf: qcom_l2_pmu: Make l2_cache_pmu_probe_cluster() more robustChristophe JAILLET2023-06-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If an error occurs after calling list_add(), the &l2cache_pmu->clusters list will reference some memory that will be freed when the managed resources will be released. Move the list_add() at the end of the function when everything is in fine. This is harmless because if l2_cache_pmu_probe_cluster() fails, then l2_cache_pmu_probe() will fail as well and 'l2cache_pmu' will be released as well. But it looks cleaner and could silence static checker warning. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/6a0f5bdb6b7b2ed4ef194fc49693e902ad5b95ea.1684397879.git.christophe.jaillet@wanadoo.fr Signed-off-by: Will Deacon <will@kernel.org>
| * | perf/arm-cci: Slightly optimize cci_pmu_sync_counters()Christophe JAILLET2023-06-051-2/+2
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the 'mask' bitmap is cleared, it is better to use its full maximum size instead of only the needed size. This lets the compiler optimize it because the size is now known at compile time. HW_CNTRS_MAX is small (i.e. currently 9), so a call to memset() is saved. Also, as 'mask' is local to the function, the non-atomic __set_bit() can also safely be used here. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/88d4e20d595f771396e9d558c1587eb4494057db.1682022422.git.christophe.jaillet@wanadoo.fr Signed-off-by: Will Deacon <will@kernel.org>
* / KVM: arm64: PMU: Don't overwrite PMUSERENR with vcpu loadedReiji Watanabe2023-06-041-3/+18
|/ | | | | | | | | | | | | | | | | | | | | | | | | | Currently, with VHE, KVM sets ER, CR, SW and EN bits of PMUSERENR_EL0 to 1 on vcpu_load(), and saves and restores the register value for the host on vcpu_load() and vcpu_put(). If the value of those bits are cleared on a pCPU with a vCPU loaded (armv8pmu_start() would do that when PMU counters are programmed for the guest), PMU access from the guest EL0 might be trapped to the guest EL1 directly regardless of the current PMUSERENR_EL0 value of the vCPU. Fix this by not letting armv8pmu_start() overwrite PMUSERENR_EL0 on the pCPU where PMUSERENR_EL0 for the guest is loaded, and instead updating the saved shadow register value for the host so that the value can be restored on vcpu_put() later. While vcpu_{put,load}() are manipulating PMUSERENR_EL0, disable IRQs to prevent a race condition between these processes and IPIs that attempt to update PMUSERENR_EL0 for the host EL0. Suggested-by: Mark Rutland <mark.rutland@arm.com> Suggested-by: Marc Zyngier <maz@kernel.org> Fixes: 83a7a4d643d3 ("arm64: perf: Enable PMU counter userspace access for perf event") Signed-off-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230603025035.3781797-3-reijiw@google.com
* RISC-V: Align SBI probe implementation with specAndrew Jones2023-04-291-1/+1
| | | | | | | | | | | | | | | | sbi_probe_extension() is specified with "Returns 0 if the given SBI extension ID (EID) is not available, or 1 if it is available unless defined as any other non-zero value by the implementation." Additionally, sbiret.value is a long. Fix the implementation to ensure any nonzero long value is considered a success, rather than only positive int values. Fixes: b9dcd9e41587 ("RISC-V: Add basic support for SBI v0.2") Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230427163626.101042-1-ajones@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
* Merge tag 'arm64-upstream' of ↵Linus Torvalds2023-04-2518-66/+1517
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "ACPI: - Improve error reporting when failing to manage SDEI on AGDI device removal Assembly routines: - Improve register constraints so that the compiler can make use of the zero register instead of moving an immediate #0 into a GPR - Allow the compiler to allocate the registers used for CAS instructions CPU features and system registers: - Cleanups to the way in which CPU features are identified from the ID register fields - Extend system register definition generation to handle Enum types when defining shared register fields - Generate definitions for new _EL2 registers and add new fields for ID_AA64PFR1_EL1 - Allow SVE to be disabled separately from SME on the kernel command-line Tracing: - Support for "direct calls" in ftrace, which enables BPF tracing for arm64 Kdump: - Don't bother unmapping the crashkernel from the linear mapping, which then allows us to use huge (block) mappings and reduce TLB pressure when a crashkernel is loaded. Memory management: - Try again to remove data cache invalidation from the coherent DMA allocation path - Simplify the fixmap code by mapping at page granularity - Allow the kfence pool to be allocated early, preventing the rest of the linear mapping from being forced to page granularity Perf and PMU: - Move CPU PMU code out to drivers/perf/ where it can be reused by the 32-bit ARM architecture when running on ARMv8 CPUs - Fix race between CPU PMU probing and pKVM host de-privilege - Add support for Apple M2 CPU PMU - Adjust the generic PERF_COUNT_HW_BRANCH_INSTRUCTIONS event dynamically, depending on what the CPU actually supports - Minor fixes and cleanups to system PMU drivers Stack tracing: - Use the XPACLRI instruction to strip PAC from pointers, rather than rolling our own function in C - Remove redundant PAC removal for toolchains that handle this in their builtins - Make backtracing more resilient in the face of instrumentation Miscellaneous: - Fix single-step with KGDB - Remove harmless warning when 'nokaslr' is passed on the kernel command-line - Minor fixes and cleanups across the board" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (72 commits) KVM: arm64: Ensure CPU PMU probes before pKVM host de-privilege arm64: kexec: include reboot.h arm64: delete dead code in this_cpu_set_vectors() arm64/cpufeature: Use helper macro to specify ID register for capabilites drivers/perf: hisi: add NULL check for name drivers/perf: hisi: Remove redundant initialized of pmu->name arm64/cpufeature: Consistently use symbolic constants for min_field_value arm64/cpufeature: Pull out helper for CPUID register definitions arm64/sysreg: Convert HFGITR_EL2 to automatic generation ACPI: AGDI: Improve error reporting for problems during .remove() arm64: kernel: Fix kernel warning when nokaslr is passed to commandline perf/arm-cmn: Fix port detection for CMN-700 arm64: kgdb: Set PSTATE.SS to 1 to re-enable single-step arm64: move PAC masks to <asm/pointer_auth.h> arm64: use XPACLRI to strip PAC arm64: avoid redundant PAC stripping in __builtin_return_address() arm64/sme: Fix some comments of ARM SME arm64/signal: Alloc tpidr2 sigframe after checking system_supports_tpidr2() arm64/signal: Use system_supports_tpidr2() to check TPIDR2 arm64/idreg: Don't disable SME when disabling SVE ...
| * drivers/perf: hisi: add NULL check for nameJunhao He2023-04-173-15/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When allocations fails that can be NULL now. If the name provided is NULL, then the initialization process of the PMU type and dev will be skipped in function perf_pmu_register(). Consequently, the PMU will not be able to register into the kernel. Moreover, in the case of unregister the PMU, the function device_del() will need to handle NULL pointers, which potentially can cause issues. So move this allocation above the cpuhp_state_add_instance() and directly return if it does fail. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230403081423.62460-3-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
| * drivers/perf: hisi: Remove redundant initialized of pmu->nameJunhao He2023-04-178-11/+8
| | | | | | | | | | | | | | | | | | | | "pmu->name" is initialized by perf_pmu_register() function, so remove the redundant initialized in hisi_pmu_init(). Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230403081423.62460-2-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
| * perf/arm-cmn: Fix port detection for CMN-700Robin Murphy2023-04-141-27/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the "extra device ports" configuration was first added, the additional mxp_device_port_connect_info registers were added around the existing mxp_mesh_port_connect_info registers. What I missed about CMN-700 is that it shuffled them around to remove this discontinuity. As such, tweak the definitions and factor out a helper for reading these registers so we can deal with this discrepancy easily, which does at least allow nicely tidying up the callsites. With this we can then also do the nice thing and skip accesses completely rather than relying on RES0 behaviour where we know the extra registers aren't defined. Fixes: 23760a014417 ("perf/arm-cmn: Add CMN-700 support") Reported-by: Jing Zhang <renyu.zj@linux.alibaba.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/71d129241d4d7923cde72a0e5b4c8d2f6084525f.1681295193.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
| * arm64: pmuv3: dynamically map PERF_COUNT_HW_BRANCH_INSTRUCTIONSStephane Eranian2023-04-111-4/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The mapping of perf_events generic hardware events to actual PMU events on ARM PMUv3 may not always be correct. This is in particular true for the PERF_COUNT_HW_BRANCH_INSTRUCTIONS event. Although the mapping points to an architected event, it may not always be available. This can be seen with a simple: $ perf stat -e branches sleep 0 Performance counter stats for 'sleep 0': <not supported> branches 0.001401081 seconds time elapsed Yet the hardware does have an event that could be used for branches. Dynamically check for a supported hardware event which can be used for PERF_COUNT_HW_BRANCH_INSTRUCTIONS at mapping time. And with that: $ perf stat -e branches sleep 0 Performance counter stats for 'sleep 0': 166,739 branches 0.000832163 seconds time elapsed Co-developed-by: Stephane Eranian <eranian@google.com> Signed-off-by: Stephane Eranian <eranian@google.com> Co-developed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Co-developed-by: Peter Newman <peternewman@google.com> Signed-off-by: Peter Newman <peternewman@google.com> Link: https://lore.kernel.org/all/YvunKCJHSXKz%2FkZB@FVFF77S0Q05N Link: https://lore.kernel.org/r/20230411093809.657501-1-peternewman@google.com Signed-off-by: Will Deacon <will@kernel.org>
| * perf/arm-cmn: Validate cycles events fullyRobin Murphy2023-04-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | DTC cycle count events don't have anything to validate or initialise in themselves, but we should not forget to still validate their whole group context. Otherwise, we may fail to correctly reject a contrived group containing an impossible number of cycles events. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/3124e8c276a1f513c1a415dc839ca4181b3c8bc8.1680522545.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
| * drivers/perf: apple_m1: Add Apple M2 supportJanne Grunau2023-03-271-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | The PMU itself is compatible with the one found on M1. We still know next to nothing about the counters so keep using CPU uarch specific compatibles/PMU names. Signed-off-by: Janne Grunau <j@jannau.net> Acked-by: Mark Rutland <mark.rutland@arm.com. Reviewed-by: Hector Martin <marcan@marcan.st> Link: https://lore.kernel.org/r/20230214-apple_m2_pmu-v1-2-9c9213ab9b63@jannau.net Signed-off-by: Will Deacon <will@kernel.org>
| * perf: arm_cspmu: Fix variable dereference warningBesar Wicaksono2023-03-271-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix warning message from smatch tool: | smatch warnings: | drivers/perf/arm_cspmu/arm_cspmu.c:1075 arm_cspmu_find_cpu_container() | warn: variable dereferenced before check 'cpu_dev' (see line 1073) Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Link: https://lore.kernel.org/r/202302191227.kc0V8fM7-lkp@intel.com/ Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20230302205701.35323-1-bwicaksono@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
| * perf/amlogic: Fix config1/config2 parsing issueJiucheng Xu2023-03-271-2/+6
| | | | | | | | | | | | | | | | | | The 3th argument of for_each_set_bit is incorrect, fix them. Fixes: 2016e2113d35 ("perf/amlogic: Add support for Amlogic meson G12 SoC DDR PMU driver") Signed-off-by: Jiucheng Xu <jiucheng.xu@amlogic.com> Link: https://lore.kernel.org/r/20230209115403.521868-1-jiucheng.xu@amlogic.com Signed-off-by: Will Deacon <will@kernel.org>
| * drivers/perf: Use devm_platform_get_and_ioremap_resource()Yang Li2023-03-271-2/+1
| | | | | | | | | | | | | | | | | | | | | | Convert platform_get_resource(), devm_ioremap_resource() to a single call to devm_platform_get_and_ioremap_resource(), as this is exactly what this function does. Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Link: https://lore.kernel.org/r/20230216063403.9753-1-yang.lee@linux.alibaba.com Signed-off-by: Will Deacon <will@kernel.org>
| * kbuild, drivers/perf: remove MODULE_LICENSE in non-modulesNick Alcock2023-03-271-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 8b41fc4454e ("kbuild: create modules.builtin without Makefile.modbuiltin or tristate.conf"), MODULE_LICENSE declarations are used to identify modules. As a consequence, uses of the macro in non-modules will cause modprobe to misidentify their containing object file as a module when it is not (false positives), and modprobe might succeed rather than failing with a suitable error message. So remove it in the files in this commit, none of which can be built as modules. Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Suggested-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: linux-modules@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Hitomi Hasegawa <hasegawa-hitomi@fujitsu.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230217141059.392471-9-nick.alcock@oracle.com Signed-off-by: Will Deacon <will@kernel.org>
| * perf: qcom: Use devm_platform_get_and_ioremap_resource()Yang Li2023-03-271-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | According to commit 890cc39a8799 ("drivers: provide devm_platform_get_and_ioremap_resource()"), convert platform_get_resource(), devm_ioremap_resource() to a single call to devm_platform_get_and_ioremap_resource(), as this is exactly what this function does. Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/20230315023108.36953-1-yang.lee@linux.alibaba.com Signed-off-by: Will Deacon <will@kernel.org>
| * perf: arm: Use devm_platform_get_and_ioremap_resource()Yang Li2023-03-271-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | According to commit 890cc39a8799 ("drivers: provide devm_platform_get_and_ioremap_resource()"), convert platform_get_resource(), devm_ioremap_resource() to a single call to devm_platform_get_and_ioremap_resource(), as this is exactly what this function does. Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/20230315023017.35789-1-yang.lee@linux.alibaba.com Signed-off-by: Will Deacon <will@kernel.org>
| * perf/arm-cmn: Move overlapping wp_combine fieldIlkka Koskinen2023-03-271-1/+1
| | | | | | | | | | | | | | | | | | | | As eventid field was expanded to support new mesh versions, it started to overlap with wp_combine field. Move wp_combine to fix the issue. Fixes: 23760a014417 ("perf/arm-cmn: Add CMN-700 support") Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/20230301175540.19891-1-ilkka@os.amperecomputing.com Signed-off-by: Will Deacon <will@kernel.org>
| * ARM: perf: Allow the use of the PMUv3 driver on 32bit ARMMarc Zyngier2023-03-272-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The only thing stopping the PMUv3 driver from compiling on 32bit is the lack of defined system registers names and the handful of required helpers. This is easily solved by providing the sysreg accessors and updating the Kconfig entry. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Co-developed-by: Zaid Al-Bassam <zalbassam@google.com> Signed-off-by: Zaid Al-Bassam <zalbassam@google.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20230317195027.3746949-8-zalbassam@google.com Signed-off-by: Will Deacon <will@kernel.org>
| * perf: pmuv3: Change GENMASK to GENMASK_ULLZaid Al-Bassam2023-03-271-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | GENMASK macro uses "unsigned long" (32-bit wide on arm and 64-bit on arm64), This causes build issues when enabling PMUv3 on arm as it tries to access bits > 31. This patch switches the GENMASK to GENMASK_ULL, which uses "unsigned long long" (64-bit on both arm and arm64). Signed-off-by: Zaid Al-Bassam <zalbassam@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20230317195027.3746949-6-zalbassam@google.com Signed-off-by: Will Deacon <will@kernel.org>
| * perf: pmuv3: Move inclusion of kvm_host.h to the arch-specific helperZaid Al-Bassam2023-03-271-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | KVM host support is available only on arm64. By moving the inclusion of kvm_host.h to an arm64-specific file, the 32bit architecture will be able to implement dummy helpers. Signed-off-by: Zaid Al-Bassam <zalbassam@google.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20230317195027.3746949-5-zalbassam@google.com Signed-off-by: Will Deacon <will@kernel.org>
| * perf: pmuv3: Abstract PMU version checksZaid Al-Bassam2023-03-271-4/+3
| | | | | | | | | | | | | | | | | | | | | | The current PMU version definitions are available for arm64 only, As we want to add PMUv3 support to arm (32-bit), abstracts these definitions by using arch-specific helpers. Signed-off-by: Zaid Al-Bassam <zalbassam@google.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20230317195027.3746949-4-zalbassam@google.com Signed-off-by: Will Deacon <will@kernel.org>
| * arm64: perf: Abstract system register accesses awayMarc Zyngier2023-03-271-92/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As we want to enable 32bit support, we need to distanciate the PMUv3 driver from the AArch64 system register names. This patch moves all system register accesses to an architecture specific include file, allowing the 32bit counterpart to be slotted in at a later time. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Co-developed-by: Zaid Al-Bassam <zalbassam@google.com> Signed-off-by: Zaid Al-Bassam <zalbassam@google.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20230317195027.3746949-3-zalbassam@google.com Signed-off-by: Will Deacon <will@kernel.org>
| * arm64: perf: Move PMUv3 driver to drivers/perfMarc Zyngier2023-03-273-0/+1480
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Having the ARM PMUv3 driver sitting in arch/arm64/kernel is getting in the way of being able to use perf on ARMv8 cores running a 32bit kernel, such as 32bit KVM guests. This patch moves it into drivers/perf/arm_pmuv3.c, with an include file in include/linux/perf/arm_pmuv3.h. The only thing left in arch/arm64 is some mundane perf stuff. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Zaid Al-Bassam <zalbassam@google.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20230317195027.3746949-2-zalbassam@google.com Signed-off-by: Will Deacon <will@kernel.org>
* | perf/amlogic: adjust register offsetsMarc Gonzalez2023-03-271-17/+17
|/ | | | | | | | | | | | Commit "perf/amlogic: resolve conflict between canvas & pmu" changed the base address. Fixes: 2016e2113d35 ("perf/amlogic: Add support for Amlogic meson G12 SoC DDR PMU driver") Signed-off-by: Marc Gonzalez <mgonzalez@freebox.fr> Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20230327120932.2158389-4-mgonzalez@freebox.fr Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
* Merge tag 'riscv-for-linus-6.3-mw2' of ↵Linus Torvalds2023-03-031-5/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull more RISC-V updates from Palmer Dabbelt: - Some cleanups and fixes for the Zbb-optimized string routines - Support for custom (vendor or implementation defined) perf events - COMMAND_LINE_SIZE has been increased to 1024 * tag 'riscv-for-linus-6.3-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Bump COMMAND_LINE_SIZE value to 1024 drivers/perf: RISC-V: Allow programming custom firmware events riscv, lib: Fix Zbb strncmp RISC-V: improve string-function assembly
| * drivers/perf: RISC-V: Allow programming custom firmware eventsMayuresh Chitale2023-03-011-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | Applications need to be able to program the SBI implementation specific or custom firmware events in addition to the standard firmware events. Remove a check in the driver that prohibits the programming of the custom firmware events. Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20230208074314.3661406-1-mchitale@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
* | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2023-02-251-7/+57
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull kvm updates from Paolo Bonzini: "ARM: - Provide a virtual cache topology to the guest to avoid inconsistencies with migration on heterogenous systems. Non secure software has no practical need to traverse the caches by set/way in the first place - Add support for taking stage-2 access faults in parallel. This was an accidental omission in the original parallel faults implementation, but should provide a marginal improvement to machines w/o FEAT_HAFDBS (such as hardware from the fruit company) - A preamble to adding support for nested virtualization to KVM, including vEL2 register state, rudimentary nested exception handling and masking unsupported features for nested guests - Fixes to the PSCI relay that avoid an unexpected host SVE trap when resuming a CPU when running pKVM - VGIC maintenance interrupt support for the AIC - Improvements to the arch timer emulation, primarily aimed at reducing the trap overhead of running nested - Add CONFIG_USERFAULTFD to the KVM selftests config fragment in the interest of CI systems - Avoid VM-wide stop-the-world operations when a vCPU accesses its own redistributor - Serialize when toggling CPACR_EL1.SMEN to avoid unexpected exceptions in the host - Aesthetic and comment/kerneldoc fixes - Drop the vestiges of the old Columbia mailing list and add [Oliver] as co-maintainer RISC-V: - Fix wrong usage of PGDIR_SIZE instead of PUD_SIZE - Correctly place the guest in S-mode after redirecting a trap to the guest - Redirect illegal instruction traps to guest - SBI PMU support for guest s390: - Sort out confusion between virtual and physical addresses, which currently are the same on s390 - A new ioctl that performs cmpxchg on guest memory - A few fixes x86: - Change tdp_mmu to a read-only parameter - Separate TDP and shadow MMU page fault paths - Enable Hyper-V invariant TSC control - Fix a variety of APICv and AVIC bugs, some of them real-world, some of them affecting architecurally legal but unlikely to happen in practice - Mark APIC timer as expired if its in one-shot mode and the count underflows while the vCPU task was being migrated - Advertise support for Intel's new fast REP string features - Fix a double-shootdown issue in the emergency reboot code - Ensure GIF=1 and disable SVM during an emergency reboot, i.e. give SVM similar treatment to VMX - Update Xen's TSC info CPUID sub-leaves as appropriate - Add support for Hyper-V's extended hypercalls, where "support" at this point is just forwarding the hypercalls to userspace - Clean up the kvm->lock vs. kvm->srcu sequences when updating the PMU and MSR filters - One-off fixes and cleanups - Fix and cleanup the range-based TLB flushing code, used when KVM is running on Hyper-V - Add support for filtering PMU events using a mask. If userspace wants to restrict heavily what events the guest can use, it can now do so without needing an absurd number of filter entries - Clean up KVM's handling of "PMU MSRs to save", especially when vPMU support is disabled - Add PEBS support for Intel Sapphire Rapids - Fix a mostly benign overflow bug in SEV's send|receive_update_data() - Move several SVM-specific flags into vcpu_svm x86 Intel: - Handle NMI VM-Exits before leaving the noinstr region - A few trivial cleanups in the VM-Enter flows - Stop enabling VMFUNC for L1 purely to document that KVM doesn't support EPTP switching (or any other VM function) for L1 - Fix a crash when using eVMCS's enlighted MSR bitmaps Generic: - Clean up the hardware enable and initialization flow, which was scattered around multiple arch-specific hooks. Instead, just let the arch code call into generic code. Both x86 and ARM should benefit from not having to fight common KVM code's notion of how to do initialization - Account allocations in generic kvm_arch_alloc_vm() - Fix a memory leak if coalesced MMIO unregistration fails selftests: - On x86, cache the CPU vendor (AMD vs. Intel) and use the info to emit the correct hypercall instruction instead of relying on KVM to patch in VMMCALL - Use TAP interface for kvm_binary_stats_test and tsc_msrs_test" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (325 commits) KVM: SVM: hyper-v: placate modpost section mismatch error KVM: x86/mmu: Make tdp_mmu_allowed static KVM: arm64: nv: Use reg_to_encoding() to get sysreg ID KVM: arm64: nv: Only toggle cache for virtual EL2 when SCTLR_EL2 changes KVM: arm64: nv: Filter out unsupported features from ID regs KVM: arm64: nv: Emulate EL12 register accesses from the virtual EL2 KVM: arm64: nv: Allow a sysreg to be hidden from userspace only KVM: arm64: nv: Emulate PSTATE.M for a guest hypervisor KVM: arm64: nv: Add accessors for SPSR_EL1, ELR_EL1 and VBAR_EL1 from virtual EL2 KVM: arm64: nv: Handle SMCs taken from virtual EL2 KVM: arm64: nv: Handle trapped ERET from virtual EL2 KVM: arm64: nv: Inject HVC exceptions to the virtual EL2 KVM: arm64: nv: Support virtual EL2 exceptions KVM: arm64: nv: Handle HCR_EL2.NV system register traps KVM: arm64: nv: Add nested virt VCPU primitives for vEL2 VCPU state KVM: arm64: nv: Add EL2 system registers to vcpu context KVM: arm64: nv: Allow userspace to set PSR_MODE_EL2x KVM: arm64: nv: Reset VCPU to EL2 registers if VCPU nested virt is set KVM: arm64: nv: Introduce nested virtualization VCPU feature KVM: arm64: Use the S2 MMU context to iterate over S2 table ...
| * perf: RISC-V: Improve privilege mode filtering for perfAtish Patra2023-02-071-5/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the host driver doesn't have any method to identify if the requested perf event is from kvm or bare metal. As KVM runs in HS mode, there are no separate hypervisor privilege mode to distinguish between the attributes for guest/host. Improve the privilege mode filtering by using the event specific config1 field. Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Anup Patel <anup@brainfault.org>
| * perf: RISC-V: Define helper functions expose hpm counter width and countAtish Patra2023-02-071-2/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KVM module needs to know how many hardware counters and the counter width that the platform supports. Otherwise, it will not be able to show optimal value of virtual counters to the guest. The virtual hardware counters also need to have the same width as the logical hardware counters for simplicity. However, there shouldn't be mapping between virtual hardware counters and logical hardware counters. As we don't support hetergeneous harts or counters with different width as of now, the implementation relies on the counter width of the first available programmable counter. Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Anup Patel <anup@brainfault.org>
* | Merge tag 'arm64-upstream' of ↵Linus Torvalds2023-02-2113-92/+143
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - Support for arm64 SME 2 and 2.1. SME2 introduces a new 512-bit architectural register (ZT0, for the look-up table feature) that Linux needs to save/restore - Include TPIDR2 in the signal context and add the corresponding kselftests - Perf updates: Arm SPEv1.2 support, HiSilicon uncore PMU updates, ACPI support to the Marvell DDR and TAD PMU drivers, reset DTM_PMU_CONFIG (ARM CMN) at probe time - Support for DYNAMIC_FTRACE_WITH_CALL_OPS on arm64 - Permit EFI boot with MMU and caches on. Instead of cleaning the entire loaded kernel image to the PoC and disabling the MMU and caches before branching to the kernel bare metal entry point, leave the MMU and caches enabled and rely on EFI's cacheable 1:1 mapping of all of system RAM to populate the initial page tables - Expose the AArch32 (compat) ELF_HWCAP features to user in an arm64 kernel (the arm32 kernel only defines the values) - Harden the arm64 shadow call stack pointer handling: stash the shadow stack pointer in the task struct on interrupt, load it directly from this structure - Signal handling cleanups to remove redundant validation of size information and avoid reading the same data from userspace twice - Refactor the hwcap macros to make use of the automatically generated ID registers. It should make new hwcaps writing less error prone - Further arm64 sysreg conversion and some fixes - arm64 kselftest fixes and improvements - Pointer authentication cleanups: don't sign leaf functions, unify asm-arch manipulation - Pseudo-NMI code generation optimisations - Minor fixes for SME and TPIDR2 handling - Miscellaneous updates: ARCH_FORCE_MAX_ORDER is now selectable, replace strtobool() to kstrtobool() in the cpufeature.c code, apply dynamic shadow call stack in two passes, intercept pfn changes in set_pte_at() without the required break-before-make sequence, attempt to dump all instructions on unhandled kernel faults * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (130 commits) arm64: fix .idmap.text assertion for large kernels kselftest/arm64: Don't require FA64 for streaming SVE+ZA tests kselftest/arm64: Copy whole EXTRA context arm64: kprobes: Drop ID map text from kprobes blacklist perf: arm_spe: Print the version of SPE detected perf: arm_spe: Add support for SPEv1.2 inverted event filtering perf: Add perf_event_attr::config3 arm64/sme: Fix __finalise_el2 SMEver check drivers/perf: fsl_imx8_ddr_perf: Remove set-but-not-used variable arm64/signal: Only read new data when parsing the ZT context arm64/signal: Only read new data when parsing the ZA context arm64/signal: Only read new data when parsing the SVE context arm64/signal: Avoid rereading context frame sizes arm64/signal: Make interface for restore_fpsimd_context() consistent arm64/signal: Remove redundant size validation from parse_user_sigframe() arm64/signal: Don't redundantly verify FPSIMD magic arm64/cpufeature: Use helper macros to specify hwcaps arm64/cpufeature: Always use symbolic name for feature value in hwcaps arm64/sysreg: Initial unsigned annotations for ID registers arm64/sysreg: Initial annotation of signed ID registers ...
| * | perf: arm_spe: Print the version of SPE detectedRob Herring2023-02-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | There's up to 4 versions of SPE now. Let's add the version that's been detected to the driver's informational print out. Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230206204746.1452942-1-robh@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
| * | perf: arm_spe: Add support for SPEv1.2 inverted event filteringRob Herring2023-02-071-0/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Arm SPEv1.2 (Arm v8.7/v9.2) adds a new feature called Inverted Event Filter which excludes samples matching the event filter. The feature mirrors the existing event filter in PMSEVFR_EL1 adding a new register, PMSNEVFR_EL1, which has the same event bit assignments. Tested-by: James Clark <james.clark@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20220825-arm-spe-v8-7-v4-8-327f860daf28@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
| * | drivers/perf: fsl_imx8_ddr_perf: Remove set-but-not-used variableSascha Hauer2023-02-031-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | active_events is set but not used, remove it. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Link: https://lore.kernel.org/r/20230203121509.3580245-1-s.hauer@pengutronix.de Signed-off-by: Will Deacon <will@kernel.org>
| * | perf: arm_spe: Support new SPEv1.2/v8.7 'not taken' eventRob Herring2023-01-191-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Arm SPEv1.2 (Armv8.7/v9.2) adds a new event, 'not taken', in bit 6 of the PMSEVFR_EL1 register. Update arm_spe_pmsevfr_res0() to support the additional event. Tested-by: James Clark <james.clark@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20220825-arm-spe-v8-7-v4-6-327f860daf28@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
| * | perf: arm_spe: Use new PMSIDR_EL1 register enumsRob Herring2023-01-191-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that the SPE register definitions include enums for some PMSIDR_EL1 fields, use them in the driver in place of magic values. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20220825-arm-spe-v8-7-v4-5-327f860daf28@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
| * | perf: arm_spe: Drop BIT() and use FIELD_GET/PREP accessorsRob Herring2023-01-191-36/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that generated sysregs are in place, update the register field accesses. The use of BIT() is no longer needed with the new defines. Use FIELD_GET and FIELD_PREP instead of open coding masking and shifting. No functional change. Tested-by: James Clark <james.clark@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20220825-arm-spe-v8-7-v4-4-327f860daf28@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
| * | arm64: Drop SYS_ from SPE register definesRob Herring2023-01-191-43/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently have a non-standard SYS_ prefix in the constants generated for the SPE register bitfields. Drop this in preparation for automatic register definition generation. The SPE mask defines were unshifted, and the SPE register field enumerations were shifted. The autogenerated defines are the opposite, so make the necessary adjustments. No functional changes. Tested-by: James Clark <james.clark@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20220825-arm-spe-v8-7-v4-2-327f860daf28@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
| * | perf: arm_spe: Use feature numbering for PMSEVFR_EL1 definesRob Herring2023-01-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to commit 121a8fc088f1 ("arm64/sysreg: Use feature numbering for PMU and SPE revisions") use feature numbering instead of architecture versions for the PMSEVFR_EL1 Res0 defines. Tested-by: James Clark <james.clark@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20220825-arm-spe-v8-7-v4-1-327f860daf28@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
| * | perf/marvell: Add ACPI support to TAD uncore driverGowthami Thiagarajan2023-01-191-6/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for ACPI based device registration so that the driver can be also enabled through ACPI table. While at that change the DT specific API's to device_* API's so that both DT based and ACPI based probing works. Signed-off-by: Gowthami Thiagarajan <gthiagarajan@marvell.com> Link: https://lore.kernel.org/r/20221209053715.3930071-1-gthiagarajan@marvell.com Signed-off-by: Will Deacon <will@kernel.org>
| * | perf/marvell: Add ACPI support to DDR uncore driverGowthami Thiagarajan2023-01-191-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for ACPI based device registration so that the driver can be also enabled through ACPI table. Signed-off-by: Gowthami Thiagarajan <gthiagarajan@marvell.com> Link: https://lore.kernel.org/r/20221209053607.3929964-1-gthiagarajan@marvell.com Signed-off-by: Will Deacon <will@kernel.org>
| * | perf/arm-cmn: Reset DTM_PMU_CONFIG at probeRobin Murphy2023-01-191-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Although we treat the DTM counters as free-running such that we're not too concerned about the initial DTM state, it's possible for a previous user to have left DTM counters enabled and paired with DTC counters. Thus if the first events are scheduled using some, but not all, DTMs, the as-yet-unused ones could end up adding spurious increments to the event counts at the DTC. Make sure we sync our initial DTM_PMU_CONFIG state to all the DTMs at probe time to avoid that possibility. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/ba5f38b3dc733cd06bfb5e659b697e76d18c2183.1670269572.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
| * | drivers/perf: hisi: Extract initialization of "cpa_pmu->pmu"Junhao He2023-01-191-15/+1
| | | | | | | | | | | | | | | | | | | | | | | | Use hisi_pmu_init() function to simplify initialization of "cpa_pmu->pmu". Signed-off-by: Junhao He <hejunhao3@huawei.com> Link: https://lore.kernel.org/r/20230119100307.3660-4-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
| * | drivers/perf: hisi: Simplify the parameters of hisi_pmu_init()Junhao He2023-01-197-10/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | Use "hisi_pmu" to simplify the parameter list for the hisi_pmu_init() function. Signed-off-by: Junhao He <hejunhao3@huawei.com> Link: https://lore.kernel.org/r/20230119100307.3660-3-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
| * | drivers/perf: hisi: Advertise the PERF_PMU_CAP_NO_EXCLUDE capabilityJunhao He2023-01-191-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Missed initialization the variable of pmu::capabilities when extract the initialization code of hisi_pmu->pmu into a function. HISI UNCORE PMU drivers counters that not support context exclusion. So we have to advertise the PERF_PMU_CAP_NO_EXCLUDE capability. This ensures that perf will prevent us from handling events where any exclusion flags are set. Signed-off-by: Junhao He <hejunhao3@huawei.com> Link: https://lore.kernel.org/r/20230119100307.3660-2-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
* | | Merge tag 'sched-core-2023-02-20' of ↵Linus Torvalds2023-02-202-17/+2
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: - Improve the scalability of the CFS bandwidth unthrottling logic with large number of CPUs. - Fix & rework various cpuidle routines, simplify interaction with the generic scheduler code. Add __cpuidle methods as noinstr to objtool's noinstr detection and fix boatloads of cpuidle bugs & quirks. - Add new ABI: introduce MEMBARRIER_CMD_GET_REGISTRATIONS, to query previously issued registrations. - Limit scheduler slice duration to the sysctl_sched_latency period, to improve scheduling granularity with a large number of SCHED_IDLE tasks. - Debuggability enhancement on sys_exit(): warn about disabled IRQs, but also enable them to prevent a cascade of followup problems and repeat warnings. - Fix the rescheduling logic in prio_changed_dl(). - Micro-optimize cpufreq and sched-util methods. - Micro-optimize ttwu_runnable() - Micro-optimize the idle-scanning in update_numa_stats(), select_idle_capacity() and steal_cookie_task(). - Update the RSEQ code & self-tests - Constify various scheduler methods - Remove unused methods - Refine __init tags - Documentation updates - Misc other cleanups, fixes * tag 'sched-core-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (110 commits) sched/rt: pick_next_rt_entity(): check list_entry sched/deadline: Add more reschedule cases to prio_changed_dl() sched/fair: sanitize vruntime of entity being placed sched/fair: Remove capacity inversion detection sched/fair: unlink misfit task from cpu overutilized objtool: mem*() are not uaccess safe cpuidle: Fix poll_idle() noinstr annotation sched/clock: Make local_clock() noinstr sched/clock/x86: Mark sched_clock() noinstr x86/pvclock: Improve atomic update of last_value in pvclock_clocksource_read() x86/atomics: Always inline arch_atomic64*() cpuidle: tracing, preempt: Squash _rcuidle tracing cpuidle: tracing: Warn about !rcu_is_watching() cpuidle: lib/bug: Disable rcu_is_watching() during WARN/BUG cpuidle: drivers: firmware: psci: Dont instrument suspend code KVM: selftests: Fix build of rseq test exit: Detect and fix irq disabled state in oops cpuidle, arm64: Fix the ARM64 cpuidle logic cpuidle: mvebu: Fix duplicate flags assignment sched/fair: Limit sched slice duration ...