summaryrefslogtreecommitdiffstats
path: root/drivers/perf
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'arm64-upstream' of ↵Linus Torvalds6 days29-44/+497
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - Support for running Linux in a protected VM under the Arm Confidential Compute Architecture (CCA) - Guarded Control Stack user-space support. Current patches follow the x86 ABI of implicitly creating a shadow stack on clone(). Subsequent patches (already on the list) will add support for clone3() allowing finer-grained control of the shadow stack size and placement from libc - AT_HWCAP3 support (not running out of HWCAP2 bits yet but we are getting close with the upcoming dpISA support) - Other arch features: - In-kernel use of the memcpy instructions, FEAT_MOPS (previously only exposed to user; uaccess support not merged yet) - MTE: hugetlbfs support and the corresponding kselftests - Optimise CRC32 using the PMULL instructions - Support for FEAT_HAFT enabling ARCH_HAS_NONLEAF_PMD_YOUNG - Optimise the kernel TLB flushing to use the range operations - POE/pkey (permission overlays): further cleanups after bringing the signal handler in line with the x86 behaviour for 6.12 - arm64 perf updates: - Support for the NXP i.MX91 PMU in the existing IMX driver - Support for Ampere SoCs in the Designware PCIe PMU driver - Support for Marvell's 'PEM' PCIe PMU present in the 'Odyssey' SoC - Support for Samsung's 'Mongoose' CPU PMU - Support for PMUv3.9 finer-grained userspace counter access control - Switch back to platform_driver::remove() now that it returns 'void' - Add some missing events for the CXL PMU driver - Miscellaneous arm64 fixes/cleanups: - Page table accessors cleanup: type updates, drop unused macros, reorganise arch_make_huge_pte() and clean up pte_mkcont(), sanity check addresses before runtime P4D/PUD folding - Command line override for ID_AA64MMFR0_EL1.ECV (advertising the FEAT_ECV for the generic timers) allowing Linux to boot with firmware deployments that don't set SCTLR_EL3.ECVEn - ACPI/arm64: tighten the check for the array of platform timer structures and adjust the error handling procedure in gtdt_parse_timer_block() - Optimise the cache flush for the uprobes xol slot (skip if no change) and other uprobes/kprobes cleanups - Fix the context switching of tpidrro_el0 when kpti is enabled - Dynamic shadow call stack fixes - Sysreg updates - Various arm64 kselftest improvements * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (168 commits) arm64: tls: Fix context-switching of tpidrro_el0 when kpti is enabled kselftest/arm64: Try harder to generate different keys during PAC tests kselftest/arm64: Don't leak pipe fds in pac.exec_sign_all() arm64/ptrace: Clarify documentation of VL configuration via ptrace kselftest/arm64: Corrupt P0 in the irritator when testing SSVE acpi/arm64: remove unnecessary cast arm64/mm: Change protval as 'pteval_t' in map_range() kselftest/arm64: Fix missing printf() argument in gcs/gcs-stress.c kselftest/arm64: Add FPMR coverage to fp-ptrace kselftest/arm64: Expand the set of ZA writes fp-ptrace does kselftets/arm64: Use flag bits for features in fp-ptrace assembler code kselftest/arm64: Enable build of PAC tests with LLVM=1 kselftest/arm64: Check that SVCR is 0 in signal handlers selftests/mm: Fix unused function warning for aarch64_write_signal_pkey() kselftest/arm64: Fix printf() compiler warnings in the arm64 syscall-abi.c tests kselftest/arm64: Fix printf() warning in the arm64 MTE prctl() test kselftest/arm64: Fix printf() compiler warnings in the arm64 fp tests kselftest/arm64: Fix build with stricter assemblers arm64/scs: Drop unused prototype __pi_scs_patch_vmlinux() arm64/scs: Deal with 64-bit relative offsets in FDE frames ...
| * perf: Switch back to struct platform_driver::remove()Uwe Kleine-König2024-11-0623-23/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After commit 0edb555a65d1 ("platform: Make platform_driver::remove() return void") .remove() is (again) the right callback to implement for platform drivers. Convert all platform drivers below drivers/perf to use .remove(), with the eventual goal to drop struct platform_driver::remove_new(). As .remove() and .remove_new() have the same prototypes, conversion is done by just changing the structure member name in the driver initializer. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Link: https://lore.kernel.org/r/20241027180313.410964-2-u.kleine-koenig@baylibre.com Signed-off-by: Will Deacon <will@kernel.org>
| * perf: arm_pmuv3: Add support for Samsung Mongoose PMUMarkuss Broks2024-10-291-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | Add support for the Samsung Mongoose CPU core PMU. This just adds the names and links to DT compatible strings. Co-developed-by: Maksym Holovach <nergzd@nergzd723.xyz> Signed-off-by: Maksym Holovach <nergzd@nergzd723.xyz> Signed-off-by: Markuss Broks <markuss.broks@gmail.com> Link: https://lore.kernel.org/r/20241026-mongoose-pmu-v1-2-f1a7448054be@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
| * perf/dwc_pcie: Fix typos in event namesIlkka Koskinen2024-10-291-3/+3
| | | | | | | | | | | | | | | | | | | | Fix a few typos in event names Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Reviewed-by: Jing Zhang <renyu.zj@linux.alibaba.com> Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Link: https://lore.kernel.org/r/20241008231824.5102-4-ilkka@os.amperecomputing.com Signed-off-by: Will Deacon <will@kernel.org>
| * perf/dwc_pcie: Add support for Ampere SoCsIlkka Koskinen2024-10-291-0/+1
| | | | | | | | | | | | | | | | | | Add support for Ampere SoCs by adding Ampere's vendor ID to the vendor list. Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/20241008231824.5102-2-ilkka@os.amperecomputing.com Signed-off-by: Will Deacon <will@kernel.org>
| * perf/marvell: Marvell PEM performance monitor supportGowthami Thiagarajan2024-10-283-0/+433
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PCI Express Interface PMU includes various performance counters to monitor the data that is transmitted over the PCIe link. The counters track various inbound and outbound transactions which includes separate counters for posted/non-posted/completion TLPs. Also, inbound and outbound memory read requests along with their latencies can also be monitored. Address Translation Services(ATS)events such as ATS Translation, ATS Page Request, ATS Invalidation along with their corresponding latencies are also supported. The performance counters are 64 bits wide. For instance, perf stat -e ib_tlp_pr <workload> tracks the inbound posted TLPs for the workload. Co-developed-by: Linu Cherian <lcherian@marvell.com> Signed-off-by: Linu Cherian <lcherian@marvell.com> Signed-off-by: Gowthami Thiagarajan <gthiagarajan@marvell.com> Link: https://lore.kernel.org/r/20241028055309.17893-1-gthiagarajan@marvell.com Signed-off-by: Will Deacon <will@kernel.org>
| * perf/arm_pmuv3: Add PMUv3.9 per counter EL0 access controlRob Herring (Arm)2024-10-281-10/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Armv8.9/9.4 PMUv3.9 adds per counter EL0 access controls. Per counter access is enabled with the UEN bit in PMUSERENR_EL1 register. Individual counters are enabled/disabled in the PMUACR_EL1 register. When UEN is set, the CR/ER bits control EL0 write access and must be set to disable write access. With the access controls, the clearing of unused counters can be skipped. KVM also configures PMUSERENR_EL1 in order to trap to EL2. UEN does not need to be set for it since only PMUv3.5 is exposed to guests. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Link: https://lore.kernel.org/r/20241002184326.1105499-1-robh@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
| * perf/dwc_pcie: Convert the events with mixed case to lowercaseIlkka Koskinen2024-10-241-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Group #1 events had both upper and lower case characters in their names. Trying to count such events with perf tool results in an error: $ perf stat -e dwc_rootport_10008/Tx_PCIe_TLP_Data_Payload/ sleep 1 event syntax error: 'dwc_rootport_10008/Tx_PCIe_TLP_Data_Payload/' \___ Bad event or PMU Unable to find PMU or event on a PMU of 'dwc_rootport_10008' event syntax error: '..port_10008/Tx_PCIe_TLP_Data_Payload/' \___ unknown term 'Tx_PCIe_TLP_Data_Payload' for pmu 'dwc_rootport_10008' valid terms: eventid,type,lane,config,config1,config2,config3,name,period,percore,metric-id Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events Perf tool assumes the event names are either in lower or upper case. This is also mentioned in Documentation/ABI/testing/sysfs-bus-event_source-devices-events "As performance monitoring event names are case insensitive in the perf tool, the perf tool only looks for lower or upper case event names in sysfs to avoid scanning the directory. It is therefore required the name of the event here is either lower or upper case." Change the Group #1 events names to lower case. Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/20241016210136.65452-1-ilkka@os.amperecomputing.com Signed-off-by: Will Deacon <will@kernel.org>
| * perf/cxlpmu: Support missing events in 3.1 specDavidlohr Bueso2024-10-241-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Update the CXL PMU driver to support the new events introduced in the latest revision. These are: - read/write accesses with TEE constraints. - S2M indicating Modified state. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/20241010025208.180458-1-dave@stgolabs.net Signed-off-by: Will Deacon <will@kernel.org>
| * perf: imx_perf: add support for i.MX91 platformXu Yang2024-10-241-0/+5
| | | | | | | | | | | | | | | | | | This will add compatible and identifier for i.MX91 platform. Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20240924061251.3387850-2-xu.yang_2@nxp.com Signed-off-by: Will Deacon <will@kernel.org>
| * drivers perf: remove unused field pmu_nodeYunhui Cui2024-10-141-1/+0
| | | | | | | | | | | | | | | | | | The driver does not use the pmu_node field, so remove it. Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com> Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Link: https://lore.kernel.org/r/20240919034601.2453-1-cuiyunhui@bytedance.com Signed-off-by: Will Deacon <will@kernel.org>
* | drivers: perf: Fix wrong put_cpu() placementAlexandre Ghiti12 days1-2/+2
|/ | | | | | | | | | | | | Unfortunately, the wrong patch version was merged which places the put_cpu() after enabling a static key, which is not safe as pointed by Will [1], so move put_cpu() before to avoid this. Fixes: 2840dadf0dde ("drivers: perf: Fix smp_processor_id() use in preemptible code") Reported-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/all/20240827125335.GD4772@willie-the-truck/ [1] Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20241112113422.617954-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
* drivers/perf: riscv: Align errno for unsupported perf eventPu Lehui2024-10-012-4/+4
| | | | | | | | | | | | | | | | | | RISC-V perf driver does not yet support PERF_TYPE_BREAKPOINT. It would be more appropriate to return -EOPNOTSUPP or -ENOENT for this type in pmu_sbi_event_map. Considering that other implementations return -ENOENT for unsupported perf types, let's synchronize this behavior. Due to this reason, a riscv bpf testcases perf_skip fail. Meanwhile, align that behavior to the rest of proper place. Signed-off-by: Pu Lehui <pulehui@huawei.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Fixes: 9b3e150e310e ("RISC-V: Add a simple platform driver for RISC-V legacy perf") Fixes: 16d3b1af0944 ("perf: RISC-V: Check standard event availability") Fixes: e9991434596f ("RISC-V: Add perf platform driver based on SBI PMU extension") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240831071520.1630360-1-pulehui@huaweicloud.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
* Merge tag 'riscv-for-linus-6.12-mw1' of ↵Linus Torvalds2024-09-242-11/+22
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support using Zkr to seed KASLR - Support IPI-triggered CPU backtracing - Support for generic CPU vulnerabilities reporting to userspace - A few cleanups for missing licenses - The size limit on the XIP kernel has been removed - Support for tracing userspace stacks - Support for the Svvptc extension - Various cleanups and fixes throughout the tree * tag 'riscv-for-linus-6.12-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (47 commits) crash: Fix riscv64 crash memory reserve dead loop perf/riscv-sbi: Add platform specific firmware event handling tools: Optimize ring buffer for riscv tools: Add riscv barrier implementation RISC-V: Don't have MAX_PHYSMEM_BITS exceed phys_addr_t ACPI: NUMA: initialize all values of acpi_early_node_map to NUMA_NO_NODE riscv: Enable bitops instrumentation riscv: Omit optimized string routines when using KASAN ACPI: RISCV: Make acpi_numa_get_nid() to be static riscv: Randomize lower bits of stack address selftests: riscv: Allow mmap test to compile on 32-bit riscv: Make riscv_isa_vendor_ext_andes array static riscv: Use LIST_HEAD() to simplify code riscv: defconfig: Disable RZ/Five peripheral support RISC-V: Implement kgdb_roundup_cpus() to enable future NMI Roundup riscv: avoid Imbalance in RAS riscv: cacheinfo: Add back init_cache_level() function riscv: Remove unused _TIF_WORK_MASK drivers/perf: riscv: Remove redundant macro check riscv: define ILLEGAL_POINTER_VALUE for 64bit ...
| * perf/riscv-sbi: Add platform specific firmware event handlingMayuresh Chitale2024-09-201-9/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The SBI v2.0 specification pointed to by the link below reserves the event code 0xffff for platform specific firmware events. Update the driver to be able to parse and program such events. The platform specific firmware events must now be specified in the perf command as below: perf stat -e rCxxx ... where bits[63:62] = 0x3 of the event config indicate a platform specific firmware event and xxx indicate the actual event code which is passed as the event data. Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com> Link: https://github.com/riscv-non-isa/riscv-sbi-doc/releases/download/v2.0/riscv-sbi.pdf Link: https://lore.kernel.org/r/20240812051109.6496-1-mchitale@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
| * drivers/perf: riscv: Remove redundant macro checkXiao Wang2024-09-151-2/+0
| | | | | | | | | | | | | | | | | | | | The macro CONFIG_RISCV_PMU must have been defined when riscv_pmu.c gets compiled, so this patch removes the redundant check. Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20240708121224.1148154-1-xiao.w.wang@intel.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
* | Merge tag 'arm64-upstream' of ↵Linus Torvalds2024-09-1615-369/+1243
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "The highlights are support for Arm's "Permission Overlay Extension" using memory protection keys, support for running as a protected guest on Android as well as perf support for a bunch of new interconnect PMUs. Summary: ACPI: - Enable PMCG erratum workaround for HiSilicon HIP10 and 11 platforms. - Ensure arm64-specific IORT header is covered by MAINTAINERS. CPU Errata: - Enable workaround for hardware access/dirty issue on Ampere-1A cores. Memory management: - Define PHYSMEM_END to fix a crash in the amdgpu driver. - Avoid tripping over invalid kernel mappings on the kexec() path. - Userspace support for the Permission Overlay Extension (POE) using protection keys. Perf and PMUs: - Add support for the "fixed instruction counter" extension in the CPU PMU architecture. - Extend and fix the event encodings for Apple's M1 CPU PMU. - Allow LSM hooks to decide on SPE permissions for physical profiling. - Add support for the CMN S3 and NI-700 PMUs. Confidential Computing: - Add support for booting an arm64 kernel as a protected guest under Android's "Protected KVM" (pKVM) hypervisor. Selftests: - Fix vector length issues in the SVE/SME sigreturn tests - Fix build warning in the ptrace tests. Timers: - Add support for PR_{G,S}ET_TSC so that 'rr' can deal with non-determinism arising from the architected counter. Miscellaneous: - Rework our IPI-based CPU stopping code to try NMIs if regular IPIs don't succeed. - Minor fixes and cleanups" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (94 commits) perf: arm-ni: Fix an NULL vs IS_ERR() bug arm64: hibernate: Fix warning for cast from restricted gfp_t arm64: esr: Define ESR_ELx_EC_* constants as UL arm64: pkeys: remove redundant WARN perf: arm_pmuv3: Use BR_RETIRED for HW branch event if enabled MAINTAINERS: List Arm interconnect PMUs as supported perf: Add driver for Arm NI-700 interconnect PMU dt-bindings/perf: Add Arm NI-700 PMU perf/arm-cmn: Improve format attr printing perf/arm-cmn: Clean up unnecessary NUMA_NO_NODE check arm64/mm: use lm_alias() with addresses passed to memblock_free() mm: arm64: document why pte is not advanced in contpte_ptep_set_access_flags() arm64: Expose the end of the linear map in PHYSMEM_END arm64: trans_pgd: mark PTEs entries as valid to avoid dead kexec() arm64/mm: Delete __init region from memblock.reserved perf/arm-cmn: Support CMN S3 dt-bindings: perf: arm-cmn: Add CMN S3 perf/arm-cmn: Refactor DTC PMU register access perf/arm-cmn: Make cycle counts less surprising perf/arm-cmn: Improve build-time assertion ...
| * | perf: arm-ni: Fix an NULL vs IS_ERR() bugDan Carpenter2024-09-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The devm_ioremap() function never returns error pointers, it returns a NULL pointer if there is an error. Fixes: 4d5a7680f2b4 ("perf: Add driver for Arm NI-700 interconnect PMU") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/04d6ccc3-6d31-4f0f-ab0f-7a88342cc09a@stanley.mountain Signed-off-by: Will Deacon <will@kernel.org>
| * | perf: arm_pmuv3: Use BR_RETIRED for HW branch event if enabledIlkka Koskinen2024-09-091-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PMU driver attempts to use PC_WRITE_RETIRED for the HW branch event, if enabled. However, PC_WRITE_RETIRED counts only taken branches, whereas BR_RETIRED counts also non-taken ones. Furthermore, perf uses HW branch event to calculate branch misses ratio, implying BR_RETIRED is the correct event to count. We keep PC_WRITE_RETIRED still as an option in case BR_RETIRED isn't implemented. Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/20240906191539.4847-1-ilkka@os.amperecomputing.com Signed-off-by: Will Deacon <will@kernel.org>
| * | perf: Add driver for Arm NI-700 interconnect PMURobin Murphy2024-09-063-0/+789
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Arm NI-700 Network-on-Chip Interconnect has a relatively straightforward design with a hierarchy of voltage, power, and clock domains, where each clock domain then contains a number of interface units and a PMU which can monitor events thereon. As such, it begets a relatively straightforward driver to interface those PMUs with perf. Even more so than with arm-cmn, users will require detailed knowledge of the wider system topology in order to meaningfully analyse anything, since the interconnect itself cannot know what lies beyond the boundary of each inscrutably-numbered interface. Given that, for now they are also expected to refer to the NI-700 documentation for the relevant event IDs to provide as well. An identifier is implemented so we can come back and add jevents if anyone really wants to. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/9933058d0ab8138c78a61cd6852ea5d5ff48e393.1725470837.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
| * | perf/arm-cmn: Improve format attr printingRobin Murphy2024-09-061-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Take full advantage of our formats being stored in bitfield form, and make the printing even more robust and simple by letting printk do all the hard work of formatting bitlists. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/50459f2d48fc62310a566863dbf8a7c14361d363.1725474584.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
| * | perf/arm-cmn: Clean up unnecessary NUMA_NO_NODE checkRobin Murphy2024-09-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Checking for NUMA_NO_NODE is a misleading and, on reflection, entirely unnecessary micro-optimisation. If it ever did happen that an incoming CPU has no NUMA affinity while the current CPU does, a questionably- useful PMU migration isn't the biggest thing wrong with that picture... Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/00634da33c21269a00844140afc7cc3a2ac1eb4d.1725474584.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
| * | perf/arm-cmn: Support CMN S3Robin Murphy2024-09-041-43/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CMN S3 is the latest and greatest evolution for 2024, although most of the new features don't impact the PMU, so from our point of view it ends up looking a lot like CMN-700 r3 still. We have some new device types to ignore, a mildly irritating rearrangement of the register layouts, and a scary new configuration option that makes it potentially unsafe to even walk the full discovery tree, let alone attempt to use the PMU. Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/2ec9eec5b6bf215a9886f3b69e3b00e4cd85095c.1725296395.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
| * | perf/arm-cmn: Refactor DTC PMU register accessRobin Murphy2024-09-041-28/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Annoyingly, we're soon going to have to cope with PMU registers moving about. This will mostly be straightforward, except for the hard-coding of CMN_PMU_OFFSET for the DTC PMU registers. As a first step, refactor those accessors to allow for encapsulating a variable offset without making a big mess all over. As a bonus, we can repack the arm_cmn_dtc structure to accommodate the new pointer without growing any larger, since irq_friend only encodes a range of +/-3. Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/fc677576fae7b5b55780e5b245a4ef6ea1b30daf.1725296395.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
| * | perf/arm-cmn: Make cycle counts less surprisingRobin Murphy2024-09-041-5/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By default, CMN has automatic clock-gating with the implication that a DTC's cycle counter may not increment while the DTC is sufficiently idle. Given that we may have up to 4 DTCs to choose from when scheduling a cycles event, this may potentially lead to surprising results if trying to measure metrics based on activity in a different DTC domain from where cycles end up being counted. Furthermore, since the details of internal clock gating are not documented, we can't even reason about what "active" cycles for a DTC actually mean relative to the activity of other nodes within the same nominal DTC domain. Make the reasonable assumption that if the user wants to count cycles, they almost certainly want to count all of the cycles, and disable clock gating while a DTC's cycle counter is in use. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/c47cfdc09e907b1d7753d142a7e659982cceb246.1725296395.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
| * | perf/arm-cmn: Improve build-time assertionRobin Murphy2024-09-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | These days we can use static_assert() in the logical place rather than jamming a BUILD_BUG_ON() into the nearest function scope. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/224ee8286f299100f1c768edb254edc898539f50.1725296395.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
| * | perf/arm-cmn: Ensure dtm_idx is big enoughRobin Murphy2024-09-041-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While CMN_MAX_DIMENSION was bumped to 12 for CMN-650, that only supports up to a 10x10 mesh, so bumping dtm_idx to 256 bits at the time worked out OK in practice. However CMN-700 did finally support up to 144 XPs, and thus needs a worst-case 288 bits of dtm_idx for an aggregated XP event on a maxed-out config. Oops. Fixes: 23760a014417 ("perf/arm-cmn: Add CMN-700 support") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/e771b358526a0d7fc06efee2c3a2fdc0c9f51d44.1725296395.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
| * | perf/arm-cmn: Fix CCLA register offsetRobin Murphy2024-09-041-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Apparently pmu_event_sel is offset by 8 for all CCLA nodes, not just the CCLA_RNI combination type. Fixes: 23760a014417 ("perf/arm-cmn: Add CMN-700 support") Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/6e7bb06fef6046f83e7647aad0e5be544139763f.1725296395.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
| * | perf/arm-cmn: Refactor node ID handling. Again.Robin Murphy2024-09-041-54/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The scope of the "extra device ports" configuration is not made clear by the CMN documentation - so far we've assumed it applies globally, based on the sole example which suggests as much. However it transpires that this is incorrect, and the format does in fact vary based on each individual XP's port configuration. As a consequence, we're currenly liable to decode the port/device indices from a node ID incorrectly, thus program the wrong event source in the DTM leading to bogus event counts, and also show device topology on the wrong ports in debugfs. To put this right, rework node IDs yet again to carry around the additional data necessary to decode them properly per-XP. At this point the notion of fully decomposing an ID becomes more impractical than it's worth, so unabstracting the XY mesh coordinates (where 2/3 users were just debug anyway) ends up leaving things a bit simpler overall. Fixes: 60d1504070c2 ("perf/arm-cmn: Support new IP features") Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/5195f990152fc37adba5fbf5929a6b11063d9f09.1725296395.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
| * | drivers/perf: hisi_pcie: Export supported Root Ports [bdf_min, bdf_max]Yicong Yang2024-08-301-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently users can get the Root Ports supported by the PCIe PMU by "bus" sysfs attributes which indicates the PCIe bus number where Root Ports are located. This maybe insufficient since Root Ports supported by different PCIe PMUs may be located on the same PCIe bus. So export the BDF range the Root Ports additionally. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240829090332.28756-4-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
| * | drivers/perf: hisi_pcie: Fix TLP headers bandwidth countingYicong Yang2024-08-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We make the initial value of event ctrl register as HISI_PCIE_INIT_SET and modify according to the user options. This will make TLP headers bandwidth only counting never take effect since HISI_PCIE_INIT_SET configures to count the TLP payloads bandwidth. Fix this by making the initial value of event ctrl register as 0. Fixes: 17d573984d4d ("drivers/perf: hisi: Add TLP filter support") Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240829090332.28756-3-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
| * | drivers/perf: hisi_pcie: Record hardware counts correctlyYicong Yang2024-08-301-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we set the period and record it as the initial value of the counter without checking it's set to the hardware successfully or not. However the counter maybe unwritable if the target event is unsupported by the device. In such case we will pass user a wrong count: [start counts when setting the period] hwc->prev_count = 0x8000000000000000 device.counter_value = 0 // the counter is not set as the period [when user reads the counter] event->count = device.counter_value - hwc->prev_count = 0x8000000000000000 // wrong. should be 0. Fix this by record the hardware counter counts correctly when setting the period. Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU") Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240829090332.28756-2-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
| * | drivers/perf: arm_spe: Use perf_allow_kernel() for permissionsJames Clark2024-08-301-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use perf_allow_kernel() for 'pa_enable' (physical addresses), 'pct_enable' (physical timestamps) and context IDs. This means that perf_event_paranoid is now taken into account and LSM hooks can be used, which is more consistent with other perf_event_open calls. For example PERF_SAMPLE_PHYS_ADDR uses perf_allow_kernel() rather than just perfmon_capable(). This also indirectly fixes the following error message which is misleading because perf_event_paranoid is not taken into account by perfmon_capable(): $ perf record -e arm_spe/pa_enable/ Error: Access to performance monitoring and observability operations is limited. Consider adjusting /proc/sys/kernel/perf_event_paranoid setting ... Suggested-by: Al Grant <al.grant@arm.com> Signed-off-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20240827145113.1224604-1-james.clark@linaro.org Link: https://lore.kernel.org/all/20240807120039.GD37996@noisy.programming.kicks-ass.net/ Signed-off-by: Will Deacon <will@kernel.org>
| * | perf/dwc_pcie: Add support for QCOM vendor devicesKrishna chaitanya chundru2024-08-231-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Update the vendor table with QCOM PCIe vendorid. Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240816-dwc_pmu_fix-v2-4-198b8ab1077c@quicinc.com Signed-off-by: Will Deacon <will@kernel.org>
| * | perf/dwc_pcie: Always register for PCIe bus notifierKrishna chaitanya chundru2024-08-231-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the PCIe devices are discovered late, the driver can't find the PCIe devices and returns in the init without registering with the bus notifier. Due to that the devices which are discovered late the driver can't register for this. Register for bus notifier & driver even if the device is not found as part of init. Fixes: af9597adc2f1 ("drivers/perf: add DesignWare PCIe PMU driver") Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240816-dwc_pmu_fix-v2-3-198b8ab1077c@quicinc.com Signed-off-by: Will Deacon <will@kernel.org>
| * | perf/dwc_pcie: Fix registration issue in multi PCIe controller instancesKrishna chaitanya chundru2024-08-231-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When there are multiple of instances of PCIe controllers, registration to perf driver fails with this error. sysfs: cannot create duplicate filename '/devices/platform/dwc_pcie_pmu.0' CPU: 0 PID: 166 Comm: modprobe Not tainted 6.10.0-rc2-next-20240607-dirty Hardware name: Qualcomm SA8775P Ride (DT) Call trace: dump_backtrace.part.8+0x98/0xf0 show_stack+0x14/0x1c dump_stack_lvl+0x74/0x88 dump_stack+0x14/0x1c sysfs_warn_dup+0x60/0x78 sysfs_create_dir_ns+0xe8/0x100 kobject_add_internal+0x94/0x224 kobject_add+0xa8/0x118 device_add+0x298/0x7b4 platform_device_add+0x1a0/0x228 platform_device_register_full+0x11c/0x148 dwc_pcie_register_dev+0x74/0xf0 [dwc_pcie_pmu] dwc_pcie_pmu_init+0x7c/0x1000 [dwc_pcie_pmu] do_one_initcall+0x58/0x1c0 do_init_module+0x58/0x208 load_module+0x1804/0x188c __do_sys_init_module+0x18c/0x1f0 __arm64_sys_init_module+0x14/0x1c invoke_syscall+0x40/0xf8 el0_svc_common.constprop.1+0x70/0xf4 do_el0_svc+0x18/0x20 el0_svc+0x28/0xb0 el0t_64_sync_handler+0x9c/0xc0 el0t_64_sync+0x160/0x164 kobject: kobject_add_internal failed for dwc_pcie_pmu.0 with -EEXIST, don't try to register things with the same name in the same directory. This is because of having same bdf value for devices under two different controllers. Update the logic to use sbdf which is a unique number in case of multi instance also. Fixes: af9597adc2f1 ("drivers/perf: add DesignWare PCIe PMU driver") Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240816-dwc_pmu_fix-v2-1-198b8ab1077c@quicinc.com Signed-off-by: Will Deacon <will@kernel.org>
| * | drivers/perf: Fix ali_drw_pmu driver interrupt status clearingJing Zhang2024-08-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The alibaba_uncore_pmu driver forgot to clear all interrupt status in the interrupt processing function. After the PMU counter overflow interrupt occurred, an interrupt storm occurred, causing the system to hang. Therefore, clear the correct interrupt status in the interrupt handling function to fix it. Fixes: cf7b61073e45 ("drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC") Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com> Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/1724297611-20686-1-git-send-email-renyu.zj@linux.alibaba.com Signed-off-by: Will Deacon <will@kernel.org>
| * | drivers/perf: apple_m1: add known PMU eventsYangyu Chen2024-08-231-73/+105
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds known PMU events that can be found on /usr/share/kpep in macOS. The m1_pmu_events and m1_pmu_event_affinity are generated from the script [1], which consumes the plist file from Apple. And then added these events to m1_pmu_perf_map and m1_pmu_event_attrs with Apple's documentation [2]. Link: https://github.com/cyyself/m1-pmu-gen [1] Link: https://developer.apple.com/download/apple-silicon-cpu-optimization-guide/ [2] Signed-off-by: Yangyu Chen <cyy@cyyself.name> Acked-by: Hector Martin <marcan@marcan.st> Link: https://lore.kernel.org/r/tencent_C5DA658E64B8D13125210C8D707CD8823F08@qq.com Signed-off-by: Will Deacon <will@kernel.org>
| * | perf: arm_pmuv3: Add support for Armv9.4 PMU instruction counterRob Herring (Arm)2024-08-161-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Armv9.4/8.9 PMU adds optional support for a fixed instruction counter similar to the fixed cycle counter. Support for the feature is indicated in the ID_AA64DFR1_EL1 register PMICNTR field. The counter is not accessible in AArch32. Existing userspace using direct counter access won't know how to handle the fixed instruction counter, so we have to avoid using the counter when user access is requested. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Tested-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20240731-arm-pmu-3-9-icntr-v3-7-280a8d7ff465@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
| * | arm64: perf/kvm: Use a common PMU cycle counter defineRob Herring (Arm)2024-08-161-12/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PMUv3 and KVM code each have a define for the PMU cycle counter index. Move KVM's define to a shared location and use it for PMUv3 driver. Reviewed-by: Marc Zyngier <maz@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Tested-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20240731-arm-pmu-3-9-icntr-v3-5-280a8d7ff465@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
| * | perf: arm_pmuv3: Prepare for more than 32 countersRob Herring (Arm)2024-08-161-19/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Various PMUv3 registers which are a mask of counters are 64-bit registers, but the accessor functions take a u32. This has been fine as the upper 32-bits have been RES0 as there has been a maximum of 32 counters prior to Armv9.4/8.9. With Armv9.4/8.9, a 33rd counter is added. Update the accessor functions to use a u64 instead. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Tested-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20240731-arm-pmu-3-9-icntr-v3-2-280a8d7ff465@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
| * | perf: arm_pmu: Remove event index to counter remappingRob Herring (Arm)2024-08-166-102/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Xscale and Armv6 PMUs defined the cycle counter at 0 and event counters starting at 1 and had 1:1 event index to counter numbering. On Armv7 and later, this changed the cycle counter to 31 and event counters start at 0. The drivers for Armv7 and PMUv3 kept the old event index numbering and introduced an event index to counter conversion. The conversion uses masking to convert from event index to a counter number. This operation relies on having at most 32 counters so that the cycle counter index 0 can be transformed to counter number 31. Armv9.4 adds support for an additional fixed function counter (instructions) which increases possible counters to more than 32, and the conversion won't work anymore as a simple subtract and mask. The primary reason for the translation (other than history) seems to be to have a contiguous mask of counters 0-N. Keeping that would result in more complicated index to counter conversions. Instead, store a mask of available counters rather than just number of events. That provides more information in addition to the number of events. No (intended) functional changes. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Tested-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20240731-arm-pmu-3-9-icntr-v3-1-280a8d7ff465@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
| * | perf: arm_pmu: Use of_property_present()Rob Herring (Arm)2024-08-161-1/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | Use of_property_present() to test for property presence rather than of_find_property(). This is part of a larger effort to remove callers of of_find_property() and similar functions. of_find_property() leaks the DT struct property and data pointers which is a problem for dynamically allocated nodes which may be freed. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20240731191312.1710417-15-robh@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
* / drivers: perf: Fix smp_processor_id() use in preemptible codeAlexandre Ghiti2024-09-101-1/+6
|/ | | | | | | | | | | | | | | | | As reported in [1], the use of smp_processor_id() in pmu_sbi_device_probe() must be protected by disabling the preemption, so simple use get_cpu()/put_cpu() instead. Reported-by: Nam Cao <namcao@linutronix.de> Closes: https://lore.kernel.org/linux-riscv/20240820074925.ReMKUPP3@linutronix.de/ [1] Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Tested-by: Nam Cao <namcao@linutronix.de> Fixes: a8625217a054 ("drivers/perf: riscv: Implement SBI PMU snapshot function") Reported-by: Andrea Parri <parri.andrea@gmail.com> Tested-by: Andrea Parri <parri.andrea@gmail.com> Link: https://lore.kernel.org/r/20240826165210.124696-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
* perf: riscv: Fix selecting counters in legacy modeShifrin Dmitry2024-08-011-1/+1
| | | | | | | | | | | | | | | | | | | | It is required to check event type before checking event config. Events with the different types can have the same config. This check is missed for legacy mode code For such perf usage: sysctl -w kernel.perf_user_access=2 perf stat -e cycles,L1-dcache-loads -- driver will try to force both events to CYCLE counter. This commit implements event type check before forcing events on the special counters. Signed-off-by: Shifrin Dmitry <dmitry.shifrin@syntacore.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Fixes: cc4c07c89aad ("drivers: perf: Implement perf event mmap support in the SBI backend") Link: https://lore.kernel.org/r/20240729125858.630653-1-dmitry.shifrin@syntacore.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
* Merge tag 'riscv-for-linus-6.11-mw2' of ↵Linus Torvalds2024-07-271-3/+8
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull more RISC-V updates from Palmer Dabbelt: - Support for NUMA (via SRAT and SLIT), console output (via SPCR), and cache info (via PPTT) on ACPI-based systems. - The trap entry/exit code no longer breaks the return address stack predictor on many systems, which results in an improvement to trap latency. - Support for HAVE_ARCH_STACKLEAK. - The sv39 linear map has been extended to support 128GiB mappings. - The frequency of the mtime CSR is now visible via hwprobe. * tag 'riscv-for-linus-6.11-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (21 commits) RISC-V: Provide the frequency of time CSR via hwprobe riscv: Extend sv39 linear mapping max size to 128G riscv: enable HAVE_ARCH_STACKLEAK riscv: signal: Remove unlikely() from WARN_ON() condition riscv: Improve exception and system call latency RISC-V: Select ACPI PPTT drivers riscv: cacheinfo: initialize cacheinfo's level and type from ACPI PPTT riscv: cacheinfo: remove the useless input parameter (node) of ci_leaf_init() RISC-V: ACPI: Enable SPCR table for console output on RISC-V riscv: boot: remove duplicated targets line trace: riscv: Remove deprecated kprobe on ftrace support riscv: cpufeature: Extract common elements from extension checking riscv: Introduce vendor variants of extension helpers riscv: Add vendor extensions to /proc/cpuinfo riscv: Extend cpufeature.c to detect vendor extensions RISC-V: run savedefconfig for defconfig RISC-V: hwprobe: sort EXT_KEY()s in hwprobe_isa_ext0() alphabetically ACPI: NUMA: replace pr_info with pr_debug in arch_acpi_numa_init ACPI: NUMA: change the ACPI_NUMA to a hidden option ACPI: NUMA: Add handler for SRAT RINTC affinity structure ...
| * Merge patch series "riscv: Separate vendor extensions from standard extensions"Palmer Dabbelt2024-07-221-3/+8
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Charlie Jenkins <charlie@rivosinc.com> says: All extensions, both standard and vendor, live in one struct "riscv_isa_ext". There is currently one vendor extension, xandespmu, but it is likely that more vendor extensions will be added to the kernel in the future. As more vendor extensions (and standard extensions) are added, riscv_isa_ext will become more bloated with a mix of vendor and standard extensions. This also allows each vendor to be conditionally enabled through Kconfig. * b4-shazam-merge: riscv: cpufeature: Extract common elements from extension checking riscv: Introduce vendor variants of extension helpers riscv: Add vendor extensions to /proc/cpuinfo riscv: Extend cpufeature.c to detect vendor extensions Link: https://lore.kernel.org/r/20240719-support_vendor_extensions-v3-0-0af7587bbec0@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
| | * riscv: Introduce vendor variants of extension helpersCharlie Jenkins2024-07-221-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Vendor extensions are maintained in per-vendor structs (separate from standard extensions which live in riscv_isa). Create vendor variants for the existing extension helpers to interface with the riscv_isa_vendor bitmaps. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andy Chiu <andy.chiu@sifive.com> Link: https://lore.kernel.org/r/20240719-support_vendor_extensions-v3-3-0af7587bbec0@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
| | * riscv: Extend cpufeature.c to detect vendor extensionsCharlie Jenkins2024-07-221-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of grouping all vendor extensions into the same riscv_isa_ext that standard instructions use, create a struct "riscv_isa_vendor_ext_data_list" that allows each vendor to maintain their vendor extensions independently of the standard extensions. xandespmu is currently the only vendor extension so that is the only extension that is affected by this change. An additional benefit of this is that the extensions of each vendor can be conditionally enabled. A config RISCV_ISA_VENDOR_EXT_ANDES has been added to allow for that. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andy Chiu <andy.chiu@sifive.com> Tested-by: Yu Chien Peter Lin <peterlin@andestech.com> Reviewed-by: Yu Chien Peter Lin <peterlin@andestech.com> Link: https://lore.kernel.org/r/20240719-support_vendor_extensions-v3-1-0af7587bbec0@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
* | | sysctl: treewide: constify the ctl_table argument of proc_handlersJoel Granados2024-07-242-2/+2
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | const qualify the struct ctl_table argument in the proc_handler function signatures. This is a prerequisite to moving the static ctl_table structs into .rodata data which will ensure that proc_handler function pointers cannot be modified. This patch has been generated by the following coccinelle script: ``` virtual patch @r1@ identifier ctl, write, buffer, lenp, ppos; identifier func !~ "appldata_(timer|interval)_handler|sched_(rt|rr)_handler|rds_tcp_skbuf_handler|proc_sctp_do_(hmac_alg|rto_min|rto_max|udp_port|alpha_beta|auth|probe_interval)"; @@ int func( - struct ctl_table *ctl + const struct ctl_table *ctl ,int write, void *buffer, size_t *lenp, loff_t *ppos); @r2@ identifier func, ctl, write, buffer, lenp, ppos; @@ int func( - struct ctl_table *ctl + const struct ctl_table *ctl ,int write, void *buffer, size_t *lenp, loff_t *ppos) { ... } @r3@ identifier func; @@ int func( - struct ctl_table * + const struct ctl_table * ,int , void *, size_t *, loff_t *); @r4@ identifier func, ctl; @@ int func( - struct ctl_table *ctl + const struct ctl_table *ctl ,int , void *, size_t *, loff_t *); @r5@ identifier func, write, buffer, lenp, ppos; @@ int func( - struct ctl_table * + const struct ctl_table * ,int write, void *buffer, size_t *lenp, loff_t *ppos); ``` * Code formatting was adjusted in xfs_sysctl.c to comply with code conventions. The xfs_stats_clear_proc_handler, xfs_panic_mask_proc_handler and xfs_deprecated_dointvec_minmax where adjusted. * The ctl_table argument in proc_watchdog_common was const qualified. This is called from a proc_handler itself and is calling back into another proc_handler, making it necessary to change it as part of the proc_handler migration. Co-developed-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Co-developed-by: Joel Granados <j.granados@samsung.com> Signed-off-by: Joel Granados <j.granados@samsung.com>