summaryrefslogtreecommitdiffstats
path: root/tools/perf
Commit message (Collapse)AuthorAgeFilesLines
* perf build: Ensure sysreg-defs Makefile respects output dirOliver Upton2023-11-222-10/+16
| | | | | | | | | | | | | Currently the sysreg-defs are written out to the source tree unconditionally, ignoring the specified output directory. Correct the build rule to emit the header to the output directory. Opportunistically reorganize the rules to avoid interleaving with the set of beauty make rules. Reported-by: Ian Rogers <irogers@google.com> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20231121192956.919380-3-oliver.upton@linux.dev Signed-off-by: Namhyung Kim <namhyung@kernel.org>
* tools perf: Add arm64 sysreg files to MANIFESTOliver Upton2023-11-221-0/+2
| | | | | | | | | | | | | Ian pointed out that source tarballs are incomplete as of commit e2bdd172e665 ("perf build: Generate arm64's sysreg-defs.h and add to include path"), since the source files needed from the kernel tree do not appear in the manifest. Add them. Reported-by: Ian Rogers <irogers@google.com> Fixes: e2bdd172e665 ("perf build: Generate arm64's sysreg-defs.h and add to include path") Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20231121192956.919380-2-oliver.upton@linux.dev Signed-off-by: Namhyung Kim <namhyung@kernel.org>
* tools/perf: Update tools's copy of mips syscall tableNamhyung Kim2023-11-221-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tldr; Just FYI, I'm carrying this on the perf tools tree. Full explanation: There used to be no copies, with tools/ code using kernel headers directly. From time to time tools/perf/ broke due to legitimate kernel hacking. At some point Linus complained about such direct usage. Then we adopted the current model. The way these headers are used in perf are not restricted to just including them to compile something. There are sometimes used in scripts that convert defines into string tables, etc, so some change may break one of these scripts, or new MSRs may use some different #define pattern, etc. E.g.: $ ls -1 tools/perf/trace/beauty/*.sh | head -5 tools/perf/trace/beauty/arch_errno_names.sh tools/perf/trace/beauty/drm_ioctl.sh tools/perf/trace/beauty/fadvise.sh tools/perf/trace/beauty/fsconfig.sh tools/perf/trace/beauty/fsmount.sh $ $ tools/perf/trace/beauty/fadvise.sh static const char *fadvise_advices[] = { [0] = "NORMAL", [1] = "RANDOM", [2] = "SEQUENTIAL", [3] = "WILLNEED", [4] = "DONTNEED", [5] = "NOREUSE", }; $ The tools/perf/check-headers.sh script, part of the tools/ build process, points out changes in the original files. So its important not to touch the copies in tools/ when doing changes in the original kernel headers, that will be done later, when check-headers.sh inform about the change to the perf tools hackers. Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: linux-mips@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20231121225650.390246-14-namhyung@kernel.org
* tools/perf: Update tools's copy of s390 syscall tableNamhyung Kim2023-11-221-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tldr; Just FYI, I'm carrying this on the perf tools tree. Full explanation: There used to be no copies, with tools/ code using kernel headers directly. From time to time tools/perf/ broke due to legitimate kernel hacking. At some point Linus complained about such direct usage. Then we adopted the current model. The way these headers are used in perf are not restricted to just including them to compile something. There are sometimes used in scripts that convert defines into string tables, etc, so some change may break one of these scripts, or new MSRs may use some different #define pattern, etc. E.g.: $ ls -1 tools/perf/trace/beauty/*.sh | head -5 tools/perf/trace/beauty/arch_errno_names.sh tools/perf/trace/beauty/drm_ioctl.sh tools/perf/trace/beauty/fadvise.sh tools/perf/trace/beauty/fsconfig.sh tools/perf/trace/beauty/fsmount.sh $ $ tools/perf/trace/beauty/fadvise.sh static const char *fadvise_advices[] = { [0] = "NORMAL", [1] = "RANDOM", [2] = "SEQUENTIAL", [3] = "WILLNEED", [4] = "DONTNEED", [5] = "NOREUSE", }; $ The tools/perf/check-headers.sh script, part of the tools/ build process, points out changes in the original files. So its important not to touch the copies in tools/ when doing changes in the original kernel headers, that will be done later, when check-headers.sh inform about the change to the perf tools hackers. Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: linux-s390@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20231121225650.390246-13-namhyung@kernel.org
* tools/perf: Update tools's copy of powerpc syscall tableNamhyung Kim2023-11-221-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tldr; Just FYI, I'm carrying this on the perf tools tree. Full explanation: There used to be no copies, with tools/ code using kernel headers directly. From time to time tools/perf/ broke due to legitimate kernel hacking. At some point Linus complained about such direct usage. Then we adopted the current model. The way these headers are used in perf are not restricted to just including them to compile something. There are sometimes used in scripts that convert defines into string tables, etc, so some change may break one of these scripts, or new MSRs may use some different #define pattern, etc. E.g.: $ ls -1 tools/perf/trace/beauty/*.sh | head -5 tools/perf/trace/beauty/arch_errno_names.sh tools/perf/trace/beauty/drm_ioctl.sh tools/perf/trace/beauty/fadvise.sh tools/perf/trace/beauty/fsconfig.sh tools/perf/trace/beauty/fsmount.sh $ $ tools/perf/trace/beauty/fadvise.sh static const char *fadvise_advices[] = { [0] = "NORMAL", [1] = "RANDOM", [2] = "SEQUENTIAL", [3] = "WILLNEED", [4] = "DONTNEED", [5] = "NOREUSE", }; $ The tools/perf/check-headers.sh script, part of the tools/ build process, points out changes in the original files. So its important not to touch the copies in tools/ when doing changes in the original kernel headers, that will be done later, when check-headers.sh inform about the change to the perf tools hackers. Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20231121225650.390246-12-namhyung@kernel.org
* tools/perf: Update tools's copy of x86 syscall tableNamhyung Kim2023-11-221-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tldr; Just FYI, I'm carrying this on the perf tools tree. Full explanation: There used to be no copies, with tools/ code using kernel headers directly. From time to time tools/perf/ broke due to legitimate kernel hacking. At some point Linus complained about such direct usage. Then we adopted the current model. The way these headers are used in perf are not restricted to just including them to compile something. There are sometimes used in scripts that convert defines into string tables, etc, so some change may break one of these scripts, or new MSRs may use some different #define pattern, etc. E.g.: $ ls -1 tools/perf/trace/beauty/*.sh | head -5 tools/perf/trace/beauty/arch_errno_names.sh tools/perf/trace/beauty/drm_ioctl.sh tools/perf/trace/beauty/fadvise.sh tools/perf/trace/beauty/fsconfig.sh tools/perf/trace/beauty/fsmount.sh $ $ tools/perf/trace/beauty/fadvise.sh static const char *fadvise_advices[] = { [0] = "NORMAL", [1] = "RANDOM", [2] = "SEQUENTIAL", [3] = "WILLNEED", [4] = "DONTNEED", [5] = "NOREUSE", }; $ The tools/perf/check-headers.sh script, part of the tools/ build process, points out changes in the original files. So its important not to touch the copies in tools/ when doing changes in the original kernel headers, that will be done later, when check-headers.sh inform about the change to the perf tools hackers. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: x86@kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20231121225650.390246-11-namhyung@kernel.org
* tools headers: Update tools's copy of socket.h headerNamhyung Kim2023-11-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tldr; Just FYI, I'm carrying this on the perf tools tree. Full explanation: There used to be no copies, with tools/ code using kernel headers directly. From time to time tools/perf/ broke due to legitimate kernel hacking. At some point Linus complained about such direct usage. Then we adopted the current model. The way these headers are used in perf are not restricted to just including them to compile something. There are sometimes used in scripts that convert defines into string tables, etc, so some change may break one of these scripts, or new MSRs may use some different #define pattern, etc. E.g.: $ ls -1 tools/perf/trace/beauty/*.sh | head -5 tools/perf/trace/beauty/arch_errno_names.sh tools/perf/trace/beauty/drm_ioctl.sh tools/perf/trace/beauty/fadvise.sh tools/perf/trace/beauty/fsconfig.sh tools/perf/trace/beauty/fsmount.sh $ $ tools/perf/trace/beauty/fadvise.sh static const char *fadvise_advices[] = { [0] = "NORMAL", [1] = "RANDOM", [2] = "SEQUENTIAL", [3] = "WILLNEED", [4] = "DONTNEED", [5] = "NOREUSE", }; $ The tools/perf/check-headers.sh script, part of the tools/ build process, points out changes in the original files. So its important not to touch the copies in tools/ when doing changes in the original kernel headers, that will be done later, when check-headers.sh inform about the change to the perf tools hackers. Cc: netdev@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20231121225650.390246-7-namhyung@kernel.org
* perf lock contention: Fix a build error on 32-bitYang Jihong2023-11-211-1/+2
| | | | | | | | | | | | | | | | | | | Fix a build error on 32-bit system: util/bpf_lock_contention.c: In function 'lock_contention_get_name': util/bpf_lock_contention.c:253:50: error: format '%lu' expects argument of type 'long unsigned int', but argument 4 has type 'u64 {aka long long unsigned int}' [-Werror=format=] snprintf(name_buf, sizeof(name_buf), "cgroup:%lu", cgrp_id); ~~^ %llu cc1: all warnings being treated as errors Fixes: d0c502e46e97 ("perf lock contention: Prepare to handle cgroups") Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: avagin@google.com Cc: daniel.diaz@linaro.org Link: https://lore.kernel.org/r/20231118024858.1567039-3-yangjihong1@huawei.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
* perf kwork: Fix a build error on 32-bitYang Jihong2023-11-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | lkft reported a build error for 32-bit system: builtin-kwork.c: In function 'top_print_work': builtin-kwork.c:1646:28: error: format '%ld' expects argument of type 'long int', but argument 3 has type 'u64' {aka 'long long unsigned int'} [-Werror=format=] 1646 | ret += printf(" %*ld ", PRINT_PID_WIDTH, work->id); | ~~~^ ~~~~~~~~ | | | | long int u64 {aka long long unsigned int} | %*lld cc1: all warnings being treated as errors make[3]: *** [/builds/linux/tools/build/Makefile.build:106: /home/tuxbuild/.cache/tuxmake/builds/1/build/builtin-kwork.o] Error 1 Fix it. Fixes: 55c40e505234 ("perf kwork top: Introduce new top utility") Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: avagin@google.com Cc: daniel.diaz@linaro.org Link: https://lore.kernel.org/r/20231118024858.1567039-2-yangjihong1@huawei.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
* Merge tag 'perf-tools-for-v6.7-1-2023-11-01' of ↵Linus Torvalds2023-11-03242-1801/+31495
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools updates from Namhyung Kim: "Build: - Compile BPF programs by default if clang (>= 12.0.1) is available to enable more features like kernel lock contention, off-cpu profiling, kwork, sample filtering and so on. This can be disabled by passing BUILD_BPF_SKEL=0 to make. - Produce better error messages for bison on debug build (make DEBUG=1) by defining YYDEBUG symbol internally. perf record: - Track sideband events (like FORK/MMAP) from all CPUs even if perf record targets a subset of CPUs only (using -C option). Otherwise it may lose some information happened on a CPU out of the target list. - Fix checking raw sched_switch tracepoint argument using system BTF. This affects off-cpu profiling which attaches a BPF program to the raw tracepoint. perf lock contention: - Add --lock-cgroup option to see contention by cgroups. This should be used with BPF only (using -b option). $ sudo perf lock con -ab --lock-cgroup -- sleep 1 contended total wait max wait avg wait cgroup 835 14.06 ms 41.19 us 16.83 us /system.slice/led.service 25 122.38 us 13.77 us 4.89 us / 44 23.73 us 3.87 us 539 ns /user.slice/user-657345.slice/session-c4.scope 1 491 ns 491 ns 491 ns /system.slice/connectd.service - Add -G/--cgroup-filter option to see contention only for given cgroups. This can be useful when you identified a cgroup in the above command and want to investigate more on it. It also works with other output options like -t/--threads and -l/--lock-addr. $ sudo perf lock con -ab -G /user.slice/user-657345.slice/session-c4.scope -- sleep 1 contended total wait max wait avg wait type caller 8 77.11 us 17.98 us 9.64 us spinlock futex_wake+0xc8 2 24.56 us 14.66 us 12.28 us spinlock tick_do_update_jiffies64+0x25 1 4.97 us 4.97 us 4.97 us spinlock futex_q_lock+0x2a - Use per-cpu array for better spinlock tracking. This is to improve performance of the BPF program and to avoid nested contention on a lock in the BPF hash map. - Update callstack check for PowerPC. To find a representative caller of a lock, it needs to look up the call stacks. It ends the lookup when it sees 0 in the call stack buffer. However, PowerPC call stacks can have 0 values in the beginning so skip them when it expects valid call stacks after. perf kwork: - Support 'sched' class (for -k option) so that it can see task scheduling event (using sched_switch tracepoint) as well as irq and workqueue items. - Add perf kwork top subcommand to show more accurate cpu utilization with sched class above. It works both with a recorded data (using perf kwork record command) and BPF (using -b option). Unlike perf top command, it does not support interactive mode (yet). $ sudo perf kwork top -b -k sched Starting trace, Hit <Ctrl+C> to stop and report ^C Total : 160702.425 ms, 8 cpus %Cpu(s): 36.00% id, 0.00% hi, 0.00% si %Cpu0 [|||||||||||||||||| 61.66%] %Cpu1 [|||||||||||||||||| 61.27%] %Cpu2 [||||||||||||||||||| 66.40%] %Cpu3 [|||||||||||||||||| 61.28%] %Cpu4 [|||||||||||||||||| 61.82%] %Cpu5 [||||||||||||||||||||||| 77.41%] %Cpu6 [|||||||||||||||||| 61.73%] %Cpu7 [|||||||||||||||||| 63.25%] PID SPID %CPU RUNTIME COMMMAND ------------------------------------------------------------- 0 0 38.72 8089.463 ms [swapper/1] 0 0 38.71 8084.547 ms [swapper/3] 0 0 38.33 8007.532 ms [swapper/0] 0 0 38.26 7992.985 ms [swapper/6] 0 0 38.17 7971.865 ms [swapper/4] 0 0 36.74 7447.765 ms [swapper/7] 0 0 33.59 6486.942 ms [swapper/2] 0 0 22.58 3771.268 ms [swapper/5] 9545 9351 2.48 447.136 ms sched-messaging 9574 9351 2.09 418.583 ms sched-messaging 9724 9351 2.05 372.407 ms sched-messaging 9531 9351 2.01 368.804 ms sched-messaging 9512 9351 2.00 362.250 ms sched-messaging 9514 9351 1.95 357.767 ms sched-messaging 9538 9351 1.86 384.476 ms sched-messaging 9712 9351 1.84 386.490 ms sched-messaging 9723 9351 1.83 380.021 ms sched-messaging 9722 9351 1.82 382.738 ms sched-messaging 9517 9351 1.81 354.794 ms sched-messaging 9559 9351 1.79 344.305 ms sched-messaging 9725 9351 1.77 365.315 ms sched-messaging <SNIP> - Add hard/soft-irq statistics to perf kwork top. This will show the total CPU utilization with IRQ stats like below: $ sudo perf kwork top -b -k sched,irq,softirq Starting trace, Hit <Ctrl+C> to stop and report ^C Total : 12554.889 ms, 8 cpus %Cpu(s): 96.23% id, 0.10% hi, 0.19% si <---- here %Cpu0 [| 4.60%] %Cpu1 [| 4.59%] %Cpu2 [ 2.73%] %Cpu3 [| 3.81%] <SNIP> perf bench: - Add -G/--cgroups option to perf bench sched pipe. The pipe bench is good to measure context switch overhead. With this option, it puts the reader and writer tasks in separate cgroups to enforce context switch between two different cgroups. Also it needs to set CPU affinity of the tasks in a CPU to accurately measure the impact of cgroup context switches. $ sudo perf stat -e context-switches,cgroup-switches -- \ > taskset -c 0 perf bench sched pipe -l 100000 # Running 'sched/pipe' benchmark: # Executed 100000 pipe operations between two processes Total time: 0.307 [sec] 3.078180 usecs/op 324867 ops/sec Performance counter stats for 'taskset -c 0 perf bench sched pipe -l 100000': 200,026 context-switches 63 cgroup-switches 0.321637922 seconds time elapsed You can see small number of cgroup-switches because both write and read tasks are in the same cgroup. $ sudo mkdir /sys/fs/cgroup/{AAA,BBB} $ sudo perf stat -e context-switches,cgroup-switches -- \ > taskset -c 0 perf bench sched pipe -l 100000 -G AAA,BBB # Running 'sched/pipe' benchmark: # Executed 100000 pipe operations between two processes Total time: 0.351 [sec] 3.512990 usecs/op 284657 ops/sec Performance counter stats for 'taskset -c 0 perf bench sched pipe -l 100000 -G AAA,BBB': 200,020 context-switches 200,019 cgroup-switches 0.365034567 seconds time elapsed Now context-switches and cgroup-switches are almost same. And you can see the pipe operation took little more. - Kill child processes when perf bench sched messaging exited abnormally. Otherwise it'd leave the child doing unnecessary work. perf test: - Fix various shellcheck issues on the tests written in shell script. - Skip tests when condition is not satisfied: - object code reading test for non-text section addresses. - CoreSight test if cs_etm// event is not available. - lock contention test if not enough CPUs. Event parsing: - Make PMU alias name loading lazy to reduce the startup time in the event parsing code for perf record, stat and others in the general case. - Lazily compute PMU default config. In the same sense, delay PMU initialization until it's really needed to reduce the startup cost. - Fix event term values that are raw events. The event specification can have several terms including event name. But sometimes it clashes with raw event encoding which starts with 'r' and has hex-digits. For example, an event named 'read' should be processed as a normal event but it was mis-treated as a raw encoding and caused a failure. $ perf stat -e 'uncore_imc_free_running/event=read/' -a sleep 1 event syntax error: '..nning/event=read/' \___ parser error Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events Event metrics: - Add "Compat" regex to match event with multiple identifiers. - Usual updates for Intel, Power10, Arm telemetry/CMN and AmpereOne. Misc: - Assorted memory leak fixes and footprint reduction. - Add "bpf_skeletons" to perf version --build-options so that users can check whether their perf tools have BPF support easily. - Fix unaligned access in Intel-PT packet decoder found by undefined-behavior sanitizer. - Avoid frequency mode for the dummy event. Surprisingly it'd impact kernel timer tick handler performance by force iterating all PMU events. - Update bash shell completion for events and metrics" * tag 'perf-tools-for-v6.7-1-2023-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (187 commits) perf vendor events intel: Update tsx_cycles_per_elision metrics perf vendor events intel: Update bonnell version number to v5 perf vendor events intel: Update westmereex events to v4 perf vendor events intel: Update meteorlake events to v1.06 perf vendor events intel: Update knightslanding events to v16 perf vendor events intel: Add typo fix for ivybridge FP perf vendor events intel: Update a spelling in haswell/haswellx perf vendor events intel: Update emeraldrapids to v1.01 perf vendor events intel: Update alderlake/alderlake events to v1.23 perf build: Disable BPF skeletons if clang version is < 12.0.1 perf callchain: Fix spelling mistake "statisitcs" -> "statistics" perf report: Fix spelling mistake "heirachy" -> "hierarchy" perf python: Fix binding linkage due to rename and move of evsel__increase_rlimit() perf tests: test_arm_coresight: Simplify source iteration perf vendor events intel: Add tigerlake two metrics perf vendor events intel: Add broadwellde two metrics perf vendor events intel: Fix broadwellde tma_info_system_dram_bw_use metric perf mem_info: Add and use map_symbol__exit and addr_map_symbol__exit perf callchain: Minor layout changes to callchain_list perf callchain: Make brtype_stat in callchain_list optional ...
| * Merge tag 'perf-tools-fixes-for-v6.6-2-2023-10-20' into perf-tools-nextNamhyung Kim2023-10-304-23/+41
| |\ | | | | | | | | | | | | | | | | | | To get the latest fixes in the perf tools including perf stat output, dlfilter and LLVM feature detection. Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf vendor events intel: Update tsx_cycles_per_elision metricsIan Rogers2023-10-288-14/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update tsx_cycles_per_elision as per: https://github.com/intel/perfmon/pull/116 Prefer the el-start event rather than cycles-t for detecting whether the metric will work as HLE may be disabled. Remove the metric from sapphirerapids that has no el-start event. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20231026003149.3287633-9-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf vendor events intel: Update bonnell version number to v5Ian Rogers2023-10-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Spelling fixes were already incorporated in the Linux perf tree, update the version number to reflect this. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20231026003149.3287633-8-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf vendor events intel: Update westmereex events to v4Ian Rogers2023-10-282-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update westmereex events from v3 to v4 fixing a spelling issue. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20231026003149.3287633-7-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf vendor events intel: Update meteorlake events to v1.06Ian Rogers2023-10-287-6/+209
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update meteorlake from v1.04 to v1.06 adding the changes from: https://github.com/intel/perfmon/commit/bc84df043091ec7c98c0629f3d074d9d7a108194 https://github.com/intel/perfmon/commit/405d3ee987d756b5b5d9a64d8a8fa77559822ecf Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20231026003149.3287633-6-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf vendor events intel: Update knightslanding events to v16Ian Rogers2023-10-286-57/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update knightslanding from v10 to v16 adding the changes from: https://github.com/intel/perfmon/commit/6c1f169f6ed63ee1fd75ebb303d0fd06d71196f5 https://github.com/intel/perfmon/commit/b22ca587ec8b5ac20471ea2f14924f63e63afe9d https://github.com/intel/perfmon/commit/e685286f083ee81cb7dafd0cd8546c79ee433187 Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20231026003149.3287633-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf vendor events intel: Add typo fix for ivybridge FPIan Rogers2023-10-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a missed space. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20231026003149.3287633-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf vendor events intel: Update a spelling in haswell/haswellxIan Rogers2023-10-282-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The spelling of "in-flight" was switched to "inflight". Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20231026003149.3287633-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf vendor events intel: Update emeraldrapids to v1.01Ian Rogers2023-10-282-1/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update emeraldrapids to v1.01 from v1.00 adding the changes from: https://github.com/intel/perfmon/commit/3993b600e032a9fd443ffd828aab73de7cb167e5 Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20231026003149.3287633-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf vendor events intel: Update alderlake/alderlake events to v1.23Ian Rogers2023-10-289-14/+146
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update alderlake and alderlaken events from v1.21 to v1.23 adding the changes from: https://github.com/intel/perfmon/commit/8df4db9433a2aab59dbbac1a70281032d1af7734 https://github.com/intel/perfmon/commit/846bd247c6e04acc572ca56c992e9e65852bbe63 The tsx_cycles_per_elision metric is updated from PR: https://github.com/intel/perfmon/pull/116 Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20231026003149.3287633-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf build: Disable BPF skeletons if clang version is < 12.0.1Arnaldo Carvalho de Melo2023-10-271-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While building on a wide range of distros and clang versions it was noticed that at least version 12.0.1 (noticed on Alpine 3.15 with "Alpine clang version 12.0.1") is needed to not fail with BTF generation errors such as: Debian:10 Debian clang version 11.0.1-2~deb10u1: CLANG /tmp/build/perf/util/bpf_skel/.tmp/sample_filter.bpf.o <SNIP> GENSKEL /tmp/build/perf/util/bpf_skel/sample_filter.skel.h libbpf: failed to find BTF for extern 'bpf_cast_to_kern_ctx' [21] section: -2 Error: failed to open BPF object file: No such file or directory make[2]: *** [Makefile.perf:1121: /tmp/build/perf/util/bpf_skel/sample_filter.skel.h] Error 254 make[2]: *** Deleting file '/tmp/build/perf/util/bpf_skel/sample_filter.skel.h' Amazon Linux 2: clang version 11.1.0 (Amazon Linux 2 11.1.0-1.amzn2.0.2) GENSKEL /tmp/build/perf/util/bpf_skel/sample_filter.skel.h libbpf: elf: skipping unrecognized data section(18) .eh_frame libbpf: elf: skipping relo section(19) .rel.eh_frame for section(18) .eh_frame libbpf: failed to find BTF for extern 'bpf_cast_to_kern_ctx' [21] section: -2 Error: failed to open BPF object file: No such file or directory make[2]: *** [/tmp/build/perf/util/bpf_skel/sample_filter.skel.h] Error 254 make[2]: *** Deleting file `/tmp/build/perf/util/bpf_skel/sample_filter.skel.h' Ubuntu 20.04: clang version 10.0.0-4ubuntu1 CLANG /tmp/build/perf/util/bpf_skel/.tmp/augmented_raw_syscalls.bpf.o GENSKEL /tmp/build/perf/util/bpf_skel/bench_uprobe.skel.h GENSKEL /tmp/build/perf/util/bpf_skel/bperf_leader.skel.h libbpf: sec '.reluprobe': corrupted symbol #27 pointing to invalid section #65522 for relo #0 GENSKEL /tmp/build/perf/util/bpf_skel/bperf_follower.skel.h Error: failed to open BPF object file: BPF object format invalid make[2]: *** [Makefile.perf:1121: /tmp/build/perf/util/bpf_skel/bench_uprobe.skel.h] Error 95 make[2]: *** Deleting file '/tmp/build/perf/util/bpf_skel/bench_uprobe.skel.h' So check if the version is at least 12.0.1 otherwise disable building BPF skels and provide a message about it, continuing the build. The message, when running on amazonlinux:2: Makefile.config:698: Warning: Disabled BPF skeletons as reliable BTF generation needs at least clang version 12.0.1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/ZTvGx/Ou6BVnYBqi@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf callchain: Fix spelling mistake "statisitcs" -> "statistics"Colin Ian King2023-10-271-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | There are a couple of spelling mistakes in perror messages. Fix them. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Cc: kernel-janitors@vger.kernel.org Link: https://lore.kernel.org/r/20231027084633.1167530-1-colin.i.king@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf report: Fix spelling mistake "heirachy" -> "hierarchy"Colin Ian King2023-10-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | There is a spelling mistake in a ui error message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Cc: kernel-janitors@vger.kernel.org Link: https://lore.kernel.org/r/20231027084011.1167091-1-colin.i.king@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf python: Fix binding linkage due to rename and move of ↵Arnaldo Carvalho de Melo2023-10-271-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | evsel__increase_rlimit() The changes in ("perf evsel: Rename evsel__increase_rlimit to rlimit__increase_nofile") ended up breaking the python binding that now references the rlimit__increase_nofile function, add the util/rlimit.o to the tools/perf/util/python-ext-sources to cure that. This was detected by the 'perf test python' regression test: $ perf test python 14: 'import perf' in python : FAILED! $ perf test -v python Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc 14: 'import perf' in python : --- start --- test child forked, pid 2912462 python usage test: "echo "import sys ; sys.path.insert(0, '/tmp/build/perf-tools-next/python'); import perf" | '/usr/bin/python3' " Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: /tmp/build/perf-tools-next/python/perf.cpython-311-x86_64-linux-gnu.so: undefined symbol: rlimit__increase_nofile test child finished with -1 ---- end ---- 'import perf' in python: FAILED! $ Fixes: e093a222d7cba1eb ("perf evsel: Rename evsel__increase_rlimit to rlimit__increase_nofile") Acked-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/lkml/ZTrCS5Z3PZAmfPdV@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf tests: test_arm_coresight: Simplify source iterationJames Clark2023-10-261-14/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two reasons to do this, firstly there is a shellcheck warning in cs_etm_dev_name(), which can be completely deleted. And secondly the current iteration method doesn't support systems with both ETE and ETM because it picks one or the other. There isn't a known system with this configuration, but it could happen in the future. Iterating over all the sources for each CPU can be done by going through /sys/bus/event_source/devices/cs_etm/cpu* and following the symlink back to the Coresight device in /sys/bus/coresight/devices. This will work whether the device is ETE, ETM or any future name, and is much simpler and doesn't require any hard coded version numbers Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: James Clark <james.clark@arm.com> Acked-by: Ian Rogers <irogers@google.com> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: tianruidong@linux.alibaba.com Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Anushree Mathur <anushree.mathur@linux.vnet.ibm.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: atrajeev@linux.vnet.ibm.com Cc: coresight@lists.linaro.org Link: https://lore.kernel.org/r/20231023131550.487760-1-james.clark@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf vendor events intel: Add tigerlake two metricsIan Rogers2023-10-261-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add tma_info_system_socket_clks and uncore_freq metrics. The associated converter script fix is in: https://github.com/intel/perfmon/pull/112 Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Link: https://lore.kernel.org/r/20230926205948.1399594-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf vendor events intel: Add broadwellde two metricsIan Rogers2023-10-261-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add tma_info_system_socket_clks and uncore_freq metrics that require a broadwellx style uncore event for UNC_CLOCK. The associated converter script fix is in: https://github.com/intel/perfmon/pull/112 Fixes: 7d124303d620 ("perf vendor events intel: Update broadwell variant events/metrics") Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Link: https://lore.kernel.org/r/20230926205948.1399594-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf vendor events intel: Fix broadwellde tma_info_system_dram_bw_use metricIan Rogers2023-10-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Broadwell-de has a consumer core and server uncore. The uncore_arb PMU isn't present and the broadwellx style cbox PMU should be used instead. Fix the tma_info_system_dram_bw_use metric to use the server metric rather than client. The associated converter script fix is in: https://github.com/intel/perfmon/pull/111 Fixes: 7d124303d620 ("perf vendor events intel: Update broadwell variant events/metrics") Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Link: https://lore.kernel.org/r/20230926031034.1201145-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf mem_info: Add and use map_symbol__exit and addr_map_symbol__exitIan Rogers2023-10-257-39/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix leak where mem_info__put wouldn't release the maps/map as used by perf mem. Add exit functions and use elsewhere that the maps and map are released. Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-12-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf callchain: Minor layout changes to callchain_listIan Rogers2023-10-251-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid 6 byte hole for padding. Place more frequently used fields first in an attempt to use just 1 cacheline in the common case. Before: ``` struct callchain_list { u64 ip; /* 0 8 */ struct map_symbol ms; /* 8 24 */ struct { _Bool unfolded; /* 32 1 */ _Bool has_children; /* 33 1 */ }; /* 32 2 */ /* XXX 6 bytes hole, try to pack */ u64 branch_count; /* 40 8 */ u64 from_count; /* 48 8 */ u64 predicted_count; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ u64 abort_count; /* 64 8 */ u64 cycles_count; /* 72 8 */ u64 iter_count; /* 80 8 */ u64 iter_cycles; /* 88 8 */ struct branch_type_stat * brtype_stat; /* 96 8 */ const char * srcline; /* 104 8 */ struct list_head list; /* 112 16 */ /* size: 128, cachelines: 2, members: 13 */ /* sum members: 122, holes: 1, sum holes: 6 */ }; ``` After: ``` struct callchain_list { struct list_head list; /* 0 16 */ u64 ip; /* 16 8 */ struct map_symbol ms; /* 24 24 */ const char * srcline; /* 48 8 */ u64 branch_count; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ u64 from_count; /* 64 8 */ u64 cycles_count; /* 72 8 */ u64 iter_count; /* 80 8 */ u64 iter_cycles; /* 88 8 */ struct branch_type_stat * brtype_stat; /* 96 8 */ u64 predicted_count; /* 104 8 */ u64 abort_count; /* 112 8 */ struct { _Bool unfolded; /* 120 1 */ _Bool has_children; /* 121 1 */ }; /* 120 2 */ /* size: 128, cachelines: 2, members: 13 */ /* padding: 6 */ }; ``` Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-11-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf callchain: Make brtype_stat in callchain_list optionalIan Rogers2023-10-252-9/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct callchain_list is 352bytes in size, 232 of which are brtype_stat. brtype_stat is only used for certain callchain_list items so make it optional, allocating when necessary. So that printing doesn't need to deal with an optional brtype_stat, pass an empty/zero version. Before: ``` struct callchain_list { u64 ip; /* 0 8 */ struct map_symbol ms; /* 8 24 */ struct { _Bool unfolded; /* 32 1 */ _Bool has_children; /* 33 1 */ }; /* 32 2 */ /* XXX 6 bytes hole, try to pack */ u64 branch_count; /* 40 8 */ u64 from_count; /* 48 8 */ u64 predicted_count; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ u64 abort_count; /* 64 8 */ u64 cycles_count; /* 72 8 */ u64 iter_count; /* 80 8 */ u64 iter_cycles; /* 88 8 */ struct branch_type_stat brtype_stat; /* 96 232 */ /* --- cacheline 5 boundary (320 bytes) was 8 bytes ago --- */ const char * srcline; /* 328 8 */ struct list_head list; /* 336 16 */ /* size: 352, cachelines: 6, members: 13 */ /* sum members: 346, holes: 1, sum holes: 6 */ /* last cacheline: 32 bytes */ }; ``` After: ``` struct callchain_list { u64 ip; /* 0 8 */ struct map_symbol ms; /* 8 24 */ struct { _Bool unfolded; /* 32 1 */ _Bool has_children; /* 33 1 */ }; /* 32 2 */ /* XXX 6 bytes hole, try to pack */ u64 branch_count; /* 40 8 */ u64 from_count; /* 48 8 */ u64 predicted_count; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ u64 abort_count; /* 64 8 */ u64 cycles_count; /* 72 8 */ u64 iter_count; /* 80 8 */ u64 iter_cycles; /* 88 8 */ struct branch_type_stat * brtype_stat; /* 96 8 */ const char * srcline; /* 104 8 */ struct list_head list; /* 112 16 */ /* size: 128, cachelines: 2, members: 13 */ /* sum members: 122, holes: 1, sum holes: 6 */ }; ``` Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-10-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf callchain: Make display use of branch_type_stat constIan Rogers2023-10-253-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Display code doesn't modify the branch_type_stat so switch uses to const. This is done to aid refactoring struct callchain_list where current the branch_type_stat is embedded even if not used. Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-9-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf offcpu: Add missed btf_freeIan Rogers2023-10-251-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Caught by address/leak sanitizer. Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-8-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf threads: Remove unused dead thread listIan Rogers2023-10-252-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 40826c45eb0b ("perf thread: Remove notion of dead threads") removed dead threads but the list head wasn't removed. Remove it here. Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-7-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf hist: Add missing puts to hist__account_cyclesIan Rogers2023-10-251-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Caught using reference count checking on perf top with "--call-graph=lbr". After this no memory leaks were detected. Fixes: 57849998e2cd ("perf report: Add processing for cycle histograms") Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-6-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | libperf rc_check: Add RC_CHK_EQUALIan Rogers2023-10-258-15/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Comparing pointers with reference count checking is tricky to avoid a SEGV. Add a convenience macro to simplify and use. Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf machine: Avoid out of bounds LBR memory readIan Rogers2023-10-251-10/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Running perf top with address sanitizer and "--call-graph=lbr" fails due to reading sample 0 when no samples exist. Add a guard to prevent this. Fixes: e2b23483eb1d ("perf machine: Factor out lbr_callchain_add_lbr_ip()") Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf rwsem: Add debug mode that uses a mutexIan Rogers2023-10-252-0/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mutex error check will capture trying to take the lock recursively and other problems that rwlock won't. At the expense of concurrency, adda debug mode that uses a mutex in place of a rwsem. Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf build: Address stray '\' before # that is warned about since grep 3.8Arnaldo Carvalho de Melo2023-10-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To address this grep 3.8 warning: grep: warning: stray \ before # We needed to remove the '' around the grep expression and keep the \ before # so that it is escaped by the $(shell grep ...) and thus doesn't get to grep. We need that \ before the #, otherwise we get this: Makefile.perf:364: *** unterminated call to function 'shell': missing ')'. Stop. As everything after the # will be considered a comment. Removing the single quotes needs some more escaping so that _some_ of the escaped chars gets to grep, like the '\|' that becomes '\\\|´. Running on debian:10, where there is no libtraceevent-devel available, we get: Makefile.perf:367: *** PYTHON_EXT_SRCS= util/python.c ../lib/ctype.c util/cap.c util/evlist.c util/evsel.c util/evsel_fprintf.c util/perf_event_attr_fprintf.c util/cpumap.c util/memswap.c util/mmap.c util/namespaces.c ../lib/bitmap.c ../lib/find_bit.c ../lib/list_sort.c ../lib/hweight.c ../lib/string.c ../lib/vsprintf.c util/thread_map.c util/util.c util/cgroup.c util/parse-branch-options.c util/rblist.c util/counts.c util/print_binary.c util/strlist.c ../lib/rbtree.c util/string.c util/symbol_fprintf.c util/units.c util/affinity.c util/rwsem.c util/hashmap.c util/perf_regs.c util/fncache.c util/perf-regs-arch/perf_regs_aarch64.c util/perf-regs-arch/perf_regs_arm.c util/perf-regs-arch/perf_regs_csky.c util/perf-regs-arch/perf_regs_loongarch.c util/perf-regs-arch/perf_regs_mips.c util/perf-regs-arch/perf_regs_powerpc.c util/perf-regs-arch/perf_regs_riscv.c util/perf-regs-arch/perf_regs_s390.c util/perf-regs-arch/perf_regs_x86.c. Stop. make[1]: *** [Makefile.perf:242: sub-make] Error 2 I.e. both the comments and the util/trace-event.c were removed. When using: msg := $(error PYTHON_EXT_SRCS=$(PYTHON_EXT_SRCS)) While on the more recent fedora:38, with the new grep and make packages and libtraceevent-devel installed: Makefile.perf:367: *** PYTHON_EXT_SRCS= util/python.c ../lib/ctype.c util/cap.c util/evlist.c util/evsel.c util/evsel_fprintf.c util/perf_event_attr_fprintf.c util/cpumap.c util/memswap.c util/mmap.c util/namespaces.c ../lib/bitmap.c ../lib/find_bit.c ../lib/list_sort.c ../lib/hweight.c ../lib/string.c ../lib/vsprintf.c util/thread_map.c util/util.c util/cgroup.c util/parse-branch-options.c util/rblist.c util/counts.c util/print_binary.c util/strlist.c util/trace-event.c ../lib/rbtree.c util/string.c util/symbol_fprintf.c util/units.c util/affinity.c util/rwsem.c util/hashmap.c util/perf_regs.c util/fncache.c util/perf-regs-arch/perf_regs_aarch64.c util/perf-regs-arch/perf_regs_arm.c util/perf-regs-arch/perf_regs_csky.c util/perf-regs-arch/perf_regs_loongarch.c util/perf-regs-arch/perf_regs_mips.c util/perf-regs-arch/perf_regs_powerpc.c util/perf-regs-arch/perf_regs_riscv.c util/perf-regs-arch/perf_regs_s390.c util/perf-regs-arch/perf_regs_x86.c. Stop. make[1]: *** [Makefile.perf:242: sub-make] Error 2 make: *** [Makefile:113: install-bin] Error 2 make: Leaving directory '/home/acme/git/perf-tools-next/tools/perf' $ I.e. only the comments were removed. If we build it on the same fedora:38 system, but using NO_LIBTRACEEVENT=1 $ make NO_LIBTRACEEVENT=1 CORESIGHT=1 O=/tmp/build/$(basename $PWD) -C tools/perf install-bin Makefile.perf:367: *** PYTHON_EXT_SRCS= util/python.c ../lib/ctype.c util/cap.c util/evlist.c util/evsel.c util/evsel_fprintf.c util/perf_event_attr_fprintf.c util/cpumap.c util/memswap.c util/mmap.c util/namespaces.c ../lib/bitmap.c ../lib/find_bit.c ../lib/list_sort.c ../lib/hweight.c ../lib/string.c ../lib/vsprintf.c util/thread_map.c util/util.c util/cgroup.c util/parse-branch-options.c util/rblist.c util/counts.c util/print_binary.c util/strlist.c ../lib/rbtree.c util/string.c util/symbol_fprintf.c util/units.c util/affinity.c util/rwsem.c util/hashmap.c util/perf_regs.c util/fncache.c util/perf-regs-arch/perf_regs_aarch64.c util/perf-regs-arch/perf_regs_arm.c util/perf-regs-arch/perf_regs_csky.c util/perf-regs-arch/perf_regs_loongarch.c util/perf-regs-arch/perf_regs_mips.c util/perf-regs-arch/perf_regs_powerpc.c util/perf-regs-arch/perf_regs_riscv.c util/perf-regs-arch/perf_regs_s390.c util/perf-regs-arch/perf_regs_x86.c. Stop. make[1]: *** [Makefile.perf:242: sub-make] Error 2 make: *** [Makefile:113: install-bin] Error 2 make: Leaving directory '/home/acme/git/perf-tools-next/tools/perf' $ Both comments and the util/trace-event.c file removed. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/ZTj6mfM9UqY2DggC@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf report: Fix hierarchy mode on pipe inputNamhyung Kim2023-10-251-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The hierarchy mode needs to setup output formats for each evsel. Normally setup_sorting() handles this at the beginning, but it cannot do that if data comes from a pipe since there's no evsel info before reading the data. And then perf report cannot process the samples in hierarchy mode and think as if there's no sample. Let's check the condition and setup the output formats after reading data so that it can find evsels. Before: $ ./perf record -o- true | ./perf report -i- --hierarchy -q [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] Error: The - data has no samples! After: $ ./perf record -o- true | ./perf report -i- --hierarchy -q [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] 94.76% true 94.76% [kernel.kallsyms] 94.76% [k] filemap_fault 5.24% perf-ex 5.24% [kernel.kallsyms] 5.06% [k] __memset 0.18% [k] native_write_msr Acked-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20231025003121.2811738-1-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf lock contention: Use per-cpu array map for spinlocksNamhyung Kim2023-10-251-17/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently lock contention timestamp is maintained in a hash map keyed by pid. That means it needs to get and release a map element (which is proctected by spinlock!) on each contention begin and end pair. This can impact on performance if there are a lot of contention (usually from spinlocks). It used to go with task local storage but it had an issue on memory allocation in some critical paths. Although it's addressed in recent kernels IIUC, the tool should support old kernels too. So it cannot simply switch to the task local storage at least for now. As spinlocks create lots of contention and they disabled preemption during the spinning, it can use per-cpu array to keep the timestamp to avoid overhead in hashmap update and delete. In contention_begin, it's easy to check the lock types since it can see the flags. But contention_end cannot see it. So let's try to per-cpu array first (unconditionally) if it has an active element (lock != 0). Then it should be used and per-task tstamp map should not be used until the per-cpu array element is cleared which means nested spinlock contention (if any) was finished and it nows see (the outer) lock. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Hao Luo <haoluo@google.com> Cc: Song Liu <song@kernel.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20231020204741.1869520-3-namhyung@kernel.org
| * | perf lock contention: Check race in tstamp elem creationNamhyung Kim2023-10-251-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When pelem is NULL, it'd create a new entry with zero data. But it might be preempted by IRQ/NMI just before calling bpf_map_update_elem() then there's a chance to call it twice for the same pid. So it'd be better to use BPF_NOEXIST flag and check the return value to prevent the race. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Hao Luo <haoluo@google.com> Cc: Song Liu <song@kernel.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20231020204741.1869520-2-namhyung@kernel.org
| * | perf lock contention: Clear lock addr after useNamhyung Kim2023-10-251-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It checks the current lock to calculated the delta of contention time. The address is saved in the tstamp map which is allocated at begining of contention and released at end of contention. But it's possible for bpf_map_delete_elem() to fail. In that case, the element in the tstamp map kept for the current lock and it makes the next contention for the same lock tracked incorrectly. Specificially the next contention begin will see the existing element for the task and it'd just return. Then the next contention end will see the element and calculate the time using the timestamp for the previous begin. This can result in a large value for two small contentions happened from time to time. Let's clear the lock address so that it can be updated next time even if the bpf_map_delete_elem() failed. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Hao Luo <haoluo@google.com> Cc: Song Liu <song@kernel.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20231020204741.1869520-1-namhyung@kernel.org
| * | perf evsel: Rename evsel__increase_rlimit to rlimit__increase_nofileYang Jihong2023-10-255-34/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | evsel__increase_rlimit() helper does nothing with evsel, and description of the functionality is inaccurate, rename it and move to util/rlimit.c. By the way, fix a checkppatch warning about misplaced license tag: WARNING: Misplaced SPDX-License-Identifier tag - use line 1 instead #160: FILE: tools/perf/util/rlimit.h:3: /* SPDX-License-Identifier: LGPL-2.1 */ No functional change. Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20231023033144.1011896-1-yangjihong1@huawei.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf bench sched pipe: Add -G/--cgroups optionNamhyung Kim2023-10-252-4/+147
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The -G/--cgroups option is to put sender and receiver in different cgroups in order to measure cgroup context switch overheads. Users need to make sure the cgroups exist and accessible. The following example should the effect of this change. Please don't forget taskset before the perf bench to measure cgroup switches properly. Otherwise each task would run on a different CPU and generate cgroup switches regardless of this change. # perf stat -e context-switches,cgroup-switches \ > taskset -c 0 perf bench sched pipe -l 10000 > /dev/null Performance counter stats for 'taskset -c 0 perf bench sched pipe -l 10000': 20,001 context-switches 2 cgroup-switches 0.053449651 seconds time elapsed 0.011286000 seconds user 0.041869000 seconds sys # perf stat -e context-switches,cgroup-switches \ > taskset -c 0 perf bench sched pipe -l 10000 -G AAA,BBB > /dev/null Performance counter stats for 'taskset -c 0 perf bench sched pipe -l 10000 -G AAA,BBB': 20,001 context-switches 20,001 cgroup-switches 0.052768627 seconds time elapsed 0.006284000 seconds user 0.046266000 seconds sys Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20231017202342.1353124-1-namhyung@kernel.org
| * | perf test: Skip CoreSight tests if cs_etm// event is not availableMichael Petlan2023-10-251-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CoreSight might be not available, in such case, skip the tests. Signed-off-by: Michael Petlan <mpetlan@redhat.com> Reviewed-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Carsten Haitzler <carsten.haitzler@arm.com> Cc: vmolnaro@redhat.com Link: https://lore.kernel.org/r/20231019091137.22525-1-mpetlan@redhat.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf data: Increase RLIMIT_NOFILE limit when open too many files in ↵Yang Jihong2023-10-191-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | perf_data__create_dir() If using parallel threads to collect data, perf record needs at least 6 fds per CPU. (one for sys_perf_event_open, four for pipe msg and ack of the pipe, see record__thread_data_open_pipes(), and one for open perf.data.XXX) For an environment with more than 100 cores, if perf record uses both `-a` and `--threads` options, it is easy to exceed the upper limit of the file descriptor number, when we run out of them try to increase the limits. Before: $ ulimit -n 1024 $ lscpu | grep 'On-line CPU(s)' On-line CPU(s) list: 0-159 $ perf record --threads -a sleep 1 Failed to create data directory: Too many open files After: $ ulimit -n 1024 $ lscpu | grep 'On-line CPU(s)' On-line CPU(s) list: 0-159 $ perf record --threads -a sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.394 MB perf.data (1576 samples) ] Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20231013075945.698874-1-yangjihong1@huawei.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf vendor events: Update PMC used in PM_RUN_INST_CMPL event for power10 ↵Kajol Jain2023-10-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | platform The CPI_STALL_RATIO metric group can be used to present the high level CPI stall breakdown metrics in powerpc, which will show: - DISPATCH_STALL_CPI ( Dispatch stall cycles per insn ) - ISSUE_STALL_CPI ( Issue stall cycles per insn ) - EXECUTION_STALL_CPI ( Execution stall cycles per insn ) - COMPLETION_STALL_CPI ( Completion stall cycles per insn ) Commit cf26e043c2a9 ("perf vendor events power10: Add JSON metric events to present CPI stall cycles in powerpc)" which added the CPI_STALL_RATIO metric group, also modified the PMC value used in PM_RUN_INST_CMPL event from PMC4 to PMC5, to avoid multiplexing of events. But that got revert in recent changes. Fix this issue by changing back the PMC value used in PM_RUN_INST_CMPL to PMC5. Result with the fix: ./perf stat --metric-no-group -M CPI_STALL_RATIO <workload> Performance counter stats for 'workload': 68,745,426 PM_CMPL_STALL # 0.21 COMPLETION_STALL_CPI 7,692,827 PM_ISSUE_STALL # 0.02 ISSUE_STALL_CPI 322,638,223 PM_RUN_INST_CMPL # 0.05 DISPATCH_STALL_CPI # 0.48 EXECUTION_STALL_CPI 16,858,553 PM_DISP_STALL_CYC 153,880,133 PM_EXEC_STALL 0.089774592 seconds time elapsed "--metric-no-group" is used for forcing PM_RUN_INST_CMPL to be scheduled in all group for more accuracy. Fixes: 7d473f475b2a ("perf vendor events: Move JSON/events to appropriate files for power10 platform") Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Disha Goel<disgoel@linux.ibm.com> Cc: maddy@linux.ibm.com Link: https://lore.kernel.org/r/20231016143110.244255-1-kjain@linux.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf trace: Use the right bpf_probe_read(_str) variant for reading user dataThomas Richter2023-10-191-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Perf test case 111 Check open filename arg using perf trace + vfs_getname fails on s390. This is caused by a failing function bpf_probe_read() in file util/bpf_skel/augmented_raw_syscalls.bpf.c. The root cause is the lookup by address. Function bpf_probe_read() is used. This function works only for architectures with ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE. On s390 is not possible to determine from the address to which address space the address belongs to (user or kernel space). Replace bpf_probe_read() by bpf_probe_read_kernel() and bpf_probe_read_str() by bpf_probe_read_user_str() to explicity specify the address space the address refers to. Output before: # ./perf trace -eopen,openat -- touch /tmp/111 libbpf: prog 'sys_enter': BPF program load failed: Invalid argument libbpf: prog 'sys_enter': -- BEGIN PROG LOAD LOG -- reg type unsupported for arg#0 function sys_enter#75 0: R1=ctx(off=0,imm=0) R10=fp0 ; int sys_enter(struct syscall_enter_args *args) 0: (bf) r6 = r1 ; R1=ctx(off=0,imm=0) R6_w=ctx(off=0,imm=0) ; return bpf_get_current_pid_tgid(); 1: (85) call bpf_get_current_pid_tgid#14 ; R0_w=scalar() 2: (63) *(u32 *)(r10 -8) = r0 ; R0_w=scalar() R10=fp0 fp-8=????mmmm 3: (bf) r2 = r10 ; R2_w=fp0 R10=fp0 ; ..... lines deleted here ..... 23: (bf) r3 = r6 ; R3_w=ctx(off=0,imm=0) R6=ctx(off=0,imm=0) 24: (85) call bpf_probe_read#4 unknown func bpf_probe_read#4 processed 23 insns (limit 1000000) max_states_per_insn 0 \ total_states 2 peak_states 2 mark_read 2 -- END PROG LOAD LOG -- libbpf: prog 'sys_enter': failed to load: -22 libbpf: failed to load object 'augmented_raw_syscalls_bpf' libbpf: failed to load BPF skeleton 'augmented_raw_syscalls_bpf': -22 .... Output after: # ./perf test -Fv 111 111: Check open filename arg using perf trace + vfs_getname : --- start --- 1.085 ( 0.011 ms): touch/320753 openat(dfd: CWD, filename: \ "/tmp/temporary_file.SWH85", \ flags: CREAT|NOCTTY|NONBLOCK|WRONLY, mode: IRUGO|IWUGO) = 3 ---- end ---- Check open filename arg using perf trace + vfs_getname: Ok # Test with the sleep command shows: Output before: # ./perf trace -e *sleep sleep 1.234567890 0.000 (1234.681 ms): sleep/63114 clock_nanosleep(rqtp: \ { .tv_sec: 0, .tv_nsec: 0 }, rmtp: 0x3ffe0979720) = 0 # Output after: # ./perf trace -e *sleep sleep 1.234567890 0.000 (1234.686 ms): sleep/64277 clock_nanosleep(rqtp: \ { .tv_sec: 1, .tv_nsec: 234567890 }, rmtp: 0x3fff3df9ea0) = 0 # Fixes: 14e4b9f4289a ("perf trace: Raw augmented syscalls fix libbpf 1.0+ compatibility") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Co-developed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ian Rogers <irogers@google.com> Cc: gor@linux.ibm.com Cc: hca@linux.ibm.com Cc: sumanthk@linux.ibm.com Cc: svens@linux.ibm.com Link: https://lore.kernel.org/r/20231019082642.3286650-1-tmricht@linux.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * | perf tools: Do not ignore the default vmlinux.hNamhyung Kim2023-10-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The recent change made it possible to generate vmlinux.h from BTF and to ignore the file. But we also have a minimal vmlinux.h that will be used by default. It should not be ignored by GIT. Fixes: b7a2d774c9c5 ("perf build: Add ability to build with a generated vmlinux.h") Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Ian Rogers <irogers@google.com> Closes: https://lore.kernel.org/oe-kbuild-all/202310110451.rvdUZJEY-lkp@intel.com/ Cc: oe-kbuild-all@lists.linux.dev Signed-off-by: Namhyung Kim <namhyung@kernel.org>