summaryrefslogtreecommitdiffstats
path: root/tools
Commit message (Collapse)AuthorAgeFilesLines
* perf symbols: Fix ownership of string in dso__load_vmlinux()James Clark2024-05-091-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The linked commit updated dso__load_vmlinux() to call dso__set_long_name() before loading the symbols. Loading the symbols may not succeed but dso__set_long_name() takes ownership of the string. The two callers of this function free the string themselves on failure cases, resulting in the following error: $ perf record -- ls $ perf report free(): double free detected in tcache 2 Fix it by always taking ownership of the string, even on failure. This means the string is either freed at the very first early exit condition, or later when the dso is deleted or the long name is replaced. Now no special return value is needed to signify that the caller needs to free the string. Fixes: e59fea47f83e8a9a ("perf symbols: Fix DSO kernel load and symbol process to correctly map DSO to its long_name, type and adjust_symbols") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240507141210.195939-5-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf symbols: Update kcore map before merging in remaining symbolsJames Clark2024-05-091-19/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When loading kcore, the main vmlinux map is updated in the same loop that merges the remaining maps. If a map that overlaps is merged in before kcore, the list can become unsortable when the main map addresses are updated. This will later trigger the check_invariants() assert: $ perf record $ perf report util/maps.c:96: check_invariants: Assertion `map__end(prev) <= map__start(map) || map__start(prev) == map__start(map)' failed. Aborted Fix it by moving the main map update prior to the loop so that maps__merge_in() can split it if necessary. Fixes: 659ad3492b913c90 ("perf maps: Switch from rbtree to lazily sorted array for addresses") Signed-off-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240507141210.195939-4-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf maps: Re-use __maps__free_maps_by_name()James Clark2024-05-091-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | maps__merge_in() hard codes the steps to free the maps_by_name list. It seems to not map__put() each element before freeing, and it sets maps_by_name_sorted to true after freeing, which may be harmless but is inconsistent with maps__init() and other functions. maps__maps_by_name_addr() is also quite hard to read because we already have maps__maps_by_name() and maps__maps_by_address(), but the function is only used in that place so delete it. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240507141210.195939-3-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf symbols: Remove map from list before updating addressesJames Clark2024-05-091-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | Make the order of operations remove, update, add. Updating addresses before the map is removed causes the ordering check to fail when the map is removed. This can be reproduced when running Perf on an Arm system with a static kernel and Perf uses kcore rather than other sources: $ perf record -- ls $ perf report util/maps.c:96: check_invariants: Assertion `map__end(prev) <= map__start(map) || map__start(prev) == map__start(map)' failed Fixes: 659ad3492b913c90 ("perf maps: Switch from rbtree to lazily sorted array for addresses") Signed-off-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240507141210.195939-2-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tracepoint: Don't scan all tracepoints to test if one existsIan Rogers2024-05-092-35/+24
| | | | | | | | | | | | | | | | | | | | | | | | In is_valid_tracepoint, rather than scanning "/sys/kernel/tracing/events/*/*" skipping any path where "/sys/kernel/tracing/events/*/*/id" doesn't exist, and then testing if "*:*" matches the tracepoint name, just use the given tracepoint name replace the ':' with '/' and see if the id file exists. This turns a nested directory search into a single file available test. Rather than return 1 for valid and 0 for invalid, return true and false. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240509153245.1990426-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf dwarf-aux: Fix build with HAVE_DWARF_CFI_SUPPORTJames Clark2024-05-091-28/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | check_allowed_ops() is used from both HAVE_DWARF_GETLOCATIONS_SUPPORT and HAVE_DWARF_CFI_SUPPORT sections, so move it into the right place so that it's available when either are defined. This shows up when doing a static cross compile for arm64: $ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- LDFLAGS="-static" \ EXTRA_PERFLIBS="-lexpat" util/dwarf-aux.c:1723:6: error: implicit declaration of function 'check_allowed_ops' Fixes: 55442cc2f22d0727 ("perf dwarf-aux: Check allowed DWARF Ops") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: James Clark <james.clark@arm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240508141458.439017-1-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf thread: Fixes to thread__new() related to initializing commIan Rogers2024-05-091-9/+5
| | | | | | | | | | | | | | | | | | | | | | Freeing the thread on failure won't work with reference count checking, use thread__delete(). Don't allocate the comm_str, use a stack allocation instead. Fixes: f6005cafebab72f8 ("perf thread: Add reference count checking") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240508035301.1554434-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf report: Avoid SEGV in report__setup_sample_type()Ian Rogers2024-05-091-1/+1
| | | | | | | | | | | | | | | | | | | | In some cases evsel->name is lazily initialized in evsel__name(). If not initialized passing NULL to strstr() leads to a SEGV. Fixes: ccb17caecfbd542f ("perf report: Set PERF_SAMPLE_DATA_SRC bit for Arm SPE event") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240508035301.1554434-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf comm: Fix comm_str__put() for reference count checkingIan Rogers2024-05-091-14/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | Searching for the entry in the array needs to avoid the intermediate pointer with reference count checking. Refactor the array removal to binary search for the entry. Change the array to hold an entry with a reference count (so the intermediate pointer can work) and remove from the array when the reference count on a comm_str falls to 1. Fixes: 13ca628716c6f2c3 ("perf comm: Add reference count checking to 'struct comm_str'") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240508035301.1554434-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf ui browser: Avoid SEGV on titleIan Rogers2024-05-091-1/+1
| | | | | | | | | | | | | | | | | | | If the title is NULL then it can lead to a SEGV. Fixes: 769e6a1e15bdbbaf ("perf ui browser: Don't save pointer to stack memory") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240508035301.1554434-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf dwarf-aux: Print array type name with "[]"Namhyung Kim2024-05-071-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | It's confusing both pointers and arrays are printed as *. Let's print array types with [] so that we can identify them easily. Although it's interchangable, sometimes it can cause confusion with size like in the below example. Note that it is not the same with C syntax where it goes to the variable names, but we want to have it in the type names (like in Go language). Before: mov [20] 0x68(reg5) -> reg0 type='struct page**' size=0x80 (die:0x4e61d32) After: mov [20] 0x68(reg5) -> reg0 type='struct page*[]' size=0x80 (die:0x4e61d32) Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240507041338.2081775-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf hist: Avoid 'struct hist_entry_iter' mem_info memory leakIan Rogers2024-05-072-28/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'struct mem_info' is reference counted while 'struct branch_info' and he_cache (struct hist_entry **) are not. Break apart the priv field in 'struct hist_entry_iter' so that we can know which values are owned by the iter and do the appropriate free or put. Move hide_unresolved to marginally shrink the size of the now grown struct. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240507183545.1236093-9-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf mem-info: Add reference count checkingIan Rogers2024-05-0711-88/+135
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add reference count checking and switch 'struct mem_info' usage to use accessor functions. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240507183545.1236093-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf mem-info: Move mem-info out of mem-events and symbolIan Rogers2024-05-0715-63/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move mem-info to its own header rather than having it split between mem-events and symbol. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240507183545.1236093-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf comm: Add reference count checking to 'struct comm_str'Ian Rogers2024-05-071-70/+126
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reference count checking of an rbtree is troublesome as each pointer should have a reference, switch to using a sorted array. Remove an indirection by embedding the reference count with the string. Use pthread_once to safely initialize the comm_strs and reader writer mutex. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240507183545.1236093-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf cpumap: Remove refcnt from 'struct cpu_aggr_map'Ian Rogers2024-05-073-17/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is assigned a value of 1 and never incremented. Remove and replace puts with delete. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240507183545.1236093-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf block-info: Remove unused refcountIan Rogers2024-05-073-33/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | block_info__get() has no callers so the refcount is only ever one. As such remove the reference counting logic and turn puts to deletes. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240507183545.1236093-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf annotate: Fix memory leak in annotated_sourceIan Rogers2024-05-071-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Freeing hash map doesn't free the entries added to the hashmap, add the missing free(). Fixes: d3e7cad6f36d9e80 ("perf annotate: Add a hashmap for symbol histogram") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240507183545.1236093-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf ui browser: Don't save pointer to stack memoryIan Rogers2024-05-072-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ui_browser__show() is capturing the input title that is stack allocated memory in hist_browser__run(). Avoid a use after return by strdup-ing the string. Committer notes: Further explanation from Ian Rogers: My command line using tui is: $ sudo bash -c 'rm /tmp/asan.log*; export ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a sleep 1; /tmp/perf/perf mem report' I then go to the perf annotate view and quit. This triggers the asan error (from the log file): ``` ==1254591==ERROR: AddressSanitizer: stack-use-after-return on address 0x7f2813331920 at pc 0x7f28180 65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10 READ of size 80 at 0x7f2813331920 thread T0 #0 0x7f2818065990 in __interceptor_strlen ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461 #1 0x7f2817698251 in SLsmg_write_wrapped_string (/lib/x86_64-linux-gnu/libslang.so.2+0x98251) #2 0x7f28176984b9 in SLsmg_write_nstring (/lib/x86_64-linux-gnu/libslang.so.2+0x984b9) #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60 #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266 #5 0x55c94045c776 in ui_browser__show ui/browser.c:288 #6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206 #7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458 #8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412 #9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527 #10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613 #11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661 #12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671 #13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141 #14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805 #15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374 #16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516 #17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350 #18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403 #19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447 #20 0x55c9400e53ad in main tools/perf/perf.c:561 #21 0x7f28170456c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 #22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360 #23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId: 84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93) Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746 This frame has 1 object(s): [32, 192) 'title' (line 747) <== Memory access at offset 32 is inside this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork ``` hist_browser__run isn't on the stack so the asan error looks legit. There's no clean init/exit on struct ui_browser so I may be trading a use-after-return for a memory leak, but that seems look a good trade anyway. Fixes: 05e8b0804ec4 ("perf ui browser: Stop using 'self'") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240507183545.1236093-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf bench internals inject-build-id: Fix trap divide when collecting just ↵He Zhe2024-05-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | one DSO 'perf bench internals inject-build-id' suffers from the following error when only one DSO is collected. # perf bench internals inject-build-id -v Collected 1 DSOs traps: internals-injec[2305] trap divide error ip:557566ba6394 sp:7ffd4de97fe0 error:0 in perf[557566b2a000+23d000] Build-id injection benchmark Iteration #1 Floating point exception This patch removes the unnecessary minus one from the divisor which also corrects the randomization range. Signed-off-by: He Zhe <zhe.he@windriver.com> Fixes: 0bf02a0d80427f26 ("perf bench: Add build-id injection benchmark") Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240507065026.2652929-1-zhe.he@windriver.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf probe: Use zfree() to avoid possibly accessing dangling pointersArnaldo Carvalho de Melo2024-05-071-1/+1
| | | | | | | | | | | | | | | | | When freeing a->b it is good practice to set a->b to NULL using zfree(&a->b) so that when we have a bug where a reference to a freed 'a' pointer is kept somewhere, we can more quickly cause a segfault if some code tries to use a->b. Convert one such case in the 'perf probe' codebase. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZjpBnkL2wO3QJa5W@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Allow number of queues to be specifiedJames Clark2024-05-072-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently it's only possible to initialize with the default number of queues and then use auxtrace_queues__add_event() to grow the array. But that's problematic if you don't have a real event to pass into that function yet. The queues hold a void *priv member to store custom state, and for Coresight we want to create decoders upfront before receiving data, so add a new function that allows pre-allocating queues. One reason to do this is because we might need to store metadata (HW_ID events) that effects other queues, but never actually receive auxtrace data on that queue. Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: James Clark <james.clark@arm.com> Tested-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steve Clevenger <scclevenger@os.amperecomputing.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240429152207.479221-3-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf cs-etm: Print error for new PERF_RECORD_AUX_OUTPUT_HW_ID versionsJames Clark2024-05-071-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | The likely fix for this is to update perf so print a helpful message. Signed-off-by: James Clark <james.clark@arm.com> Tested-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Acked-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steve Clevenger <scclevenger@os.amperecomputing.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240429152207.479221-2-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf annotate: Fix a comment about multi_regs in extract_reg_offset functionAthira Rajeev2024-05-071-1/+1
| | | | | | | | | | | | | | | | | | | | | Fix a comment in function which explains how multi_regs field gets set for an instruction. In the example, "mov %rsi, 8(%rbx,%rcx,4)", the comment mistakenly referred to "dst_multi_regs = 0". Correct it to use "src_multi_regs = 0" Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Akanksha J N <akanksha@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Link: https://lore.kernel.org/r/20240506121906.76639-4-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf kwork: Use zfree() to avoid possibly accessing dangling pointersArnaldo Carvalho de Melo2024-05-071-1/+1
| | | | | | | | | | | | | | | | | | When freeing a->b it is good practice to set a->b to NULL using zfree(&a->b) so that when we have a bug where a reference to a freed 'a' pointer is kept somewhere, we can more quickly cause a segfault if some code tries to use a->b. Convert one such case in the 'perf kwork' codebase. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/lkml/Zjmc5EiN6zmWZj4r@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf callchain: Use zfree() to avoid possibly accessing dangling pointersArnaldo Carvalho de Melo2024-05-071-1/+1
| | | | | | | | | | | | | | | | | When freeing a->b it is good practice to set a->b to NULL using zfree(&a->b) so that when we have a bug where a reference to a freed 'a' pointer is kept somewhere, we can more quickly cause a segfault if some code tries to use a->b. Convert one such case in the callchain code. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZjmcGobQ8E52EyjJ@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf annotate: Use zfree() to avoid possibly accessing dangling pointersArnaldo Carvalho de Melo2024-05-073-11/+13
| | | | | | | | | | | | | | | | | | When freeing a->b it is good practice to set a->b to NULL using zfree(&a->b) so that when we have a bug where a reference to a freed 'a' pointer is kept somewhere, we can more quickly cause a segfault if some code tries to use a->b. This is mostly done but some new cases were introduced recently, convert them to zfree(). Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZjmbHHrjIm5YRIBv@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf dso: Use container_of() to avoid a pointer in 'struct dso_data'Ian Rogers2024-05-063-32/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The dso pointer in 'struct dso_data' is necessary for reference count checking to account for the dso_data forming a global list of open dso's with references to the dso. The dso pointer also allows for the indirection that reference count checking needs. Outside of reference count checking the indirection isn't needed and container_of() is more efficient and saves space. The reference count won't be increased by placing items onto the global list, matching how things were before the reference count checking change, but we assert the dso is in dsos holding it live (and that the set of open dsos is a subset of all dsos for the machine). Update the DSO data tests so that they use a dsos struct to make the invariant true. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Link: https://lore.kernel.org/r/20240506180104.485674-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf symbol-elf: dso__load_sym_internal() reference count fixesIan Rogers2024-05-061-26/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dso__load_sym_internal() passed curr_mapp as an out argument to dso__process_kernel_symbol(). The out argument was never used so remove it to simplify the reference counting logic. Simplify reference counting issues with curr_dso by ensuring the value it points to has a +1 reference count, and then putting as necessary. This avoids some reference counting games when the dso is created making the code more obviously correct with some possible introduced overhead due to the reference counting get/puts. This, however, silences reference count checking and we can always optimize from a seemingly correct point. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Link: https://lore.kernel.org/r/20240506180104.485674-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf symbol-elf: Ensure dso__put() in machine__process_ksymbol_register()Ian Rogers2024-05-061-3/+3
| | | | | | | | | | | | | | | | | | | | | | | The dso__put() after the map creation causes a use after put in dso__set_loaded(). To ensure there is a +1 reference count on both sides of the if-else, do a dso__get() on the found map's dso. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Link: https://lore.kernel.org/r/20240506180104.485674-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf map: Add missing dso__put() in map__new()Ian Rogers2024-05-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | A dso__put() is needed for the dsos__find() when the map is created and a buildid is sought. Fixes: f649ed80f3cabbf1 ("perf dsos: Tidy reference counting and locking") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Link: https://lore.kernel.org/r/20240506180104.485674-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf dso: Add reference count checking and accessor functionsIan Rogers2024-05-0657-739/+1169
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add reference count checking to struct dso, this can help with implementing correct reference counting discipline. To avoid RC_CHK_ACCESS everywhere, add accessor functions for the variables in struct dso. The majority of the change is mechanical in nature and not easy to split up. Committer testing: 'perf test' up to this patch shows no regressions. But: util/symbol.c: In function ‘dso__load_bfd_symbols’: util/symbol.c:1683:9: error: too few arguments to function ‘dso__set_adjust_symbols’ 1683 | dso__set_adjust_symbols(dso); | ^~~~~~~~~~~~~~~~~~~~~~~ In file included from util/symbol.c:21: util/dso.h:268:20: note: declared here 268 | static inline void dso__set_adjust_symbols(struct dso *dso, bool val) | ^~~~~~~~~~~~~~~~~~~~~~~ make[6]: *** [/home/acme/git/perf-tools-next/tools/build/Makefile.build:106: /tmp/tmp.ZWHbQftdN6/util/symbol.o] Error 1 MKDIR /tmp/tmp.ZWHbQftdN6/tests/workloads/ make[6]: *** Waiting for unfinished jobs.... This was updated: - symbols__fixup_end(&dso->symbols, false); - symbols__fixup_duplicate(&dso->symbols); - dso->adjust_symbols = 1; + symbols__fixup_end(dso__symbols(dso), false); + symbols__fixup_duplicate(dso__symbols(dso)); + dso__set_adjust_symbols(dso); But not build tested with BUILD_NONDISTRO and libbfd devel files installed (binutils-devel on fedora). Add the missing argument: symbols__fixup_end(dso__symbols(dso), false); symbols__fixup_duplicate(dso__symbols(dso)); - dso__set_adjust_symbols(dso); + dso__set_adjust_symbols(dso, true); Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Chengen Du <chengen.du@canonical.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20240504213803.218974-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf dsos: Switch hand crafted code to bsearch()Ian Rogers2024-05-061-19/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Switch to using the bsearch library function rather than having a hand written binary search. Const-ify some static functions to avoid compiler warnings. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Chengen Du <chengen.du@canonical.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20240504213803.218974-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf dsos: Remove __dsos__findnew_link_by_longname_id()Ian Rogers2024-05-062-47/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Function was only called in dsos.c with the dso parameter as NULL. Remove the function and specialize for the dso being NULL case removing other unused functions along the way. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Chengen Du <chengen.du@canonical.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20240504213803.218974-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf dsos: Remove __dsos__addnew()Ian Rogers2024-05-062-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Function no longer used so remove. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Chengen Du <chengen.du@canonical.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20240504213803.218974-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf dsos: Switch backing storage to array from rbtree/listIan Rogers2024-05-064-109/+177
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DSOs were held on a list for fast iteration and in an rbtree for fast finds. Switch to using a lazily sorted array where iteration is just iterating through the array and binary searches are the same complexity as searching the rbtree. The find may need to sort the array first which does increase the complexity, but add operations have lower complexity and overall the complexity should remain about the same. The set name operations on the dso just records that the array is no longer sorted, avoiding complexity in rebalancing the rbtree. Tighter locking discipline is enforced to avoid the array being resorted while long and short names or ids are changed. The array is smaller in size, replacing 6 pointers with 2, and so even with extra allocated space in the array, the array may be 50% unoccupied, the memory saving should be at least 2x. Committer testing: On a previous version of this patchset we were getting a lot of warnings about deleting a DSO still on a list, now it is ok: root@x1:~# perf probe -l root@x1:~# perf probe finish_task_switch Added new event: probe:finish_task_switch (on finish_task_switch) You can now use it in all perf tools, such as: perf record -e probe:finish_task_switch -aR sleep 1 root@x1:~# perf probe -l probe:finish_task_switch (on finish_task_switch@kernel/sched/core.c) root@x1:~# perf trace -e probe:finish_task_switch/max-stack=8/ --max-events=1 0.000 migration/0/19 probe:finish_task_switch(__probe_ip: -1894408688) finish_task_switch.isra.0 ([kernel.kallsyms]) __schedule ([kernel.kallsyms]) schedule ([kernel.kallsyms]) smpboot_thread_fn ([kernel.kallsyms]) kthread ([kernel.kallsyms]) ret_from_fork ([kernel.kallsyms]) ret_from_fork_asm ([kernel.kallsyms]) root@x1:~# root@x1:~# perf probe -d probe:* Removed event: probe:finish_task_switch root@x1:~# perf probe -l root@x1:~# I also ran the full 'perf test' suite after applying this one, no regressions. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Chengen Du <chengen.du@canonical.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20240504213803.218974-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf vendor events amd: Add Zen 5 mappingSandipan Das2024-05-041-0/+1
| | | | | | | | | | | | | | | | | | | | Add a regular expression in the map file so that appropriate JSON event files are used for AMD Zen 5 processors belonging to Family 1Ah. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Sandipan Das <sandipan.das@amd.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ananth Narayan <ananth.narayan@amd.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/862a6b683755601725f9081897a850127d085ace.1714717230.git.sandipan.das@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf vendor events amd: Add Zen 5 metricsSandipan Das2024-05-042-0/+444
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add metrics taken from Section 1.2 "Performance Measurement" of the Performance Monitor Counters for AMD Family 1Ah Model 00h-0Fh Processors document available at the link below. The recommended metrics are sourced from Table 1 "Guidance for Common Performance Statistics with Complex Event Selects". The pipeline utilization metrics are sourced from Table 2 "Guidance for Pipeline Utilization Analysis Statistics". These are useful for finding performance bottlenecks by analyzing activity at different stages of the pipeline. There are metric groups available for Level 1 and Level 2 analysis. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Sandipan Das <sandipan.das@amd.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ananth Narayan <ananth.narayan@amd.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephane Eranian <eranian@google.com> Link: https://bugzilla.kernel.org/attachment.cgi?id=305974 Link: https://lore.kernel.org/r/ee21ff77d89efa99997d3c2ebeeae22ddb6e7e12.1714717230.git.sandipan.das@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf vendor events amd: Add Zen 5 uncore eventsSandipan Das2024-05-042-0/+278
| | | | | | | | | | | | | | | | | | | | | | | | | Add uncore events taken from Section 1.5 "L3 Cache Performance Monitor Counters" and Section 2 "UMC Performance Monitors" of the Performance Monitor Counters for AMD Family 1Ah Model 00h-0Fh Processors document available at the link below. This constitutes events which capture L3 cache and UMC command activity. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Sandipan Das <sandipan.das@amd.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ananth Narayan <ananth.narayan@amd.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephane Eranian <eranian@google.com> Link: https://bugzilla.kernel.org/attachment.cgi?id=305974 Link: https://lore.kernel.org/r/e11e8d9d1af34a0fb565fc9d1c4a05f569c39ddc.1714717230.git.sandipan.das@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf vendor events amd: Add Zen 5 core eventsSandipan Das2024-05-047-0/+1983
| | | | | | | | | | | | | | | | | | | | | | | | | | Add core events taken from Section 1.4 "Core Performance Monitor Counters" of the Performance Monitor Counters for AMD Family 1Ah Model 00h-0Fh Processors document available at the link below. This constitutes events which capture information on op dispatch, execution and retirement, branch prediction, L1 and L2 cache activity, TLB activity, etc. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Sandipan Das <sandipan.das@amd.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ananth Narayan <ananth.narayan@amd.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephane Eranian <eranian@google.com> Link: https://bugzilla.kernel.org/attachment.cgi?id=305974 Link: https://lore.kernel.org/r/668d194241bf0d42dc37f1c5af8131069a0bd82c.1714717230.git.sandipan.das@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf trace: Disable syscall augmentation with recordIan Rogers2024-05-041-0/+5
| | | | | | | | | | | | | | | | | | | Syscall augmentation is causing samples not to be written to the perf.data file with "perf trace record". Disabling augmentation is sub-optimal, but it beats having a totally broken perf trace record. Closes: https://lore.kernel.org/lkml/CAP-5=fV9Gd1Teak+EOcUSxe13KqSyfZyPNagK97GbLiOQRgGaw@mail.gmail.com/ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240216172357.65037-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf pmu: Assume sysfs events are always the same caseIan Rogers2024-05-031-5/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Perf event names aren't case sensitive. For sysfs events the entire directory of events is read then iterated comparing names in a case insensitive way, most often to see if an event is present. Consider: $ perf stat -e inst_retired.any true The event inst_retired.any may be present in any PMU, so every PMU's sysfs events are loaded and then searched with strcasecmp to see if any match. This event is only present on the cpu PMU as a JSON event so a lot of events were loaded from sysfs unnecessarily just to prove an event didn't exist there. This change avoids loading all the events by assuming sysfs event names are always either lower or uppercase. It uses file exists and only loads the events when the desired event is present. For the example above, the number of openat calls measured by 'perf trace' on a tigerlake laptop goes from 325 down to 255. The reduction will be larger for machines with many PMUs, particularly replicated uncore PMUs. Ensure pmu_aliases_parse() is called before all uses of the aliases list, but remove some "pmu->sysfs_aliases_loaded" tests as they are now part of the function. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20240502213507.2339733-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf test pmu: Test all sysfs PMU event names are the same caseIan Rogers2024-05-031-0/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Being either lower or upper case means event name probes can avoid scanning the directory doing case insensitive comparisons, just the lower or upper case version of the name can be checked for existence. For the majority of PMUs event names are all lower case, upper case names are present on S390. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20240502213507.2339733-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf test pmu: Add an eagerly loaded event testIan Rogers2024-05-032-21/+124
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow events/aliases to be eagerly loaded for a PMU. Factor out the pmu_aliases_parse to allow this. Parse a test event and check it configures the attribute as expected. There is overlap with the parse-events tests, but this test is done with a PMU created in a temp directory and doesn't rely on PMUs in sysfs. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20240502213507.2339733-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf test pmu: Refactor format test and exposed test APIsIan Rogers2024-05-037-179/+177
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In tests/pmu.c, make a common utility that creates a PMU in a mkdtemp directory and uses regular PMU parsing logic to load that PMU. Formats must still be eagerly loaded as by default the PMU code assumes devices are going to be in sysfs. In util/pmu.[ch], hide perf_pmu__format_parse but add the eager argument to perf_pmu__lookup called by perf_pmus__add_test_pmu. Later patches will eagerly load other non-sysfs files when eager loading is enabled. In tests/pmu.c, rather than manually constructing a list of term arguments, just use the term parsing code from a string. Add more comments and debug logging. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20240502213507.2339733-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf test pmu-events: Make it clearer that pmu-events tests JSON eventsIan Rogers2024-05-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | Add JSON to the test name. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20240502213507.2339733-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf maps: Remove check_invariants() from maps__lock()Namhyung Kim2024-05-021-5/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | I found that the debug build was a slowed down a lot by the maps lock code since it checks the invariants whenever it gets the pointer to the lock. This means it checks twice the invariants before and after the access. Instead, let's move the checking code within the lock area but after any modification and remove it from the read paths. This would remove (more than) half of the maps lock overhead. The time for perf report with a huge data file (200k+ of MMAP2 events). Non-debug Before After --------- -------- -------- 2m 43s 6m 45s 4m 21s Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240429225738.1491791-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf cs-etm: Improve version detection and error reportingJames Clark2024-05-021-18/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the config validation functions are warning about ETMv3, they do it based on "not ETMv4". If the drivers aren't all loaded or the hardware doesn't support Coresight it will appear as "not ETMv4" and then Perf will print the error message "... not supported in ETMv3 ..." which is wrong and confusing. cs_etm_is_etmv4() is also misnamed because it also returns true for ETE because ETE has a superset of the ETMv4 metadata files. Although this was always done in the correct order so it wasn't a bug. Improve all this by making a single get version function which also handles not present as a separate case. Change the ETMv3 error message to only print when ETMv3 is detected, and add a new error message for the not present case. Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: Leo Yan <leo.yan@linux.dev> Signed-off-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240501135753.508022-4-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf cs-etm: Remove repeated fetches of the ETM PMUJames Clark2024-05-021-33/+27
| | | | | | | | | | | | | | | | | | | | | | | Most functions already have cs_etm_pmu, so it's a bit neater to pass it through rather than itr only to convert it again. Signed-off-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240501135753.508022-3-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf cs-etm: Use struct perf_cpu as much as possibleJames Clark2024-05-021-116/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The perf_cpu struct makes some iterators simpler and avoids some mistakes with interchanging CPU IDs with indexes etc. At the moment in this file the conversion to an integer is done somewhere in the middle of the call tree. Change it to delay the conversion to an int until the leaf functions. Some of the usage patterns are duplicated, so instead of changing them all, make cs_etm_get_ro() more reusable and use that everywhere. cs_etm_get_ro() didn't return an error before, but return one now so that it can also be used where an error is needed. Continue to ignore the error where it was already ignored. Use cs_etm_pmu_path_exists() instead of cs_etm_get_ro() in cs_etm_is_etmv4() because cs_etm_get_ro() prints a warning, but path exists is sufficient for this use case. Signed-off-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240501135753.508022-2-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>