summaryrefslogtreecommitdiffstats
path: root/tools/perf/util/pmu.c
Commit message (Collapse)AuthorAgeFilesLines
* perf: pmus: Remove unneeded semicolonYang Li2024-06-281-1/+1
| | | | | | | | | | | ./tools/perf/util/pmu.c:1776:49-50: Unneeded semicolon Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=9443 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20240628053049.44521-1-yang.lee@linux.alibaba.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
* perf pmu: Don't de-duplicate core PMUsJames Clark2024-06-271-6/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Arm PMUs have a suffix, either a single decimal (armv8_pmuv3_0) or 3 hex digits which (armv8_cortex_a53) which Perf assumes are both strippable suffixes for the purposes of deduplication. S390 "cpum_cf" is a similarly suffixed core PMU but is only two characters so is not treated as strippable because the rules are a minimum of 3 hex characters or 1 decimal character. There are two paths involved in listing PMU events: * HW/cache event printing assumes core PMUs don't have suffixes so doesn't try to strip. * Sysfs PMU events share the printing function with uncore PMUs which strips. This results in slightly inconsistent Perf list behavior if a core PMU has a suffix: # perf list ... armv8_pmuv3_0/branch-load-misses/ armv8_pmuv3/l3d_cache_wb/ [Kernel PMU event] ... Fix it by partially reverting back to the old list behavior where stripping was only done for uncore PMUs. For example commit 8d9f5146f5da ("perf pmus: Sort pmus by name then suffix") mentions that only PMUs starting 'uncore_' are considered to have a potential suffix. This change doesn't go back that far, but does only strip PMUs that are !is_core. This keeps the desirable behavior where the many possibly duplicated uncore PMUs aren't repeated, but it doesn't break listing for core PMUs. Searching for a PMU continues to use the new stripped comparison functions, meaning that it's still possible to request an event by specifying the common part of a PMU name, or even open events on multiple similarly named PMUs. For example: # perf stat -e armv8_cortex/inst_retired/ 5777173628 armv8_cortex_a53/inst_retired/ (99.93%) 7469626951 armv8_cortex_a57/inst_retired/ (49.88%) Fixes: 3241d46f5f54 ("perf pmus: Sort/merge/aggregate PMUs like mrvl_ddr_pmu") Suggested-by: Ian Rogers <irogers@google.com> Signed-off-by: James Clark <james.clark@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Cc: robin.murphy@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240626145448.896746-3-james.clark@arm.com
* perf pmu: Restore full PMU name wildcard supportJames Clark2024-06-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit b2b9d3a3f021 ("perf pmu: Support wildcards on pmu name in dynamic pmu events") gives the following example for wildcarding a subset of PMUs: E.g., in a system with the following dynamic pmus: mypmu_0 mypmu_1 mypmu_2 mypmu_4 perf stat -e mypmu_[01]/<config>/ Since commit f91fa2ae6360 ("perf pmu: Refactor perf_pmu__match()"), only "*" has been supported, removing the ability to subset PMUs, even though parse-events.l still supports ? and [] characters. Fix it by using fnmatch() when any glob character is detected and add a test which covers that and other scenarios of perf_pmu__match_ignoring_suffix(). Fixes: f91fa2ae6360 ("perf pmu: Refactor perf_pmu__match()") Signed-off-by: James Clark <james.clark@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Cc: robin.murphy@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240626145448.896746-2-james.clark@arm.com
* perf pmus: Sort/merge/aggregate PMUs like mrvl_ddr_pmuIan Rogers2024-05-281-13/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The mrvl_ddr_pmu is uncore and has a hexadecimal address suffix while the previous PMU sorting/merging code assumes uncore PMU names start with uncore_ and have a decimal suffix. Because of the previous assumption it isn't possible to wildcard the mrvl_ddr_pmu. Modify pmu_name_len_no_suffix but also remove the suffix number out argument, this is because we don't know if a suffix number of say 100 is in hexadecimal or decimal. As the only use of the suffix number is in comparisons, it is safe there to compare the values as hexadecimal. Modify perf_pmu__match_ignoring_suffix so that hexadecimal suffixes are ignored. Only allow hexadecimal suffixes to be greater than length 2 (ie 3 or more) so that S390's cpum_cf PMU doesn't lose its suffix. Change the return type of pmu_name_len_no_suffix to size_t to workaround GCC incorrectly determining the result could be negative. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Bharat Bhushan <bbhushan2@marvell.com> Cc: Bhaskara Budiredla <bbudiredla@marvell.com> Cc: Tuan Phan <tuanphan@os.amperecomputing.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240515060114.3268149-2-irogers@google.com
* perf pmu: Count sys and cpuid JSON events separatelyIan Rogers2024-05-111-21/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | Sys events are eagerly loaded as each event has a compat option that may mean the event is or isn't associated with the PMU. These shouldn't be counted as loaded_json_events as that is used for JSON events matching the CPUID that may or may not have been loaded. The mismatch causes issues on ARM64 that uses sys events. Fixes: e6ff1eed3584362d ("perf pmu: Lazily add JSON events") Closes: https://lore.kernel.org/lkml/20240510024729.1075732-1-justin.he@arm.com/ Reported-by: Jia He <justin.he@arm.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240511003601.2666907-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf pmu: Assume sysfs events are always the same caseIan Rogers2024-05-031-5/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Perf event names aren't case sensitive. For sysfs events the entire directory of events is read then iterated comparing names in a case insensitive way, most often to see if an event is present. Consider: $ perf stat -e inst_retired.any true The event inst_retired.any may be present in any PMU, so every PMU's sysfs events are loaded and then searched with strcasecmp to see if any match. This event is only present on the cpu PMU as a JSON event so a lot of events were loaded from sysfs unnecessarily just to prove an event didn't exist there. This change avoids loading all the events by assuming sysfs event names are always either lower or uppercase. It uses file exists and only loads the events when the desired event is present. For the example above, the number of openat calls measured by 'perf trace' on a tigerlake laptop goes from 325 down to 255. The reduction will be larger for machines with many PMUs, particularly replicated uncore PMUs. Ensure pmu_aliases_parse() is called before all uses of the aliases list, but remove some "pmu->sysfs_aliases_loaded" tests as they are now part of the function. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20240502213507.2339733-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf test pmu: Add an eagerly loaded event testIan Rogers2024-05-031-21/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow events/aliases to be eagerly loaded for a PMU. Factor out the pmu_aliases_parse to allow this. Parse a test event and check it configures the attribute as expected. There is overlap with the parse-events tests, but this test is done with a PMU created in a temp directory and doesn't rely on PMUs in sysfs. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20240502213507.2339733-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf test pmu: Refactor format test and exposed test APIsIan Rogers2024-05-031-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In tests/pmu.c, make a common utility that creates a PMU in a mkdtemp directory and uses regular PMU parsing logic to load that PMU. Formats must still be eagerly loaded as by default the PMU code assumes devices are going to be in sysfs. In util/pmu.[ch], hide perf_pmu__format_parse but add the eager argument to perf_pmu__lookup called by perf_pmus__add_test_pmu. Later patches will eagerly load other non-sysfs files when eager loading is enabled. In tests/pmu.c, rather than manually constructing a list of term arguments, just use the term parsing code from a string. Add more comments and debug logging. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20240502213507.2339733-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf pmu: Refactor perf_pmu__match()Ian Rogers2024-04-261-8/+19
| | | | | | | | | | | | | | | | | | | | | Move all implementation to pmu code. Don't allocate a fnmatch wildcard pattern, matching ignoring the suffix already handles this, and only use fnmatch if the given PMU name has a '*' in it. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Add/use PMU reverse lookup from config to nameIan Rogers2024-03-211-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add perf_pmu__name_from_config that does a reverse lookup from a config number to an alias name. The lookup is expensive as the config is computed for every alias by filling in a perf_event_attr, but this is only done when verbose output is enabled. The lookup also only considers config, and not config1, config2 or config3. An example of the output: $ perf stat -vv -e data_read true ... perf_event_attr: type 24 (uncore_imc_free_running_0) size 136 config 0x20ff (data_read) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ... Committer notes: Fix the python binding build by adding dummies for not strictly needed perf_pmu__name_from_config() and perf_pmus__find_by_type(). Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20240308001915.4060155-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf list: Give more details about raw event encodingsIan Rogers2024-03-211-1/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | List all the PMUs, not just the first core one, and list real format specifiers with value ranges. Before: $ perf list ... rNNN [Raw hardware event descriptor] cpu/t1=v1[,t2=v2,t3 ...]/modifier [Raw hardware event descriptor] [(see 'man perf-list' on how to encode it)] mem:<addr>[/len][:access] [Hardware breakpoint] ... After: $ perf list ... rNNN [Raw event descriptor] cpu/event=0..255,pc,edge,.../modifier [Raw event descriptor] [(see 'man perf-list' or 'man perf-record' on how to encode it)] breakpoint//modifier [Raw event descriptor] cstate_core/event=0..0xffffffffffffffff/modifier [Raw event descriptor] cstate_pkg/event=0..0xffffffffffffffff/modifier [Raw event descriptor] i915/i915_eventid=0..0x1fffff/modifier [Raw event descriptor] intel_bts//modifier [Raw event descriptor] intel_pt/ptw,event,cyc_thresh=0..15,.../modifier [Raw event descriptor] kprobe/retprobe/modifier [Raw event descriptor] msr/event=0..0xffffffffffffffff/modifier [Raw event descriptor] power/event=0..255/modifier [Raw event descriptor] software//modifier [Raw event descriptor] tracepoint//modifier [Raw event descriptor] uncore_arb/event=0..255,edge,inv,.../modifier [Raw event descriptor] uncore_cbox/event=0..255,edge,inv,.../modifier [Raw event descriptor] uncore_clock/event=0..255/modifier [Raw event descriptor] uncore_imc_free_running/event=0..255,umask=0..255/modifier[Raw event descriptor] uprobe/ref_ctr_offset=0..0xffffffff,retprobe/modifier[Raw event descriptor] mem:<addr>[/len][:access] [Hardware breakpoint] ... With '--details' provide more details on the formats encoding: cpu/event=0..255,pc,edge,.../modifier [Raw event descriptor] [(see 'man perf-list' or 'man perf-record' on how to encode it)] cpu/event=0..255,pc,edge,offcore_rsp=0..0xffffffffffffffff,ldlat=0..0xffff,inv, umask=0..255,frontend=0..0xffffff,cmask=0..255,config=0..0xffffffffffffffff, config1=0..0xffffffffffffffff,config2=0..0xffffffffffffffff,config3=0..0xffffffffffffffff, name=string,period=number,freq=number,branch_type=(u|k|hv|any|...),time, call-graph=(fp|dwarf|lbr),stack-size=number,max-stack=number,nr=number,inherit,no-inherit, overwrite,no-overwrite,percore,aux-output,aux-sample-size=number/modifier breakpoint//modifier [Raw event descriptor] breakpoint//modifier cstate_core/event=0..0xffffffffffffffff/modifier [Raw event descriptor] cstate_core/event=0..0xffffffffffffffff/modifier cstate_pkg/event=0..0xffffffffffffffff/modifier [Raw event descriptor] cstate_pkg/event=0..0xffffffffffffffff/modifier i915/i915_eventid=0..0x1fffff/modifier [Raw event descriptor] i915/i915_eventid=0..0x1fffff/modifier intel_bts//modifier [Raw event descriptor] intel_bts//modifier intel_pt/ptw,event,cyc_thresh=0..15,.../modifier [Raw event descriptor] intel_pt/ptw,event,cyc_thresh=0..15,pt,notnt,branch,tsc,pwr_evt,fup_on_ptw,cyc,noretcomp, mtc,psb_period=0..15,mtc_period=0..15/modifier kprobe/retprobe/modifier [Raw event descriptor] kprobe/retprobe/modifier msr/event=0..0xffffffffffffffff/modifier [Raw event descriptor] msr/event=0..0xffffffffffffffff/modifier power/event=0..255/modifier [Raw event descriptor] power/event=0..255/modifier software//modifier [Raw event descriptor] software//modifier tracepoint//modifier [Raw event descriptor] tracepoint//modifier uncore_arb/event=0..255,edge,inv,.../modifier [Raw event descriptor] uncore_arb/event=0..255,edge,inv,umask=0..255,cmask=0..31/modifier uncore_cbox/event=0..255,edge,inv,.../modifier [Raw event descriptor] uncore_cbox/event=0..255,edge,inv,umask=0..255,cmask=0..31/modifier uncore_clock/event=0..255/modifier [Raw event descriptor] uncore_clock/event=0..255/modifier uncore_imc_free_running/event=0..255,umask=0..255/modifier[Raw event descriptor] uncore_imc_free_running/event=0..255,umask=0..255/modifier uprobe/ref_ctr_offset=0..0xffffffff,retprobe/modifier[Raw event descriptor] uprobe/ref_ctr_offset=0..0xffffffff,retprobe/modifier Committer notes: Address this build error in various distros: 55 58.44 ubuntu:24.04 : FAIL gcc version 13.2.0 (Ubuntu 13.2.0-17ubuntu2) util/pmu.c:1638:70: error: '_Static_assert' with no message is a C2x extension [-Werror,-Wc2x-extensions] 1638 | _Static_assert(ARRAY_SIZE(terms) == __PARSE_EVENTS__TERM_TYPE_NR - 6); | ^ | , "" 1 error generated. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20240308001915.4060155-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf pmu: Drop "default_core" from alias namesIan Rogers2024-03-211-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "default_core" is used by jevents.py for json events' PMU name when none is specified. On x86 the "default_core" is typically the PMU "cpu". When creating an alias see if the event's PMU name is "default_core" in which case don't record it. This means in places like "perf list" the PMU's name will be used in its place. Before: $ perf list --details ... cache: l1d.replacement [Counts the number of cache lines replaced in L1 data cache] default_core/event=0x51,period=0x186a3,umask=0x1/ ... After: $ perf list --details ... cache: l1d.replacement [Counts the number of cache lines replaced in L1 data cache. Unit: cpu] cpu/event=0x51,period=0x186a3,umask=0x1/ ... Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20240308001915.4060155-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf pmu: Fix a potential memory leak in perf_pmu__lookup()Christophe JAILLET2024-02-261-4/+3
| | | | | | | | | | | | | | The commit in Fixes has reordered some code, but missed an error handling path. 'goto err' now, in order to avoid a memory leak in case of error. Fixes: f63a536f03a2 ("perf pmu: Merge JSON events with sysfs at load time") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Ian Rogers <irogers@google.com> Cc: kernel-janitors@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/9538b2b634894c33168dfe9d848d4df31fd4d801.1693085544.git.christophe.jaillet@wanadoo.fr
* perf parse-events: Improve error location of terms cloned from an eventIan Rogers2024-02-021-4/+5
| | | | | | | | | | | | | | | | | | | | | | A PMU event/alias will have a set of format terms that replace it when an event is parsed. The location of the terms is their position when parsed for the event/alias either from sysfs or json. This location is of little use when an event fails to parse as the error will be given in terms of the location in the string of events parsed not the json or sysfs string. Fix this by making the cloned terms location that of the event/alias. If a cloned term from an event/alias is invalid the bad format is hard to determine from the error string. Add the name of the bad format into the error string. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@arm.com> Cc: tchen168@asu.edu Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240131134940.593788-2-irogers@google.com
* perf pmu: Treat the msr pmu as softwareIan Rogers2024-01-251-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The msr PMU is a software one, meaning msr events may be grouped with events in a hardware context. As the msr PMU isn't marked as a software PMU by perf_pmu__is_software, groups with the msr PMU in are broken and the msr events placed in a different group. This may lead to multiplexing errors where a hardware event isn't counted while the msr event, such as tsc, is. Fix all of this by marking the msr PMU as software, which agrees with the driver. Before: ``` $ perf stat -e '{slots,tsc}' -a true WARNING: events were regrouped to match PMUs Performance counter stats for 'system wide': 1,750,335 slots 4,243,557 tsc 0.001456717 seconds time elapsed ``` After: ``` $ perf stat -e '{slots,tsc}' -a true Performance counter stats for 'system wide': 12,526,380 slots 3,415,163 tsc 0.001488360 seconds time elapsed ``` Fixes: 251aa040244a ("perf parse-events: Wildcard most "numeric" events") Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: James Clark <james.clark@arm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20240124234200.1510417-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
* perf mem: Add mem_events into the supported perf_pmuKan Liang2024-01-241-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | With the mem_events, perf doesn't need to read sysfs for each PMU to find the mem-events-supported PMU. The patch also makes it possible to clean up the related __weak functions later. The patch is only to add the mem_events into the perf_pmu for all ARCHs. It will be used in the later cleanup patches. Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Tested-by: Ravi Bangoria <ravi.bangoria@amd.com> Tested-by: Leo Yan <leo.yan@linaro.org> Tested-by: Kajol Jain <kjain@linux.ibm.com> Suggested-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: will@kernel.org Cc: mike.leach@linaro.org Cc: renyu.zj@linux.alibaba.com Cc: yuhaixin.yhx@linux.alibaba.com Cc: tmricht@linux.ibm.com Cc: atrajeev@linux.vnet.ibm.com Cc: linux-arm-kernel@lists.infradead.org Cc: john.g.garry@oracle.com Link: https://lore.kernel.org/r/20240123185036.3461837-2-kan.liang@linux.intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
* perf parse-events: Make legacy events lower priority than sysfs/JSONIan Rogers2023-11-271-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The perf tool has previously made legacy events the priority so with or without a PMU the legacy event would be opened: $ perf stat -e cpu-cycles,cpu/cpu-cycles/ true Using CPUID GenuineIntel-6-8D-1 intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch Attempting to add event pmu 'cpu' with 'cpu-cycles,' that may result in non-fatal errors After aliases, add event pmu 'cpu' with 'cpu-cycles,' that may result in non-fatal errors Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 833967 cpu -1 group_fd -1 flags 0x8 = 3 ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ ... Fixes to make hybrid/BIG.little PMUs behave correctly, ie as core PMUs capable of opening legacy events on each, removing hard coded "cpu_core" and "cpu_atom" Intel PMU names, etc. caused a behavioral difference on Apple/ARM due to latent issues in the PMU driver reported in: https://lore.kernel.org/lkml/08f1f185-e259-4014-9ca4-6411d5c1bc65@marcan.st/ As part of that report Mark Rutland <mark.rutland@arm.com> requested that legacy events not be higher in priority when a PMU is specified reversing what has until this change been perf's default behavior. With this change the above becomes: $ perf stat -e cpu-cycles,cpu/cpu-cycles/ true Using CPUID GenuineIntel-6-8D-1 Attempt to add: cpu/cpu-cycles=0/ ..after resolving event: cpu/event=0x3c/ Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 827628 cpu -1 group_fd -1 flags 0x8 = 3 ------------------------------------------------------------ perf_event_attr: type 4 (PERF_TYPE_RAW) size 136 config 0x3c sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ ... So the second event has become a raw event as /sys/devices/cpu/events/cpu-cycles exists. A fix was necessary to config_term_pmu in parse-events.c as check_alias expansion needs to happen after config_term_pmu, and config_term_pmu may need calling a second time because of this. config_term_pmu is updated to not use the legacy event when the PMU has such a named event (either from JSON or sysfs). The bulk of this change is updating all of the parse-events test expectations so that if a sysfs/JSON event exists for a PMU the test doesn't fail - a further sign, if it were needed, that the legacy event priority was a known and tested behavior of the perf tool. Reported-by: Hector Martin <marcan@marcan.st> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Hector Martin <marcan@marcan.st> Tested-by: Marc Zyngier <maz@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20231123042922.834425-1-irogers@google.com [ Initialize the 'alias_rewrote_terms' variable to false to address a clang warning ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* Merge tag 'perf-tools-fixes-for-v6.6-2-2023-10-20' into perf-tools-nextNamhyung Kim2023-10-301-4/+4
|\ | | | | | | | | | | | | To get the latest fixes in the perf tools including perf stat output, dlfilter and LLVM feature detection. Signed-off-by: Namhyung Kim <namhyung@kernel.org>
| * perf pmu: Fix perf stat output with correct scale and unitWyes Karny2023-09-261-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The perf_pmu__parse_* functions for the sysfs files of pmu event’s scale, unit, per-pkg and snapshot were updated in commit 7b723dbb96e8 ("perf pmu: Be lazy about loading event info files from sysfs"). However, the paths for these sysfs files were incorrect. This resulted in perf stat reporting values with wrong scaling and missing units. This is fixed by correcting the paths for these sysfs files. Before this fix: $sudo perf stat -e power/energy-pkg/ -- sleep 2 Performance counter stats for 'system wide': 351,217,188,864 power/energy-pkg/ 2.004127961 seconds time elapsed After this fix: $sudo perf stat -e power/energy-pkg/ -- sleep 2 Performance counter stats for 'system wide': 80.58 Joules power/energy-pkg/ 2.004009749 seconds time elapsed Fixes: 7b723dbb96e8 ("perf pmu: Be lazy about loading event info files from sysfs") Signed-off-by: Wyes Karny <wyes.karny@amd.com> Reviewed-by: Ian Rogers <irogers@google.com> Cc: ravi.bangoria@amd.com Cc: sandipan.das@amd.com Cc: james.clark@arm.com Cc: kan.liang@linux.intel.com Link: https://lore.kernel.org/r/20230920122349.418673-1-wyes.karny@amd.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
* | perf pmu: Lazily compute default configIan Rogers2023-10-171-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The default config is computed during creation of the PMU and may do things like scanning sysfs, when the PMU may just be used as part of scanning. Change default_config to perf_event_attr_init_default, a callback that is used when a default config needs initializing. This avoids holding onto the memory for a perf_event_attr and copying. On a tigerlake laptop running the pmu-scan benchmark: Before: Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 100 times Average core PMU scanning took: 28.780 usec (+- 0.503 usec) Average PMU scanning took: 283.480 usec (+- 18.471 usec) Number of openat syscalls: 30,227 After: Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 100 times Average core PMU scanning took: 27.880 usec (+- 0.169 usec) Average PMU scanning took: 245.260 usec (+- 15.758 usec) Number of openat syscalls: 28,914 Over 3 runs it is a nearly 12% reduction in execution time and a 4.3% of openat calls. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: John Garry <john.g.garry@oracle.com> Cc: linux-arm-kernel@lists.infradead.org Cc: coresight@lists.linaro.org Link: https://lore.kernel.org/r/20231012175645.1849503-8-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
* | perf pmu: Const-ify perf_pmu__config_termsIan Rogers2023-10-171-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add const to related APIs, this is so they can be used to default initialize a perf_event_attr from a const pmu. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: John Garry <john.g.garry@oracle.com> Cc: linux-arm-kernel@lists.infradead.org Cc: coresight@lists.linaro.org Link: https://lore.kernel.org/r/20231012175645.1849503-6-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
* | perf pmu: Const-ify file APIsIan Rogers2023-10-171-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | File APIs don't alter the struct pmu so allow const ones to be passed. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: John Garry <john.g.garry@oracle.com> Cc: linux-arm-kernel@lists.infradead.org Cc: coresight@lists.linaro.org Link: https://lore.kernel.org/r/20231012175645.1849503-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
* | perf pmu: Rename perf_pmu__get_default_config to perf_pmu__arch_initIan Rogers2023-10-171-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Assign default_config as part of the init. perf_pmu__get_default_config was doing more than just getting the default config and so this is intended to better align with the code. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: John Garry <john.g.garry@oracle.com> Cc: linux-arm-kernel@lists.infradead.org Cc: coresight@lists.linaro.org Link: https://lore.kernel.org/r/20231012175645.1849503-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
* | Merge tag 'perf-tools-fixes-for-v6.6-1-2023-09-25' into perf-tools-nextArnaldo Carvalho de Melo2023-10-101-1/+1
|\| | | | | | | | | | | | | | | | | | | To pick up the 'perf bench sched-seccomp-notify' changes to allow us to continue build testing perf-tools-next with the set of distro containers, where some older ones don't have a recent enough seccomp.h UAPI header that contains defines needed by this new 'perf bench' workload. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf pmu: Ensure all alias variables are initializedIan Rogers2023-09-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix an error detected by memory sanitizer: ``` ==4033==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x55fb0fbedfc7 in read_alias_info tools/perf/util/pmu.c:457:6 #1 0x55fb0fbea339 in check_info_data tools/perf/util/pmu.c:1434:2 #2 0x55fb0fbea339 in perf_pmu__check_alias tools/perf/util/pmu.c:1504:9 #3 0x55fb0fbdca85 in parse_events_add_pmu tools/perf/util/parse-events.c:1429:32 #4 0x55fb0f965230 in parse_events_parse tools/perf/util/parse-events.y:299:6 #5 0x55fb0fbdf6b2 in parse_events__scanner tools/perf/util/parse-events.c:1822:8 #6 0x55fb0fbdf8c1 in __parse_events tools/perf/util/parse-events.c:2094:8 #7 0x55fb0fa8ffa9 in parse_events tools/perf/util/parse-events.h:41:9 #8 0x55fb0fa8ffa9 in test_event tools/perf/tests/parse-events.c:2393:8 #9 0x55fb0fa8f458 in test__pmu_events tools/perf/tests/parse-events.c:2551:15 #10 0x55fb0fa6d93f in run_test tools/perf/tests/builtin-test.c:242:9 #11 0x55fb0fa6d93f in test_and_print tools/perf/tests/builtin-test.c:271:8 #12 0x55fb0fa6d082 in __cmd_test tools/perf/tests/builtin-test.c:442:5 #13 0x55fb0fa6d082 in cmd_test tools/perf/tests/builtin-test.c:564:9 #14 0x55fb0f942720 in run_builtin tools/perf/perf.c:322:11 #15 0x55fb0f942486 in handle_internal_command tools/perf/perf.c:375:8 #16 0x55fb0f941dab in run_argv tools/perf/perf.c:419:2 #17 0x55fb0f941dab in main tools/perf/perf.c:535:3 ``` Fixes: 7b723dbb96e8 ("perf pmu: Be lazy about loading event info files from sysfs") Signed-off-by: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20230914022425.1489035-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
* | perf pmus: Make PMU alias name loading lazyIan Rogers2023-09-291-18/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PMU alias names were computed when the first perf_pmu is created, scanning all PMUs in event sources for a file called alias that generally doesn't exist. Switch to trying to load the file when all PMU related files are loaded in lookup. This would cause a PMU name lookup of an alias name to fail if no PMUs were loaded, so in that case all PMUs are loaded and the find repeated. The overhead is similar but in the (very) general case not all PMUs are scanned for the alias file. As the overhead occurs once per invocation it doesn't show in perf bench internals pmu-scan. On a tigerlake machine, the number of openat system calls for an event of cpu/cycles/ with perf stat reduces from 94 to 69 (ie 25 fewer openat calls). Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20230925062323.840799-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
* | perf pmu: "Compat" supports regular expression matching identifiersJing Zhang2023-09-271-2/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The jevent "Compat" is used for uncore PMU alias or metric definitions. The same PMU driver has different PMU identifiers due to different hardware versions and types, but they may have some common PMU event. Since a Compat value can only match one identifier, when adding the same event alias to PMUs with different identifiers, each identifier needs to be defined once, which is not streamlined enough. So let "Compat" support using regular expression to match identifiers for uncore PMU alias. For example, if the "Compat" value is set to "43401|43c01", it would be able to match PMU identifiers such as "43401" or "43c01", which correspond to CMN600_r0p0 or CMN700_r0p0. Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com> Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Shuai Xue <xueshuai@linux.alibaba.com> Cc: Zhuo Song <zhuo.song@linux.alibaba.com> Cc: John Garry <john.g.garry@oracle.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-doc@vger.kernel.org Link: https://lore.kernel.org/r/1695794391-34817-2-git-send-email-renyu.zj@linux.alibaba.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
* | perf pmu: Remove unused functionJames Clark2023-09-151-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pmu_events_table__find() is no longer used so remove it and its Arm specific version. Signed-off-by: James Clark <james.clark@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Haixin Yu <yuhaixin.yhx@linux.alibaba.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230913153355.138331-4-james.clark@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
* | perf pmu: Move pmu__find_core_pmu() to pmus.cJames Clark2023-09-151-17/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pmu__find_core_pmu() more logically belongs in pmus.c because it iterates over all PMUs, so move it to pmus.c At the same time rename it to perf_pmus__find_core_pmu() to match the naming convention in this file. list_prepare_entry() can't be used in perf_pmus__scan_core() anymore now that it's called from the same compilation unit. This is with -O2 (specifically -O1 -ftree-vrp -finline-functions -finline-small-functions) which allow the bounds of the array access to be determined at compile time. list_prepare_entry() subtracts the offset of the 'list' member in struct perf_pmu from &core_pmus, which isn't a struct perf_pmu. The compiler sees that pmu results in &core_pmus - 8 and refuses to compile. At runtime this works because list_for_each_entry_continue() always adds the offset back again before dereferencing ->next, but it's technically undefined behavior. With -fsanitize=undefined an additional warning is generated. Using list_first_entry_or_null() to get the first entry here avoids doing &core_pmus - 8 but has the same result and fixes both the compile warning and the undefined behavior warning. There are other uses of list_prepare_entry() in pmus.c, but the compiler doesn't seem to be able to see that they can also be called with &core_pmus, so I won't change any at this time. Signed-off-by: James Clark <james.clark@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Haixin Yu <yuhaixin.yhx@linux.alibaba.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230913153355.138331-2-james.clark@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
* | perf parse-events: Introduce 'struct parse_events_terms'Ian Rogers2023-09-111-24/+25
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | parse_events_terms() existed in function names but was passed a 'struct list_head'. As many parse_events functions take an evsel_config list as well as a parse_event_term list, and the naming head_terms and head_config is inconsistent, there's a potential to switch the lists and get errors. Introduce a 'struct parse_events_terms', that just wraps a list_head, to avoid this. Add the regular init/exit functions and transition the code to use them. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230901233949.2930562-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf parse-events: Fix propagation of term's no_value when cloningIan Rogers2023-08-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The no_value field in 'struct parse_events_term' indicates that the val variable isn't used, the case for an event name. Cloning wasn't propagating this, making cloned event name terms appearing to have a constant assinged to them. Working around the bug would check for a value of 1 assigned to value, but then this meant a user value of 1 couldn't be differentiated causing the value to be lost in debug printing and perf list. The change fixes the cloning and updates the "val.num ==/!= 1" tests to use no_value instead. To better check the no_value is set appropriately parameter comments are added for constant values. This found that no_value wasn't set correctly in parse_events_multi_pmu_add, which matters now that no_value is used to indicate an event name. Fixes: 7a6e91644708d514 ("perf parse-events: Make common term list to strbuf helper") Fixes: 99e7138eb7897aa0 ("perf tools: Fail on using multiple bits long terms without value") Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230831071421.2201358-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf pmu: Remove str from perf_pmu_aliasIan Rogers2023-08-301-23/+10
| | | | | | | | | | | | | | | | | | | | Currently the value is only used in perf list. Compute the value just when needed to avoid unnecessary overhead. Recycle the strbuf to avoid memory allocation overhead. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20230830070753.1821629-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf parse-events: Make common term list to strbuf helperIan Rogers2023-08-301-15/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A term list is turned into a string for debug output and for the str value in the alias. Add a helper to do this based on existing code, but then fix for situations like events being identified. Use strbuf to manage the dynamic memory allocation and remove the 256 byte limit. Use in various places the string of the term list is required. Before: $ sudo perf stat -vv -e inst_retired.any true Using CPUID GenuineIntel-6-8D-1 intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch Attempting to add event pmu 'cpu' with 'inst_retired.any,' that may result in non-fatal errors After aliases, add event pmu 'cpu' with 'event,period,' that may result in non-fatal errors inst_retired.any -> cpu/inst_retired.any/ ... After: $ sudo perf stat -vv -e inst_retired.any true Using CPUID GenuineIntel-6-8D-1 intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch Attempt to add: cpu/inst_retired.any/ ..after resolving event: cpu/event=0xc0,period=0x1e8483/ inst_retired.any -> cpu/event=0xc0,period=0x1e8483/ ... Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20230830070753.1821629-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf pmu: Avoid uninitialized use of alias->strIan Rogers2023-08-301-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | alias is allocated with malloc allowing uninitialized memory to be accessed. The initialization of str was moved late after it could have been updated by a JSON event, however, this create a potential for an uninitialized use. Fix this by assigning str to NULL early. Testing on ARM (Raspberry Pi) showed a memory leak in the same code so add a zfree. Fixes: f63a536f03a2f64f ("perf pmu: Merge JSON events with sysfs at load time") Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20230830000545.1638964-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf jevents: Use "default_core" for events with no UnitIan Rogers2023-08-291-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The JSON Unit field encodes the name of the PMU to match the events to. When no name is given it has meant the "cpu" core PMU except for tests. On ARM, Intel hybrid and s390 the core PMU is named differently which means that using "cpu" for this case causes the events not to get matched to the PMU. Introduce a new "default_core" string for this case and in the pmu__name_match force all core PMUs to match this name. Fixes: 2e255b4f9f41f137 ("perf jevents: Group events by PMU") Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Reported-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20230826062203.1058041-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf pmus: Skip duplicate PMUs and don't print list suffix by defaultIan Rogers2023-08-291-5/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a PMUs scan that ignores duplicates. When there are multiple PMUs that differ only by suffix, by default just list the first one and skip all others. The scan routine checks that the PMU names match but doesn't enforce that the numbers are consecutive as for some PMUs there are gaps. If "-v" is passed to "perf list" then list all PMUs. With the previous change duplicate PMUs are no longer printed but the suffix of the first is printed. When duplicate PMUs are being skipped avoid printing the suffix. Before: $ perf list ... uncore_imc_free_running_0/data_read/ [Kernel PMU event] uncore_imc_free_running_0/data_total/ [Kernel PMU event] uncore_imc_free_running_0/data_write/ [Kernel PMU event] uncore_imc_free_running_1/data_read/ [Kernel PMU event] uncore_imc_free_running_1/data_total/ [Kernel PMU event] uncore_imc_free_running_1/data_write/ [Kernel PMU event] After: $ perf list ... uncore_imc_free_running/data_read/ [Kernel PMU event] uncore_imc_free_running/data_total/ [Kernel PMU event] uncore_imc_free_running/data_write/ [Kernel PMU event] ... $ perf list -v uncore_imc_free_running_0/data_read/ [Kernel PMU event] uncore_imc_free_running_0/data_total/ [Kernel PMU event] uncore_imc_free_running_0/data_write/ [Kernel PMU event] uncore_imc_free_running_1/data_read/ [Kernel PMU event] uncore_imc_free_running_1/data_total/ [Kernel PMU event] uncore_imc_free_running_1/data_write/ [Kernel PMU event] ... Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Link: https://lore.kernel.org/r/20230825135237.921058-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf pmu: Make id const and add missing freeIan Rogers2023-08-251-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The struct pmu id is initialized from pmu_id that is read into allocated memory from a file, as such it needs free-ing in pmu__delete(). Make the id value const so that we can remove casts in tests. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Wei Li <liwei391@huawei.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230825024002.801955-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf parse-events: Make term's config constIan Rogers2023-08-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This avoids casts in tests. Use zfree in a few places to avoid warnings about a freeing a const pointer. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Wei Li <liwei391@huawei.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230825024002.801955-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf pmu: Remove logic for PMU name being NULLIan Rogers2023-08-251-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PMU name could be NULL in the case of the fake_pmu. Initialize the name for the fake_pmu to "fake" so that all other logic can assume it is initialized. Add a const to the type of name so that a literal can be used to avoid additional initialization code. Propagate the cost through related routines and remove now unnecessary "(char *)" casts. Doing this located a bug in builtin-list for the pmu_glob that was missing a strdup. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20230825024002.801955-3-irogers@google.com Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Wei Li <liwei391@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Ming Wang <wangming01@loongson.cn> Cc: John Garry <john.g.garry@oracle.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf pmu: Lazily load sysfs aliasesIan Rogers2023-08-241-39/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | Don't load sysfs aliases for a PMU when the PMU is first created, defer until an alias needs to be found. For the pmu-scan benchmark, average core PMU scanning is reduced by 30.8%, and average PMU scanning by 12.6%. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230824041330.266337-17-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf pmu: Be lazy about loading event info files from sysfsIan Rogers2023-08-241-45/+83
| | | | | | | | | | | | | | | | | | | | | | | | | Event info is only needed when an event is parsed or when merging data from an JSON and sysfs event. Be lazy in its loading to reduce file accesses. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230824041330.266337-16-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf pmu: Scan type early to fail an invalid PMU quicklyIan Rogers2023-08-241-7/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | Scan sysfs PMU's type early so that format and aliases aren't attempted to be loaded if the PMU name is invalid. This is the case for event_pmu tokens in parse-events.y where a wildcard name is first assumed to be a PMU name. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230824041330.266337-15-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf pmu: Lazily add JSON eventsIan Rogers2023-08-241-12/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than scanning all JSON events and adding them when a PMU is created, add the alias when the JSON event is needed. Average core PMU scanning run time reduced by 60.2%. Average PMU scanning run time reduced by 15%. Page faults with no events reduced by 74 page faults, 4% of total. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230824041330.266337-14-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf pmu: Cache JSON events tableIan Rogers2023-08-241-9/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cache the JSON events table so that finding it isn't done per event/alias. Change the events table find so that when the PMU is given, if the PMU has no JSON events return null. Update usage to always use the PMU variable. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230824041330.266337-13-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf pmu: Merge JSON events with sysfs at load timeIan Rogers2023-08-241-89/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than load all sysfs events then parsing all JSON events and merging with ones that already exist. When a sysfs event is loaded, look for a corresponding JSON event and merge immediately. To simplify the logic, early exit the perf_pmu__new_alias function if an alias is attempted to be added twice - as merging has already been explicitly handled. Fix the copying of terms to a merged alias and some ENOMEM paths. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230824041330.266337-12-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf pmu: Prefer passing pmu to aliases listIan Rogers2023-08-241-28/+16
| | | | | | | | | | | | | | | | | | | | | | | | The aliases list is part of the PMU. Rather than pass the aliases list, pass the full PMU simplifying some callbacks. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230824041330.266337-11-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf pmu: Parse sysfs events directly from a fileIan Rogers2023-08-241-31/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than read a sysfs events file into a 256 byte char buffer, pass the FILE* directly to the lex/yacc parser. This avoids there being a maximum events file size. While changing the API, constify some arguments to remove unnecessary casts. Allocating the read buffer decreases the performance of pmu-scan by around 3%. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230824041330.266337-10-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf pmu-events: Reduce processed events by passing PMUIan Rogers2023-08-241-25/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pass the PMU to pmu_events_table__for_each_event so that entries that don't match don't need to be processed by callback. If a NULL PMU is passed then all PMUs are processed. 'perf bench internals pmu-scan's "Average PMU scanning" performance is reduced by about 5% on an Intel tigerlake. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230824041330.266337-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf parse-events: Improve error message for double settingIan Rogers2023-08-241-7/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Double setting information for an event would produce an error message associated with the PMU rather than the term that was double setting. Improve the error message to be on the term. Before: $ perf stat -e 'cpu/inst_retired.any,inst_retired.any/' true event syntax error: 'cpu/inst_retired.any,inst_retired.any/' \___ Bad event or PMU Unabled to find PMU or event on a PMU of 'cpu' Run 'perf list' for a list of valid events $ After: $ perf stat -e 'cpu/inst_retired.any,inst_retired.any/' true event syntax error: '..etired.any,inst_retired.any/' \___ Bad event or PMU Unabled to find PMU or event on a PMU of 'cpu' Initial error: event syntax error: '..etired.any,inst_retired.any/' \___ Attempt to set event's scale twice Run 'perf list' for a list of valid events Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230824041330.266337-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf pmu-events: Add extra underscore to function namesIan Rogers2023-08-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Add extra underscore before "for" of pmu_events_table_for_each_event and pmu_metrics_table_for_each_metric. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gaosheng Cui <cuigaosheng1@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230824041330.266337-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>