summaryrefslogtreecommitdiffstats
path: root/tools/perf/util
Commit message (Collapse)AuthorAgeFilesLines
...
* perf kvm: Move arch specific code into arch/Alexander Yarygin2014-07-161-0/+130
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Parts of a 'perf kvm stat' code make sense only for x86. Let's move this code into the arch/x86/kvm-stat.c file and add util/kvm-stat.h for generic structure definitions. Add a global array 'kvm_reg_events_ops' for accessing the arch-specific 'kvm_events_ops' from generic code. Since the several global arrays (i.e. 'kvm_events_tp') have been moved to arch/*, we can not know their sizes and use them directly in builtin-kvm.c. This patch fixes that problem by adding trimming NULL element to each array and changing the behavior of their handlers in generic code. Reviewed-by: David Ahern <dsahern@gmail.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1404397747-20939-3-git-send-email-yarygin@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* Merge tag 'perf-core-for-mingo' of ↵Ingo Molnar2014-07-162-39/+135
|\ | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf into perf/core Pull perf/core improvements and fixes from Jiri Olsa: * Add IO mode into timechart command (Stanislav Fomichev) Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf timechart: Implement IO modeStanislav Fomichev2014-07-102-4/+100
| | | | | | | | | | | | | | | | | | | | | | | | Currently, timechart records only scheduler and CPU events (task switches, running times, CPU power states, etc); this commit adds IO mode which makes it possible to record IO (disk, network) activity. In this mode perf timechart will generate SVG with IO charts (writes, reads, tx, rx, polls). Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/n/1404835423-23098-3-git-send-email-stfomichev@yandex-team.ru Signed-off-by: Jiri Olsa <jolsa@kernel.org>
| * perf timechart: Fix rendering in FirefoxStanislav Fomichev2014-07-101-36/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | Firefox doesn't correctly handle cases where we specify number in quotes and have some padding around the number, like the following: <rect ... height=" 3.1" ...> In this case, it doesn't draw the figure. This patch removes 'field width' component from fprintf strings to fix it. Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/n/1404835423-23098-2-git-send-email-stfomichev@yandex-team.ru Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* | perf tools: Suggest using -f to override perf.data file ownership messageArnaldo Carvalho de Melo2014-07-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | # id uid=0(root) gid=0(root) groups=0(root) # ls -la perf.data -rw-------. 1 acme acme 20720 Jul 8 11:35 perf.data Previously: # perf report file perf.data not owned by current user or root Now: # perf report File perf.data not owned by current user or root (use -f to override) Suggested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-0j2wuuegnhv3gljbil8ld6kx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf tools: Convert open coded equivalents to asprintf()Andy Shevchenko2014-07-072-16/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following snippet V = malloc(S); if (!V) { } sprintf(V, ...) Can be easily changed to a one line: if (asprintf(&V, ...) < 0) { } Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Link: http://lkml.kernel.org/r/1404474229-15272-1-git-send-email-andriy.shevchenko@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf hists browser: Left justify column headersArnaldo Carvalho de Melo2014-07-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Looks better and avoids it moving to the end of the screen as the column width changes over time in 'perf top'. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-yc144ai5jye3yl3h5yxw0scd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf hists browser: Add ui.show-headers config file optionJiri Olsa2014-07-073-1/+16
|/ | | | | | | | | | | | | | | | | | | | | | | Adding ui.show-headers config file option to define if the histogram entries headers will start visible or not. Currently columns headers are displayed by default, following lines in ~/.perfconfig file will disable that: [ui] show-headers = false Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1403886418-5556-4-git-send-email-jolsa@kernel.org [ renamed symbol_conf.show_headers to .show_hist_headers ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf script: Handle the num array type in python properlySebastian Andrzej Siewior2014-06-271-12/+31
| | | | | | | | | | | | | | | | The raw_syscalls:sys_enter tracer for instance passes has one argument named 'arg' which is an array of 6 integers. Right the python scripts gets only 0 passed as an argument. The reason is that pevent_read_number() can not handle data types of 48 and returns always 0. This patch changes this by passing num array as list of nums which fit the description. As a result python will now see a list named arg which contains 6 (integer) items. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/n/1401207274-8170-2-git-send-email-bigeasy@linutronix.de Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* perf script: Move the number processing into its own functionSebastian Andrzej Siewior2014-06-271-15/+23
| | | | | | | | | | | | | I was going to change something here and the result was so much on the right side of the screen that I decided to move that piece into its own function. This patch should make no function change except the moving the code into its own function. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/n/1401207274-8170-1-git-send-email-bigeasy@linutronix.de Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* perf tools powerpc: Adjust callchain based on DWARF debug infoSukadev Bhattiprolu2014-06-272-2/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When saving the callchain on Power, the kernel conservatively saves excess entries in the callchain. A few of these entries are needed in some cases but not others. We should use the DWARF debug information to determine when the entries are needed. Eg: the value in the link register (LR) is needed only when it holds the return address of a function. At other times it must be ignored. If the unnecessary entries are not ignored, we end up with duplicate arcs in the call-graphs. Use the DWARF debug information to determine if any callchain entries should be ignored when building call-graphs. Callgraph before the patch: 14.67% 2234 sprintft libc-2.18.so [.] __random | --- __random | |--61.12%-- __random | | | |--97.15%-- rand | | do_my_sprintf | | main | | generic_start_main.isra.0 | | __libc_start_main | | 0x0 | | | --2.85%-- do_my_sprintf | main | generic_start_main.isra.0 | __libc_start_main | 0x0 | --38.88%-- rand | |--94.01%-- rand | do_my_sprintf | main | generic_start_main.isra.0 | __libc_start_main | 0x0 | --5.99%-- do_my_sprintf main generic_start_main.isra.0 __libc_start_main 0x0 Callgraph after the patch: 14.67% 2234 sprintft libc-2.18.so [.] __random | --- __random | |--95.93%-- rand | do_my_sprintf | main | generic_start_main.isra.0 | __libc_start_main | 0x0 | --4.07%-- do_my_sprintf main generic_start_main.isra.0 __libc_start_main 0x0 TODO: For split-debug info objects like glibc, we can only determine the call-frame-address only when both .eh_frame and .debug_info sections are available. We should be able to determin the CFA even without the .eh_frame section. Fix suggested by Anton Blanchard. Thanks to valuable input on DWARF debug information from Ulrich Weigand. Reported-by: Maynard Johnson <maynard@us.ibm.com> Tested-by: Maynard Johnson <maynard@us.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20140625154903.GA29607@us.ibm.com Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* Merge tag 'perf-core-for-mingo' of ↵Ingo Molnar2014-06-251-2/+3
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: * Add --repeat global option to 'perf bench' to be used in benchmarks such as the existing 'futex' one, that was modified to use it instead of a local option. (Davidlohr Bueso) * Fix fd -> pathname resolution in 'trace', be it using /proc or a vfs_getname probe point. (Arnaldo Carvalho de Melo) * Add suggestion of how to set perf_event_paranoid sysctl, to help non-root users trying tools like 'trace' to get a working environment. (Arnaldo Carvalho de Melo) Fixes: * Fix memory leak in the 'sched-messaging' perf bench test. (Davidlohr Bueso) * The -o and -n 'perf bench mem' options are mutually exclusive, emit error when both are specified. (Davidlohr Bueso) * Fix scrollbar refresh row index in the ui browser, problem exposed now that headers will be added and will be allowed to be switched on/off. (Jiri Olsa) Cleanups: * Remove needless reassignments in 'trace' (Arnaldo Carvalho de Melo) * Cache the is_exit syscall test in 'trace) (Arnaldo Carvalho de Melo) * No need to reimplement err() in 'perf bench sched-messaging', drop barf(). (Davidlohr Bueso). * Remove ev_name argument from perf_evsel__hists_browse, can be obtained from the other parameters. (Jiri Olsa) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf evlist: Add suggestion of how to set perf_event_paranoid sysctlArnaldo Carvalho de Melo2014-06-191-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Minor hint to speed up problem resolution and get 'trace' working for non root users. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-abdqi8km4fj9osrn70q2zj9v@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf symbols: Get kernel start address by symbol nameSimon Que2014-06-201-32/+22
|/ | | | | | | | | | | | | | | | | The function machine__get_kernel_start_addr() was taking the first symbol of kallsyms as the start address. This is incorrect in certain cases where the first symbol is something at 0, while the actual kernel functions begin at a later point (e.g. 0x80200000). This patch fixes machine__get_kernel_start_addr() to search for the symbol "_text" or "_stext", which marks the beginning of kernel mapping. This was already being done in machine__create_kernel_maps(). Thus, this patch is just a refactor, to move that code into machine__get_kernel_start_addr(). Signed-off-by: Simon Que <sque@chromium.org> Link: http://lkml.kernel.org/r/1402943529-13244-1-git-send-email-sque@chromium.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* perf tools: Add dso__data_* interface descriptonsJiri Olsa2014-06-122-0/+97
| | | | | | | | | | | | | | | | | | Adding descriptions/explanations for dso__data_* interface functions. Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1401892622-30848-10-git-send-email-jolsa@kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* perf tools: Allow to close dso fd in case of open failureJiri Olsa2014-06-121-1/+22
| | | | | | | | | | | | | | | | | | | Adding do_open function that tries to close opened dso objects in case we fail to open the dso due to to crossing the allowed RLIMIT_NOFILE limit. Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1401892622-30848-9-git-send-email-jolsa@kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* perf tools: Add file size check and factor dso__data_read_offsetJiri Olsa2014-06-122-15/+50
| | | | | | | | | | | | | | | | | | | | | | Adding file size check, because the lseek will succeed for any offset behind file size and thus succeed when it was expected to fail. Factoring the code to check the offset against file size earlier in the flow. Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1401892622-30848-8-git-send-email-jolsa@kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* perf tools: Cache dso data file descriptorJiri Olsa2014-06-122-4/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Caching dso data file descriptors to avoid expensive re-opens especially during DWARF unwind. We keep dsos data file descriptors open until their count reaches the half of the current fd open limit (RLIMIT_NOFILE). In this case we close file descriptor of the first opened dso object. We've got overall speedup (~27% for my workload) of report: 'perf report --stdio -i perf-test.data' (3 runs) (perf-test.data size was around 12GB) current code: 545,640,944,228 cycles ( +- 0.53% ) 785,255,798,320 instructions ( +- 0.03% ) 366.340910010 seconds time elapsed ( +- 3.65% ) after change: 435,895,036,114 cycles ( +- 0.26% ) 636,790,271,176 instructions ( +- 0.04% ) 266.481463387 seconds time elapsed ( +- 0.13% ) Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1401892622-30848-7-git-send-email-jolsa@kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* perf tools: Add global count of opened dso objectsJiri Olsa2014-06-121-1/+7
| | | | | | | | | | | | | | | | | | | Adding global count of opened dso objects so we could properly limit the number of opened dso data file descriptors. Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1401892622-30848-6-git-send-email-jolsa@kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* perf tools: Add global list of opened dso objectsJiri Olsa2014-06-122-2/+40
| | | | | | | | | | | | | | | | | | | Adding global list of opened dso objects, so we can track them and use the list for caching dso data file descriptors. Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1401892622-30848-5-git-send-email-jolsa@kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* perf tools: Add data_fd into dso objectJiri Olsa2014-06-123-6/+24
| | | | | | | | | | | | | | | | | | | | | Adding data_fd into dso object so we could handle caching of opened dso file data descriptors coming int next patches. Adding dso__data_close interface to keep the data_fd updated when the descriptor is closed. Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1401892622-30848-4-git-send-email-jolsa@kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* perf tools: Separate dso data related variablesJiri Olsa2014-06-122-5/+10
| | | | | | | | | | | | | | | | | | | Add separated structure/namespace for data related variables. We are going to add mode of them, so this way they will be clearly separated. Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1401892622-30848-3-git-send-email-jolsa@kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* perf tools: Cache register accesses for unwind processingJiri Olsa2014-06-123-2/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Caching registers value into an array. Got about 4% speed up of perf_reg_value function for report command processing dwarf unwind stacks. Output from report over 1.5 GB data with DWARF unwind stacks: (TODO fix perf diff) current code: 5.84% perf perf [.] perf_reg_value change: 1.94% perf perf [.] perf_reg_value And little bit of overall speed up: (perf stat -r 5 -e '{cycles,instructions}:u' ...) current code: 310,298,611,754 cycles ( +- 0.33% ) 439,669,689,341 instructions ( +- 0.03% ) 188.656753166 seconds time elapsed ( +- 0.82% ) change: 291,315,329,878 cycles ( +- 0.22% ) 391,763,485,304 instructions ( +- 0.03% ) 180.742249687 seconds time elapsed ( +- 0.64% ) Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1401892622-30848-2-git-send-email-jolsa@kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* perf record: Fix to honor user freq/interval properlyNamhyung Kim2014-06-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | When configuring event perf checked a wrong condition that user specified both of freq (-F) and period (-c) or the event has no default value. This worked because most of events don't have default value and only tracepoint events have default of 1 (and it's not desirable to change it for those events). However, Andi's downloadable event patch changes the situation so it cannot change the value for those events. Fix it by allowing override the default value if user gives one of the options. $ perf record -a -e uops_retired.all -F 4000 sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.325 MB perf.data (~14185 samples) ] $ perf evlist -F cpu/uops_retired.all/: sample_freq=4000 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/1402292617-26278-1-git-send-email-namhyung@kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* Merge tag 'perf-core-for-mingo' of ↵Ingo Molnar2014-06-122-8/+16
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements from Arnaldo Carvalho de Melo: User visible: * Improve 'perf probe' error messages, moving some diagnostic messages to only appear in --verbose mode and fixing up some error reporting related to variables and struct members. (Masami Hiramatsu) * Reflow 'perf timechart' man page. (Stanislav Fomichev) Developer stuff: * Be more precise when reporting missing libraries in a static tool build. (Arnaldo Carvalho de Melo) * Show error messages from the multiple make invoked from 'make build-test'. (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf probe: Improve error messages in --line optionMasami Hiramatsu2014-06-101-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Improve error messages of 'perf probe --line' mode. Currently 'perf probe' shows the "Debuginfo analysis failed" message with an error code when the given symbol is not found: ----- # perf probe -L page_cgroup_init_flatmem Debuginfo analysis failed. (-2) Error: Failed to show lines. ----- But -2 (-ENOENT) means that the given source line or function was not found. With this patch, 'perf probe' shows the correct error message: ----- # perf probe -L page_cgroup_init_flatmem Specified source line is not found. Error: Failed to show lines. ----- There is also another debug error code is shown in the same function after get_real_path(). This removes that too. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20140606071406.6788.47850.stgit@kbuild-fedora.novalocal Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf probe: Improve an error message of perf probe --vars modeMasami Hiramatsu2014-06-092-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix an error message when failed to find given address in --vars mode. Without this fix, perf probe -V doesn't show the final "Error" message if it fails to find given source line. Moreover, it tells it fails to find "variables" instead of the source line. ----- # perf probe -V foo@bar Failed to find variables at foo@bar (0) ----- The result also shows mysterious error code. Actually the error returns 0 or -ENOENT means that it just fails to find the address of given source line. (0 means there is no matching address, and -ENOENT means there is an entry(DIE) but it has no instance, e.g. an empty inlined function) This fixes it to show what happened and the final error message as below. ----- # perf probe -V foo@bar Failed to find the address of foo@bar Error: Failed to show vars. ----- Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20140606071359.6788.84716.stgit@kbuild-fedora.novalocal Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf probe: Improve error message for unknown member of data structureMasami Hiramatsu2014-06-091-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Improve the error message if we can not find given member in the given structure. Currently perf probe shows a wrong error message as below. ----- # perf probe getname_flags:65 "result->BOGUS" result(type:filename) has no member BOGUS. Failed to find 'result' in this function. Error: Failed to add events. (-22) ----- The first message is correct, but the second one is not, since we didn't fail to find a variable but fails to find the member of given variable. ----- # perf probe getname_flags:65 "result->BOGUS" result(type:filename) has no member BOGUS. Error: Failed to add events. (-22) ----- With this patch, the error message shows only the first one. And if we really failed to find given variable, it tells us so. ----- # perf probe getname_flags:65 "BOGUS" Failed to find 'BOGUS' in this function. Error: Failed to add events. (-2) ----- Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20140606071345.6788.23744.stgit@kbuild-fedora.novalocal Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf tools: Add dcacheline sortDon Zickus2014-06-094-0/+111
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In perf's 'mem-mode', one can get access to a whole bunch of details specific to a particular sample instruction. A bunch of those details relate to the data address. One interesting thing you can do with data addresses is to convert them into a unique cacheline they belong too. Organizing these data cachelines into similar groups and sorting them can reveal cache contention. This patch creates an alogorithm based on various sample details that can help group entries together into data cachelines and allows 'perf report' to sort on it. The algorithm relies on having proper mmap2 support in the kernel to help determine if the memory map the data address belongs to is private to a pid or globally shared. The alogortithm is as follows: o group cpumodes together o group entries with discovered maps together o sort on major, minor, inode and inode generation numbers o if userspace anon, then sort on pid o sort on cachelines based on data addresses The 'dcacheline' sort option in 'perf report' only works in 'mem-mode'. Sample output: # # Samples: 206 of event 'cpu/mem-loads/pp' # Total weight : 2534 # Sort order : dcacheline,pid # # Overhead Samples Data Cacheline Command: Pid # ........ ............ ...................................................................... .................. # 13.22% 1 [k] 0xffff88042f08ebc0 swapper: 0 9.27% 1 [k] 0xffff88082e8cea80 swapper: 0 3.59% 2 [k] 0xffffffff819ba180 swapper: 0 0.32% 1 [k] arch_trigger_all_cpu_backtrace_handler_na.23901+0xffffffffffffffe0 swapper: 0 0.32% 1 [k] timekeeper_seq+0xfffffffffffffff8 swapper: 0 Note: Added a '+1' to symlen size in hists__calc_col_len to prevent the next column from prematurely tabbing over and mis-aligning. Not sure what the problem is. Signed-off-by: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/r/1401208087-181977-8-git-send-email-dzickus@redhat.com Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* | perf tools: Add support to dynamically get cacheline sizeDon Zickus2014-06-092-0/+2
| | | | | | | | | | | | | | | | | | Different arches may have different cacheline sizes. Look it up and set a global variable for reference. Signed-off-by: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/r/1401480605-97442-1-git-send-email-dzickus@redhat.com Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* | perf tools: Add cpumode to struct hist_entryDon Zickus2014-06-092-3/+5
| | | | | | | | | | | | | | | | The next patch needs to sort on cpumode, so add it to hist_entry to be tracked. Signed-off-by: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/r/1401208087-181977-6-git-send-email-dzickus@redhat.com Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* | Revert "perf: Disable PERF_RECORD_MMAP2 support"Don Zickus2014-06-092-14/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 3090ffb5a2515990182f3f55b0688a7817325488. Re-enable the mmap2 interface as we will have a user soon. Since things have changed since perf disabled mmap2, small tweaks to the revert had to be done: o commit 9d4ecc88 forced (n!=8) to become (n<7) o a new libunwind test needed updating to use mmap2 interface Signed-off-by: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/r/1401461382-209586-1-git-send-email-dzickus@redhat.com Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* | perf tools: Update mmap2 interface with protection and flag bitsDon Zickus2014-06-095-5/+32
| | | | | | | | | | | | | | | | | | The kernel piece passes more info now. Update the perf tool to reflect that and adjust the synthesized maps to play along. Signed-off-by: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/r/1400526833-141779-4-git-send-email-dzickus@redhat.com Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* | perf script/python: Print array argument as stringNamhyung Kim2014-06-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the Sebastian's change of handling num array argument (of raw syscall enter), the script still failed to work like this: $ perf record -e raw_syscalls:* sleep 1 $ perf script -g python $ perf script -s perf-script.py ... Traceback (most recent call last): File "perf-script.py", line 42, in raw_syscalls__sys_enter (id, args), TypeError: %u format: a number is required, not list Fatal Python error: problem in Python trace event handler Aborted (core dumped) This is because the generated script tries to print the array arg as unsigned integer (%u). Since the python seems to convert arguments to strings by default, just using %s solved the problem for me. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: http://lkml.kernel.org/r/1401338695-18837-1-git-send-email-namhyung@kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* | tools lib traceevent: Added support for __get_bitmask() macroSteven Rostedt (Red Hat)2014-06-072-0/+2
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | Coming in v3.16, trace events will be able to save bitmasks in raw format in the ring buffer and output it with the __get_bitmask() macro. In order for userspace tools to parse this, it must be able to handle the __get_bitmask() call and be able to convert the data that's in the ring buffer into a nice bitmask format. The output is similar to what the kernel uses to print bitmasks, with a comma separator every 4 bytes (8 characters). This allows for cpumasks to also be saved efficiently. The first user is the thermal:thermal_power_limit event which has the following output: thermal_power_limit: cpus=0000000f freq=1900000 cdev_state=0 power=5252 Link: http://lkml.kernel.org/r/20140506132238.22e136d1@gandalf.local.home Suggested-by: Javi Merino <javi.merino@arm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Javi Merino <javi.merino@arm.com> Link: http://lkml.kernel.org/r/20140603032224.229186537@goodmis.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* Merge branch 'perf/urgent' into perf/core, to resolve conflict and to ↵Ingo Molnar2014-06-062-4/+7
|\ | | | | | | | | | | | | | | | | prepare for new patches Conflicts: arch/x86/kernel/traps.c Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf probe: Fix perf probe to find correct variable DIEMasami Hiramatsu2014-06-041-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix perf probe to find correct variable DIE which has location or external instance by tracking down the lexical blocks. Current die_find_variable() expects that the all variable DIEs which has DW_TAG_variable have a location. However, since recent dwarf information may have declaration variable DIEs at the entry of function (subprogram), die_find_variable() returns it. To solve this problem, it must track down the DIE tree to find a DIE which has an actual location or a reference for external instance. e.g. finding a DIE which origin is <0xdc73>; <1><11496>: Abbrev Number: 95 (DW_TAG_subprogram) <11497> DW_AT_abstract_origin: <0xdc42> <1149b> DW_AT_low_pc : 0x1850 [...] <2><114cc>: Abbrev Number: 119 (DW_TAG_variable) <- this is a declaration <114cd> DW_AT_abstract_origin: <0xdc73> <2><114d1>: Abbrev Number: 119 (DW_TAG_variable) [...] <3><115a7>: Abbrev Number: 105 (DW_TAG_lexical_block) <115a8> DW_AT_ranges : 0xaa0 <4><115ac>: Abbrev Number: 96 (DW_TAG_variable) <- this has a location <115ad> DW_AT_abstract_origin: <0xdc73> <115b1> DW_AT_location : 0x486c (location list) Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20140529121930.30879.87092.stgit@ltc230.yrl.intra.hitachi.co.jp Signed-off-by: Jiri Olsa <jolsa@kernel.org>
| * perf probe: Fix a segfault if asked for variable it doesn't findMasami Hiramatsu2014-06-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a segfault bug by asking for variable it doesn't find. Since the convert_variable() didn't handle error code returned from convert_variable_location(), it just passed an incomplete variable field and then a segfault was occurred when formatting the field. This fixes that bug by handling success code correctly in convert_variable(). Other callers of convert_variable_location() are correctly checking the return code. This bug was introduced by following commit. But another hidden erroneous error handling has been there previously (-ENOMEM case). commit 3d918a12a1b3088ac16ff37fa52760639d6e2403 Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20140529105232.28251.30447.stgit@ltc230.yrl.intra.hitachi.co.jp Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* | perf tools: Move elide bool into perf_hpp_fmt structJiri Olsa2014-06-033-36/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After output/sort fields refactoring, it's expensive to check the elide bool in its current location inside the 'struct sort_entry'. The perf_hpp__should_skip function gets highly noticable in workloads with high number of output/sort fields, like for: $ perf report -i perf-test.data -F overhead,sample,period,comm,pid,dso,symbol,cpu --stdio Performance report: 9.70% perf [.] perf_hpp__should_skip Moving the elide bool into the 'struct perf_hpp_fmt', which makes the perf_hpp__should_skip just single struct read. Got speedup of around 22% for my test perf.data workload. The change should not harm any other workload types. Performance counter stats for (10 runs): before: 358,319,732,626 cycles ( +- 0.55% ) 467,129,581,515 instructions # 1.30 insns per cycle ( +- 0.00% ) 150.943975206 seconds time elapsed ( +- 0.62% ) now: 278,785,972,990 cycles ( +- 0.12% ) 370,146,797,640 instructions # 1.33 insns per cycle ( +- 0.00% ) 116.416670507 seconds time elapsed ( +- 0.31% ) Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20140601142622.GA9131@krava.brq.redhat.com Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* | perf tools: Remove elide setup for SORT_MODE__MEMORY modeJiri Olsa2014-06-031-13/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's no need to setup elide of sort_dso sort entry again with symbol_conf.dso_list list. The only difference were list names of memory mode data, which does not make much sense to me. Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1400858147-7155-2-git-send-email-jolsa@kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* | perf tools: Reset output/sort order to defaultNamhyung Kim2014-06-011-0/+3
| | | | | | | | | | | | | | | | | | | | | | When reset_output_field() is called, also reset field/sort order to NULL so that it can have the default values. It's needed for testing. Signed-off-by: Namhyung Kim <namhyung@kernel.org> CC: Arun Sharma <asharma@fb.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/1401335910-16832-26-git-send-email-namhyung@kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* | perf tools: Enable --children option by defaultNamhyung Kim2014-06-011-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | Now perf top and perf report will show children column by default if it has callchain information. Requested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar> Tested-by: Arun Sharma <asharma@fb.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/1401335910-16832-23-git-send-email-namhyung@kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* | perf tools: Add callback function to hist_entry_iterNamhyung Kim2014-06-012-43/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new ->add_entry_cb() will be called after an entry was added to the histogram. It's used for code sharing between perf report and perf top. Note that ops->add_*_entry() should set iter->he properly in order to call the ->add_entry_cb. Also pass @arg to the callback function. It'll be used by perf top later. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arun Sharma <asharma@fb.com> Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/87k393g999.fsf@sejong.aot.lge.com Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* | perf tools: Add more hpp helper functionsNamhyung Kim2014-06-011-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | Sometimes it needs to disable some columns at runtime. Add help functions to support that. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arun Sharma <asharma@fb.com> Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/1401335910-16832-15-git-send-email-namhyung@kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* | perf tools: Apply percent-limit to cumulative percentageNamhyung Kim2014-06-011-1/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | If -g cumulative option is given, it needs to show entries which don't have self overhead. So apply percent-limit to accumulated overhead percentage in this case. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arun Sharma <asharma@fb.com> Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/1401335910-16832-14-git-send-email-namhyung@kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* | perf ui/hist: Add support to accumulated hist statNamhyung Kim2014-06-012-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Print accumulated stat of a hist entry if requested. To do that, add new HPP_PERCENT_ACC_FNS macro and generate a perf_hpp_fmt using it. The __hpp__sort_acc() function sorts entries by accumulated period value. When accumulated periods of two entries are same (i.e. single path callchain) put the caller above since accumulation tends to put callers on higher position for obvious reason. Also add "overhead_children" output field to be selected by user. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arun Sharma <asharma@fb.com> Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/1401335910-16832-11-git-send-email-namhyung@kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* | perf tools: Save callchain info for each cumulative entryNamhyung Kim2014-06-011-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | When accumulating callchain entry, also save current snapshot of the chain so that it can show the rest of the chain. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arun Sharma <asharma@fb.com> Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/1401335910-16832-10-git-send-email-namhyung@kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* | perf callchain: Add callchain_cursor_snapshot()Namhyung Kim2014-06-011-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | The callchain_cursor_snapshot() is for saving current status of the callchain. It'll be used to accumulate callchain information for each node. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arun Sharma <asharma@fb.com> Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/1401335910-16832-9-git-send-email-namhyung@kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* | perf report: Cache cumulative callchainsNamhyung Kim2014-06-011-0/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possble that a callchain has cycles or recursive calls. In that case it'll end up having entries more than 100% overhead in the output. In order to prevent such entries, cache each callchain node and skip if same entry already cumulated. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arun Sharma <asharma@fb.com> Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/1401335910-16832-8-git-send-email-namhyung@kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
* | perf tools: Update cpumode for each cumulative entryNamhyung Kim2014-06-013-11/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The cpumode and level in struct addr_localtion was set for a sample and but updated as cumulative callchains were added. This led to have non-matching symbol and cpumode in the output. Update it accordingly based on the fact whether the map is a part of the kernel or not. This is a reverse of what thread__find_addr_map() does. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arun Sharma <asharma@fb.com> Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/1401335910-16832-7-git-send-email-namhyung@kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>