summaryrefslogtreecommitdiffstats
path: root/tools
Commit message (Collapse)AuthorAgeFilesLines
* Merge commit 'v3.1-rc9' into perf/coreIngo Molnar2011-10-0614-58/+213
|\ | | | | | | | | | | Merge reason: pick up latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * perf tools: Fix raw sample readingJiri Olsa2011-09-291-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Wrong pointer is being passed for raw data sanity checking, when parsing sample event. This ends up with invalid event and perf record being stuck in __perf_session__process_events function during processing build IDs (process_buildids function). Following command hangs up in my setup: ./perf record -e raw_syscalls:sys_enter ls The fix is to use proper pointer to the raw data instead of the 'u' union. Reviewed-by: David Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1317308709-9474-2-git-send-email-jolsa@redhat.com Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf python: Add missing perf_event__parse_sample 'swapped' parmArnaldo Carvalho de Melo2011-09-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem introduced in 936be50, that missed one perf_event__parse_sample user, the python binding. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-ja4phms9618ggi657plyuch2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf tools: Add support for disabling -Werror via WERROR=0Darren Hart2011-09-231-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GCC often introduces new warnings with lots of false positives - breaking -Werror builds. WERROR=0 allows one to build perf without much fuss - while still encouraging people to send patches to avoid the fuss of having to type WERROR=0. Bisecting back to commits that produce a (mostly harmless) warning on some compilers is more difficult. With WERROR=0 one could bisect without worrying about harmless warnings. Cc: Ingo Molnar <mingo@elte.hu> Link: http://lkml.kernel.org/r/eac06c7cc4920e5d4830417d466161fb26c7359c.1315514559.git.dvhart@linux.intel.com Signed-off-by: Darren Hart <dvhart@linux.intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf top: Fix userspace sample addr map offsetArnaldo Carvalho de Melo2011-09-231-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 'perf top' tool came from the kernel where we had each DSO (vmlinux, modules) loaded just once at a time. But userspace may have DSOs loaded in multiple addresses (shared libraries), requiring that we use the just resolved map instead of the first one found. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-ag53wz0yllpgers0n2w7hchp@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf symbols: Fix issue with binaries using 16-bytes buildids (v2)Stephane Eranian2011-09-231-11/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Buildid can vary in size. According to the man page of ld, buildid can be 160 bits (sha1) or 128 bits (md5, uuid). Perf assumes buildid size of 20 bytes (160 bits) regardless. When dealing with md5 buildids, it would thus read more than needed and that would cause mismatches and samples without symbols. This patch fixes this by taking into account the actual buildid size as encoded int he section header. The leftover bytes are also cleared. This second version fixes a minor issue with the memset() base position. Cc: David S. Miller <davem@davemloft.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@amd.com> Cc: Stephane Eranian <eranian@gmail.com> Link: http://lkml.kernel.org/r/4cc1af3c.8ee7d80a.5a28.ffff868e@mx.google.com Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf tool: Fix endianness handling of u32 data in samplesDavid Ahern2011-09-234-14/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, analyzing PPC data files on x86 the cpu field is always 0 and the tid and pid are backwards. For example, analyzing a PPC file on PPC the pid/tid fields show: rsyslogd 1210/1212 and analyzing the same PPC file using an x86 perf binary shows: rsyslogd 1212/1210 The problem is that the swap_op method for samples is perf_event__all64_swap which assumes all elements in the sample_data struct are u64s. cpu, tid and pid are u32s and need to be handled individually. Given that the swap is done before the sample is parsed, the simplest solution is to undo the 64-bit swap of those elements when the sample is parsed and do the proper swap. The RAW data field is generic and perf cannot have programmatic knowledge of how to treat that data. Instead a warning is given to the user. Thanks to Anton Blanchard for providing a data file for a mult-CPU PPC system so I could verify the fix for the CPU fields. v3 -> v4: - fixed use of WARN_ONCE v2 -> v3: - used WARN_ONCE for message regarding raw data - removed struct wrapper around union - fixed whitespace issues v1 -> v2: - added a union for undoing the byte-swap on u64 and redoing swap on u32's to address compiler errors (see git commit 65014ab3) Cc: Anton Blanchard <anton@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1315321946-16993-1-git-send-email-dsahern@gmail.com Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf sort: Fix symbol sort output by separating unresolved samples by typeAnton Blanchard2011-09-231-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I took a profile that suggested 60% of total CPU time was in the hypervisor: ... 60.20% [H] 0x33d43c 4.43% [k] ._spin_lock_irqsave 1.07% [k] ._spin_lock Using perf stat to get the user/kernel/hypervisor breakdown contradicted this. The problem is we merge all unresolved samples into the one unknown bucket. If add a comparison by sample type to sort__sym_cmp we get the real picture: ... 57.11% [.] 0x80fbf63c 4.43% [k] ._spin_lock_irqsave 1.07% [k] ._spin_lock 0.65% [H] 0x33d43c So it was almost all userspace, not hypervisor as the initial profile suggested. I found another issue while adding this. Symbol sorting sometimes shows multiple entries for the unknown bucket: ... 16.65% [.] 0x6cd3a8 7.25% [.] 0x422460 5.37% [.] yylex 4.79% [.] malloc 4.78% [.] _int_malloc 4.03% [.] _int_free 3.95% [.] hash_source_code_string 2.82% [.] 0x532908 2.64% [.] 0x36b538 0.94% [H] 0x8000000000e132a4 0.82% [H] 0x800000000000e8b0 This happens because we aren't consistent with our sorting. On one hand we check to see if both symbols match and for two unresolved samples sym is NULL so we match: if (left->ms.sym == right->ms.sym) return 0; On the other hand we use sample IP for unresolved samples when comparing against a symbol: ip_l = left->ms.sym ? left->ms.sym->start : left->ip; ip_r = right->ms.sym ? right->ms.sym->start : right->ip; This means unresolved samples end up spread across the rbtree and we can't merge them all. If we use cmp_null all unresolved samples will end up in the one bucket and the output makes more sense: ... 39.12% [.] 0x36b538 5.37% [.] yylex 4.79% [.] malloc 4.78% [.] _int_malloc 4.03% [.] _int_free 3.95% [.] hash_source_code_string 2.26% [H] 0x800000000000e8b0 Acked-by: Eric B Munson <emunson@mgebm.net> Cc: Eric B Munson <emunson@mgebm.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ian Munsie <imunsie@au1.ibm.com> Link: http://lkml.kernel.org/r/20110831115145.4f598ab2@kryten Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf symbols: Synthesize anonymous mmap eventsAnton Blanchard2011-09-231-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | perf_event__synthesize_mmap_events does not create anonymous mmap events even though the kernel does. As a result an already running application with dynamically created code will not get profiled - all samples end up in the unknown bucket. This patch skips any entries with '[' in the name to avoid adding events for special regions (eg the vsyscall page). All other executable mmaps are assumed to be anonymous and an event is synthesized. Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Eric B Munson <emunson@mgebm.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Link: http://lkml.kernel.org/r/20110830091506.60b51fe8@kryten Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf record: Create events initially disabled and enable after initDavid Ahern2011-09-233-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | perf-record currently creates events enabled. When doing a system wide collection (-a arg) this causes data collection for perf's initialization activities -- eg., perf_event__synthesize_threads(). For some events (e.g., context switch S/W event or tracepoints like syscalls) perf's initialization causes a lot of events to be captured frequently generating "Check IO/CPU overload!" warnings on larger systems (e.g., 2 socket, quad core, hyperthreading). perf's initialization phase can be skipped by creating events disabled and then enabling them once the initialization is done. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1314289075-14706-1-git-send-email-dsahern@gmail.com Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf symbols: Add some heuristics for choosing the best duplicate symbolAnton Blanchard2011-09-231-0/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Try and pick the best symbol based on a few heuristics: - Prefer a non weak symbol over a weak one - Prefer a global symbol over a non global one - Prefer a symbol with less underscores (idea taken from kallsyms.c) - If all else fails, choose the symbol with the longest name Cc: Eric B Munson <emunson@mgebm.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110824065243.161953371@samba.org Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf symbols: Preserve symbol scope when parsing /proc/kallsymsAnton Blanchard2011-09-231-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | kallsyms__parse capitalises the symbol type, so every symbol is marked global. Remove this and fix symbol_type__is_a to handle both local and global symbols. Cc: Eric B Munson <emunson@mgebm.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110824065243.077125989@samba.org Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf symbols: /proc/kallsyms does not sort module symbolsAnton Blanchard2011-09-231-22/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kallsyms__parse assumes that /proc/kallsyms is sorted and sets the end of the previous symbol to the start of the current one. Unfortunately module symbols are not sorted, eg: ffffffffa0081f30 t e1000_clean_rx_irq [e1000e] ffffffffa00817a0 t e1000_alloc_rx_buffers [e1000e] Some symbols end up with a negative length and others have a length larger than they should. This results in confusing perf output. We already have a function to fixup the end of zero length symbols so use that instead. Cc: Eric B Munson <emunson@mgebm.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110824065242.969681349@samba.org Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf symbols: Fix ppc64 SEGV in dso__load_sym with debuginfo filesAnton Blanchard2011-09-231-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 64bit PowerPC debuginfo files have an empty function descriptor section. I hit a SEGV when perf tried to use this section for symbol resolution. To fix this we need to check the section is valid and we can do this by checking for type SHT_PROGBITS. Cc: <stable@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Eric B Munson <emunson@mgebm.net> Link: http://lkml.kernel.org/r/20110824065242.895239970@samba.org Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf probe: Fix regression of variable finderMasami Hiramatsu2011-09-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix to call convert_variable() if previous call does not fail. To call convert_variable, it ensures "ret" is 0. However, since "ret" has the return value of synthesize_perf_probe_arg() which always returns positive value if it succeeded, perf probe doesn't call convert_variable(). This will cause a SEGV when we add an event with arguments. This has to be fixed as it ensures "ret" is greater than 0 (or not negative). This regression has been introduced by my previous patch, f182e3e1. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: yrl.pp-manager.tt@hitachi.com Link: http://lkml.kernel.org/r/20110820053922.3286.65805.stgit@fedora15 Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf symbols: Treat all memory maps without dso file as loadedJiri Olsa2011-09-291-6/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The stack/vdso/heap memory maps dont have any dso file. Setting the perf dso objects as 'loaded' for these maps, we avoid unnecessary warnings like: "Failed to open [stack], continuing without symbols" All map__find_* functions still return NULL when searching for symbols in these maps. Link: http://lkml.kernel.org/r/20110824131834.GA2007@jolsa.brq.redhat.com Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf sched: Fix script command documentationJiri Olsa2011-09-291-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | Fixed leftover from trace -> script rename. Link: http://lkml.kernel.org/r/1317114995-4534-1-git-send-email-jolsa@redhat.com Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf report: Fix stdio event name header printingArnaldo Carvalho de Melo2011-09-291-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the past we tried to avoid printing the name of the event when just one event was found in the perf.data file, after some refactorings it ended up not printing the event name if just one hist_entry was found in one of the events. Fix it by always printing the name of the event, even if just one is found. Reported-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-kikr0c7ou55bd9caok8569rf@git.kernel.org Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf: Support setting the disassembler styleAndi Kleen2011-09-296-1/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | Add -M option to report/annotate to pass directly to objdump. This allows to use -M intel for intel style disassembler syntax, which is useful for people who are very used to the Intel syntax. Link: http://lkml.kernel.org/r/1316122302-24306-2-git-send-email-andi@firstfloor.org [committer note: Add missing Documentation bits, fixup conflicts with 3e6a2a7] Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Stephane Eranian <eranian@google.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf tools: Make stat/record print fatal signals of the target programAndi Kleen2011-09-292-2/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a program crashes under perf there is no message about it, unlike when running it from bash. This can be confusing and lead to wrong actions during debugging. Print fatal signals in perf stat/record. Thanks to Furat Afram for finding the problem originally Link: http://lkml.kernel.org/r/1316122302-24306-1-git-send-email-andi@firstfloor.org Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Stephane Eranian <eranian@google.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf stat: Fix spelling in commentJim Cromie2011-09-291-1/+1
| | | | | | | | | | | | | | Link: http://lkml.kernel.org/r/1315437244-3788-6-git-send-email-jim.cromie@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf stat: Allow tab as cvs delimiterJim Cromie2011-09-291-2/+4
| | | | | | | | | | | | | | | | | | | | If option -x '\t' is given, convert '\t' to "\t". This makes cvs printing more flexible. Link: http://lkml.kernel.org/r/1315437244-3788-5-git-send-email-jim.cromie@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf stat: Suppress printing std-dev when its 0Jim Cromie2011-09-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | For pretty output only (preserve column for cvs output), dont print std-deviation when its 0.00. Do this based upon value, instead of checking for --no-aggr, since the stats could conceivably be computed over the runs on each CPU, and theres no reason to preclude that. Link: http://lkml.kernel.org/r/1315437244-3788-4-git-send-email-jim.cromie@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf stat: Fix +- nan% in --no-aggr runsJim Cromie2011-09-291-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Without this patch, running: $ sudo ./perf stat -r20 --no-aggr -a perl -e '$i++ for 1..100000' I get computations like this: CPU0 12.488247 task-clock # 1.224 CPUs utilized ( +- -nan% ) CPU1 12.488909 task-clock # 1.225 CPUs utilized ( +- -nan% ) CPU2 12.500221 task-clock # 1.226 CPUs utilized ( +- -nan% ) CPU3 12.481713 task-clock # 1.224 CPUs utilized ( +- -nan% ) but with patch, I get: CPU0 8.233682 task-clock # 0.754 CPUs utilized ( +- 0.00% ) CPU1 8.226318 task-clock # 0.754 CPUs utilized ( +- 0.00% ) CPU2 8.210737 task-clock # 0.752 CPUs utilized ( +- 0.00% ) CPU3 8.201691 task-clock # 0.751 CPUs utilized ( +- 0.00% ) Note that without --no-aggr, I get non-0 statistics both before and after patch: 231.986022 task-clock # 4.030 CPUs utilized ( +- 0.97% ) 212 context-switches # 0.001 M/sec ( +- 12.07% ) 9 CPU-migrations # 0.000 M/sec ( +- 25.80% ) 466 page-faults # 0.002 M/sec ( +- 3.23% ) 174,318,593 cycles # 0.751 GHz ( +- 1.06% ) I couldnt see anything wrong in the caller, so fixed it in stddev_stats(). ISTM that 0.00 is better than nan, since perf stat was passed -A (--no-aggr) so no standard deviation should be expected, and nan is suggestive of a deeper error. When running with --no-aggr, perhaps we should suppress the statistics printing entirely, or do so when they are 0.00. Link: http://lkml.kernel.org/r/1315437244-3788-3-git-send-email-jim.cromie@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf stat: Add --log-fd <N> option to redirect stderr elsewhereJim Cromie2011-09-292-1/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This perf stat option emulates valgrind's --log-fd option, allowing the user to send perf results elsewhere, and leaving stderr for use by the program under test. This complements --output file option, and is mutually exclusive with it. 3>results perf stat --log-fd 3 -- $cmd 3>>results perf stat --log-fd 3 --append -- $cmd The perl distro's make test.valgrind target uses valgrind's --log-fd option, I've adapted it to invoke perf also, and tested this patch there. Link: http://lkml.kernel.org/r/1315437244-3788-2-git-send-email-jim.cromie@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf top: Improve lost events warningArnaldo Carvalho de Melo2011-09-293-17/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now it warns everytime that new events are lost. And the TUI also warns now. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-w1n168yrvrppnq6887s4u0wx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf top browser: Fix up line width calculationArnaldo Carvalho de Melo2011-09-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixing an artifact where the last 3 chars of a long DSO name would remain on the screen sometimes. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-dkiakcl3z69dh1bt9uegaktv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf buildid-list: Support showing the build id in an ELF fileArnaldo Carvalho de Melo2011-09-292-4/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Try first reading the build id, validating that it is an ELF file, etc. Cheap as libelf will bail out as soon as the magic number check fails. Useful when investigating debuginfo packaging problems like this one: [root@emilia ~]# perf buildid-list -i /usr/lib/debug/lib/modules/`uname -r`/vmlinux 77bb4ea591a602d455ace759a377c9adfff1aba3 [root@emilia ~]# perf buildid-list -k 07b0c016a2b30004e86132d0239945b1e88f5d75 [root@emilia ~]# Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-4elot9oxwa0rr0d90dshca3a@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf buildid-list: Add option to show the running kernel build idArnaldo Carvalho de Melo2011-09-292-2/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [root@emilia ~]# perf buildid-list -k 07b0c016a2b30004e86132d0239945b1e88f5d75 Useful when diagnosing build id problems in debuginfo packages, etc. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-po1bl7acn6e1hhne90opmvtl@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf script: Add drop monitor scriptNeil Horman2011-09-293-0/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A while back I created the dropmonitor protocol, which allowed users to get reports of dropped frames communicated to them via a netlink socket. While useful, several people have now asked that I integrate the ability to do drop monitoring with perf, so they don't have to run additional tools. This patch adds a drop monitor script to the perf suite, and provides the same output that the netlink socket does. Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1309801217-22450-1-git-send-email-nhorman@tuxdriver.com Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf symbols: Stop using 'self' in map_groups__ methodsArnaldo Carvalho de Melo2011-09-292-58/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stop using this python/OOP convention, doesn't really helps. Will do more from time to time till we get it cleaned up in all of /perf. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-rl9e690y60vnuyng05yp1zd3@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | Merge commit 'v3.1-rc7' into perf/coreIngo Molnar2011-09-2622-289/+311
|\| | | | | | | | | | | Merge reason: Pick up the latest upstream fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * Merge branch 'fixes' of master.kernel.org:/home/rmk/linux-2.6-armLinus Torvalds2011-08-291-0/+3
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm: ARM: pm: avoid writing the auxillary control register for ARMv7 ARM: pm: some ARMv7 requires a dsb in resume to ensure correctness ARM: pm: arm920/926: fix number of registers saved ARM: pm: CPU specific code should not overwrite r1 (v:p offset) ARM: 7066/1: proc-v7: disable SCTLR.TE when disabling MMU ARM: 7065/1: kexec: ensure new kernel is entered in ARM state ARM: 7003/1: vexpress: Add clock definition for the SP805. ARM: 7051/1: cpuimx* boards: fix mach-types errors ARM: 7019/1: Footbridge: select CLKEVT_I8253 for ARCH_NETWINDER ARM: 7015/1: ARM errata: Possible cache data corruption with hit-under-miss enabled ARM: 7014/1: cache-l2x0: Fix L2 Cache size calculation. ARM: 6967/1: ep93xx: ts72xx: fix board model detection ARM: 6965/1: ep93xx: add model detection for ts-7300 and ts-7400 boards ARM: cache: detect VIPT aliasing I-cache on ARMv6 ARM: twd: register clockevents device before enabling PPI ARM: realview: ensure visibility of writes during reset ARM: perf: make name of arm_pmu_type consistent ARM: perf: fix prototype of release_pmu ARM: fix perf build with uclibc toolchains
| | * Merge branch '3.1-fixes-for-rmk' of git://linux-arm.org/linux-2.6-wd into fixesRussell King2011-08-131-0/+3
| | |\
| | | * ARM: fix perf build with uclibc toolchainsFlorian Fainelli2011-08-121-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | libio.h is not provided by uClibc, in order to be able to test the definition of __UCLIBC__ we need to include stdlib.h, which also includes stddef.h, providing the definition of 'NULL'. Signed-off-by: Florian Fainelli <florian@openwrt.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
| * | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/brodo/cpupowerutilsLinus Torvalds2011-08-2521-289/+308
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/brodo/cpupowerutils: cpupower: use man(1) when calling "cpupower help subcommand" cpupower: make NLS truly optional cpupower: fix Makefile typo cpupower: Make monitor command -c/--cpu aware cpupower: Better detect offlined CPUs cpupower: Do not show an empty Idle_Stats monitor if no idle driver is available cpupower: mperf monitor - Use TSC to calculate max frequency if possible cpupower: avoid using symlinks
| | * | | cpupower: use man(1) when calling "cpupower help subcommand"Dominik Brodowski2011-08-1911-228/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of printing something non-formatted to stdout, call man(1) to show the man page for the proper subcommand. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| | * | | cpupower: make NLS truly optionalDominik Brodowski2011-08-192-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Loosely based on a patch for cpufrequtils, submittted by Sergey Dryabzhinsky <sergey.dryabzhinsky@gmail.com> and signed-off-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| | * | | cpupower: fix Makefile typoDave Jones2011-08-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| | * | | cpupower: Make monitor command -c/--cpu awareThomas Renninger2011-08-151-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows for example: cpupower -c 2-4,6 monitor -m Mperf |Mperf PKG |CORE|CPU | C0 | Cx | Freq 0| 8| 4| 2.42| 97.58| 1353 0| 16| 2| 14.38| 85.62| 1928 0| 24| 6| 1.76| 98.24| 1442 1| 16| 3| 15.53| 84.47| 1650 CPUs always get resorted for package, core then cpu id if it could get read out (or however you name these topology levels...). Still this is a nice way to keep the overview if a test binary is bound to a specific CPU or if one wants to show all CPUs inside a package or similar. Still missing: Do not measure not available cores to reduce the overhead and achieve better results. Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| | * | | cpupower: Better detect offlined CPUsThomas Renninger2011-08-155-4/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before, checking for offlined CPUs was done dirty and it was checked whether topology parsing returned -1 values. But this is a valid case on a Xen (and possibly other) kernels. Do proper online/offline checking, also take CONFIG_HOTPLUG_CPU option into account (no /sys/devices/../cpuX/online file). Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| | * | | cpupower: Do not show an empty Idle_Stats monitor if no idle driver is availableThomas Renninger2011-08-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By taking error values of: sysfs_get_idlestate_count(..); into account. Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| | * | | cpupower: mperf monitor - Use TSC to calculate max frequency if possibleThomas Renninger2011-08-152-48/+131
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Which makes the implementation independent from cpufreq drivers. Therefore this would also work on a Xen kernel where the hypervisor is doing frequency switching and idle entering. Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| | * | | cpupower: avoid using symlinksAmerigo Wang2011-08-153-6/+4
| | | |/ | | |/| | | | | | | | | | | | | | | | | | | | | Reference the source directly, don't create symlinks. Signed-off-by: WANG Cong <amwang@redhat.com> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* | | | Merge branch 'perf/core' of ↵Ingo Molnar2011-08-188-66/+126
|\ \ \ \ | |/ / / |/| | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Conflicts: tools/perf/builtin-stat.c
| * | | perf stat: Add -o and --append optionsStephane Eranian2011-08-183-64/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds an option (-o) to save the output of perf stat into a file. You could do this with perf record but not with perf stat. Instead, you had to fiddle with stderr to save the counts into a separate file. The patch also adds the --append option so that results can be concatenated into a single file across runs. Each run of the tool is clearly separated by a comment line starting with a hash mark. The -A option of perf record is already used by perf stat, so we only add a long option. $ perf stat -o res.txt date $ cat res.txt Performance counter stats for 'date': 0.791306 task-clock # 0.668 CPUs utilized 2 context-switches # 0.003 M/sec 0 CPU-migrations # 0.000 M/sec 197 page-faults # 0.249 M/sec 1878143 cycles # 2.373 GHz <not supported> stalled-cycles-frontend <not supported> stalled-cycles-backend 1083367 instructions # 0.58 insns per cycle 193027 branches # 243.935 M/sec 9014 branch-misses # 4.67% of all branches 0.001184746 seconds time elapsed The option can be combined with -x to make the output file much easier to parse. Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20110815202233.GA18535@quad Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | perf annotate: Add --symfs optionStephane Eranian2011-08-182-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If you have --symfs in perf report, then you also need it for perf annotate. This allows off-box assembly level analysis of perf.data samples. This patch complements: commit ec5761eab318e50e69fcf8e63e9edaef5949c067 Author: David Ahern <daahern@cisco.com> Date: Thu Dec 9 13:27:07 2010 -0700 perf symbols: Add symfs option for off-box analysis using specified tree Acked-by: David Ahern <daahern@cisco.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: David Ahern <daahern@cisco.com> Link: http://lkml.kernel.org/r/20110729232040.GA21838@quad Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | perf annotate: Make output more readableStephane Eranian2011-08-185-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds two new options to perf annotate: - --no-asm-raw : Do not display raw instruction encodings - --no-source : Do not interleave source code with assembly code We believe those options make the output of annotate more readable. Systematically displaying source can make it hard to follow code and especially optimized code. Raw encodings are not useful in most cases. Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20110517153207.GA9834@quad Signed-off-by: Stephane Eranian <eranian@google.com> [committer note: Use the 'no-' option inverting logic] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | | | perf tools: Add group event scheduling option to perf record/statLin Ming2011-08-182-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Group event scheduling command line option is missing in perf record/stat. Add it to perf record/stat, which is same as in perf top. Reported-by: Andi Kleen <andi@firstfloor.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1313577727.2754.5.camel@hp6530s Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | | | perf tools: Fix build against newer glibcJosh Boyer2011-08-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Upstream glibc commit 295e904 added a definition for __attribute_const__ to cdefs.h. This causes the following error when building perf: util/include/linux/compiler.h:8:0: error: "__attribute_const__" redefined [-Werror] /usr/include/sys/cdefs.h:226:0: note: this is the location of the previous definition Wrap __attribute_const__ in #ifndef as we do for __always_inline. Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110818113720.GL2227@zod.bos.redhat.com Signed-off-by: Josh Boyer <jwboyer@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>