summaryrefslogtreecommitdiffstats
path: root/tools/perf/Documentation
Commit message (Collapse)AuthorAgeFilesLines
* perf diff: Support --pid/--tid filter optionsJin Yao2019-03-061-0/+6
| | | | | | | | | | | | | | | | | | | | | Using the existing symbol_conf.pid_list_str and symbol_conf.tid_list_str logic. For example: perf diff --tid 13965 It'll only diff the samples for thread 13965. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1551791143-10334-4-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf diff: Support --cpu filter optionJin Yao2019-03-061-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | To improve 'perf diff', implement a --cpu filter option. Multiple CPUs can be provided as a comma-separated list with no space: 0,1. Ranges of CPUs are specified with -: 0-2. Default is to report samples on all CPUs. For example, perf diff --cpu 0,1 It only diff the samples for CPU0 and CPU1. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1551791143-10334-3-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf diff: Support --time filter optionJin Yao2019-03-061-0/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To improve 'perf diff', implement a --time filter option to diff the samples within given time window. It supports time percent with multiple time ranges. The time string format is 'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. For example: Select the second 10% time slice to diff: perf diff --time 10%/2 Select from 0% to 10% time slice to diff: perf diff --time 0%-10% Select the first and the second 10% time slices to diff: perf diff --time 10%/1,10%/2 Select from 0% to 10% and 30% to 40% slices to diff: perf diff --time 0%-10%,30%-40% It also supports analysing samples within a given time window <start>,<stop>. Times have the format seconds.microseconds. If 'start' is not given (i.e., time string is ',x.y') then analysis starts at the beginning of the file. If the stop time is not given (i.e, time string is 'x.y,') then analysis goes to end of file. Time string is 'a1.b1,c1.d1:a2.b2,c2.d2'. Use ':' to separate timestamps for different perf.data files. For example, we get the timestamp information from perf script. perf script -i perf.data.old mgen 13940 [000] 3946.361400: ... perf script -i perf.data mgen 13940 [000] 3971.150589 ... perf diff --time 3946.361400,:3971.150589, It analyzes the perf.data.old from the timestamp 3946.361400 to the end of perf.data.old and analyzes the perf.data from the timestamp 3971.150589 to the end of perf.data. v4: --- Update abstime_str_dup(), let it return error if strdup is failed, and update __cmd_diff() accordingly. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1551791143-10334-2-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf script: Allow +- operator for type specific fields optionJiri Olsa2019-02-201-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support to add/remove fields for specific event types in -F option. It's now possible to use '+-' after event type, like: # cat > test.c #include <stdio.h> int main(void) { printf("Hello world\n"); while(1) {} } ^D # gcc -g -o test test.c # perf probe -x test 'test.c:5' # perf record -e '{cpu/cpu-cycles,period=10000/,probe_test:main}:S' ./test ... # perf script -Ftrace:+period,-cpu test 3859 396291.117343: 10275 cpu/cpu-cycles,period=10000/: 7f.. test 3859 396291.118234: 11041 cpu/cpu-cycles,period=10000/: ffffff.. test 3859 396291.118234: 1 probe_test:main: test 3859 396291.118248: 8668 cpu/cpu-cycles,period=10000/: ffffff.. test 3859 396291.118263: 10139 cpu/cpu-cycles,period=10000/: ffffff.. Committer testing: Couldn't make the test above work, but tested it with: # perf probe -x hello main Added new event: probe_hello:main (on main in /home/acme/c/hello) You can now use it in all perf tools, such as: perf record -e probe_hello:main -aR sleep 1 # perf record -e probe_hello:main ./hello hello, world [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.025 MB perf.data (1 samples) ] # perf script hello 21454 [002] 254116.874005: probe_hello:main: (401126) # # perf script -Ftrace:+period,-cpu hello 21454 254116.874005: 1 probe_hello:main: (401126) Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190220122800.864-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf trace: Allow dumping a BPF map after setting up BPF eventsArnaldo Carvalho de Melo2019-02-191-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Initial use case: Dumping the maps setup by tools/perf/examples/bpf/augmented_raw_syscalls.c, which so far are just booleans, showing just non-zeroed entries: # cat ~/.perfconfig [llvm] dump-obj = true clang-opt = -g [trace] #add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o add_events = /wb/augmented_raw_syscalls.o $ date Tue Feb 19 16:29:33 -03 2019 $ ls -la /wb/augmented_raw_syscalls.o -rwxr-xr-x. 1 root root 14048 Jan 24 12:09 /wb/augmented_raw_syscalls.o $ file /wb/augmented_raw_syscalls.o /wb/augmented_raw_syscalls.o: ELF 64-bit LSB relocatable, eBPF, version 1 (SYSV), with debug_info, not stripped $ # trace -e recvmmsg,sendmmsg --map-dump foobar ERROR: BPF map "foobar" not found # trace -e recvmmsg,sendmmsg --map-dump filtered_pids ERROR: BPF map "filtered_pids" not found # trace -e recvmmsg,sendmmsg --map-dump pids_filtered [2583] = 1, [2267] = 1, ^Z [1]+ Stopped trace -e recvmmsg,sendmmsg --map-dump pids_filtered # pidof trace 2267 # ps ax|grep gnome-terminal|grep -v grep 2583 ? Ssl 58:33 /usr/libexec/gnome-terminal-server ^C # trace -e recvmmsg,sendmmsg --map-dump syscalls [299] = 1, [307] = 1, ^C # grep x64_recvmmsg arch/x86/entry/syscalls/syscall_64.tbl 299 64 recvmmsg __x64_sys_recvmmsg # grep x64_sendmmsg arch/x86/entry/syscalls/syscall_64.tbl 307 64 sendmmsg __x64_sys_sendmmsg # Next step probably will be something like 'perf stat's --interval-print and --interval-clear. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Yonghong Song <yhs@fb.com> Link: https://lkml.kernel.org/n/tip-ztxj25rtx37ixo9cfajt8ocy@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf doc: Fix documentation of the Flags section in perf.dataJonas Rabenstein2019-02-191-5/+4
| | | | | | | | | | | | | | | | | | According to the current documentation the flags section is placed after the file header itself but the code assumes to find the flags section after the data section. This change updates the documentation to that assumption. Signed-off-by: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: http://lkml.kernel.org/r/20190219154515.3954-2-jonas.rabenstein@studium.uni-erlangen.de Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf doc: Fix HEADER_CMDLINE description in perf.data documentationJonas Rabenstein2019-02-191-1/+1
| | | | | | | | | | | | | | | | The content of the HEADER_CMDLINE feature header is a perf_header_string_list of the argument vector and not a perf_header_string of the commandline. Signed-off-by: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: http://lkml.kernel.org/r/20190219154515.3954-1-jonas.rabenstein@studium.uni-erlangen.de Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf record: Implement --affinity=node|cpu optionAlexey Budankov2019-02-111-0/+5
| | | | | | | | | | | | | | Implement --affinity=node|cpu option for the record mode defaulting to system affinity mask bouncing. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/083f5422-ece9-10dd-8305-bf59c860f10f@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* Merge tag 'perf-core-for-mingo-5.1-20190206' of ↵Ingo Molnar2019-02-092-0/+45
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: Hardware tracing: Adrian Hunter: - Handle calls optimized into jumps to a different symbol in the thread stack routines used to process hardware traces (Adrian Hunter) Intel PT: Adrian Hunter: - Fix overlap calculation for padding. - Fix CYC timestamp calculation after OVF. - Packet splitting can only happen in 32-bit. - Add timestamp to auxtrace errors. ARM CoreSight: Leo Yan: - Add last instruction information in packet - Set sample flags for instruction range, exception and return packets and for a trace discontinuity. - Add exception number in exception packet - Change tuple from traceID-CPU# to traceID-metadata - Add traceID in packet Mathieu Poirier: - Add "sinks" group to PMU directory - Use event attributes to send sink information to kernel - Remove set_drv_config() API, no longer used. perf annotate: Jiri Olsa: - Delay symbol annotation to the resort phase, speeding up 'perf report' startup. perf record: Alexey Budankov: - Allow binding userspace buffers to NUMA nodes. Symbols: Adrian Hunter: - Fix calculating of symbol sizes when splitting kallsyms into maps for kcore processing. Vendor events: William Cohen: - Intel: Fix Load_Miss_Real_Latency on CLX Misc: Arnaldo Carvalho de Melo: - Streamline headers, removing includes when all that is needed are just forward declarations, fixup the fallout for cases where headers should have been explicitely included but were instead obtained indirectly, by sheer luck. - Add fallback versions for CPU_{OR,EQUAL}(), so that code using it continue to build on older systems where those were not yet introduced or in systems using some other libc than the GNU one where those helpers aren't present. Documentation: Changbin Du: - Add documentation for BPF event selection. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf tools: Add documentation for BPF event selectionChangbin Du2019-02-062-0/+45
| | | | | | | | | | | | | | | | | | | | | | Add documentation for how to pass a BPF program as a perf event. Signed-off-by: Changbin Du <changbin.du@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190201134651.12373-1-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf mem/c2c: Fix perf_mem_events to support powerpcRavi Bangoria2019-02-042-5/+13
|/ | | | | | | | | | | | | | | | | | | | | | PowerPC hardware does not have a builtin latency filter (--ldlat) for the "mem-load" event and perf_mem_events by default includes "/ldlat=30/" which is causing a failure on PowerPC. Refactor the code to support "perf mem/c2c" on PowerPC. This patch depends on kernel side changes done my Madhavan: https://lists.ozlabs.org/pipermail/linuxppc-dev/2018-December/182596.html Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Dick Fowles <fowles@inreach.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Joe Mario <jmario@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Namhyung Kim <namhyung@kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20190129132412.771-1-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf trace: Allow asking for not suppressing common string prefixesArnaldo Carvalho de Melo2018-12-181-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So far we've been suppressing common stuff such as "MAP_" in the mmap flags, showing "SHARED" instead of "MAP_SHARED", allow for those prefixes (and a few suffixes) to be shown: # trace -e *map,open*,*seek sleep 1 openat("/etc/ld.so.cache", CLOEXEC) = 3 mmap(0, 109093, READ, PRIVATE, 3, 0) = 0x7ff61c695000 openat("/lib64/libc.so.6", CLOEXEC) = 3 lseek(3, 792, SET) = 792 mmap(0, 8192, READ|WRITE, PRIVATE|ANONYMOUS) = 0x7ff61c693000 lseek(3, 792, SET) = 792 lseek(3, 864, SET) = 864 mmap(0, 1857568, READ, PRIVATE|DENYWRITE, 3, 0) = 0x7ff61c4cd000 mmap(0x7ff61c4ef000, 1363968, EXEC|READ, PRIVATE|FIXED|DENYWRITE, 3, 139264) = 0x7ff61c4ef000 mmap(0x7ff61c63c000, 311296, READ, PRIVATE|FIXED|DENYWRITE, 3, 1503232) = 0x7ff61c63c000 mmap(0x7ff61c689000, 24576, READ|WRITE, PRIVATE|FIXED|DENYWRITE, 3, 1814528) = 0x7ff61c689000 mmap(0x7ff61c68f000, 14368, READ|WRITE, PRIVATE|FIXED|ANONYMOUS) = 0x7ff61c68f000 munmap(0x7ff61c695000, 109093) = 0 openat("/usr/lib/locale/locale-archive", CLOEXEC) = 3 mmap(0, 217749968, READ, PRIVATE, 3, 0) = 0x7ff60f523000 # # vim ~/.perfconfig # # perf config llvm.dump-obj=true trace.add_events=/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o trace.show_zeros=yes trace.show_duration=no trace.no_inherit=yes trace.show_timestamp=no trace.show_arg_names=no trace.args_alignment=0 trace.string_quote=" trace.show_prefix=yes # # # trace -e *map,open*,*seek sleep 1 openat(AT_FDCWD, "/etc/ld.so.cache", O_CLOEXEC) = 3 mmap(0, 109093, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7ebbe59000 openat(AT_FDCWD, "/lib64/libc.so.6", O_CLOEXEC) = 3 lseek(3, 792, SEEK_SET) = 792 mmap(0, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS) = 0x7f7ebbe57000 lseek(3, 792, SEEK_SET) = 792 lseek(3, 864, SEEK_SET) = 864 mmap(0, 1857568, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f7ebbc91000 mmap(0x7f7ebbcb3000, 1363968, PROT_EXEC|PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 139264) = 0x7f7ebbcb3000 mmap(0x7f7ebbe00000, 311296, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 1503232) = 0x7f7ebbe00000 mmap(0x7f7ebbe4d000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 1814528) = 0x7f7ebbe4d000 mmap(0x7f7ebbe53000, 14368, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS) = 0x7f7ebbe53000 munmap(0x7f7ebbe59000, 109093) = 0 openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_CLOEXEC) = 3 mmap(0, 217749968, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7eaece7000 # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-mtn1i4rjowjl72trtnbmvjd4@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf trace: Make the alignment of the syscall args be configurableArnaldo Carvalho de Melo2018-12-181-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the start 'perf trace' aligns the parens enclosing the list of syscall args to align the syscall results, allow this to be configurable, keeping the default of 70. Using: # perf config llvm.dump-obj=true trace.add_events=/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o trace.show_zeros=yes trace.show_duration=no trace.no_inherit=yes trace.show_timestamp=no trace.show_arg_names=no trace.args_alignment=0 # trace -e open*,close,*sleep sleep 1 openat(CWD, /etc/ld.so.cache, CLOEXEC) = 3 close(3) = 0 openat(CWD, /lib64/libc.so.6, CLOEXEC) = 3 close(3) = 0 openat(CWD, /usr/lib/locale/locale-archive, CLOEXEC) = 3 close(3) = 0 nanosleep(0x7ffc00de66f0, 0) = 0 close(1) = 0 close(2) = 0 # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-r8cbhoz1lr5npq9tutpvoigr@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf trace: Allow suppressing the syscall argument namesArnaldo Carvalho de Melo2018-12-181-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To show just the values: Default: # trace -e open*,close,*sleep sleep 1 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC ) = 3 close(fd: 3 ) = 0 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC ) = 3 close(fd: 3 ) = 0 openat(dfd: CWD, filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3 close(fd: 3 ) = 0 nanosleep(rqtp: 0x7ffc0c4ea0d0, rmtp: 0 ) = 0 close(fd: 1 ) = 0 close(fd: 2 ) = 0 # Remove it: # perf config trace.show_arg_names=no # trace -e open*,close,*sleep sleep 1 openat(CWD, /etc/ld.so.cache, CLOEXEC ) = 3 close(3 ) = 0 openat(CWD, /lib64/libc.so.6, CLOEXEC ) = 3 close(3 ) = 0 openat(CWD, /usr/lib/locale/locale-archive, CLOEXEC ) = 3 close(3 ) = 0 nanosleep(0x7ffced3a8c40, 0 ) = 0 close(1 ) = 0 close(2 ) = 0 # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-ta9tbdwgodpw719sr2bjm8eb@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf trace: Allow configuring if the syscall start timestamp should be printedArnaldo Carvalho de Melo2018-12-181-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | # trace -e open*,close,*sleep sleep 1 0.000 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC ) = 3 0.016 close(fd: 3 ) = 0 0.024 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC ) = 3 0.074 close(fd: 3 ) = 0 0.235 openat(dfd: CWD, filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3 0.251 close(fd: 3 ) = 0 0.285 nanosleep(rqtp: 0x7ffd68e6d620, rmtp: 0 ) = 0 1000.386 close(fd: 1 ) = 0 1000.395 close(fd: 2 ) = 0 # # perf config trace.show_timestamp=no # trace -e open*,close,*sleep sleep 1 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC ) = 3 close(fd: 3 ) = 0 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC ) = 3 close(fd: 3 ) = 0 openat(dfd: CWD, filename: , flags: CLOEXEC ) = 3 close(fd: 3 ) = 0 nanosleep(rqtp: 0x7fffa79c38e0, rmtp: 0 ) = 0 close(fd: 1 ) = 0 close(fd: 2 ) = 0 # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-mjjnicy48367jah6ls4k0nk8@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf trace: Allow configuring default for perf_event_attr.inheritArnaldo Carvalho de Melo2018-12-181-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I.f. if children should inherit the parent perf_event configuration, i.e. if we should trace children as well or just the parent. The default is to follow children, to disable this and have a behaviour similar to strace, set this config option or use the --no_inherit 'perf trace' option. E.g.: Default: # perf config trace.no_inherit # trace -e clone,*sleep time sleep 1 0.000 time/21107 clone(clone_flags: CHILD_CLEARTID|CHILD_SETTID|0x11, newsp: 0, child_tidptr: 0x7f7b8f9ae810) = 21108 (time) ? time/21108 ... [continued]: clone() 0.691 sleep/21108 nanosleep(rqtp: 0x7ffed01d0540, rmtp: 0 ) = 0 0.00user 0.00system 0:01.00elapsed 0%CPU (0avgtext+0avgdata 1988maxresident)k 0inputs+0outputs (0major+76minor)pagefaults 0swaps # Disable it: # trace -e clone,*sleep time sleep 1 0.000 clone(clone_flags: CHILD_CLEARTID|CHILD_SETTID|0x11, newsp: 0, child_tidptr: 0x7ff41e100810) = 21414 (time) 0.00user 0.00system 0:01.00elapsed 0%CPU (0avgtext+0avgdata 1964maxresident)k 0inputs+0outputs (0major+76minor)pagefaults 0swaps # Notice that since there is just one thread, the "comm/TID" column is suppressed. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-thd8s16pagyza71ufi5vjlan@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf trace: Allow configuring if the syscall duration should be printedArnaldo Carvalho de Melo2018-12-181-0/+4
| | | | | | | | | | | | | | | | | | | | | | # perf config trace.show_duration=no # perf config -l | grep trace trace.default_events=/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o trace.show_zeros=true trace.show_duration=no # trace -e *sleep sleep 1 0.000 sleep/8729 nanosleep(rqtp: 0x7ffcb0b4c940, rmtp: 0) = 0 # perf config trace.show_duration=yes # trace -e *sleep sleep 1 0.000 (1000.212 ms): sleep/8735 nanosleep(rqtp: 0x7ffca15fa770, rmtp: 0) = 0 # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-2c7h1m8fhzb9puxtj9nlevi8@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf trace: Allow configuring if zeroed syscall args should be printedArnaldo Carvalho de Melo2018-12-181-0/+2
| | | | | | | | | | | | | | | | The default so far, since we show argument names followed by its values, was to make the output more compact by suppressing most zeroed args. Make this configurable so that users can choose what best suit their needs. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-q0gxws02ygodh94o0hzim5xd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf trace: Allow specifying a set of events to add in perfconfigArnaldo Carvalho de Melo2018-12-181-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | To add augmented_raw_syscalls to the events speficied by the user, or be the only one if no events were specified by the user, one can add this to perfconfig: # cat ~/.perfconfig [trace] add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o # I.e. pre-compile the augmented_raw_syscalls.c BPF program and make it always load, this way: # perf trace -e open* cat /etc/passwd > /dev/null 0.000 ( 0.013 ms): cat/31557 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) = 3 0.035 ( 0.007 ms): cat/31557 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC) = 3 0.353 ( 0.009 ms): cat/31557 openat(dfd: CWD, filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3 0.424 ( 0.006 ms): cat/31557 openat(dfd: CWD, filename: /etc/passwd) = 3 # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-0lgj7vh64hg3ce44gsmvj7ud@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf trace: Allow selecting use the use of the ordered_events codeArnaldo Carvalho de Melo2018-12-181-0/+6
| | | | | | | | | | | | | | | | | | | | | | I was trigger happy on this one, as using ordered_events as implemented by Jiri for use with the --block code under discussion on lkml incurs in delaying processing to form batches that then get ordered and then printed. With 'perf trace' we want to process the events as they go, without that delay, and doing it that way works well for the common case which is to trace a thread or a workload started by 'perf trace'. So revert back to not using ordered_events but add an option to select that mode so that users can experiment with their particular use case to see if works better, i.e. if the added delay is not a problem and the ordering helps. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-8ki7sld6rusnjhhtaly26i5o@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Support 'srccode' outputAndi Kleen2018-12-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When looking at PT or brstackinsn traces with 'perf script' it can be very useful to see the source code. This adds a simple facility to print them with 'perf script', if the information is available through dwarf % perf record ... % perf script -F insn,ip,sym,srccode ... 4004c6 main 5 for (i = 0; i < 10000000; i++) 4004cd main 5 for (i = 0; i < 10000000; i++) 4004c6 main 5 for (i = 0; i < 10000000; i++) 4004cd main 5 for (i = 0; i < 10000000; i++) 4004cd main 5 for (i = 0; i < 10000000; i++) 4004cd main 5 for (i = 0; i < 10000000; i++) 4004cd main 5 for (i = 0; i < 10000000; i++) 4004cd main 5 for (i = 0; i < 10000000; i++) 4004b3 main 6 v++; % perf record -b ... % perf script -F insn,ip,sym,srccode,brstackinsn ... main+22: 0000000000400543 insn: e8 ca ff ff ff # PRED |18 f1(); f1: 0000000000400512 insn: 55 |10 { 0000000000400513 insn: 48 89 e5 0000000000400516 insn: b8 00 00 00 00 |11 f2(); 000000000040051b insn: e8 d6 ff ff ff # PRED f2: 00000000004004f6 insn: 55 |5 { 00000000004004f7 insn: 48 89 e5 00000000004004fa insn: 8b 05 2c 0b 20 00 |6 c = a / b; 0000000000400500 insn: 8b 0d 2a 0b 20 00 0000000000400506 insn: 99 0000000000400507 insn: f7 f9 0000000000400509 insn: 89 05 29 0b 20 00 000000000040050f insn: 90 |7 } 0000000000400510 insn: 5d 0000000000400511 insn: c3 # PRED f1+14: 0000000000400520 insn: b8 00 00 00 00 |12 f2(); 0000000000400525 insn: e8 cc ff ff ff # PRED f2: 00000000004004f6 insn: 55 |5 { 00000000004004f7 insn: 48 89 e5 00000000004004fa insn: 8b 05 2c 0b 20 00 |6 c = a / b; Not supported for callchains currently, would need some layout changes there. Committer notes: Fixed the build on Alpine Linux (3.4 .. 3.8) by addressing this warning: In file included from util/srccode.c:19:0: /usr/include/sys/fcntl.h:1:2: error: #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h> [-Werror=cpp] #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h> ^~~~~~~ cc1: all warnings being treated as errors Signed-off-by: Andi Kleen <ak@linux.intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20181204001848.24769-1-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Allow specifying proc-map-timeout in config fileMark Drayton2018-12-171-0/+6
| | | | | | | | | | | | | | | | | | | | The default timeout of 500ms for parsing /proc/<pid>/maps files is too short for profiling many of our services. This can be overridden by passing --proc-map-timeout to the relevant command but it'd be nice to globally increase our default value. This patch permits setting a different default with the core.proc-map-timeout config file parameter. Signed-off-by: Mark Drayton <mbd@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20181204203420.1683114-1-mbd@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools Documentation: Fix diverse typosIngo Molnar2018-12-173-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | Go over the tools/ files that are maintained in Arnaldo's tree and fix common typos: half of them were in comments, the other half in JSON files. No change in functionality intended. Committer notes: This was split from a larger patch as there are code that is, additionally, maintained outside the kernel tree, so to ease cherry picking and/or backporting, split this into multiple patches. In this particular case, it affects documentation, so may be interesting to cherry pick as it is information that is presented to the user. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20181203102200.GA104797@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf report: Documentation average IPC and IPC coverageJin Yao2018-12-171-0/+8
| | | | | | | | | | | | | | | | | | | Add explanations for new columns "IPC" and "IPC coverage" in perf documentation. v5: --- Update the description according to Ingo's comments. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1543586097-27632-5-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf record: Extend trace writing to multi AIOAlexey Budankov2018-12-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | Multi AIO trace writing allows caching more kernel data into userspace memory postponing trace writing for the sake of overall profiling data thruput increase. It could be seen as kernel data buffer extension into userspace memory. With an --aio option value different from 0 (default value is 1) the tool has capability to cache more and more data into user space along with delegating spill to AIO. That allows avoiding to suspend at record__aio_sync() between calls of record__mmap_read_evlist() and increases profiling data thruput at the cost of userspace memory. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/050bb053-e7f3-aa83-fde7-f27ff90be7f6@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf record: Enable asynchronous trace writingAlexey Budankov2018-12-171-0/+5
| | | | | | | | | | | | | | | | | | | | | | | The trace file offset is read once before mmaps iterating loop and written back after all performance data is enqueued for aio writing. The trace file offset is incremented linearly after every successful aio write operation. record__aio_sync() blocks till completion of the started AIO operation and then proceeds. record__aio_mmap_read_sync() implements a barrier for all incomplete aio write requests. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/ce2d45e9-d236-871c-7c8f-1bed2d37e8ac@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf top: Allow passing a kallsyms fileArnaldo Carvalho de Melo2018-12-171-0/+3
| | | | | | | | | | | | | | | | | | | | | | This basically replicates what was done for 'perf report' in: b226a5a72901 ("perf report: Allow user to specify path to kallsyms file") This should help with resolving eBPF symbols, that are in kallsyms but, of course, not in vmlinux. Reported-by: Ivan Babrou <ibobrik@gmail.com> Tested-by: Ivan Babrou <ibobrik@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-x52mx1ybq8128rtg9hjrj5qk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf bench: Add epoll_ctl(2) benchmarkDavidlohr Bueso2018-11-211-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Benchmark the various operations allowed for epoll_ctl(2). The idea is to concurrently stress a single epoll instance doing add/mod/del operations. Committer testing: # perf bench epoll ctl # Running 'epoll/ctl' benchmark: Run summary [PID 20344]: 4 threads doing epoll_ctl ops 64 file-descriptors for 8 secs. [thread 0] fdmap: 0x21a46b0 ... 0x21a47ac [ add: 1680960 ops; mod: 1680960 ops; del: 1680960 ops ] [thread 1] fdmap: 0x21a4960 ... 0x21a4a5c [ add: 1685440 ops; mod: 1685440 ops; del: 1685440 ops ] [thread 2] fdmap: 0x21a4c10 ... 0x21a4d0c [ add: 1674368 ops; mod: 1674368 ops; del: 1674368 ops ] [thread 3] fdmap: 0x21a4ec0 ... 0x21a4fbc [ add: 1677568 ops; mod: 1677568 ops; del: 1677568 ops ] Averaged 1679584 ADD operations (+- 0.14%) Averaged 1679584 MOD operations (+- 0.14%) Averaged 1679584 DEL operations (+- 0.14%) # Lets measure those calls with 'perf trace' to get a glympse at what this benchmark is doing in terms of syscalls: # perf trace -m32768 -s perf bench epoll ctl # Running 'epoll/ctl' benchmark: Run summary [PID 20405]: 4 threads doing epoll_ctl ops 64 file-descriptors for 8 secs. [thread 0] fdmap: 0x21764e0 ... 0x21765dc [ add: 1100480 ops; mod: 1100480 ops; del: 1100480 ops ] [thread 1] fdmap: 0x2176790 ... 0x217688c [ add: 1250176 ops; mod: 1250176 ops; del: 1250176 ops ] [thread 2] fdmap: 0x2176a40 ... 0x2176b3c [ add: 1022464 ops; mod: 1022464 ops; del: 1022464 ops ] [thread 3] fdmap: 0x2176cf0 ... 0x2176dec [ add: 705472 ops; mod: 705472 ops; del: 705472 ops ] Averaged 1019648 ADD operations (+- 11.27%) Averaged 1019648 MOD operations (+- 11.27%) Averaged 1019648 DEL operations (+- 11.27%) Summary of events: epoll-ctl (20405), 1264 events, 0.0% syscall calls total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- --------- --------- --------- --------- ------ eventfd2 256 9.514 0.001 0.037 5.243 68.00% clone 4 1.245 0.204 0.311 0.531 24.13% mprotect 66 0.345 0.002 0.005 0.021 7.43% openat 45 0.313 0.004 0.007 0.073 21.93% mmap 88 0.302 0.002 0.003 0.013 5.02% futex 4 0.160 0.002 0.040 0.140 83.43% sched_setaffinity 4 0.124 0.005 0.031 0.070 49.39% read 44 0.103 0.001 0.002 0.013 15.54% fstat 40 0.052 0.001 0.001 0.003 5.43% close 39 0.039 0.001 0.001 0.001 1.48% stat 9 0.034 0.003 0.004 0.006 7.30% access 3 0.023 0.007 0.008 0.008 4.25% open 2 0.021 0.008 0.011 0.013 22.60% getdents 4 0.019 0.001 0.005 0.009 37.15% write 2 0.013 0.004 0.007 0.009 38.48% munmap 1 0.010 0.010 0.010 0.010 0.00% brk 3 0.006 0.001 0.002 0.003 26.34% rt_sigprocmask 2 0.004 0.001 0.002 0.003 43.95% rt_sigaction 3 0.004 0.001 0.001 0.002 16.07% prlimit64 3 0.004 0.001 0.001 0.001 5.39% prctl 1 0.003 0.003 0.003 0.003 0.00% epoll_create 1 0.003 0.003 0.003 0.003 0.00% lseek 2 0.002 0.001 0.001 0.001 11.42% sched_getaffinity 1 0.002 0.002 0.002 0.002 0.00% arch_prctl 1 0.002 0.002 0.002 0.002 0.00% set_tid_address 1 0.001 0.001 0.001 0.001 0.00% getpid 1 0.001 0.001 0.001 0.001 0.00% set_robust_list 1 0.001 0.001 0.001 0.001 0.00% execve 1 0.000 0.000 0.000 0.000 0.00% epoll-ctl (20406), 1245480 events, 14.6% syscall calls total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- --------- --------- --------- --------- ------ epoll_ctl 619511 1034.927 0.001 0.002 6.691 0.67% nanosleep 3226 616.114 0.006 0.191 10.376 7.57% futex 2 11.336 0.002 5.668 11.334 99.97% set_robust_list 1 0.001 0.001 0.001 0.001 0.00% clone 1 0.000 0.000 0.000 0.000 0.00% epoll-ctl (20407), 1243151 events, 14.5% syscall calls total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- --------- --------- --------- --------- ------ epoll_ctl 618350 1042.181 0.001 0.002 2.512 0.40% nanosleep 3220 366.261 0.012 0.114 18.162 9.59% futex 4 5.463 0.001 1.366 5.427 99.12% set_robust_list 1 0.002 0.002 0.002 0.002 0.00% epoll-ctl (20408), 1801690 events, 21.1% syscall calls total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- --------- --------- --------- --------- ------ epoll_ctl 896174 1540.581 0.001 0.002 6.987 0.74% nanosleep 4667 783.393 0.006 0.168 10.419 7.10% futex 2 4.682 0.002 2.341 4.681 99.93% set_robust_list 1 0.002 0.002 0.002 0.002 0.00% clone 1 0.000 0.000 0.000 0.000 0.00% epoll-ctl (20409), 4254890 events, 49.8% syscall calls total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- --------- --------- --------- --------- ------ epoll_ctl 2116416 3768.097 0.001 0.002 9.956 0.41% nanosleep 11023 1141.778 0.006 0.104 9.447 4.95% futex 3 0.037 0.002 0.012 0.029 70.50% set_robust_list 1 0.008 0.008 0.008 0.008 0.00% madvise 1 0.005 0.005 0.005 0.005 0.00% clone 1 0.000 0.000 0.000 0.000 0.00% # Committer notes: Fix build on fedora:24-x-ARC-uClibc, debian:experimental-x-mips, debian:experimental-x-mipsel, ubuntu:16.04-x-arm and ubuntu:16.04-x-powerpc CC /tmp/build/perf/bench/epoll-ctl.o bench/epoll-ctl.c: In function 'init_fdmaps': bench/epoll-ctl.c:214:16: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare] for (i = 0; i < nfds; i+=inc) { ^ bench/epoll-ctl.c: In function 'bench_epoll_ctl': bench/epoll-ctl.c:377:16: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare] for (i = 0; i < nthreads; i++) { ^ bench/epoll-ctl.c:388:16: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare] for (i = 0; i < nthreads; i++) { ^ cc1: all warnings being treated as errors Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Jason Baron <jbaron@akamai.com> Link: http://lkml.kernel.org/r/20181106152226.20883-3-dave@stgolabs.net [ Use inttypes.h to print rlim_t fields, fixing the build on Alpine Linux / musl libc ] [ Check if eventfd() is available, i.e. if HAVE_EVENTFD is defined ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf bench: Add epoll parallel epoll_wait benchmarkDavidlohr Bueso2018-11-211-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This program benchmarks concurrent epoll_wait(2) for file descriptors that are monitored with with EPOLLIN along various semantics, by a single epoll instance. Such conditions can be found when using single/combined or multiple queuing when load balancing. Each thread has a number of private, nonblocking file descriptors, referred to as fdmap. A writer thread will constantly be writing to the fdmaps of all threads, minimizing each threads's chances of epoll_wait not finding any ready read events and blocking as this is not what we want to stress. Full details in the start of the C file. Committer testing: # perf bench Usage: perf bench [<common options>] <collection> <benchmark> [<options>] # List of all available benchmark collections: sched: Scheduler and IPC benchmarks mem: Memory access benchmarks numa: NUMA scheduling and MM benchmarks futex: Futex stressing benchmarks epoll: Epoll stressing benchmarks all: All benchmarks # perf bench epoll # List of available benchmarks for collection 'epoll': wait: Benchmark epoll concurrent epoll_waits all: Run all futex benchmarks # perf bench epoll wait # Running 'epoll/wait' benchmark: Run summary [PID 19295]: 3 threads monitoring on 64 file-descriptors for 8 secs. [thread 0] fdmap: 0xdaa650 ... 0xdaa74c [ 328241 ops/sec ] [thread 1] fdmap: 0xdaa900 ... 0xdaa9fc [ 351695 ops/sec ] [thread 2] fdmap: 0xdaabb0 ... 0xdaacac [ 381423 ops/sec ] Averaged 353786 operations/sec (+- 4.35%), total secs = 8 # Committer notes: Fix the build on debian:experimental-x-mips, debian:experimental-x-mipsel and others: CC /tmp/build/perf/bench/epoll-wait.o bench/epoll-wait.c: In function 'writerfn': bench/epoll-wait.c:399:12: error: format '%ld' expects argument of type 'long int', but argument 2 has type 'size_t' {aka 'unsigned int'} [-Werror=format=] printinfo("exiting writer-thread (total full-loops: %ld)\n", iter); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~ bench/epoll-wait.c:86:31: note: in definition of macro 'printinfo' do { if (__verbose) { printf(fmt, ## arg); fflush(stdout); } } while (0) ^~~ cc1: all warnings being treated as errors Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Jason Baron <jbaron@akamai.com> <jbaron@akamai.com> Link: http://lkml.kernel.org/r/20181106152226.20883-2-dave@stgolabs.net Link: http://lkml.kernel.org/r/20181106182349.thdkpvshkna5vd7o@linux-r8p5> [ Applied above fixup as per Davidlohr's request ] [ Use inttypes.h to print rlim_t fields, fixing the build on Alpine Linux / musl libc ] [ Check if eventfd() is available, i.e. if HAVE_EVENTFD is defined ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf record: Support weak groupsAndi Kleen2018-11-051-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | Implement a weak group fallback for 'perf record', similar to the existing 'perf stat' support. This allows to use groups that might be longer than the available counters without failing. Before: $ perf record -e '{cycles,cache-misses,cache-references,cpu_clk_unhalted.thread,cycles,cycles,cycles}' -a sleep 1 Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cycles). /bin/dmesg | grep -i perf may provide additional information. After: $ ./perf record -e '{cycles,cache-misses,cache-references,cpu_clk_unhalted.thread,cycles,cycles,cycles}:W' -a sleep 1 WARNING: No sample_id_all support, falling back to unordered processing [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 8.136 MB perf.data (134069 samples) ] Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20181001195927.14211-2-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf top: Do not use overwrite mode by defaultArnaldo Carvalho de Melo2018-10-311-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Enabling --overwrite mode allows us to to use just the most recent records, which helps in high core count machines such as Knights Landing/Mill, but right now is being disabled by default as the pausing used in this technique is leading to loss of metadata events such as PERF_RECORD_MMAP which makes 'perf top' unable to resolve samples, leading to lots of unknown samples appearing on the UI. Enabling this may be useful if you are in such machines and profiling a workload that doesn't creates short lived threads and/or doesn't uses many executable mmap operations. Work is being planed to solve this situation, till then, this will remain disabled by default. Reported-by: David Miller <davem@davemloft.net> Acked-by: Kan Liang <kan.liang@intel.com> Link: https://lkml.kernel.org/r/4f84468f-37d9-cf1b-12c1-514ef74b6a48@linux.intel.com Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: ebebbf082357 ("perf top: Switch default mode to overwrite mode") Link: https://lkml.kernel.org/n/tip-ehvf77vi1si9409r7p4wx788@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf top: Allow disabling the overwrite modeArnaldo Carvalho de Melo2018-10-301-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In ebebbf082357 ("perf top: Switch default mode to overwrite mode") we forgot to leave a way to disable that new default, add a --overwrite option that can be disabled using --no-overwrite, since the code already in such a way that we can readily disable this mode. This is useful when investigating bugs with this mode like the recent report from David Miller where lots of unknown symbols appear due to disabling the events while processing them which disables all record types, not just PERF_RECORD_SAMPLE, which makes it impossible to resolve maps when we lose PERF_RECORD_MMAP records. This can be easily seen while building a kernel, when there are lots of short lived processes. Reported-by: David Miller <davem@davemloft.net> Acked-by: Kan Liang <kan.liang@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: ebebbf082357 ("perf top: Switch default mode to overwrite mode") Link: https://lkml.kernel.org/n/tip-oqgsz2bq4kgrnnajrafcdhie@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf script: Implement --graph-functionAndi Kleen2018-10-241-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a ftrace style --graph-function argument to 'perf script' that allows to print itrace function calls only below a given function. This makes it easier to find the code of interest in a large trace. % perf record -e intel_pt//k -a sleep 1 % perf script --graph-function group_sched_in --call-trace perf 900 [000] 194167.205652203: ([kernel.kallsyms]) group_sched_in perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_sched_in.isra.107 perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_set_state.part.71 perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_time perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_disable perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_log_itrace_start perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_userpage perf 900 [000] 194167.205652203: ([kernel.kallsyms]) calc_timer_values perf 900 [000] 194167.205652203: ([kernel.kallsyms]) sched_clock_cpu perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax perf 900 [000] 194167.205652203: ([kernel.kallsyms]) arch_perf_update_userpage perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __fentry__ perf 900 [000] 194167.205652203: ([kernel.kallsyms]) using_native_sched_clock perf 900 [000] 194167.205652203: ([kernel.kallsyms]) sched_clock_stable perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_enable perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) group_sched_in swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) __x86_indirect_thunk_rax swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) event_sched_in.isra.107 swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) perf_event_set_state.part.71 swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) perf_event_update_time swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) perf_pmu_disable swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) perf_log_itrace_start swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) __x86_indirect_thunk_rax swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) perf_event_update_userpage swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) calc_timer_values swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) sched_clock_cpu swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) __x86_indirect_thunk_rax swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) arch_perf_update_userpage swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) __fentry__ swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) using_native_sched_clock swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) sched_clock_stable Signed-off-by: Andi Kleen <ak@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Leo Yan <leo.yan@linaro.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Kim Phillips <kim.phillips@arm.com> Link: http://lkml.kernel.org/r/20180920180540.14039-5-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* tools script: Add --call-trace and --call-ret-traceAndi Kleen2018-10-241-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add short cut options to print PT call trace and call-ret-trace, for calls and call and returns. Roughly corresponds to ftrace function tracer and function graph tracer. Just makes these common use cases nicer to use. % perf record -a -e intel_pt// sleep 1 % perf script --call-trace perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_enable perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_filter_match perf 900 [000] 194167.205652203: ([kernel.kallsyms]) group_sched_in perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_sched_in.isra.107 perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_set_state.part.71 perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_time perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_disable perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_log_itrace_start perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_userpage % perf script --call-ret-trace perf 900 [000] 194167.205652203: tr strt ([unknown]) pt_config perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) pt_config perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) pt_event_add perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) perf_pmu_enable perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) perf_pmu_nop_void perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) event_sched_in.isra.107 perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) __x86_indirect_thunk_rax perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) perf_pmu_nop_int perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) group_sched_in perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) event_filter_match perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) event_filter_match perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) group_sched_in perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) __x86_indirect_thunk_rax perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) perf_pmu_nop_txn perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) event_sched_in.isra.107 perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) perf_event_set_state.part.71 Signed-off-by: Andi Kleen <ak@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Leo Yan <leo.yan@linaro.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Kim Phillips <kim.phillips@arm.com> Link: http://lkml.kernel.org/r/20180920180540.14039-4-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf script: Make itrace script default to all callsAndi Kleen2018-10-241-3/+4
| | | | | | | | | | | | | | | | | | | | | | | By default 'perf script' for itrace outputs sampled instructions or branches. In my experience this is confusing to users because it's hard to correlate with real program behavior. The sampling makes sense for tools like 'perf report' that actually sample to reduce the run time, but run time is normally not a problem for 'perf script'. It's better to give an accurate representation of the program flow. Default 'perf script' to output all calls for itrace. That's a much saner default. The old behavior can be still requested with 'perf script' --itrace=ibxwpe100000 v2: Fix ETM build failure v3: Really fix ETM build failure (Kim Phillips) Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Leo Yan <leo.yan@linaro.org> Link: http://lkml.kernel.org/r/20180920180540.14039-3-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf script: Add --insn-trace for instruction decodingAndi Kleen2018-10-242-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a --insn-trace short hand option for decoding and disassembling instruction streams for intel_pt. This automatically pipes the output into the xed disassembler to generate disassembled instructions. This just makes this use model much nicer to use. Before % perf record -e intel_pt// ... % perf script --itrace=i0ns --ns -F +insn,-event,-period | xed -F insn: -A -64 swapper 0 [000] 17276.429606186: ffffffff81010486 pt_config ([kernel.kallsyms]) nopl %eax, (%rax,%rax,1) swapper 0 [000] 17276.429606186: ffffffff8101048b pt_config ([kernel.kallsyms]) add $0x10, %rsp swapper 0 [000] 17276.429606186: ffffffff8101048f pt_config ([kernel.kallsyms]) popq %rbx swapper 0 [000] 17276.429606186: ffffffff81010490 pt_config ([kernel.kallsyms]) popq %rbp swapper 0 [000] 17276.429606186: ffffffff81010491 pt_config ([kernel.kallsyms]) popq %r12 swapper 0 [000] 17276.429606186: ffffffff81010493 pt_config ([kernel.kallsyms]) popq %r13 swapper 0 [000] 17276.429606186: ffffffff81010495 pt_config ([kernel.kallsyms]) popq %r14 swapper 0 [000] 17276.429606186: ffffffff81010497 pt_config ([kernel.kallsyms]) popq %r15 swapper 0 [000] 17276.429606186: ffffffff81010499 pt_config ([kernel.kallsyms]) retq swapper 0 [000] 17276.429606186: ffffffff8101063e pt_event_add ([kernel.kallsyms]) cmpl $0x1, 0x1b0(%rbx) swapper 0 [000] 17276.429606186: ffffffff81010645 pt_event_add ([kernel.kallsyms]) mov $0xffffffea, %eax swapper 0 [000] 17276.429606186: ffffffff8101064a pt_event_add ([kernel.kallsyms]) mov $0x0, %edx swapper 0 [000] 17276.429606186: ffffffff8101064f pt_event_add ([kernel.kallsyms]) popq %rbx swapper 0 [000] 17276.429606186: ffffffff81010650 pt_event_add ([kernel.kallsyms]) cmovnz %edx, %eax swapper 0 [000] 17276.429606186: ffffffff81010653 pt_event_add ([kernel.kallsyms]) jmp 0xffffffff81010635 swapper 0 [000] 17276.429606186: ffffffff81010635 pt_event_add ([kernel.kallsyms]) retq swapper 0 [000] 17276.429606186: ffffffff8115e687 event_sched_in.isra.107 ([kernel.kallsyms]) test %eax, %eax Now: % perf record -e intel_pt// ... % perf script --insn-trace --xed ... same output ... XED needs to be installed with: $ git clone https://github.com/intelxed/mbuild.git mbuild $ git clone https://github.com/intelxed/xed $ cd xed $ ./mfile.py $ ./mfile.py examples $ sudo ./mfile.py --prefix=/usr/local install $ sudo cp obj/examples/xed /usr/local/bin $ xed | head -3 ERROR: required argument(s) were missing Copyright (C) 2017, Intel Corporation. All rights reserved. XED version: [v10.0-328-g7d62c8c49b7b] $ Signed-off-by: Andi Kleen <ak@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20180920180540.14039-2-andi@firstfloor.org [ Fixed up whitespace damage, added the 'mfile.py examples + cp obj/examples/xed ... ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf scripts python: call-graph-from-sql.py: Rename to exported-sql-viewer.pyAdrian Hunter2018-10-231-1/+1
| | | | | | | | | | | Additional reports will be added to the script so rename to reflect the more general purpose. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20181001062853.28285-13-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf trace: Introduce per-event maximum number of events propertyArnaldo Carvalho de Melo2018-10-221-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Call it 'nr', as in this context it should be expressive enough, i.e.: # perf trace -e sched:*waking/nr=8,call-graph=fp/ 0.000 :0/0 sched:sched_waking:comm=rcu_sched pid=10 prio=120 target_cpu=001 try_to_wake_up ([kernel.kallsyms]) sched_clock ([kernel.kallsyms]) 3.933 :0/0 sched:sched_waking:comm=rcu_sched pid=10 prio=120 target_cpu=001 try_to_wake_up ([kernel.kallsyms]) sched_clock ([kernel.kallsyms]) 3.970 IPDL Backgroun/3622 sched:sched_waking:comm=Gecko_IOThread pid=3569 prio=120 target_cpu=003 try_to_wake_up ([kernel.kallsyms]) __libc_write (/usr/lib64/libpthread-2.26.so) 20.069 IPDL Backgroun/3622 sched:sched_waking:comm=Gecko_IOThread pid=3569 prio=120 target_cpu=003 try_to_wake_up ([kernel.kallsyms]) __libc_write (/usr/lib64/libpthread-2.26.so) 37.170 IPDL Backgroun/3622 sched:sched_waking:comm=Gecko_IOThread pid=3569 prio=120 target_cpu=003 try_to_wake_up ([kernel.kallsyms]) __libc_write (/usr/lib64/libpthread-2.26.so) 53.267 IPDL Backgroun/3622 sched:sched_waking:comm=Gecko_IOThread pid=3569 prio=120 target_cpu=003 try_to_wake_up ([kernel.kallsyms]) __libc_write (/usr/lib64/libpthread-2.26.so) 70.365 IPDL Backgroun/3622 sched:sched_waking:comm=Gecko_IOThread pid=3569 prio=120 target_cpu=003 try_to_wake_up ([kernel.kallsyms]) __libc_write (/usr/lib64/libpthread-2.26.so) 75.781 Web Content/3649 sched:sched_waking:comm=JS Helper pid=3670 prio=120 target_cpu=000 try_to_wake_up ([kernel.kallsyms]) try_to_wake_up ([kernel.kallsyms]) wake_up_q ([kernel.kallsyms]) futex_wake ([kernel.kallsyms]) do_futex ([kernel.kallsyms]) __x64_sys_futex ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64_after_hwframe ([kernel.kallsyms]) pthread_cond_signal@@GLIBC_2.3.2 (/usr/lib64/libpthread-2.26.so) # # perf trace -e sched:*switch/nr=2/,block:*_plug/nr=4/,block:*_unplug/nr=1/,net:*dev_queue/nr=3,max-stack=16/ 0.000 :0/0 sched:sched_switch:swapper/0:0 [120] S ==> trace:3367 [120] 0.046 :0/0 sched:sched_switch:swapper/1:0 [120] S ==> kworker/u16:58:2722 [120] 570.670 irq/50-iwlwifi/680 net:net_dev_queue:dev=wlp3s0 skbaddr=0xffff93498051ef00 len=66 __dev_queue_xmit ([kernel.kallsyms]) 1106.141 jbd2/dm-0-8/476 block:block_plug:[jbd2/dm-0-8] 1106.175 jbd2/dm-0-8/476 block:block_unplug:[jbd2/dm-0-8] 1 1618.088 kworker/u16:30/2694 block:block_plug:[kworker/u16:30] 1810.000 :0/0 net:net_dev_queue:dev=vnet0 skbaddr=0xffff93498051ef00 len=52 __dev_queue_xmit ([kernel.kallsyms]) 3857.974 :0/0 net:net_dev_queue:dev=vnet0 skbaddr=0xffff93498051f900 len=52 __dev_queue_xmit ([kernel.kallsyms]) 4790.277 jbd2/dm-2-8/748 block:block_plug:[jbd2/dm-2-8] 4790.448 jbd2/dm-2-8/748 block:block_plug:[jbd2/dm-2-8] # The global --max-events has precendence: # trace --max-events 3 -e sched:*switch/nr=2/,block:*_plug/nr=4/,block:*_unplug/nr=1/,net:*dev_queue/nr=3,max-stack=16/ 0.000 :0/0 sched:sched_switch:swapper/0:0 [120] S ==> qemu-system-x86:2252 [120] 0.029 qemu-system-x8/2252 sched:sched_switch:qemu-system-x86:2252 [120] D ==> swapper/0:0 [120] 58.047 DNS Res~er #14/31661 net:net_dev_queue:dev=wlp3s0 skbaddr=0xffff9346966af100 len=84 __dev_queue_xmit ([kernel.kallsyms]) __libc_send (/usr/lib64/libpthread-2.26.so) # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-s4jswltvh660ughvg9nwngah@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf trace: Introduce --max-eventsArnaldo Carvalho de Melo2018-10-191-0/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow stopping tracing after a number of events take place, considering strace-like syscalls formatting as one event per enter/exit pair or when in a multi-process tracing session a syscall is interrupted and printed ending with '...'. Examples included in the documentation: Trace the first 4 open, openat or open_by_handle_at syscalls (in the future more syscalls may match here): $ perf trace -e open* --max-events 4 [root@jouet perf]# trace -e open* --max-events 4 2272.992 ( 0.037 ms): gnome-shell/1370 openat(dfd: CWD, filename: /proc/self/stat) = 31 2277.481 ( 0.139 ms): gnome-shell/3039 openat(dfd: CWD, filename: /proc/self/stat) = 65 3026.398 ( 0.076 ms): gnome-shell/3039 openat(dfd: CWD, filename: /proc/self/stat) = 65 4294.665 ( 0.015 ms): sed/15879 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) = 3 $ Trace the first minor page fault when running a workload: # perf trace -F min --max-stack=7 --max-events 1 sleep 1 0.000 ( 0.000 ms): sleep/18006 minfault [__clear_user+0x1a] => 0x5626efa56080 (?k) __clear_user ([kernel.kallsyms]) load_elf_binary ([kernel.kallsyms]) search_binary_handler ([kernel.kallsyms]) __do_execve_file.isra.33 ([kernel.kallsyms]) __x64_sys_execve ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64 ([kernel.kallsyms]) # Trace the next min page page fault to take place on the first CPU: # perf trace -F min --call-graph=dwarf --max-events 1 --cpu 0 0.000 ( 0.000 ms): Web Content/17136 minfault [js::gc::Chunk::fetchNextDecommittedArena+0x4b] => 0x7fbe6181b000 (?.) js::gc::FreeSpan::initAsEmpty (inlined) js::gc::Arena::setAsNotAllocated (inlined) js::gc::Chunk::fetchNextDecommittedArena (/usr/lib64/firefox/libxul.so) js::gc::Chunk::allocateArena (/usr/lib64/firefox/libxul.so) js::gc::GCRuntime::allocateArena (/usr/lib64/firefox/libxul.so) js::gc::ArenaLists::allocateFromArena (/usr/lib64/firefox/libxul.so) js::gc::GCRuntime::tryNewTenuredThing<JSString, (js::AllowGC)1> (inlined) js::AllocateString<JSString, (js::AllowGC)1> (/usr/lib64/firefox/libxul.so) js::Allocate<JSThinInlineString, (js::AllowGC)1> (inlined) JSThinInlineString::new_<(js::AllowGC)1> (inlined) AllocateInlineString<(js::AllowGC)1, unsigned char> (inlined) js::ConcatStrings<(js::AllowGC)1> (/usr/lib64/firefox/libxul.so) [0x18b26e6bc2bd] (/tmp/perf-17136.map) Tracing the next four ext4 operations on a specific CPU: # perf trace -e ext4:*/call-graph=fp/ --max-events 4 --cpu 3 0.000 mutt/3849 ext4:ext4_es_lookup_extent_enter:dev 253,2 ino 57277 lblk 0 ext4_es_lookup_extent ([kernel.kallsyms]) read (/usr/lib64/libc-2.26.so) 0.097 mutt/3849 ext4:ext4_es_lookup_extent_exit:dev 253,2 ino 57277 found 0 [0/0) 0 ext4_es_lookup_extent ([kernel.kallsyms]) read (/usr/lib64/libc-2.26.so) 0.141 mutt/3849 ext4:ext4_ext_map_blocks_enter:dev 253,2 ino 57277 lblk 0 len 1 flags ext4_ext_map_blocks ([kernel.kallsyms]) read (/usr/lib64/libc-2.26.so) 0.184 mutt/3849 ext4:ext4_ext_load_extent:dev 253,2 ino 57277 lblk 1516511 pblk 18446744071750013657 __read_extent_tree_block ([kernel.kallsyms]) __read_extent_tree_block ([kernel.kallsyms]) ext4_find_extent ([kernel.kallsyms]) ext4_ext_map_blocks ([kernel.kallsyms]) ext4_map_blocks ([kernel.kallsyms]) ext4_mpage_readpages ([kernel.kallsyms]) read_pages ([kernel.kallsyms]) __do_page_cache_readahead ([kernel.kallsyms]) ondemand_readahead ([kernel.kallsyms]) generic_file_read_iter ([kernel.kallsyms]) __vfs_read ([kernel.kallsyms]) vfs_read ([kernel.kallsyms]) ksys_read ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64 ([kernel.kallsyms]) read (/usr/lib64/libc-2.26.so) # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Rudá Moura <ruda.moura@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-sweh107bs7ol5bzls0m4tqdz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf Documentation: Fix out-of-tree asciidoctor man page generationBen Hutchings2018-09-181-1/+1
| | | | | | | | | | | | | | | | The dependency for the man page rule using asciidoctor incorrectly specifies a source file in $(OUTPUT). When building out-of-tree, the source file is not found, resulting in a fall-back to the following rule which uses xmlto. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180916151704.GF4765@decadent.org.uk Fixes: ffef80ecf89f ("perf Documentation: Support for asciidoctor") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf report: Add --percent-type optionJiri Olsa2018-08-081-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | Set annotation percent type from following choices: global-period, local-period, global-hits, local-hits With following report option setup the percent type will be passed to annotation browser: $ perf report --percent-type period-local The local/global keywords set if the percentage is computed in the scope of the function (local) or the whole data (global). The period/hits keywords set the base the percentage is computed on - the samples period or the number of samples (hits). Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20180804130521.11408-21-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf annotate: Add --percent-type optionJiri Olsa2018-08-081-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add --percent-type option to set annotation percent type from following choices: global-period, local-period, global-hits, local-hits Examples: $ perf annotate --percent-type period-local --stdio | head -1 Percent | Source code ... es, percent: local period) $ perf annotate --percent-type hits-local --stdio | head -1 Percent | Source code ... es, percent: local hits) $ perf annotate --percent-type hits-global --stdio | head -1 Percent | Source code ... es, percent: global hits) $ perf annotate --percent-type period-global --stdio | head -1 Percent | Source code ... es, percent: global period) The local/global keywords set if the percentage is computed in the scope of the function (local) or the whole data (global). The period/hits keywords set the base the percentage is computed on - the samples period or the number of samples (hits). Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20180804130521.11408-20-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf list: Add missing documentation for --desc and --debug optionsSangwon Hong2018-07-241-1/+7
| | | | | | | | | | | | | Add missing documentation for --desc and --debug options to the 'perf list' man page. Signed-off-by: Sangwon Hong <qpakzk@gmail.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20180717110738.10779-1-qpakzk@gmail.com [ Clarify that --desc is by default active ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf, tools: Use correct articles in commentsTobias Tefke2018-07-162-3/+3
| | | | | | | | | | | | | | | | | | Some of the comments in the perf events code use articles incorrectly, using 'a' for words beginning with a vowel sound, where 'an' should be used. Signed-off-by: Tobias Tefke <tobias.tefke@tutanota.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@kernel.org Cc: alexander.shishkin@linux.intel.com Cc: jolsa@redhat.com Cc: namhyung@kernel.org Link: http://lkml.kernel.org/r/20180709105715.22938-1-tobias.tefke@tutanota.com [ Fix a few more perf related 'a event' typo fixes from all around the kernel and tooling tree. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
* perf stat: Add --interval-clear optionJiri Olsa2018-06-071-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | Adding --interval-clear option to clear the screen before next interval. Committer testing: # perf stat -I 1000 --interval-clear And, as expected, it behaves almost like: # watch -n 0 perf stat -a sleep 1 Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20180606221513.11302-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf script python: Add dict fields introduction to DocumentationJin Yao2018-06-061-0/+26
| | | | | | | | | | | | | | | | | Add a brief introduction about fields to perf-script-python.txt. It should help python script developers in easily finding what fields are supported. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1527843663-32288-4-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf stat: Display user and system timeJiri Olsa2018-06-061-11/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adding the support to read rusage data once the workload is finished and display the system/user time values: $ perf stat --null perf bench sched pipe ... Performance counter stats for 'perf bench sched pipe': 5.342599256 seconds time elapsed 2.544434000 seconds user 4.549691000 seconds sys It works only in non -r mode and only for workload target. So as of now, for workload targets, we display 3 types of timings. The time we meassure in perf stat from enable to disable+period: 5.342599256 seconds time elapsed The time spent in user and system lands, displayed only for workload session/target: 2.544434000 seconds user 4.549691000 seconds sys Those times are the very same displayed by 'time' tool. They are returned by wait4 call via the getrusage struct interface. Committer notes: Had to rename some variables to avoid this on older systems such as centos:6: builtin-stat.c: In function 'print_footer': builtin-stat.c:1831: warning: declaration of 'stime' shadows a global declaration /usr/include/time.h:297: warning: shadowed declaration is here Committer testing: # perf stat --null time perf bench sched pipe # Running 'sched/pipe' benchmark: # Executed 1000000 pipe operations between two processes Total time: 5.526 [sec] 5.526534 usecs/op 180945 ops/sec 1.00user 6.25system 0:05.52elapsed 131%CPU (0avgtext+0avgdata 8056maxresident)k 0inputs+0outputs (0major+606minor)pagefaults 0swaps Performance counter stats for 'time perf bench sched pipe': 5.530978744 seconds time elapsed 1.004037000 seconds user 6.259937000 seconds sys # Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180605121313.31337-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf record: Enable arbitrary event names thru name= modifierAlexey Budankov2018-06-062-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enable complex event names containing [.:=,] symbols to be encoded into Perf trace using name= modifier e.g. like this: perf record -e cpu/name=\'OFFCORE_RESPONSE:request=DEMAND_RFO:response=L3_HIT.SNOOP_HITM\',\ period=0x3567e0,event=0x3c,cmask=0x1/Duk ./futex Below is how it looks like in the report output. Please note explicit escaped quoting at cmdline string in the header so that thestring can be directly reused for another collection in shell: perf report --header # ======== ... # cmdline : /root/abudanko/kernel/tip/tools/perf/perf record -v -e cpu/name=\'OFFCORE_RESPONSE:request=DEMAND_RFO:response=L3_HIT.SNOOP_HITM\',period=0x3567e0,event=0x3c,cmask=0x1/Duk ./futex # event : name = OFFCORE_RESPONSE:request=DEMAND_RFO:response=L3_HIT.SNOOP_HITM, , type = 4, size = 112, config = 0x100003c, { sample_period, sample_freq } = 3500000, sample_type = IP|TID|TIME, disabled = 1, inh ... # ======== # # # Total Lost Samples: 0 # # Samples: 24K of event 'OFFCORE_RESPONSE:request=DEMAND_RFO:response=L3_HIT.SNOOP_HITM' # Event count (approx.): 86492000000 # # Overhead Command Shared Object Symbol # ........ ....... ................ .............................................. # 14.75% futex [kernel.vmlinux] [k] __entry_trampoline_start ... perf stat -e cpu/name=\'CPU_CLK_UNHALTED.THREAD:cmask=0x1\',period=0x3567e0,event=0x3c,cmask=0x1/Duk ./futex 10000000 process context switches in 16678890291ns (1667.9ns/ctxsw) Performance counter stats for './futex': 88,095,770,571 CPU_CLK_UNHALTED.THREAD:cmask=0x1 16.679542407 seconds time elapsed Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/c194b060-761d-0d50-3b21-bb4ed680002d@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* Merge remote-tracking branch 'tip/perf/urgent' into perf/coreArnaldo Carvalho de Melo2018-06-041-1/+9
|\ | | | | | | | | | | To pick up fixes. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf tools: Fix perf.data format description of NRCPUS headerArnaldo Carvalho de Melo2018-05-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the perf.data HEADER_CPUDESC feadure header we store first the number of available CPUs in the system, then the number of CPUs at the time of writing the header, not the other way around. Reported-by: Thomas-Mich Richter <tmricht@linux.ibm.com> Acked-by: Andi Kleen <ak@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Lakshman Annadorai <lakshmana@google.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-j7o92acm2vnxjv70y4o3swoc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>