diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2012-03-20 10:29:15 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2012-03-20 10:29:15 -0700 |
commit | 9c2b957db1772ebf942ae7a9346b14eba6c8ca66 (patch) | |
tree | 0dbb83e57260ea7fc0dc421f214d5f1b26262005 /tools/perf/Documentation | |
parent | 0bbfcaff9b2a69c71a95e6902253487ab30cb498 (diff) | |
parent | bea95c152dee1791dd02cbc708afbb115bb00f9a (diff) | |
download | linux-9c2b957db1772ebf942ae7a9346b14eba6c8ca66.tar.gz linux-9c2b957db1772ebf942ae7a9346b14eba6c8ca66.tar.bz2 linux-9c2b957db1772ebf942ae7a9346b14eba6c8ca66.zip |
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf events changes for v3.4 from Ingo Molnar:
- New "hardware based branch profiling" feature both on the kernel and
the tooling side, on CPUs that support it. (modern x86 Intel CPUs
with the 'LBR' hardware feature currently.)
This new feature is basically a sophisticated 'magnifying glass' for
branch execution - something that is pretty difficult to extract from
regular, function histogram centric profiles.
The simplest mode is activated via 'perf record -b', and the result
looks like this in perf report:
$ perf record -b any_call,u -e cycles:u branchy
$ perf report -b --sort=symbol
52.34% [.] main [.] f1
24.04% [.] f1 [.] f3
23.60% [.] f1 [.] f2
0.01% [k] _IO_new_file_xsputn [k] _IO_file_overflow
0.01% [k] _IO_vfprintf_internal [k] _IO_new_file_xsputn
0.01% [k] _IO_vfprintf_internal [k] strchrnul
0.01% [k] __printf [k] _IO_vfprintf_internal
0.01% [k] main [k] __printf
This output shows from/to branch columns and shows the highest
percentage (from,to) jump combinations - i.e. the most likely taken
branches in the system. "branches" can also include function calls
and any other synchronous and asynchronous transitions of the
instruction pointer that are not 'next instruction' - such as system
calls, traps, interrupts, etc.
This feature comes with (hopefully intuitive) flat ascii and TUI
support in perf report.
- Various 'perf annotate' visual improvements for us assembly junkies.
It will now recognize function calls in the TUI and by hitting enter
you can follow the call (recursively) and back, amongst other
improvements.
- Multiple threads/processes recording support in perf record, perf
stat, perf top - which is activated via a comma-list of PIDs:
perf top -p 21483,21485
perf stat -p 21483,21485 -ddd
perf record -p 21483,21485
- Support for per UID views, via the --uid paramter to perf top, perf
report, etc. For example 'perf top --uid mingo' will only show the
tasks that I am running, excluding other users, root, etc.
- Jump label restructurings and improvements - this includes the
factoring out of the (hopefully much clearer) include/linux/static_key.h
generic facility:
struct static_key key = STATIC_KEY_INIT_FALSE;
...
if (static_key_false(&key))
do unlikely code
else
do likely code
...
static_key_slow_inc();
...
static_key_slow_inc();
...
The static_key_false() branch will be generated into the code with as
little impact to the likely code path as possible. the
static_key_slow_*() APIs flip the branch via live kernel code patching.
This facility can now be used more widely within the kernel to
micro-optimize hot branches whose likelihood matches the static-key
usage and fast/slow cost patterns.
- SW function tracer improvements: perf support and filtering support.
- Various hardenings of the perf.data ABI, to make older perf.data's
smoother on newer tool versions, to make new features integrate more
smoothly, to support cross-endian recording/analyzing workflows
better, etc.
- Restructuring of the kprobes code, the splitting out of 'optprobes',
and a corner case bugfix.
- Allow the tracing of kernel console output (printk).
- Improvements/fixes to user-space RDPMC support, allowing user-space
self-profiling code to extract PMU counts without performing any
system calls, while playing nice with the kernel side.
- 'perf bench' improvements
- ... and lots of internal restructurings, cleanups and fixes that made
these features possible. And, as usual this list is incomplete as
there were also lots of other improvements
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (120 commits)
perf report: Fix annotate double quit issue in branch view mode
perf report: Remove duplicate annotate choice in branch view mode
perf/x86: Prettify pmu config literals
perf report: Enable TUI in branch view mode
perf report: Auto-detect branch stack sampling mode
perf record: Add HEADER_BRANCH_STACK tag
perf record: Provide default branch stack sampling mode option
perf tools: Make perf able to read files from older ABIs
perf tools: Fix ABI compatibility bug in print_event_desc()
perf tools: Enable reading of perf.data files from different ABI rev
perf: Add ABI reference sizes
perf report: Add support for taken branch sampling
perf record: Add support for sampling taken branch
perf tools: Add code to support PERF_SAMPLE_BRANCH_STACK
x86/kprobes: Split out optprobe related code to kprobes-opt.c
x86/kprobes: Fix a bug which can modify kernel code permanently
x86/kprobes: Fix instruction recovery on optimized path
perf: Add callback to flush branch_stack on context switch
perf: Disable PERF_SAMPLE_BRANCH_* when not supported
perf/x86: Add LBR software filter support for Intel CPUs
...
Diffstat (limited to 'tools/perf/Documentation')
-rw-r--r-- | tools/perf/Documentation/Makefile | 86 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-lock.txt | 20 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-record.txt | 38 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-report.txt | 10 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-script.txt | 5 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-stat.txt | 4 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-top.txt | 8 |
7 files changed, 127 insertions, 44 deletions
diff --git a/tools/perf/Documentation/Makefile b/tools/perf/Documentation/Makefile index 4626a398836a..ca600e09c8d4 100644 --- a/tools/perf/Documentation/Makefile +++ b/tools/perf/Documentation/Makefile @@ -1,3 +1,10 @@ +OUTPUT := ./ +ifeq ("$(origin O)", "command line") + ifneq ($(O),) + OUTPUT := $(O)/ + endif +endif + MAN1_TXT= \ $(filter-out $(addsuffix .txt, $(ARTICLES) $(SP_ARTICLES)), \ $(wildcard perf-*.txt)) \ @@ -6,10 +13,11 @@ MAN5_TXT= MAN7_TXT= MAN_TXT = $(MAN1_TXT) $(MAN5_TXT) $(MAN7_TXT) -MAN_XML=$(patsubst %.txt,%.xml,$(MAN_TXT)) -MAN_HTML=$(patsubst %.txt,%.html,$(MAN_TXT)) +_MAN_XML=$(patsubst %.txt,%.xml,$(MAN_TXT)) +_MAN_HTML=$(patsubst %.txt,%.html,$(MAN_TXT)) -DOC_HTML=$(MAN_HTML) +MAN_XML=$(addprefix $(OUTPUT),$(_MAN_XML)) +MAN_HTML=$(addprefix $(OUTPUT),$(_MAN_HTML)) ARTICLES = # with their own formatting rules. @@ -18,11 +26,17 @@ API_DOCS = $(patsubst %.txt,%,$(filter-out technical/api-index-skel.txt technica SP_ARTICLES += $(API_DOCS) SP_ARTICLES += technical/api-index -DOC_HTML += $(patsubst %,%.html,$(ARTICLES) $(SP_ARTICLES)) +_DOC_HTML = $(_MAN_HTML) +_DOC_HTML+=$(patsubst %,%.html,$(ARTICLES) $(SP_ARTICLES)) +DOC_HTML=$(addprefix $(OUTPUT),$(_DOC_HTML)) -DOC_MAN1=$(patsubst %.txt,%.1,$(MAN1_TXT)) -DOC_MAN5=$(patsubst %.txt,%.5,$(MAN5_TXT)) -DOC_MAN7=$(patsubst %.txt,%.7,$(MAN7_TXT)) +_DOC_MAN1=$(patsubst %.txt,%.1,$(MAN1_TXT)) +_DOC_MAN5=$(patsubst %.txt,%.5,$(MAN5_TXT)) +_DOC_MAN7=$(patsubst %.txt,%.7,$(MAN7_TXT)) + +DOC_MAN1=$(addprefix $(OUTPUT),$(_DOC_MAN1)) +DOC_MAN5=$(addprefix $(OUTPUT),$(_DOC_MAN5)) +DOC_MAN7=$(addprefix $(OUTPUT),$(_DOC_MAN7)) # Make the path relative to DESTDIR, not prefix ifndef DESTDIR @@ -150,9 +164,9 @@ man1: $(DOC_MAN1) man5: $(DOC_MAN5) man7: $(DOC_MAN7) -info: perf.info perfman.info +info: $(OUTPUT)perf.info $(OUTPUT)perfman.info -pdf: user-manual.pdf +pdf: $(OUTPUT)user-manual.pdf install: install-man @@ -166,7 +180,7 @@ install-man: man install-info: info $(INSTALL) -d -m 755 $(DESTDIR)$(infodir) - $(INSTALL) -m 644 perf.info perfman.info $(DESTDIR)$(infodir) + $(INSTALL) -m 644 $(OUTPUT)perf.info $(OUTPUT)perfman.info $(DESTDIR)$(infodir) if test -r $(DESTDIR)$(infodir)/dir; then \ $(INSTALL_INFO) --info-dir=$(DESTDIR)$(infodir) perf.info ;\ $(INSTALL_INFO) --info-dir=$(DESTDIR)$(infodir) perfman.info ;\ @@ -176,7 +190,7 @@ install-info: info install-pdf: pdf $(INSTALL) -d -m 755 $(DESTDIR)$(pdfdir) - $(INSTALL) -m 644 user-manual.pdf $(DESTDIR)$(pdfdir) + $(INSTALL) -m 644 $(OUTPUT)user-manual.pdf $(DESTDIR)$(pdfdir) #install-html: html # '$(SHELL_PATH_SQ)' ./install-webdoc.sh $(DESTDIR)$(htmldir) @@ -189,14 +203,14 @@ install-pdf: pdf # # Determine "include::" file references in asciidoc files. # -doc.dep : $(wildcard *.txt) build-docdep.perl +$(OUTPUT)doc.dep : $(wildcard *.txt) build-docdep.perl $(QUIET_GEN)$(RM) $@+ $@ && \ $(PERL_PATH) ./build-docdep.perl >$@+ $(QUIET_STDERR) && \ mv $@+ $@ --include doc.dep +-include $(OUPTUT)doc.dep -cmds_txt = cmds-ancillaryinterrogators.txt \ +_cmds_txt = cmds-ancillaryinterrogators.txt \ cmds-ancillarymanipulators.txt \ cmds-mainporcelain.txt \ cmds-plumbinginterrogators.txt \ @@ -205,32 +219,36 @@ cmds_txt = cmds-ancillaryinterrogators.txt \ cmds-synchelpers.txt \ cmds-purehelpers.txt \ cmds-foreignscminterface.txt +cmds_txt=$(addprefix $(OUTPUT),$(_cmds_txt)) -$(cmds_txt): cmd-list.made +$(cmds_txt): $(OUTPUT)cmd-list.made -cmd-list.made: cmd-list.perl ../command-list.txt $(MAN1_TXT) +$(OUTPUT)cmd-list.made: cmd-list.perl ../command-list.txt $(MAN1_TXT) $(QUIET_GEN)$(RM) $@ && \ $(PERL_PATH) ./cmd-list.perl ../command-list.txt $(QUIET_STDERR) && \ date >$@ clean: - $(RM) *.xml *.xml+ *.html *.html+ *.1 *.5 *.7 - $(RM) *.texi *.texi+ *.texi++ perf.info perfman.info - $(RM) howto-index.txt howto/*.html doc.dep - $(RM) technical/api-*.html technical/api-index.txt - $(RM) $(cmds_txt) *.made - -$(MAN_HTML): %.html : %.txt + $(RM) $(MAN_XML) $(addsuffix +,$(MAN_XML)) + $(RM) $(MAN_HTML) $(addsuffix +,$(MAN_HTML)) + $(RM) $(DOC_HTML) $(DOC_MAN1) $(DOC_MAN5) $(DOC_MAN7) + $(RM) $(OUTPUT)*.texi $(OUTPUT)*.texi+ $(OUTPUT)*.texi++ + $(RM) $(OUTPUT)perf.info $(OUTPUT)perfman.info + $(RM) $(OUTPUT)howto-index.txt $(OUTPUT)howto/*.html $(OUTPUT)doc.dep + $(RM) $(OUTPUT)technical/api-*.html $(OUTPUT)technical/api-index.txt + $(RM) $(cmds_txt) $(OUTPUT)*.made + +$(MAN_HTML): $(OUTPUT)%.html : %.txt $(QUIET_ASCIIDOC)$(RM) $@+ $@ && \ $(ASCIIDOC) -b xhtml11 -d manpage -f asciidoc.conf \ $(ASCIIDOC_EXTRA) -aperf_version=$(PERF_VERSION) -o $@+ $< && \ mv $@+ $@ -%.1 %.5 %.7 : %.xml +$(OUTPUT)%.1 $(OUTPUT)%.5 $(OUTPUT)%.7 : $(OUTPUT)%.xml $(QUIET_XMLTO)$(RM) $@ && \ - xmlto -m $(MANPAGE_XSL) $(XMLTO_EXTRA) man $< + xmlto -o $(OUTPUT) -m $(MANPAGE_XSL) $(XMLTO_EXTRA) man $< -%.xml : %.txt +$(OUTPUT)%.xml : %.txt $(QUIET_ASCIIDOC)$(RM) $@+ $@ && \ $(ASCIIDOC) -b docbook -d manpage -f asciidoc.conf \ $(ASCIIDOC_EXTRA) -aperf_version=$(PERF_VERSION) -o $@+ $< && \ @@ -239,25 +257,25 @@ $(MAN_HTML): %.html : %.txt XSLT = docbook.xsl XSLTOPTS = --xinclude --stringparam html.stylesheet docbook-xsl.css -user-manual.html: user-manual.xml +$(OUTPUT)user-manual.html: $(OUTPUT)user-manual.xml $(QUIET_XSLTPROC)xsltproc $(XSLTOPTS) -o $@ $(XSLT) $< -perf.info: user-manual.texi - $(QUIET_MAKEINFO)$(MAKEINFO) --no-split -o $@ user-manual.texi +$(OUTPUT)perf.info: $(OUTPUT)user-manual.texi + $(QUIET_MAKEINFO)$(MAKEINFO) --no-split -o $@ $(OUTPUT)user-manual.texi -user-manual.texi: user-manual.xml +$(OUTPUT)user-manual.texi: $(OUTPUT)user-manual.xml $(QUIET_DB2TEXI)$(RM) $@+ $@ && \ - $(DOCBOOK2X_TEXI) user-manual.xml --encoding=UTF-8 --to-stdout >$@++ && \ + $(DOCBOOK2X_TEXI) $(OUTPUT)user-manual.xml --encoding=UTF-8 --to-stdout >$@++ && \ $(PERL_PATH) fix-texi.perl <$@++ >$@+ && \ rm $@++ && \ mv $@+ $@ -user-manual.pdf: user-manual.xml +$(OUTPUT)user-manual.pdf: $(OUTPUT)user-manual.xml $(QUIET_DBLATEX)$(RM) $@+ $@ && \ $(DBLATEX) -o $@+ -p /etc/asciidoc/dblatex/asciidoc-dblatex.xsl -s /etc/asciidoc/dblatex/asciidoc-dblatex.sty $< && \ mv $@+ $@ -perfman.texi: $(MAN_XML) cat-texi.perl +$(OUTPUT)perfman.texi: $(MAN_XML) cat-texi.perl $(QUIET_DB2TEXI)$(RM) $@+ $@ && \ ($(foreach xml,$(MAN_XML),$(DOCBOOK2X_TEXI) --encoding=UTF-8 \ --to-stdout $(xml) &&) true) > $@++ && \ @@ -265,7 +283,7 @@ perfman.texi: $(MAN_XML) cat-texi.perl rm $@++ && \ mv $@+ $@ -perfman.info: perfman.texi +$(OUTPUT)perfman.info: $(OUTPUT)perfman.texi $(QUIET_MAKEINFO)$(MAKEINFO) --no-split --no-validate $*.texi $(patsubst %.txt,%.texi,$(MAN_TXT)): %.texi : %.xml diff --git a/tools/perf/Documentation/perf-lock.txt b/tools/perf/Documentation/perf-lock.txt index d6b2a4f2108b..c7f5f55634ac 100644 --- a/tools/perf/Documentation/perf-lock.txt +++ b/tools/perf/Documentation/perf-lock.txt @@ -8,7 +8,7 @@ perf-lock - Analyze lock events SYNOPSIS -------- [verse] -'perf lock' {record|report|trace} +'perf lock' {record|report|script|info} DESCRIPTION ----------- @@ -20,10 +20,13 @@ and statistics with this 'perf lock' command. produces the file "perf.data" which contains tracing results of lock events. - 'perf lock trace' shows raw lock events. - 'perf lock report' reports statistical data. + 'perf lock script' shows raw lock events. + + 'perf lock info' shows metadata like threads or addresses + of lock instances. + COMMON OPTIONS -------------- @@ -47,6 +50,17 @@ REPORT OPTIONS Sorting key. Possible values: acquired (default), contended, wait_total, wait_max, wait_min. +INFO OPTIONS +------------ + +-t:: +--threads:: + dump thread list in perf.data + +-m:: +--map:: + dump map of lock instances (address:name table) + SEE ALSO -------- linkperf:perf[1] diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt index 2937f7e14bb7..a1386b2fff00 100644 --- a/tools/perf/Documentation/perf-record.txt +++ b/tools/perf/Documentation/perf-record.txt @@ -52,11 +52,15 @@ OPTIONS -p:: --pid=:: - Record events on existing process ID. + Record events on existing process ID (comma separated list). -t:: --tid=:: - Record events on existing thread ID. + Record events on existing thread ID (comma separated list). + +-u:: +--uid=:: + Record events in threads owned by uid. Name or number. -r:: --realtime=:: @@ -148,6 +152,36 @@ an empty cgroup (monitor all the time) using, e.g., -G foo,,bar. Cgroups must ha corresponding events, i.e., they always refer to events defined earlier on the command line. +-b:: +--branch-any:: +Enable taken branch stack sampling. Any type of taken branch may be sampled. +This is a shortcut for --branch-filter any. See --branch-filter for more infos. + +-j:: +--branch-filter:: +Enable taken branch stack sampling. Each sample captures a series of consecutive +taken branches. The number of branches captured with each sample depends on the +underlying hardware, the type of branches of interest, and the executed code. +It is possible to select the types of branches captured by enabling filters. The +following filters are defined: + + - any: any type of branches + - any_call: any function call or system call + - any_ret: any function return or system call return + - any_ind: any indirect branch + - u: only when the branch target is at the user level + - k: only when the branch target is in the kernel + - hv: only when the target is at the hypervisor level + ++ +The option requires at least one branch type among any, any_call, any_ret, ind_call. +The privilege levels may be ommitted, in which case, the privilege levels of the associated +event are applied to the branch filter. Both kernel (k) and hypervisor (hv) privilege +levels are subject to permissions. When sampling on multiple events, branch stack sampling +is enabled for all the sampling events. The sampled branch type is the same for all events. +The various filters must be specified as a comma separated list: --branch-filter any_ret,u,k +Note that this feature may not be available on all processors. + SEE ALSO -------- linkperf:perf-stat[1], linkperf:perf-list[1] diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt index 9b430e98712e..87feeee8b90c 100644 --- a/tools/perf/Documentation/perf-report.txt +++ b/tools/perf/Documentation/perf-report.txt @@ -153,6 +153,16 @@ OPTIONS information which may be very large and thus may clutter the display. It currently includes: cpu and numa topology of the host system. +-b:: +--branch-stack:: + Use the addresses of sampled taken branches instead of the instruction + address to build the histograms. To generate meaningful output, the + perf.data file must have been obtained using perf record -b or + perf record --branch-filter xxx where xxx is a branch filter option. + perf report is able to auto-detect whether a perf.data file contains + branch stacks and it will automatically switch to the branch view mode, + unless --no-branch-stack is used. + SEE ALSO -------- linkperf:perf-stat[1], linkperf:perf-annotate[1] diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt index 2f6cef43da25..e9cbfcddfa3f 100644 --- a/tools/perf/Documentation/perf-script.txt +++ b/tools/perf/Documentation/perf-script.txt @@ -115,7 +115,7 @@ OPTIONS -f:: --fields:: Comma separated list of fields to print. Options are: - comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr. + comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff. Field list can be prepended with the type, trace, sw or hw, to indicate to which event type the field list applies. e.g., -f sw:comm,tid,time,ip,sym and -f trace:time,cpu,trace @@ -200,6 +200,9 @@ OPTIONS It currently includes: cpu and numa topology of the host system. It can only be used with the perf script report mode. +--show-kernel-path:: + Try to resolve the path of [kernel.kallsyms] + SEE ALSO -------- linkperf:perf-record[1], linkperf:perf-script-perl[1], diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt index 8966b9ab2014..2fa173b51970 100644 --- a/tools/perf/Documentation/perf-stat.txt +++ b/tools/perf/Documentation/perf-stat.txt @@ -35,11 +35,11 @@ OPTIONS child tasks do not inherit counters -p:: --pid=<pid>:: - stat events on existing process id + stat events on existing process id (comma separated list) -t:: --tid=<tid>:: - stat events on existing thread id + stat events on existing thread id (comma separated list) -a:: diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt index b1a5bbbfebef..4a5680cb242e 100644 --- a/tools/perf/Documentation/perf-top.txt +++ b/tools/perf/Documentation/perf-top.txt @@ -72,11 +72,15 @@ Default is to monitor all CPUS. -p <pid>:: --pid=<pid>:: - Profile events on existing Process ID. + Profile events on existing Process ID (comma separated list). -t <tid>:: --tid=<tid>:: - Profile events on existing thread ID. + Profile events on existing thread ID (comma separated list). + +-u:: +--uid=:: + Record events in threads owned by uid. Name or number. -r <priority>:: --realtime=<priority>:: |