summaryrefslogtreecommitdiffstats
path: root/arch
Commit message (Collapse)AuthorAgeFilesLines
* kprobes: Hide CONFIG_OPTPROBES and set if arch supports optimized kprobesMasami Hiramatsu2010-03-161-7/+2
| | | | | | | | | | | | | | | | | | | | | Hide CONFIG_OPTPROBES and set if the arch supports optimized kprobes (IOW, HAVE_OPTPROBES=y), since this option doesn't change the major behavior of kprobes, and workarounds for minor changes are documented. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Dieter Ries <mail@dieterries.net> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20100315170054.31593.3153.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86, perf: Fix comments in Pentium-4 PMU definitionsLin Ming2010-03-161-2/+3
| | | | | | | | | | | | Reported-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Lin Ming <ming.m.lin@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1268705556.3379.8.camel@minggr.sh.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* perf: Fix unexported generic perf_arch_fetch_caller_regsFrederic Weisbecker2010-03-161-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | perf_arch_fetch_caller_regs() is exported for the overriden x86 version, but not for the generic weak version. As a general rule, weak functions should not have their symbol exported in the same file they are defined. So let's export it on trace_event_perf.c as it is used by trace events only. This fixes: ERROR: ".perf_arch_fetch_caller_regs" [fs/xfs/xfs.ko] undefined! ERROR: ".perf_arch_fetch_caller_regs" [arch/powerpc/platforms/cell/spufs/spufs.ko] undefined! -v2: And also only build it if trace events are enabled. -v3: Fix changelog mistake Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1268697902-9518-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* perf, x86: Enable not tagged retired instruction counting on P4sCyrill Gorcunov2010-03-152-9/+7
| | | | | | | | | | | | | | | This should turn on instruction counting on P4s, which was missing in the first version of the new PMU driver. It's inaccurate for now, we still need dependant event to tag mops before we can count them precisely. The result is that the number of instruction may be lifted up. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Lin Ming <ming.m.lin@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <1268629102.3355.11.camel@minggr.sh.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86, perf: Unmask LVTPC only if we have APIC supportedCyrill Gorcunov2010-03-131-0/+2
| | | | | | | | | | | | | | | | | | | | | | Ingo reported: | | There's a build failure on -tip with the P4 driver, on UP 32-bit, if | PERF_EVENTS is enabled but UP_APIC is disabled: | | arch/x86/built-in.o: In function `p4_pmu_handle_irq': | perf_event.c:(.text+0xa756): undefined reference to `apic' | perf_event.c:(.text+0xa76e): undefined reference to `apic' | So we have to unmask LVTPC only if we're configured to have one. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> CC: Lin Ming <ming.m.lin@intel.com> CC: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20100313081116.GA5179@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Merge branch 'perf/x86' into perf/coreIngo Molnar2010-03-127-23/+1361
|\ | | | | | | | | | | | | Merge reason: The new P4 driver is stable and ready now for more testing. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86, perf: Fix NULL deref on not assigned x86_pmuCyrill Gorcunov2010-03-121-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case of not assigned x86_pmu and software events NULL dereference may being hit via x86_pmu::schedule_events method. Fix it by checking if x86_pmu is initialized at all. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20100311215016.GG25162@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * perf, x86: Implement initial P4 PMU driverCyrill Gorcunov2010-03-117-23/+1358
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The netburst PMU is way different from the "architectural perfomance monitoring" specification that current CPUs use. P4 uses a tuple of ESCR+CCCR+COUNTER MSR registers to handle perfomance monitoring events. A few implementational details: 1) We need a separate x86_pmu::hw_config helper in struct x86_pmu since register bit-fields are quite different from P6, Core and later cpu series. 2) For the same reason is a x86_pmu::schedule_events helper introduced. 3) hw_perf_event::config consists of packed ESCR+CCCR values. It's allowed since in reality both registers only use a half of their size. Of course before making a real write into a particular MSR we need to unpack the value and extend it to a proper size. 4) The tuple of packed ESCR+CCCR in hw_perf_event::config doesn't describe the memory address of ESCR MSR register so that we need to keep a mapping between these tuples used and available ESCR (various P4 events may use same ESCRs but not simultaneously), for this sake every active event has a per-cpu map of hw_perf_event::idx <--> ESCR addresses. 5) Since hw_perf_event::idx is an offset to counter/control register we need to lift X86_PMC_MAX_GENERIC up, otherwise kernel strips it down to 8 registers and event armed may never be turned off (ie the bit in active_mask is set but the loop never reaches this index to check), thanks to Peter Zijlstra Restrictions: - No cascaded counters support (do we ever need them?) - No dependent events support (so PERF_COUNT_HW_INSTRUCTIONS doesn't work for now) - There are events with same counters which can't work simultaneously (need to use intersected ones due to broken counter 1) - No PERF_COUNT_HW_CACHE_ events yet Todo: - Implement dependent events - Need proper hashing for event opcodes (no linear search, good for debugging stage but not in real loads) - Some events counted during a clock cycle -- need to set threshold for them and count every clock cycle just to get summary statistics (ie to behave the same way as other PMUs do) - Need to swicth to use event_constraints - To support RAW events we need to encode a global list of P4 events into p4_templates - Cache events need to be added Event support status matrix: Event status ----------------------------- cycles works cache-references works cache-misses works branch-misses works bus-cycles partially (does not work on 64bit cpu with HT enabled) instruction doesnt work (needs dependent event [mop tagging]) branches doesnt work Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Lin Ming <ming.m.lin@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20100311165439.GB5129@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | Merge branch 'perf/urgent' into perf/coreIngo Molnar2010-03-124-12/+34
|\ \ | |/ |/| | | | | | | Merge reason: We want to queue up a dependent patch. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * perf: export perf_trace_regs and perf_arch_fetch_caller_regsXiao Guangrong2010-03-111-0/+1
| | | | | | | | | | | | | | | | | | | | | | Export perf_trace_regs and perf_arch_fetch_caller_regs since module will use these. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> [ use EXPORT_PER_CPU_SYMBOL_GPL() ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4B989C1B.2090407@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * perf, x86: Fix hw_perf_enable() event assignmentPeter Zijlstra2010-03-111-9/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | What happens is that we schedule badly like: <...>-1987 [019] 280.252808: x86_pmu_start: event-46/1300c0: idx: 0 <...>-1987 [019] 280.252811: x86_pmu_start: event-47/1300c0: idx: 1 <...>-1987 [019] 280.252812: x86_pmu_start: event-48/1300c0: idx: 2 <...>-1987 [019] 280.252813: x86_pmu_start: event-49/1300c0: idx: 3 <...>-1987 [019] 280.252814: x86_pmu_start: event-50/1300c0: idx: 32 <...>-1987 [019] 280.252825: x86_pmu_stop: event-46/1300c0: idx: 0 <...>-1987 [019] 280.252826: x86_pmu_stop: event-47/1300c0: idx: 1 <...>-1987 [019] 280.252827: x86_pmu_stop: event-48/1300c0: idx: 2 <...>-1987 [019] 280.252828: x86_pmu_stop: event-49/1300c0: idx: 3 <...>-1987 [019] 280.252829: x86_pmu_stop: event-50/1300c0: idx: 32 <...>-1987 [019] 280.252834: x86_pmu_start: event-47/1300c0: idx: 1 <...>-1987 [019] 280.252834: x86_pmu_start: event-48/1300c0: idx: 2 <...>-1987 [019] 280.252835: x86_pmu_start: event-49/1300c0: idx: 3 <...>-1987 [019] 280.252836: x86_pmu_start: event-50/1300c0: idx: 32 <...>-1987 [019] 280.252837: x86_pmu_start: event-51/1300c0: idx: 32 *FAIL* This happens because we only iterate the n_running events in the first pass, and reset their index to -1 if they don't match to force a re-assignment. Now, in our RR example, n_running == 0 because we fully unscheduled, so event-50 will retain its idx==32, even though in scheduling it will have gotten idx=0, and we don't trigger the re-assign path. The easiest way to fix this is the below patch, which simply validates the full assignment in the second pass. Reported-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1268311069.5037.31.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * perf, ppc: Fix compile error due to new cpu notifiersPeter Zijlstra2010-03-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix: arch/powerpc/kernel/perf_event.c:1334: error: 'power_pmu_notifier' undeclared (first use in this function) arch/powerpc/kernel/perf_event.c:1334: error: (Each undeclared identifier is reported only once arch/powerpc/kernel/perf_event.c:1334: error: for each function it appears in.) arch/powerpc/kernel/perf_event.c:1334: error: implicit declaration of function 'power_pmu_notifier' arch/powerpc/kernel/perf_event.c:1334: error: implicit declaration of function 'register_cpu_notifier' Due to commit 3f6da390 (perf: Rework and fix the arch CPU-hotplug hooks). Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * perf: Introduce new perf_fetch_caller_regs() for hot regs snapshotFrederic Weisbecker2010-03-102-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Events that trigger overflows by interrupting a context can use get_irq_regs() or task_pt_regs() to retrieve the state when the event triggered. But this is not the case for some other class of events like trace events as tracepoints are executed in the same context than the code that triggered the event. It means we need a different api to capture the regs there, namely we need a hot snapshot to get the most important informations for perf: the instruction pointer to get the event origin, the frame pointer for the callchain, the code segment for user_mode() tests (we always use __KERNEL_CS as trace events always occur from the kernel) and the eflags for further purposes. v2: rename perf_save_regs to perf_fetch_caller_regs as per Masami's suggestion. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Archs <linux-arch@vger.kernel.org>
| * perf/x86-64: Use frame pointer to walk on irq and process stacksFrederic Weisbecker2010-03-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We were using the frame pointer based stack walker on every contexts in x86-32, but not in x86-64 where we only use the seven-league boots on the exception stacks. Use it also on irq and process stacks. This utterly accelerate the captures. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf, x86: Fix the !CONFIG_CPU_SUP_INTEL buildIngo Molnar2010-03-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | Fix typo. But the modularization here is ugly and should be improved. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf, x86: Add INSTRUCTION_DECODER config flagIngo Molnar2010-03-102-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PEBS+LBR decoding magic needs the insn_get_length() infrastructure to be able to decode x86 instruction length. So split it out of KPROBES dependency and make it enabled when either KPROBES or PERF_EVENTS is enabled. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf, x86: Fix LBR read-outPeter Zijlstra2010-03-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Don't decrement the TOS twice... Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf, x86: Fixup the PEBS handler for Core2 cpusPeter Zijlstra2010-03-101-14/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull the core handler in line with the nhm one, also make sure we always drain the buffer. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf, x86: Remove checking_{wr,rd}msr() usagePeter Zijlstra2010-03-102-9/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We don't need checking_{wr,rd}msr() calls, since we should know what cpu we're running on and not use blindly poke at msrs. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf, x86: Don't reset the LBR as frequentlyPeter Zijlstra2010-03-101-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we reset the LBR on each first counter, simple counter rotation which first deschedules all counters and then reschedules the new ones will lead to LBR reset, even though we're still in the same task context. Reduce this by not flushing on the first counter but only flushing on different task contexts. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf, x86: Fix silly bug in intel_pmu_pebs_{enable,disable}Peter Zijlstra2010-03-101-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We need to use the actual cpuc->pebs_enabled value, not a local copy for the changes to take effect. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf, x86: Deal with multiple state bits for pebs-fmt1Peter Zijlstra2010-03-101-3/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Its unclear if the PEBS state record will have only a single bit set, in case it does not and accumulates bits, deal with that by only processing each event once. Also, robustify some of the code. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf, x86: Reorder intel_pmu_enable_all()Peter Zijlstra2010-03-101-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The documentation says we have to enable PEBS before we enable the PMU proper. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf, x86: Fix LBR enable/disable vs cpuc->enabledPeter Zijlstra2010-03-101-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We should never call ->enable with the pmu enabled, and we _can_ have ->disable called with the pmu enabled. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf, x86: Fix PEBS enable/disable vs cpuc->enabledPeter Zijlstra2010-03-101-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We should never call ->enable with the pmu enabled, and we _can_ have ->disable called with the pmu enabled. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf, x86: Fix pebs drainsPeter Zijlstra2010-03-101-12/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I overlooked the perf_disable()/perf_enable() calls in intel_pmu_handle_irq(), (pointed out by Markus) so we should not explicitly disable_all/enable_all pebs counters in the drain functions, these are already disabled and enabling them early is confusing. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf, x86: Avoid double disable on throttle vs ioctl(PERF_IOC_DISABLE)Peter Zijlstra2010-03-101-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Calling ioctl(PERF_EVENT_IOC_DISABLE) on a thottled counter would result in a double disable, cure this by using x86_pmu_{start,stop} for throttle/unthrottle and teach x86_pmu_stop() to check ->active_mask. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf, x86: Robustify PEBS fixupPeter Zijlstra2010-03-101-2/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | It turns out the LBR is massively unreliable on certain CPUs, so code the fixup a little more defensive to avoid crashing the kernel. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <20100305154129.042271287@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf, x86: Clear the LBRs on initPeter Zijlstra2010-03-102-2/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Some CPUs have errata where the LBR is not cleared on Power-On. So always clear the LBRs before use. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <20100305154128.966563424@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf, x86: Disable PEBS on clovertown chipsPeter Zijlstra2010-03-102-0/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | This CPU has just too many handycaps to be really useful. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <20100305154128.890278662@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf, x86: Fix silly bug in data store buffer allocationPeter Zijlstra2010-03-101-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix up the ds allocation error path, where we could free @buffer before we used it. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <20100305154128.813452402@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | x86: Move MAX_INSN_SIZE into asm/insn.hPeter Zijlstra2010-03-103-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Since there's now two users for this, place it in a common header. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <20100304140100.923774125@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf, x86: Expose the full PEBS record using PERF_SAMPLE_RAWPeter Zijlstra2010-03-101-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | Expose the full PEBS record using PERF_SAMPLE_RAW Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <20100304140100.847218224@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf, x86: Clean up IA32_PERF_CAPABILITIES usagePeter Zijlstra2010-03-104-31/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Saner PERF_CAPABILITIES support, which also exposes pebs_trap. Use that latter to make PEBS's use of LBR conditional since a fault-like pebs should already report the correct IP. ( As of this writing there is no known hardware that implements !pebs_trap ) Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <20100304140100.770650663@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf, x86: use LBR for PEBS IP+1 fixupPeter Zijlstra2010-03-104-39/+138
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the LBR to fix up the PEBS IP+1 issue. As said, PEBS reports the next instruction, here we use the LBR to find the last branch and from that construct the actual IP. If the IP matches the LBR-TO, we use LBR-FROM, otherwise we use the LBR-TO address as the beginning of the last basic block and decode forward. Once we find a match to the current IP, we use the previous location. This patch introduces a new ABI element: PERF_RECORD_MISC_EXACT, which conveys that the reported IP (PERF_SAMPLE_IP) is the exact instruction that caused the event (barring CPU errata). The fixup can fail due to various reasons: 1) LBR contains invalid data (quite possible) 2) part of the basic block got paged out 3) the reported IP isn't part of the basic block (see 1) Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <20100304140100.619375431@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf, x86: Implement simple LBR supportPeter Zijlstra2010-03-103-0/+259
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement simple suport Intel Last-Branch-Record, it supports all hardware that implements FREEZE_LBRS_ON_PMI, but does not (yet) implement the LBR config register. The Intel LBR is a FIFO of From,To addresses describing the last few branches the hardware took. This patch does not add perf interface to the LBR, but merely provides an interface for internal use. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <20100304140100.544191154@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | perf, x86: Add PEBS infrastructurePeter Zijlstra2010-03-103-261/+669
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements support for Intel Precise Event Based Sampling, which is an alternative counter mode in which the counter triggers a hardware assist to collect information on events. The hardware assist takes a trap like snapshot of a subset of the machine registers. This data is written to the Intel Debug-Store, which can be programmed with a data threshold at which to raise a PMI. With the PEBS hardware assist being trap like, the reported IP is always one instruction after the actual instruction that triggered the event. This implements a simple PEBS model that always takes a single PEBS event at a time. This is done so that the interaction with the rest of the system is as expected (freq adjust, period randomization, lbr, callchains, etc.). It adds an ABI element: perf_event_attr::precise, which indicates that we wish to use this (constrained, but precise) mode. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <20100304140100.392111285@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* perf, x86: Fix double enable callsPeter Zijlstra2010-03-101-0/+4
| | | | | | | | | | | | | | | | | | | | | | hw_perf_enable() would enable already enabled events. This causes problems with code that assumes that ->enable/->disable calls are balanced (like the LBR code does). What happens is that events that were already running and left in place would get enabled again. Avoid this by only enabling new events that match their previous assignment. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* perf, x86: Fix double disable callsPeter Zijlstra2010-03-101-1/+2
| | | | | | | | | | | | | | | | | | | | | | hw_perf_enable() would disable events that were not yet enabled. This causes problems with code that assumes that ->enable/->disable calls are balanced (like the LBR code does). What happens is that we disable newly added counters that match their previous assignment, even though they are not yet programmed on the hardware. Avoid this by only doing the first pass over the existing events. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* perf, x86: Properly account n_addedPeter Zijlstra2010-03-101-2/+2
| | | | | | | | | | | | | | | Make sure n_added is properly accounted so that we can rely on the value to reflect the number of added counters. This is needed if its going to be used for more than a boolean check. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* perf, x86: Avoid double disable on throttle vs ioctl(PERF_IOC_DISABLE)Peter Zijlstra2010-03-102-15/+7
| | | | | | | | | | | | | | | Calling ioctl(PERF_EVENT_IOC_DISABLE) on a thottled counter would result in a double disable, cure this by using x86_pmu_{start,stop} for throttle/unthrottle and teach x86_pmu_stop() to check ->active_mask. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* perf, x86: Fix x86_pmu_startPeter Zijlstra2010-03-101-13/+10
| | | | | | | | | | | | | pmu::start should undo pmu::stop, make it so. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* perf, x86: Use unlocked bitopsPeter Zijlstra2010-03-103-6/+5
| | | | | | | | | | | | | | | | There is no concurrency on these variables, so don't use LOCK'ed ops. As to the intel_pmu_handle_irq() status bit clean, nobody uses that so remove it all together. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com Cc: Arnaldo Carvalho de Melo <acme@infradead.org> LKML-Reference: <20100304140100.240023029@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* perf, x86: Change x86_pmu.{enable,disable} calling conventionPeter Zijlstra2010-03-103-33/+38
| | | | | | | | | | | | | | | Pass the full perf_event into the x86_pmu functions so that those may make use of more than the hw_perf_event, and while doing this, remove the superfluous second argument. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com Cc: Arnaldo Carvalho de Melo <acme@infradead.org> LKML-Reference: <20100304140100.165166129@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* perf, x86: Remove superfluous arguments to x86_perf_event_update()Peter Zijlstra2010-03-102-13/+8
| | | | | | | | | | | | | | | The second and third argument to x86_perf_event_update() are superfluous since they are simple expressions of the first argument. Hence remove them. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com Cc: Arnaldo Carvalho de Melo <acme@infradead.org> LKML-Reference: <20100304140100.089468871@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* perf, x86: Remove superfluous arguments to x86_perf_event_set_period()Peter Zijlstra2010-03-102-9/+8
| | | | | | | | | | | | | | | The second and third argument to x86_perf_event_set_period() are superfluous since they are simple expressions of the first argument. Hence remove them. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com Cc: Arnaldo Carvalho de Melo <acme@infradead.org> LKML-Reference: <20100304140100.006500906@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* perf, x86, Do not user perf_disable from NMI contextPeter Zijlstra2010-03-101-6/+5
| | | | | | | | | | | | | | Explicitly use intel_pmu_{disable,enable}_all() in intel_pmu_handle_irq() to avoid the NMI race conditions in perf_{disable,enable} Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* perf: Rework and fix the arch CPU-hotplug hooksPeter Zijlstra2010-03-105-66/+110
| | | | | | | | | | | | | | | | | | | Remove the hw_perf_event_*() hotplug hooks in favour of per PMU hotplug notifiers. This has the advantage of reducing the static weak interface as well as exposing all hotplug actions to the PMU. Use this to fix x86 hotplug usage where we did things in ONLINE which should have been done in UP_PREPARE or STARTING. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mundt <lethal@linux-sh.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com Cc: Arnaldo Carvalho de Melo <acme@infradead.org> LKML-Reference: <20100305154128.736225361@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* perf: Provide generic perf_sample_data initializationPeter Zijlstra2010-03-105-13/+10
| | | | | | | | | | | | | | | | | | | | This makes it easier to extend perf_sample_data and fixes a bug on arm and sparc, which failed to set ->raw to NULL, which can cause crashes when combined with PERF_SAMPLE_RAW. It also optimizes PowerPC and tracepoint, because the struct initialization is forced to zero out the whole structure. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Jean Pihet <jpihet@mvista.com> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Cc: Jamie Iles <jamie.iles@picochip.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> Cc: stable@kernel.org LKML-Reference: <20100304140100.315416040@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Merge commit 'v2.6.34-rc1' into perf/urgentIngo Molnar2010-03-091683-39874/+71228
|\ | | | | | | | | | | | | | | | | Conflicts: tools/perf/util/probe-event.c Merge reason: Pick up -rc1 and resolve the conflict as well. Signed-off-by: Ingo Molnar <mingo@elte.hu>