summaryrefslogtreecommitdiffstats
path: root/kernel/trace
Commit message (Collapse)AuthorAgeFilesLines
* uprobes: Support SDT markers having reference count (semaphore)Ravi Bangoria2018-09-242-4/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Userspace Statically Defined Tracepoints[1] are dtrace style markers inside userspace applications. Applications like PostgreSQL, MySQL, Pthread, Perl, Python, Java, Ruby, Node.js, libvirt, QEMU, glib etc have these markers embedded in them. These markers are added by developer at important places in the code. Each marker source expands to a single nop instruction in the compiled code but there may be additional overhead for computing the marker arguments which expands to couple of instructions. In case the overhead is more, execution of it can be omitted by runtime if() condition when no one is tracing on the marker: if (reference_counter > 0) { Execute marker instructions; } Default value of reference counter is 0. Tracer has to increment the reference counter before tracing on a marker and decrement it when done with the tracing. Implement the reference counter logic in core uprobe. User will be able to use it from trace_uprobe as well as from kernel module. New trace_uprobe definition with reference counter will now be: <path>:<offset>[(ref_ctr_offset)] where ref_ctr_offset is an optional field. For kernel module, new variant of uprobe_register() has been introduced: uprobe_register_refctr(inode, offset, ref_ctr_offset, consumer) No new variant for uprobe_unregister() because it's assumed to have only one reference counter for one uprobe. [1] https://sourceware.org/systemtap/wiki/UserSpaceProbeImplementation Note: 'reference counter' is called as 'semaphore' in original Dtrace (or Systemtap, bcc and even in ELF) documentation and code. But the term 'semaphore' is misleading in this context. This is just a counter used to hold number of tracers tracing on a marker. This is not really used for any synchronization. So we are calling it a 'reference counter' in kernel / perf code. Link: http://lkml.kernel.org/r/20180820044250.11659-2-ravi.bangoria@linux.ibm.com Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> [Only trace_uprobe.c] Reviewed-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Song Liu <songliubraving@fb.com> Tested-by: Song Liu <songliubraving@fb.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
* tracing/kprobe: Remove unneeded extra strchr() from create_trace_kprobe()Steven Rostedt (VMware)2018-09-241-3/+6
| | | | | | | | By utilizing a temporary variable, we can avoid adding another call to strchr(). Instead, save the first call to a temp variable, and then use that variable as the reference to set the event variable. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
* ring-buffer: Allow for rescheduling when removing pagesVaibhav Nagarnaik2018-09-171-0/+2
| | | | | | | | | | | | | | | | | | | | | When reducing ring buffer size, pages are removed by scheduling a work item on each CPU for the corresponding CPU ring buffer. After the pages are removed from ring buffer linked list, the pages are free()d in a tight loop. The loop does not give up CPU until all pages are removed. In a worst case behavior, when lot of pages are to be freed, it can cause system stall. After the pages are removed from the list, the free() can happen while the work is rescheduled. Call cond_resched() in the loop to prevent the system hangup. Link: http://lkml.kernel.org/r/20180907223129.71994-1-vnagarnaik@google.com Cc: stable@vger.kernel.org Fixes: 83f40318dab00 ("ring-buffer: Make removal of ring buffer pages atomic") Reported-by: Jason Behmer <jbehmer@google.com> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
* Merge tag 'trace-v4.19-1' of ↵Linus Torvalds2018-08-233-1/+25
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Masami found an off by one bug in the code that keeps "notrace" functions from being traced by kprobes. During my testing, I found that there's places that we may want to add kprobes to notrace, thus we may end up changing this code before 4.19 is released. The history behind this change is that we found that adding kprobes to various notrace functions caused the kernel to crashed. We took the safe route and decided not to allow kprobes to trace any notrace function. But because notrace is added to functions that just cause weird side effects to the function tracer, but are still safe, preventing kprobes for all notrace functios may be too much of a big hammer. One such place is __schedule() is marked notrace, to keep function tracer from doing strange recursive loops when it gets traced with NEED_RESCHED set. With this change, one can not add kprobes to the scheduler. Masami also added code to use gcov on ftrace" * tag 'trace-v4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing/kprobes: Fix to check notrace function with correct range tracing: Allow gcov profiling on only ftrace subsystem
| * tracing/kprobes: Fix to check notrace function with correct rangeMasami Hiramatsu2018-08-211-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix within_notrace_func() to check notrace function correctly. Since the ftrace_location_range(start, end) function checks the range inclusively (start <= ftrace-loc <= end), the end address must not include the entry address of next function. However, within_notrace_func() uses kallsyms_lookup_size_offset() to get the function size and calculate the end address from adding the size to the entry address. This means the end address is the entry address of the next function. In the result, within_notrace_func() fails to find notrace function if the next function of the target function is ftraced. Let's subtract 1 from the end address so that ftrace_location_range() can check it correctly. Link: http://lkml.kernel.org/r/153485669706.16611.17726752296213785504.stgit@devbox Fixes: commit 45408c4f9250 ("tracing: kprobes: Prohibit probing on notrace function") Reported-by: Michael Rodin <michael@rodin.online> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * tracing: Allow gcov profiling on only ftrace subsystemMasami Hiramatsu2018-08-212-0/+17
| | | | | | | | | | | | | | | | | | | | | | Add GCOV_PROFILE_FTRACE to allow gcov profiling on only files in ftrace subsystem. This config option will be used for checking kselftest/ftrace coverage. Link: http://lkml.kernel.org/r/153483647755.32472.4746349899604275441.stgit@devbox Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
* | Merge tag 'for-4.19/post-20180822' of git://git.kernel.dk/linux-blockLinus Torvalds2018-08-221-0/+4
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull more block updates from Jens Axboe: - Set of bcache fixes and changes (Coly) - The flush warn fix (me) - Small series of BFQ fixes (Paolo) - wbt hang fix (Ming) - blktrace fix (Steven) - blk-mq hardware queue count update fix (Jianchao) - Various little fixes * tag 'for-4.19/post-20180822' of git://git.kernel.dk/linux-block: (31 commits) block/DAC960.c: make some arrays static const, shrinks object size blk-mq: sync the update nr_hw_queues with blk_mq_queue_tag_busy_iter blk-mq: init hctx sched after update ctx and hctx mapping block: remove duplicate initialization tracing/blktrace: Fix to allow setting same value pktcdvd: fix setting of 'ret' error return for a few cases block: change return type to bool block, bfq: return nbytes and not zero from struct cftype .write() method block, bfq: improve code of bfq_bfqq_charge_time block, bfq: reduce write overcharge block, bfq: always update the budget of an entity when needed block, bfq: readd missing reset of parent-entity service blk-wbt: fix IO hang in wbt_wait() block: don't warn for flush on read-only device bcache: add the missing comments for smp_mb()/smp_wmb() bcache: remove unnecessary space before ioctl function pointer arguments bcache: add missing SPDX header bcache: move open brace at end of function definitions to next line bcache: add static const prefix to char * array declarations bcache: fix code comments style ...
| * | tracing/blktrace: Fix to allow setting same valueSteven Rostedt (VMware)2018-08-161-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Masami Hiramatsu reported: Current trace-enable attribute in sysfs returns an error if user writes the same setting value as current one, e.g. # cat /sys/block/sda/trace/enable 0 # echo 0 > /sys/block/sda/trace/enable bash: echo: write error: Invalid argument # echo 1 > /sys/block/sda/trace/enable # echo 1 > /sys/block/sda/trace/enable bash: echo: write error: Device or resource busy But this is not a preferred behavior, it should ignore if new setting is same as current one. This fixes the problem as below. # cat /sys/block/sda/trace/enable 0 # echo 0 > /sys/block/sda/trace/enable # echo 1 > /sys/block/sda/trace/enable # echo 1 > /sys/block/sda/trace/enable Link: http://lkml.kernel.org/r/20180816103802.08678002@gandalf.local.home Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: linux-block@vger.kernel.org Cc: stable@vger.kernel.org Fixes: cd649b8bb830d ("blktrace: remove sysfs_blk_trace_enable_show/store()") Reported-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | | Merge tag 'trace-v4.19' of ↵Linus Torvalds2018-08-2035-423/+472
|\ \ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: - Restructure of lockdep and latency tracers This is the biggest change. Joel Fernandes restructured the hooks from irqs and preemption disabling and enabling. He got rid of a lot of the preprocessor #ifdef mess that they caused. He turned both lockdep and the latency tracers to use trace events inserted in the preempt/irqs disabling paths. But unfortunately, these started to cause issues in corner cases. Thus, parts of the code was reverted back to where lockdep and the latency tracers just get called directly (without using the trace events). But because the original change cleaned up the code very nicely we kept that, as well as the trace events for preempt and irqs disabling, but they are limited to not being called in NMIs. - Have trace events use SRCU for "rcu idle" calls. This was required for the preempt/irqs off trace events. But it also had to not allow them to be called in NMI context. Waiting till Paul makes an NMI safe SRCU API. - New notrace SRCU API to allow trace events to use SRCU. - Addition of mcount-nop option support - SPDX headers replacing GPL templates. - Various other fixes and clean ups. - Some fixes are marked for stable, but were not fully tested before the merge window opened. * tag 'trace-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (44 commits) tracing: Fix SPDX format headers to use C++ style comments tracing: Add SPDX License format tags to tracing files tracing: Add SPDX License format to bpf_trace.c blktrace: Add SPDX License format header s390/ftrace: Add -mfentry and -mnop-mcount support tracing: Add -mcount-nop option support tracing: Avoid calling cc-option -mrecord-mcount for every Makefile tracing: Handle CC_FLAGS_FTRACE more accurately Uprobe: Additional argument arch_uprobe to uprobe_write_opcode() Uprobes: Simplify uprobe_register() body tracepoints: Free early tracepoints after RCU is initialized uprobes: Use synchronize_rcu() not synchronize_sched() tracing: Fix synchronizing to event changes with tracepoint_synchronize_unregister() ftrace: Remove unused pointer ftrace_swapper_pid tracing: More reverting of "tracing: Centralize preemptirq tracepoints and unify their usage" tracing/irqsoff: Handle preempt_count for different configs tracing: Partial revert of "tracing: Centralize preemptirq tracepoints and unify their usage" tracing: irqsoff: Account for additional preempt_disable trace: Use rcu_dereference_raw for hooks from trace-event subsystem tracing/kprobes: Fix within_notrace_func() to check only notrace functions ...
| * | tracing: Fix SPDX format headers to use C++ style commentsSteven Rostedt (VMware)2018-08-168-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Linux kernel adopted the SPDX License format headers to ease license compliance management, and uses the C++ '//' style comments for the SPDX header tags. Some files in the tracing directory used the C style /* */ comments for them. To be consistent across all files, replace the /* */ C style SPDX tags with the C++ // SPDX tags. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | tracing: Add SPDX License format tags to tracing filesSteven Rostedt (VMware)2018-08-1620-102/+20
| | | | | | | | | | | | | | | | | | Add the SPDX License header to ease license compliance management. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | tracing: Add SPDX License format to bpf_trace.cSteven Rostedt (VMware)2018-08-161-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | Add the SPDX License header to ease license compliance management. Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | blktrace: Add SPDX License format headerSteven Rostedt (VMware)2018-08-161-13/+1
| | | | | | | | | | | | | | | | | | | | | Add the SPDX License header to ease license compliance management. Acked-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | tracing: Add -mcount-nop option supportVasily Gorbik2018-08-152-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | -mcount-nop gcc option generates the calls to the profiling functions as nops which allows to avoid patching mcount jump with NOP instructions initially. -mcount-nop gcc option will be activated if platform selects HAVE_NOP_MCOUNT and gcc actually supports it. In addition to that CC_USING_NOP_MCOUNT is defined and could be used by architectures to adapt ftrace patching behavior. Link: http://lkml.kernel.org/r/patch-3.thread-aa7b8d.git-e02ed2dc082b.your-ad-here.call-01533557518-ext-9465@work.hours Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | uprobes: Use synchronize_rcu() not synchronize_sched()Steven Rostedt (VMware)2018-08-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While debugging another bug, I was looking at all the synchronize*() functions being used in kernel/trace, and noticed that trace_uprobes was using synchronize_sched(), with a comment to synchronize with {u,ret}_probe_trace_func(). When looking at those functions, the data is protected with "rcu_read_lock()" and not with "rcu_read_lock_sched()". This is using the wrong synchronize_*() function. Link: http://lkml.kernel.org/r/20180809160553.469e1e32@gandalf.local.home Cc: stable@vger.kernel.org Fixes: 70ed91c6ec7f8 ("tracing/uprobes: Support ftrace_event_file base multibuffer") Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | tracing: Fix synchronizing to event changes with ↵Steven Rostedt (VMware)2018-08-104-14/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tracepoint_synchronize_unregister() Now that some trace events can be protected by srcu_read_lock(tracepoint_srcu), we need to make sure all locations that depend on this are also protected. There were many places that did a synchronize_sched() thinking that it was enough to protect againts access to trace events. This use to be the case, but now that we use SRCU for _rcuidle() trace events, they may not be protected by synchronize_sched(), as they may be called in paths that RCU is not watching for preempt disable. Fixes: e6753f23d961d ("tracepoint: Make rcuidle tracepoint callers use SRCU") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | ftrace: Remove unused pointer ftrace_swapper_pidColin Ian King2018-08-101-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pointer ftrace_swapper_pid is defined but is never used hence it is redundant and can be removed. The use of this variable was removed in commit 345ddcc882d8 ("ftrace: Have set_ftrace_pid use the bitmap like events do"). Cleans up clang warning: warning: 'ftrace_swapper_pid' defined but not used [-Wunused-const-variable=] Link: http://lkml.kernel.org/r/20180809125609.13142-1-colin.king@canonical.com Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | tracing: More reverting of "tracing: Centralize preemptirq tracepoints and ↵Steven Rostedt (VMware)2018-08-103-50/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | unify their usage" Joel Fernandes created a nice patch that cleaned up the duplicate hooks used by lockdep and irqsoff latency tracer. It made both use tracepoints. But the latency tracer is triggering warnings when using tracepoints to call into the latency tracer's routines. Mainly, they can be called from NMI context. If that happens, then the SRCU may not work properly because on some architectures, SRCU is not safe to be called in both NMI and non-NMI context. This is a partial revert of the clean up patch c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and unify their usage") that adds back the direct calls into the latency tracer. It also only calls the trace events when not in NMI. Link: http://lkml.kernel.org/r/20180809210654.622445925@goodmis.org Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and unify their usage") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | tracing/irqsoff: Handle preempt_count for different configsSteven Rostedt (VMware)2018-08-101-29/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I was hitting the following warning: WARNING: CPU: 0 PID: 1 at kernel/trace/trace_irqsoff.c:631 tracer_hardirqs_off+0x15/0x2a Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-rc6-test+ #13 Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014 EIP: tracer_hardirqs_off+0x15/0x2a Code: ff 85 c0 74 0e 8b 45 00 8b 50 04 8b 45 04 e8 35 ff ff ff 5d c3 55 64 a1 cc 37 51 c1 a9 ff ff ff 7f 89 e5 53 89 d3 89 ca 75 02 <0f> 0b e8 90 fc ff ff 85 c0 74 07 89 d8 e8 0c ff ff ff 5b 5d c3 55 EAX: 80000000 EBX: c04337f0 ECX: c04338e3 EDX: c04338e3 ESI: c04337f0 EDI: c04338e3 EBP: f2aa1d68 ESP: f2aa1d64 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210046 CR0: 80050033 CR2: 00000000 CR3: 01668000 CR4: 001406f0 Call Trace: trace_irq_disable_rcuidle+0x63/0x6c trace_hardirqs_off+0x26/0x30 default_send_IPI_mask_allbutself_logical+0x31/0x93 default_send_IPI_allbutself+0x37/0x48 native_send_call_func_ipi+0x4d/0x6a smp_call_function_many+0x165/0x19d ? add_nops+0x34/0x34 ? trace_hardirqs_on_caller+0x2d/0x2d ? add_nops+0x34/0x34 smp_call_function+0x1f/0x23 on_each_cpu+0x15/0x43 ? trace_hardirqs_on_caller+0x2d/0x2d ? trace_hardirqs_on_caller+0x2d/0x2d ? trace_irq_disable_rcuidle+0x1/0x6c text_poke_bp+0xa0/0xc2 ? trace_hardirqs_on_caller+0x2d/0x2d arch_jump_label_transform+0xa7/0xcb ? trace_irq_disable_rcuidle+0x5/0x6c __jump_label_update+0x3e/0x6d jump_label_update+0x7d/0x81 static_key_slow_inc_cpuslocked+0x58/0x6d static_key_slow_inc+0x19/0x20 tracepoint_probe_register_prio+0x19e/0x1d1 ? start_critical_timings+0x1c/0x1c tracepoint_probe_register+0xf/0x11 irqsoff_tracer_init+0x21/0xf2 tracer_init+0x16/0x1a trace_selftest_startup_irqsoff+0x25/0xc4 run_tracer_selftest+0xca/0x131 register_tracer+0xd5/0x172 ? trace_event_define_fields_preemptirq_template+0x45/0x45 init_irqsoff_tracer+0xd/0x11 do_one_initcall+0xab/0x1e8 ? rcu_read_lock_sched_held+0x3d/0x44 ? trace_initcall_level+0x52/0x86 kernel_init_freeable+0x195/0x21a ? rest_init+0xb4/0xb4 kernel_init+0xd/0xe4 ret_from_fork+0x2e/0x38 It is due to running a CONFIG_PREEMPT_VOLUNTARY kernel, which would trigger this warning every time: WARN_ON_ONCE(preempt_count()); Because on CONFIG_PREEMPT_VOLUNTARY, preempt_count() is always zero. This warning is to make sure preempt_count is set because tracer_hardirqs_on() does a preempt_enable_notrace() to make the preempt_trace() work properly, as being called by a trace event, the trace event code disables preemption, and the tracer wants to know what the preemption was before it was called. Instead of enabling preemption like this, just record the preempt_count, subtract PREEMPT_DISABLE_OFFSET from it (which is zero with !CONFIG_PREEMPT set), and pass that value to the necessary functions, which should use the passed in parameter instead of calling preempt_count() directly. Fixes: da5b3ebb45277 ("tracing: irqsoff: Account for additional preempt_disable") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | tracing: Partial revert of "tracing: Centralize preemptirq tracepoints and ↵Steven Rostedt (VMware)2018-08-101-16/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | unify their usage" Joel Fernandes created a nice patch that cleaned up the duplicate hooks used by lockdep and irqsoff latency tracer. It made both use tracepoints. But it caused lockdep to trigger several false positives. We have not figured out why yet, but removing lockdep from using the trace event hooks and just call its helper functions directly (like it use to), makes the problem go away. This is a partial revert of the clean up patch c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and unify their usage") that adds direct calls for lockdep, but also keeps most of the clean up done to get rid of the horrible preprocessor if statements. Link: http://lkml.kernel.org/r/20180806155058.5ee875f4@gandalf.local.home Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and unify their usage") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | tracing: irqsoff: Account for additional preempt_disableJoel Fernandes (Google)2018-08-061-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recently we tried to make the preemptirqsoff tracer to use irqsoff tracepoint probes. However this causes issues as reported by Masami: [2.271078] Testing tracer preemptirqsoff: .. no entries found ..FAILED! [2.381015] WARNING: CPU: 0 PID: 1 at /home/mhiramat/ksrc/linux/kernel/ trace/trace.c:1512 run_tracer_selftest+0xf3/0x154 This is due to the tracepoint code increasing the preempt nesting count by calling an additional preempt_disable before calling into the preemptoff tracer which messes up the preempt_count() check in tracer_hardirqs_off. To fix this, make the irqsoff tracer probes balance the additional outer preempt_disable with a preempt_enable_notrace. The other way to fix this is to just use SRCU for all tracepoints. However we can't do that because we can't use NMIs from RCU context. Link: http://lkml.kernel.org/r/20180806034049.67949-1-joel@joelfernandes.org Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and unify their usage") Fixes: e6753f23d961 ("tracepoint: Make rcuidle tracepoint callers use SRCU") Reported-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | trace: Use rcu_dereference_raw for hooks from trace-event subsystemJoel Fernandes (Google)2018-08-031-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since we switched to using SRCU for tracepoints used in the idle path, we can no longer use rcu_dereference_sched for dereferencing points in trace-event hooks. Since tracepoints can now use either SRCU or sched-RCU, just use rcu_dereference_raw for traceevents just like we're doing when dereferencing the tracepoint table. This prevents an RCU warning reported by Masami: [ 282.060593] WARNING: can't dereference registers at 00000000f3c7f62b [ 282.063200] ============================= [ 282.064082] WARNING: suspicious RCU usage [ 282.064963] 4.18.0-rc6+ #15 Tainted: G W [ 282.066048] ----------------------------- [ 282.066923] /home/mhiramat/ksrc/linux/kernel/trace/trace_events.c:242 suspicious rcu_dereference_check() usage! [ 282.068974] [ 282.068974] other info that might help us debug this: [ 282.068974] [ 282.070770] [ 282.070770] RCU used illegally from idle CPU! [ 282.070770] rcu_scheduler_active = 2, debug_locks = 1 [ 282.072938] RCU used illegally from extended quiescent state! [ 282.074183] no locks held by swapper/0/0. [ 282.075071] [ 282.075071] stack backtrace: [ 282.076121] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W [ 282.077782] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) [ 282.079604] Call Trace: [ 282.080212] <IRQ> [ 282.080755] dump_stack+0x85/0xcb [ 282.081523] trace_event_ignore_this_pid+0x66/0x70 [ 282.082541] trace_event_raw_event_preemptirq_template+0xa2/0xb0 [ 282.083774] ? interrupt_entry+0xc4/0xe0 [ 282.084665] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 282.085669] trace_hardirqs_off_caller+0x90/0xd0 [ 282.086597] trace_hardirqs_off_thunk+0x1a/0x1c [ 282.087433] ? call_function_interrupt+0xa/0x20 [ 282.088201] interrupt_entry+0xc4/0xe0 [ 282.088848] ? call_function_interrupt+0xa/0x20 [ 282.089579] </IRQ> [ 282.090029] ? native_safe_halt+0x2/0x10 [ 282.090695] ? default_idle+0x1f/0x160 [ 282.091330] ? default_idle_call+0x24/0x40 [ 282.091997] ? do_idle+0x210/0x250 [ 282.092658] ? cpu_startup_entry+0x6f/0x80 [ 282.093338] ? start_kernel+0x49d/0x4bd [ 282.093987] ? secondary_startup_64+0xa5/0xb0 Link: http://lkml.kernel.org/r/20180803023407.225852-1-joel@joelfernandes.org Reported-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Masami Hiramatsu <mhiramat@kernel.org> Fixes: e6753f23d961 ("tracepoint: Make rcuidle tracepoint callers use SRCU") Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | tracing/kprobes: Fix within_notrace_func() to check only notrace functionsMasami Hiramatsu2018-08-021-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix within_notrace_func() to check only notrace functions and to ignore the kprobe-event which can not solve symbol addresses. within_notrace_func() returns true if the given kprobe events probe point seems to be out-of-range. But that is not the correct place to check for it, it should be done in kprobes afterwards. kprobe-events allow users to define a probe point on "currently unloaded module" so that it can trace the event during module load. In this case, the user will put a probe on a symbol which is not in kallsyms yet and it hits the within_notrace_func(). As a result, kprobe-events always refuses if user defines a probe on a "currenly unloaded module". Fixes: commit 45408c4f9250 ("tracing: kprobes: Prohibit probing on notrace function") Link: http://lkml.kernel.org/r/153319624799.29007.13604430345640129926.stgit@devbox Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | ftrace: Use true and false for boolean values in ops_references_rec()Gustavo A. R. Silva2018-08-011-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Return statements in functions returning bool should use true or false instead of an integer value. This code was detected with the help of Coccinelle. Link: http://lkml.kernel.org/r/20180802010056.GA31012@embeddedor.com Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | ring-buffer: Make ring_buffer_record_is_set_on() return boolSteven Rostedt (VMware)2018-08-011-1/+1
| | | | | | | | | | | | | | | | | | | | | The value of ring_buffer_record_is_set_on() is either true or false, so have its return value be bool. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | ring-buffer: Make ring_buffer_record_is_on() return boolSteven Rostedt (VMware)2018-08-011-1/+1
| | | | | | | | | | | | | | | | | | | | | The value of ring_buffer_record_is_on() is either true or false, so have its return value be bool. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | tracing: Make tracer_tracing_is_on() return boolSteven Rostedt (VMware)2018-08-012-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | There's code that expects tracer_tracing_is_on() to be either true or false, not some random number. Currently, it should only return one or zero, but just in case, change its return value to bool, to enforce it. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | tracing: Do a WARN_ON() if start_thread() in hwlat is called when thread existsSteven Rostedt (VMware)2018-08-011-1/+1
| | | | | | | | | | | | | | | | | | | | | The start function of the hwlat tracer should never be called when the hwlat thread already exists. If it is called, do a WARN_ON(). Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | ftrace: Add missing check for existing hwlat threadErica Bugden2018-08-011-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The hwlat tracer uses a kernel thread to measure latencies. The function that creates this kernel thread, start_kthread(), can be called when the tracer is initialized and when the tracer is explicitly enabled. start_kthread() does not check if there is an existing hwlat kernel thread and will create a new one each time it is called. This causes the reference to the previous thread to be lost. Without the thread reference, the old kernel thread becomes unstoppable and continues to use CPU time even after the hwlat tracer has been disabled. This problem can be observed when a system is booted with tracing enabled and the hwlat tracer is configured like this: echo hwlat > current_tracer; echo 1 > tracing_on Add the missing check for an existing kernel thread in start_kthread() to prevent this problem. This function and the rest of the hwlat kernel thread setup and teardown are already serialized because they are called through the tracer core code with trace_type_lock held. [ Note, this only fixes the symptom. The real fix was not to call this function when tracing_on was already one. But this still makes the code more robust, so we'll add it. ] Link: http://lkml.kernel.org/r/1533120354-22923-1-git-send-email-erica.bugden@linutronix.de Signed-off-by: Erica Bugden <erica.bugden@linutronix.de> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | tracing: Do not call start/stop() functions when tracing_on does not changeSteven Rostedt (VMware)2018-08-011-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, when one echo's in 1 into tracing_on, the current tracer's "start()" function is executed, even if tracing_on was already one. This can lead to strange side effects. One being that if the hwlat tracer is enabled, and someone does "echo 1 > tracing_on" into tracing_on, the hwlat tracer's start() function is called again which will recreate another kernel thread, and make it unable to remove the old one. Link: http://lkml.kernel.org/r/1533120354-22923-1-git-send-email-erica.bugden@linutronix.de Cc: stable@vger.kernel.org Fixes: 2df8f8a6a897e ("tracing: Fix regression with irqsoff tracer and tracing_on file") Reported-by: Erica Bugden <erica.bugden@linutronix.de> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | tracing: Centralize preemptirq tracepoints and unify their usageJoel Fernandes (Google)2018-07-314-180/+147
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch detaches the preemptirq tracepoints from the tracers and keeps it separate. Advantages: * Lockdep and irqsoff event can now run in parallel since they no longer have their own calls. * This unifies the usecase of adding hooks to an irqsoff and irqson event, and a preemptoff and preempton event. 3 users of the events exist: - Lockdep - irqsoff and preemptoff tracers - irqs and preempt trace events The unification cleans up several ifdefs and makes the code in preempt tracer and irqsoff tracers simpler. It gets rid of all the horrific ifdeferry around PROVE_LOCKING and makes configuration of the different users of the tracepoints more easy and understandable. It also gets rid of the time_* function calls from the lockdep hooks used to call into the preemptirq tracer which is not needed anymore. The negative delta in lines of code in this patch is quite large too. In the patch we introduce a new CONFIG option PREEMPTIRQ_TRACEPOINTS as a single point for registering probes onto the tracepoints. With this, the web of config options for preempt/irq toggle tracepoints and its users becomes: PREEMPT_TRACER PREEMPTIRQ_EVENTS IRQSOFF_TRACER PROVE_LOCKING | | \ | | \ (selects) / \ \ (selects) / TRACE_PREEMPT_TOGGLE ----> TRACE_IRQFLAGS \ / \ (depends on) / PREEMPTIRQ_TRACEPOINTS Other than the performance tests mentioned in the previous patch, I also ran the locking API test suite. I verified that all tests cases are passing. I also injected issues by not registering lockdep probes onto the tracepoints and I see failures to confirm that the probes are indeed working. This series + lockdep probes not registered (just to inject errors): [ 0.000000] hard-irqs-on + irq-safe-A/21: ok | ok | ok | [ 0.000000] soft-irqs-on + irq-safe-A/21: ok | ok | ok | [ 0.000000] sirq-safe-A => hirqs-on/12:FAILED|FAILED| ok | [ 0.000000] sirq-safe-A => hirqs-on/21:FAILED|FAILED| ok | [ 0.000000] hard-safe-A + irqs-on/12:FAILED|FAILED| ok | [ 0.000000] soft-safe-A + irqs-on/12:FAILED|FAILED| ok | [ 0.000000] hard-safe-A + irqs-on/21:FAILED|FAILED| ok | [ 0.000000] soft-safe-A + irqs-on/21:FAILED|FAILED| ok | [ 0.000000] hard-safe-A + unsafe-B #1/123: ok | ok | ok | [ 0.000000] soft-safe-A + unsafe-B #1/123: ok | ok | ok | With this series + lockdep probes registered, all locking tests pass: [ 0.000000] hard-irqs-on + irq-safe-A/21: ok | ok | ok | [ 0.000000] soft-irqs-on + irq-safe-A/21: ok | ok | ok | [ 0.000000] sirq-safe-A => hirqs-on/12: ok | ok | ok | [ 0.000000] sirq-safe-A => hirqs-on/21: ok | ok | ok | [ 0.000000] hard-safe-A + irqs-on/12: ok | ok | ok | [ 0.000000] soft-safe-A + irqs-on/12: ok | ok | ok | [ 0.000000] hard-safe-A + irqs-on/21: ok | ok | ok | [ 0.000000] soft-safe-A + irqs-on/21: ok | ok | ok | [ 0.000000] hard-safe-A + unsafe-B #1/123: ok | ok | ok | [ 0.000000] soft-safe-A + unsafe-B #1/123: ok | ok | ok | Link: http://lkml.kernel.org/r/20180730222423.196630-4-joel@joelfernandes.org Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | selftest/ftrace: Move kprobe selftest function to separate compile unitFrancis Deslauriers2018-07-304-11/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move selftest function to its own compile unit so it can be compiled with the ftrace cflags (CC_FLAGS_FTRACE) allowing it to be probed during the ftrace startup tests. Link: http://lkml.kernel.org/r/153294604271.32740.16490677128630177030.stgit@devbox Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | tracing: kprobes: Prohibit probing on notrace functionMasami Hiramatsu2018-07-302-9/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prohibit kprobe-events probing on notrace functions. Since probing on a notrace function can cause a recursive event call. In most cases those are just skipped, but in some cases it falls into an infinite recursive call. This protection can be disabled by the kconfig CONFIG_KPROBE_EVENTS_ON_NOTRACE=y, but it is highly recommended to keep it "n" for normal kernel builds. Note that this is only available if "kprobes on ftrace" has been implemented on the target arch and CONFIG_KPROBES_ON_FTRACE=y. Link: http://lkml.kernel.org/r/153294601436.32740.10557881188933661239.stgit@devbox Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Francis Deslauriers <francis.deslauriers@efficios.com> [ Slight grammar and spelling fixes ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | tracing: preemptirq_delay_run() can be statickbuild test robot2018-07-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Automatically found by kbuild test robot. Fixes: ffdc73a3b2ad ("lib: Add module for testing preemptoff/irqsoff latency tracers") Signed-off-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | tracing/kprobes: Simplify the logic of enable_trace_kprobe()Steven Rostedt (VMware)2018-07-271-18/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function enable_trace_kprobe() performs slightly differently if the file parameter is passed in as NULL on non-NULL. Instead of checking file twice, move the code between the two tests into a static inline helper function to make the code easier to follow. Link: http://lkml.kernel.org/r/20180725224728.7b1d5db2@vmware.local.home Link: http://lkml.kernel.org/r/20180726121152.4dd54330@gandalf.local.home Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | tracing: Remove orphaned function ftrace_nr_registered_ops()Masami Hiramatsu2018-07-261-24/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove ftrace_nr_registered_ops() because it is no longer used. ftrace_nr_registered_ops() has been introduced by commit ea701f11da44 ("ftrace: Add selftest to test function trace recursion protection"), but its caller has been removed by commit 05cbbf643b8e ("tracing: Fix selftest function recursion accounting"). So it is not called anymore. Link: http://lkml.kernel.org/r/153260907227.12474.5234899025934963683.stgit@devbox Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | tracing: Remove orphaned function using_ftrace_ops_list_func().Masami Hiramatsu2018-07-262-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove using_ftrace_ops_list_func() since it is no longer used. Using ftrace_ops_list_func() has been introduced by commit 7eea4fce0246 ("tracing/stack_trace: Skip 4 instead of 3 when using ftrace_ops_list_func") as a helper function, but its caller has been removed by commit 72ac426a5bb0 ("tracing: Clean up stack tracing and fix fentry updates"). So it is not called anymore. Link: http://lkml.kernel.org/r/153260904427.12474.9952096317439329851.stgit@devbox Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | tracing: Make unregister_trigger() staticSteven Rostedt (VMware)2018-07-262-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | Nothing uses unregister_trigger() outside of trace_events_trigger.c file, thus it should be static. Not sure why this was ever converted, because its counter part, register_trigger(), was always static. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | lib: Add module for testing preemptoff/irqsoff latency tracersJoel Fernandes (Google)2018-07-263-0/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here we introduce a test module for introducing a long preempt or irq disable delay in the kernel which the preemptoff or irqsoff tracers can detect. This module is to be used only for test purposes and is default disabled. Following is the expected output (only briefly shown) that can be parsed to verify that the tracers are working correctly. We will use this from the kselftests in future patches. For the preemptoff tracer: echo preemptoff > /d/tracing/current_tracer sleep 1 insmod ./preemptirq_delay_test.ko test_mode=preempt delay=500000 sleep 1 bash-4.3# cat /d/tracing/trace preempt -1066 2...2 0us@: preemptirq_delay_run <-preemptirq_delay_run preempt -1066 2...2 500002us : preemptirq_delay_run <-preemptirq_delay_run preempt -1066 2...2 500004us : tracer_preempt_on <-preemptirq_delay_run preempt -1066 2...2 500012us : <stack trace> => kthread => ret_from_fork For the irqsoff tracer: echo irqsoff > /d/tracing/current_tracer sleep 1 insmod ./preemptirq_delay_test.ko test_mode=irq delay=500000 sleep 1 bash-4.3# cat /d/tracing/trace irq dis -1069 1d..1 0us@: preemptirq_delay_run irq dis -1069 1d..1 500001us : preemptirq_delay_run irq dis -1069 1d..1 500002us : tracer_hardirqs_on <-preemptirq_delay_run irq dis -1069 1d..1 500005us : <stack trace> => ret_from_fork Link: http://lkml.kernel.org/r/20180712213611.GA8743@joelaf.mtv.corp.google.com Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Byungchul Park <byungchul.park@lge.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Julia Cartwright <julia@ni.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Thomas Glexiner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> [ Erick is a co-developer of this commit ] Signed-off-by: Erick Reyes <erickreyes@google.com> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
| * | tracing/irqsoff: Split reset into separate functionsJoel Fernandes (Google)2018-07-261-3/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split reset functions into seperate functions in preparation of future patches that need to do tracer specific reset. Link: http://lkml.kernel.org/r/20180628182149.226164-4-joel@joelfernandes.org Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
* | | Merge tag 'printk-for-4.19' of ↵Linus Torvalds2018-08-151-1/+3
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk Pull printk updates from Petr Mladek: - Different vendors have a different expectation about a console quietness. Make it configurable to reduce bike-shedding about the upstream default - Decide about the message visibility when the message is stored. It avoids races caused by a delayed console handling - Always store printk() messages into the per-CPU buffers again in NMI. The only exception is when flushing trace log in panic(). There the risk of loosing messages is worth an eventual reordering - Handle invalid %pO printf modifiers correctly - Better handle %p printf modifier tests before crng is initialized - Some clean up * tag 'printk-for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk: lib/vsprintf: Do not handle %pO[^F] as %px printk: Fix warning about unused suppress_message_printing printk/nmi: Prevent deadlock when accessing the main log buffer in NMI printk: Create helper function to queue deferred console handling printk: Split the code for storing a message into the log buffer printk: Clean up syslog_print_all() printk: Remove unnecessary kmalloc() from syslog during clear printk: Make CONSOLE_LOGLEVEL_QUIET configurable printk: make sure to print log on console. lib/test_printf.c: accept "ptrval" as valid result for plain 'p' tests
| * | | printk/nmi: Prevent deadlock when accessing the main log buffer in NMIPetr Mladek2018-07-091-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit 719f6a7040f1bdaf96 ("printk: Use the main logbuf in NMI when logbuf_lock is available") brought back the possible deadlocks in printk() and NMI. The check of logbuf_lock is done only in printk_nmi_enter() to prevent mixed output. But another CPU might take the lock later, enter NMI, and: + Both NMIs might be serialized by yet another lock, for example, the one in nmi_cpu_backtrace(). + The other CPU might get stopped in NMI, see smp_send_stop() in panic(). The only safe solution is to use trylock when storing the message into the main log-buffer. It might cause reordering when some lines go to the main lock buffer directly and others are delayed via the per-CPU buffer. It means that it is not useful in general. This patch replaces the problematic NMI deferred context with NMI direct context. It can be used to mark a code that might produce many messages in NMI and the risk of losing them is more critical than problems with eventual reordering. The context is then used when dumping trace buffers on oops. It was the primary motivation for the original fix. Also the reordering is even smaller issue there because some traces have their own time stamps. Finally, nmi_cpu_backtrace() need not longer be serialized because it will always us the per-CPU buffers again. Fixes: 719f6a7040f1bdaf96 ("printk: Use the main logbuf in NMI when logbuf_lock is available") Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20180627142028.11259-1-pmladek@suse.com To: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com>
* | | | Merge tag 'docs-4.19' of git://git.lwn.net/linuxLinus Torvalds2018-08-141-2/+2
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull documentation update from Jonathan Corbet: "This was a moderately busy cycle for docs, with the usual collection of small fixes and updates. We also have new ktime_get_*() docs from Arnd, some kernel-doc fixes, a new set of Italian translations (non so se vale la pena, ma non fa male - speriamo bene), and some extensive early memory-management documentation improvements from Mike Rapoport" * tag 'docs-4.19' of git://git.lwn.net/linux: (52 commits) Documentation: corrections to console/console.txt Documentation: add ioctl number entry for v4l2-subdev.h Remove gendered language from management style documentation scripts/kernel-doc: Escape all literal braces in regexes docs/mm: add description of boot time memory management docs/mm: memblock: add overview documentation docs/mm: memblock: add kernel-doc description for memblock types docs/mm: memblock: add kernel-doc comments for memblock_add[_node] docs/mm: memblock: update kernel-doc comments mm/memblock: add a name for memblock flags enumeration docs/mm: bootmem: add overview documentation docs/mm: bootmem: add kernel-doc description of 'struct bootmem_data' docs/mm: bootmem: fix kernel-doc warnings docs/mm: nobootmem: fixup kernel-doc comments mm/bootmem: drop duplicated kernel-doc comments Documentation: vm.txt: Adding 'nr_hugepages_mempolicy' parameter description. doc:it_IT: translation for kernel-hacking docs: Fix the reference labels in Locking.rst doc: tracing: Fix a typo of trace_stat mm: Introduce new type vm_fault_t ...
| * | | | doc: tracing: Fix a typo of trace_statMasami Hiramatsu2018-07-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The name of the directory for per-cpu function statistics is trace_stat, not trace_stats. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
| * | | | docs: histogram.txt: convert it to ReST file formatMauro Carvalho Chehab2018-07-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Despite being mentioned at Documentation/trace/ftrace.rst as a rst file, this file was still a text one, with several issues. Convert it to ReST and add it to the trace index: - Mark the document title as such; - Identify and indent the literal blocks; - Use the proper markups for table. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
* | | | | Merge tag 'for-4.19/block-20180812' of git://git.kernel.dk/linux-blockLinus Torvalds2018-08-141-3/+3
|\ \ \ \ \ | | |_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block updates from Jens Axboe: "First pull request for this merge window, there will also be a followup request with some stragglers. This pull request contains: - Fix for a thundering heard issue in the wbt block code (Anchal Agarwal) - A few NVMe pull requests: * Improved tracepoints (Keith) * Larger inline data support for RDMA (Steve Wise) * RDMA setup/teardown fixes (Sagi) * Effects log suppor for NVMe target (Chaitanya Kulkarni) * Buffered IO suppor for NVMe target (Chaitanya Kulkarni) * TP4004 (ANA) support (Christoph) * Various NVMe fixes - Block io-latency controller support. Much needed support for properly containing block devices. (Josef) - Series improving how we handle sense information on the stack (Kees) - Lightnvm fixes and updates/improvements (Mathias/Javier et al) - Zoned device support for null_blk (Matias) - AIX partition fixes (Mauricio Faria de Oliveira) - DIF checksum code made generic (Max Gurtovoy) - Add support for discard in iostats (Michael Callahan / Tejun) - Set of updates for BFQ (Paolo) - Removal of async write support for bsg (Christoph) - Bio page dirtying and clone fixups (Christoph) - Set of bcache fix/changes (via Coly) - Series improving blk-mq queue setup/teardown speed (Ming) - Series improving merging performance on blk-mq (Ming) - Lots of other fixes and cleanups from a slew of folks" * tag 'for-4.19/block-20180812' of git://git.kernel.dk/linux-block: (190 commits) blkcg: Make blkg_root_lookup() work for queues in bypass mode bcache: fix error setting writeback_rate through sysfs interface null_blk: add lock drop/acquire annotation Blk-throttle: reduce tail io latency when iops limit is enforced block: paride: pd: mark expected switch fall-throughs block: Ensure that a request queue is dissociated from the cgroup controller block: Introduce blk_exit_queue() blkcg: Introduce blkg_root_lookup() block: Remove two superfluous #include directives blk-mq: count the hctx as active before allocating tag block: bvec_nr_vecs() returns value for wrong slab bcache: trivial - remove tailing backslash in macro BTREE_FLAG bcache: make the pr_err statement used for ENOENT only in sysfs_attatch section bcache: set max writeback rate when I/O request is idle bcache: add code comments for bset.c bcache: fix mistaken comments in request.c bcache: fix mistaken code comments in bcache.h bcache: add a comment in super.c bcache: avoid unncessary cache prefetch bch_btree_node_get() bcache: display rate debug parameters to 0 when writeback is not running ...
| * | | | Merge tag 'v4.18-rc6' into for-4.19/block2Jens Axboe2018-08-053-7/+12
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull in 4.18-rc6 to get the NVMe core AEN change to avoid a merge conflict down the line. Signed-of-by: Jens Axboe <axboe@kernel.dk>
| * | | | | Blktrace: bail out early if block debugfs is not configuredLiu Bo2018-07-091-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since @blk_debugfs_root couldn't be configured dynamically, we can save a few memory allocation if it's not there. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | | | | | Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar2018-08-024-7/+48
|\ \ \ \ \ \ | | |_|_|_|/ | |/| | | | | | | | | | Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | tracing: Quiet gcc warning about maybe unused link variableSteven Rostedt (VMware)2018-07-251-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 57ea2a34adf4 ("tracing/kprobes: Fix trace_probe flags on enable_trace_kprobe() failure") added an if statement that depends on another if statement that gcc doesn't see will initialize the "link" variable and gives the warning: "warning: 'link' may be used uninitialized in this function" It is really a false positive, but to quiet the warning, and also to make sure that it never actually is used uninitialized, initialize the "link" variable to NULL and add an if (!WARN_ON_ONCE(!link)) where the compiler thinks it could be used uninitialized. Cc: stable@vger.kernel.org Fixes: 57ea2a34adf4 ("tracing/kprobes: Fix trace_probe flags on enable_trace_kprobe() failure") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>