linux-stable.git - Linux kernel stable tree

	Commit message (Collapse)	Author	Age	Files	Lines
*	workqueue: Track and monitor per-workqueue CPU time usage	Tejun Heo	2023-05-17	1	-19/+19
\| \| \| \| \| \| \| \| \| \| \| \| \|	Now that wq_worker_tick() is there, we can easily track the rough CPU time consumption of each workqueue by charging the whole tick whenever a tick hits an active workqueue. While not super accurate, it provides reasonable visibility into the workqueues that consume a lot of CPU cycles. wq_monitor.py is updated to report the per-workqueue CPU times. v2: wq_monitor.py was using "cputime" as the key when outputting in json format. Use "cpu_time" instead for consistency with other fields. Signed-off-by: Tejun Heo <tj@kernel.org>
*	workqueue: Automatically mark CPU-hogging work items CPU_INTENSIVE	Tejun Heo	2023-05-17	1	-19/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a per-cpu work item hogs the CPU, it can prevent other work items from starting through concurrency management. A per-cpu workqueue which intends to host such CPU-hogging work items can choose to not participate in concurrency management by setting %WQ_CPU_INTENSIVE; however, this can be error-prone and difficult to debug when missed. This patch adds an automatic CPU usage based detection. If a concurrency-managed work item consumes more CPU time than the threshold (10ms by default) continuously without intervening sleeps, wq_worker_tick() which is called from scheduler_tick() will detect the condition and automatically mark it CPU_INTENSIVE. The mechanism isn't foolproof: * Detection depends on tick hitting the work item. Getting preempted at the right timings may allow a violating work item to evade detection at least temporarily. * nohz_full CPUs may not be running ticks and thus can fail detection. * Even when detection is working, the 10ms detection delays can add up if many CPU-hogging work items are queued at the same time. However, in vast majority of cases, this should be able to detect violations reliably and provide reasonable protection with a small increase in code complexity. If some work items trigger this condition repeatedly, the bigger problem likely is the CPU being saturated with such per-cpu work items and the solution would be making them UNBOUND. The next patch will add a debug mechanism to help spot such cases. v4: Documentation for workqueue.cpu_intensive_thresh_us added to kernel-parameters.txt. v3: Switch to use wq_worker_tick() instead of hooking into preemptions as suggested by Peter. v2: Lai pointed out that wq_worker_stopping() also needs to be called from preemption and rtlock paths and an earlier patch was updated accordingly. This patch adds a comment describing the risk of infinte recursions and how they're avoided. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com>
*	workqueue: Add pwq->stats[] and a monitoring script	Tejun Heo	2023-05-17	1	-0/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, the only way to peer into workqueue operations is through tracing. While possible, it isn't easy or convenient to monitor per-workqueue behaviors over time this way. Let's add pwq->stats[] that track relevant events and a drgn monitoring script - tools/workqueue/wq_monitor.py. It's arguable whether this needs to be configurable. However, it currently only has several counters and the runtime overhead shouldn't be noticeable given that they're on pwq's which are per-cpu on per-cpu workqueues and per-numa-node on unbound ones. Let's keep it simple for the time being. v2: Patch reordered to earlier with fewer fields. Field will be added back gradually. Help message improved. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com>
*	docs: ftrace: always use canonical ftrace path	Ross Zwisler	2023-01-31	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The canonical location for the tracefs filesystem is at /sys/kernel/tracing. But, from Documentation/trace/ftrace.rst: Before 4.1, all ftrace tracing control files were within the debugfs file system, which is typically located at /sys/kernel/debug/tracing. For backward compatibility, when mounting the debugfs file system, the tracefs file system will be automatically mounted at: /sys/kernel/debug/tracing Many parts of Documentation still reference this older debugfs path, so let's update them to avoid confusion. Signed-off-by: Ross Zwisler <zwisler@google.com> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20230125213251.2013791-1-zwisler@google.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
*	workqueue: doc: Call out the non-reentrance conditions	Boqun Feng	2021-10-25	1	-4/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current doc of workqueue API suggests that work items are non-reentrant: any work item is guaranteed to be executed by at most one worker system-wide at any given time. However this is not true, the following case can cause a work item W executed by two workers at the same time: queue_work_on(0, WQ1, W); // after a worker picks up W and clear the pending bit queue_work_on(1, WQ2, W); // workers on CPU0 and CPU1 will execute W in the same time. , which means the non-reentrance of a work item is conditional, and Lai Jiangshan provided a nice summary[1] of the conditions, therefore use it to describe a work item instance and improve the doc. [1]: https://lore.kernel.org/lkml/CAJhGHyDudet_xyNk=8xnuO2==o-u06s0E0GZVP4Q67nmQ84Ceg@mail.gmail.com/ Suggested-by: Matthew Wilcox <willy@infradead.org> Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
*	docs: basics.rst: move kernel-doc workqueue markups to workqueue.rst	Mauro Carvalho Chehab	2020-10-15	1	-0/+2
\| \| \| \| \| \| \| \|	As there's already a rst file with workqueue markups, containing part of them, move the other definitions, in order to avoid warnings with Sphinx. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
*	Documentation: core-api: minor workqueue.rst cleanups	Randy Dunlap	2017-09-18	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	Clean up workqueue.rst: - fix minor typos - put '@' after `` instead of preceding them (one place) - use "CPU" instead of "cpu" in text consistently - quote one function name Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Florian Mickler <florian@mickler.org> Signed-off-by: Tejun Heo <tj@kernel.org>
*	workqueue: doc change for ST behavior on NUMA systems	Alexei Potashnik	2017-07-18	1	-3/+7
\| \| \| \| \| \| \| \| \| \|	NUMA rework of workqueue made the combination of max_active of 1 and WQ_UNBOUND insufficient to guarantee ST behavior system wide. alloc_ordered_queue should now be used instead. Signed-off-by: Alexei Potashnik <alexei@purestorage.com> Signed-off-by: Tejun Heo <tj@kernel.org>
*	Documentation/workqueue.txt: convert to ReST markup	Silvio Fricke	2016-10-28	1	-0/+394
	... and move to Documentation/core-api folder. Signed-off-by: Silvio Fricke <silvio.fricke@gmail.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>