| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Separate hardlockup code from watchdog.c and move it to watchdog_hld.c.
It is mostly straight forward. Remove everything inside
CONFIG_HARDLOCKUP_DETECTORS. This code will go to file watchdog_hld.c.
Also update the makefile accordigly.
Link: http://lkml.kernel.org/r/1478034826-43888-3-git-send-email-babu.moger@oracle.com
Signed-off-by: Babu Moger <babu.moger@oracle.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Cc: Aaron Tomlin <atomlin@redhat.com>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Josh Hunt <johunt@akamai.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Patch series "Clean up watchdog handlers", v2.
This is an attempt to cleanup watchdog handlers. Right now,
kernel/watchdog.c implements both softlockup and hardlockup detectors.
Softlockup code is generic. Hardlockup code is arch specific. Some
architectures don't use hardlockup detectors. They use their own
watchdog detectors. To make both these combination work, we have
numerous #ifdefs in kernel/watchdog.c.
We are trying here to make these handlers independent of each other.
Also provide an interface for architectures to implement their own
handlers. watchdog_nmi_enable and watchdog_nmi_disable will be defined
as weak such that architectures can override its definitions.
Thanks to Don Zickus for his suggestions.
Here are our previous discussions
http://www.spinics.net/lists/sparclinux/msg16543.html
http://www.spinics.net/lists/sparclinux/msg16441.html
This patch (of 3):
Move shared macros and definitions to nmi.h so that watchdog.c, new file
watchdog_hld.c or any other architecture specific handler can use those
definitions.
Link: http://lkml.kernel.org/r/1478034826-43888-2-git-send-email-babu.moger@oracle.com
Signed-off-by: Babu Moger <babu.moger@oracle.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Cc: Aaron Tomlin <atomlin@redhat.com>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Josh Hunt <johunt@akamai.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The OpenRISC compiler (so far) fails to optimize away a large portion of
code containing a reference to posix_timer_event in alarmtimer.c when
CONFIG_POSIX_TIMERS is unset. Let's give it a direct clue to let the
build succeed.
This fixes
[linux-next:master 6682/7183] alarmtimer.c:undefined reference to `posix_timer_event'
reported by kbuild test robot.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
kdb_trap_printk allows to pass normal printk() messages to kdb via
vkdb_printk(). For example, it is used to get backtrace using the
classic show_stack(), see kdb_show_stack().
vkdb_printf() tries to avoid a potential infinite loop by disabling the
trap. But this approach is racy, for example:
CPU1 CPU2
vkdb_printf()
// assume that kdb_trap_printk == 0
saved_trap_printk = kdb_trap_printk;
kdb_trap_printk = 0;
kdb_show_stack()
kdb_trap_printk++;
Problem1: Now, a nested printk() on CPU0 calls vkdb_printf()
even when it should have been disabled. It will not
cause a deadlock but...
// using the outdated saved value: 0
kdb_trap_printk = saved_trap_printk;
kdb_trap_printk--;
Problem2: Now, kdb_trap_printk == -1 and will stay like this.
It means that all messages will get passed to kdb from
now on.
This patch removes the racy saved_trap_printk handling. Instead, the
recursion is prevented by a check for the locked CPU.
The solution is still kind of racy. A non-related printk(), from
another process, might get trapped by vkdb_printf(). And the wanted
printk() might not get trapped because kdb_printf_cpu is assigned. But
this problem existed even with the original code.
A proper solution would be to get_cpu() before setting kdb_trap_printk
and trap messages only from this CPU. I am not sure if it is worth the
effort, though.
In fact, the race is very theoretical. When kdb is running any of the
commands that use kdb_trap_printk there is a single active CPU and the
other CPUs should be in a holding pen inside kgdb_cpu_enter().
The only time this is violated is when there is a timeout waiting for
the other CPUs to report to the holding pen.
Finally, note that the situation is a bit schizophrenic. vkdb_printf()
explicitly allows recursion but only from KDB code that calls
kdb_printf() directly. On the other hand, the generic printk()
recursion is not allowed because it might cause an infinite loop. This
is why we could not hide the decision inside vkdb_printf() easily.
Link: http://lkml.kernel.org/r/1480412276-16690-4-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
kdb_printf_lock does not prevent other CPUs from entering the critical
section because it is ignored when KDB_STATE_PRINTF_LOCK is set.
The problematic situation might look like:
CPU0 CPU1
vkdb_printf()
if (!KDB_STATE(PRINTF_LOCK))
KDB_STATE_SET(PRINTF_LOCK);
spin_lock_irqsave(&kdb_printf_lock, flags);
vkdb_printf()
if (!KDB_STATE(PRINTF_LOCK))
BANG: The PRINTF_LOCK state is set and CPU1 is entering the critical
section without spinning on the lock.
The problem is that the code tries to implement locking using two state
variables that are not handled atomically. Well, we need a custom
locking because we want to allow reentering the critical section on the
very same CPU.
Let's use solution from Petr Zijlstra that was proposed for a similar
scenario, see
https://lkml.kernel.org/r/20161018171513.734367391@infradead.org
This patch uses the same trick with cmpxchg(). The only difference is
that we want to handle only recursion from the same context and
therefore we disable interrupts.
In addition, KDB_STATE_PRINTF_LOCK is removed. In fact, we are not able
to set it a non-racy way.
Link: http://lkml.kernel.org/r/1480412276-16690-3-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
kdb_event state variable is only set but never checked in the kernel
code.
http://www.spinics.net/lists/kdb/msg01733.html suggests that this
variable affected WARN_CONSOLE_UNLOCKED() in the original
implementation. But this check never went upstream.
The semantic is unclear and racy. The value is updated after the
kdb_printf_lock is acquired and after it is released. It should be
symmetric at minimum. The value should be manipulated either inside or
outside the locked area.
Fortunately, it seems that the original function is gone and we could
simply remove the state variable.
Link: http://lkml.kernel.org/r/1480412276-16690-2-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Suggested-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We've got a delay loop waiting for secondary CPUs. That loop uses
loops_per_jiffy. However, loops_per_jiffy doesn't actually mean how
many tight loops make up a jiffy on all architectures. It is quite
common to see things like this in the boot log:
Calibrating delay loop (skipped), value calculated using timer
frequency.. 48.00 BogoMIPS (lpj=24000)
In my case I was seeing lots of cases where other CPUs timed out
entering the debugger only to print their stack crawls shortly after the
kdb> prompt was written.
Elsewhere in kgdb we already use udelay(), so that should be safe enough
to use to implement our timeout. We'll delay 1 ms for 1000 times, which
should give us a full second of delay (just like the old code wanted)
but allow us to notice that we're done every 1 ms.
[akpm@linux-foundation.org: simplifications, per Daniel]
Link: http://lkml.kernel.org/r/1477091361-2039-1-git-send-email-dianders@chromium.org
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Brian Norris <briannorris@chromium.org>
Cc: <stable@vger.kernel.org> [4.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is fragile that some definitions acquired via transitive
dependencies, as shown in below:
atomic_* (<linux/atomic.h>)
ENOMEM/EN* (<linux/errno.h>)
EXPORT_SYMBOL (<linux/export.h>)
device_initcall (<linux/init.h>)
preempt_* (<linux/preempt.h>)
Include them to prevent possible issues.
Link: http://lkml.kernel.org/r/1481163221-40170-1-git-send-email-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Smatch complains that we started using the array offset before we
checked that it was valid.
Fixes: 017c59c042d0 ('relay: Use per CPU constructs for the relay channel buffer pointers')
Link: http://lkml.kernel.org/r/20161013084947.GC16198@mwanda
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Do not break lines while printk()ing values.
kernel: warning: process `tomoyo_file_tes' used the deprecated sysctl system call with
kernel: 3.
kernel: 5.
kernel: 56.
kernel:
Link: http://lkml.kernel.org/r/1480814833-4976-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A soft lookup will occur when I run trinity in syscall kexec_load. the
corresponding stack information is as follows.
BUG: soft lockup - CPU#6 stuck for 22s! [trinity-c6:13859]
Kernel panic - not syncing: softlockup: hung tasks
CPU: 6 PID: 13859 Comm: trinity-c6 Tainted: G O L ----V------- 3.10.0-327.28.3.35.zhongjiang.x86_64 #1
Hardware name: Huawei Technologies Co., Ltd. Tecal BH622 V2/BC01SRSA0, BIOS RMIBV386 06/30/2014
Call Trace:
<IRQ> dump_stack+0x19/0x1b
panic+0xd8/0x214
watchdog_timer_fn+0x1cc/0x1e0
__hrtimer_run_queues+0xd2/0x260
hrtimer_interrupt+0xb0/0x1e0
? call_softirq+0x1c/0x30
local_apic_timer_interrupt+0x37/0x60
smp_apic_timer_interrupt+0x3f/0x60
apic_timer_interrupt+0x6d/0x80
<EOI> ? kimage_alloc_control_pages+0x80/0x270
? kmem_cache_alloc_trace+0x1ce/0x1f0
? do_kimage_alloc_init+0x1f/0x90
kimage_alloc_init+0x12a/0x180
SyS_kexec_load+0x20a/0x260
system_call_fastpath+0x16/0x1b
the first time allocation of control pages may take too much time
because crash_res.end can be set to a higher value. we need to add
cond_resched to avoid the issue.
The patch have been tested and above issue is not appear.
Link: http://lkml.kernel.org/r/1481164674-42775-1-git-send-email-zhongjiang@huawei.com
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Xunlei Pang <xpang@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently in x86_64, the symbol address of phys_base is exported to
vmcoreinfo. Dave Anderson complained this is really useless for his
Crash implementation. Because in user-space utility Crash and
Makedumpfile which exported vmcore information is mainly used for, value
of phys_base is needed to covert virtual address of exported kernel
symbol to physical address. Especially init_level4_pgt, if we want to
access and go over the page table to look up a PA corresponding to VA,
firstly we need calculate
page_dir = SYMBOL(init_level4_pgt) - __START_KERNEL_map + phys_base;
Now in Crash and Makedumpfile, we have to analyze the vmcore elf program
header to get value of phys_base. As Dave said, it would be preferable
if it were readily availabl in vmcoreinfo rather than depending upon the
PT_LOAD semantics.
Hence in this patch change to export the value of phys_base instead of
its virtual address.
And people also complained that KERNEL_IMAGE_SIZE exporting is x86_64
only, should be moved into arch dependent function
arch_crash_save_vmcoreinfo. Do the moving in this patch.
Link: http://lkml.kernel.org/r/1478568596-30060-2-git-send-email-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Xunlei Pang <xlpang@redhat.com>
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Eugene Surovegin <surovegin@google.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Atsushi Kumagai <ats-kumagai@wm.jp.nec.com>
Cc: Dave Anderson <anderson@redhat.com>
Cc: Pratyush Anand <panand@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I was amused to find "unsafe core_pattern" warning having these lines in
/etc/sysctl.conf:
fs.suid_dumpable=2
kernel.core_pattern=/core/core-%e-%p-%E
kernel.core_uses_pid=0
Turns out kernel is formally right. Default core_pattern is just "core",
which doesn't qualify for secure path while setting suid.dumpable.
Hint admins about solution, clarify sysctl names, delete unnecessary '\'
characters (string literals are concatenated regardless) and reformat for
easier grepping.
Link: http://lkml.kernel.org/r/20161029152124.GA1258@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When running certain database workload on a high-end system with many
CPUs, it was found that spinlock contention in the sigprocmask syscalls
became a significant portion of the overall CPU cycles as shown below.
9.30% 9.30% 905387 dataserver /proc/kcore 0x7fff8163f4d2
[k] _raw_spin_lock_irq
|
---_raw_spin_lock_irq
|
|--99.34%-- __set_current_blocked
| sigprocmask
| sys_rt_sigprocmask
| system_call_fastpath
| |
| |--50.63%-- __swapcontext
| | |
| | |--99.91%-- upsleepgeneric
| |
| |--49.36%-- __setcontext
| | ktskRun
Looking further into the swapcontext function in glibc, it was found that
the function always call sigprocmask() without checking if there are
changes in the signal mask.
A check was added to the __set_current_blocked() function to avoid taking
the sighand->siglock spinlock if there is no change in the signal mask.
This will prevent unneeded spinlock contention when many threads are
trying to call sigprocmask().
With this patch applied, the spinlock contention in sigprocmask() was
gone.
Link: http://lkml.kernel.org/r/1474979209-11867-1-git-send-email-Waiman.Long@hpe.com
Signed-off-by: Waiman Long <Waiman.Long@hpe.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Stas Sergeev <stsp@list.ru>
Cc: Scott J Norton <scott.norton@hpe.com>
Cc: Douglas Hatch <doug.hatch@hpe.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
NMI handler doesn't call set_irq_regs(), it's set only by normal IRQ.
Thus get_irq_regs() returns NULL or stale registers snapshot with IP/SP
pointing to the code interrupted by IRQ which was interrupted by NMI.
NULL isn't a problem: in this case watchdog calls dump_stack() and
prints full stack trace including NMI. But if we're stuck in IRQ
handler then NMI watchlog will print stack trace without IRQ part at
all.
This patch uses registers snapshot passed into NMI handler as arguments:
these registers point exactly to the instruction interrupted by NMI.
Fixes: 55537871ef66 ("kernel/watchdog.c: perform all-CPU backtrace in case of hard lockup")
Link: http://lkml.kernel.org/r/146771764784.86724.6006627197118544150.stgit@buzz
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Aaron Tomlin <atomlin@redhat.com>
Cc: <stable@vger.kernel.org> [4.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Pull workqueue updates from Tejun Heo:
"Mostly patches to initialize workqueue subsystem earlier and get rid
of keventd_up().
The patches were headed for the last merge cycle but got delayed due
to a bug found late minute, which is fixed now.
Also, to help debugging, destroy_workqueue() is more chatty now on a
sanity check failure."
* 'for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: move wq_numa_init() to workqueue_init()
workqueue: remove keventd_up()
debugobj, workqueue: remove keventd_up() usage
slab, workqueue: remove keventd_up() usage
power, workqueue: remove keventd_up() usage
tty, workqueue: remove keventd_up() usage
mce, workqueue: remove keventd_up() usage
workqueue: make workqueue available early during boot
workqueue: dump workqueue state on sanity check failures in destroy_workqueue()
|
| |\ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
While splitting up workqueue initialization into two parts,
ac8f73400782 ("workqueue: make workqueue available early during boot")
put wq_numa_init() into workqueue_init_early(). Unfortunately, on
some archs including power and arm64, cpu to node mapping isn't yet
established by the time the early init is called leading to incorrect
NUMA initialization and subsequently the following oops due to zero
cpumask on node-specific unbound pools.
Unable to handle kernel paging request for data at address 0x00000038
Faulting instruction address: 0xc0000000000fc0cc
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=2048 NUMA PowerNV
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-compiler_gcc-6.2.0-next-20161005 #94
task: c0000007f5400000 task.stack: c000001ffc084000
NIP: c0000000000fc0cc LR: c0000000000ed928 CTR: c0000000000fbfd0
REGS: c000001ffc087780 TRAP: 0300 Not tainted (4.8.0-compiler_gcc-6.2.0-next-20161005)
MSR: 9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: 48000424 XER: 00000000
CFAR: c0000000000089dc DAR: 0000000000000038 DSISR: 40000000 SOFTE: 0
GPR00: c0000000000ed928 c000001ffc087a00 c000000000e63200 c000000010d6d600
GPR04: c0000007f5409200 0000000000000021 000000000748e08c 000000000000001f
GPR08: 0000000000000000 0000000000000021 000000000748f1f8 0000000000000000
GPR12: 0000000028000422 c00000000fb80000 c00000000000e0c8 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000021 0000000000000001
GPR20: ffffffffafb50401 0000000000000000 c000000010d6d600 000000000000ba7e
GPR24: 000000000000ba7e c000000000d8bc58 afb504000afb5041 0000000000000001
GPR28: 0000000000000000 0000000000000004 c0000007f5409280 0000000000000000
NIP [c0000000000fc0cc] enqueue_task_fair+0xfc/0x18b0
LR [c0000000000ed928] activate_task+0x78/0xe0
Call Trace:
[c000001ffc087a00] [c0000007f5409200] 0xc0000007f5409200 (unreliable)
[c000001ffc087b10] [c0000000000ed928] activate_task+0x78/0xe0
[c000001ffc087b50] [c0000000000ede58] ttwu_do_activate+0x68/0xc0
[c000001ffc087b90] [c0000000000ef1b8] try_to_wake_up+0x208/0x4f0
[c000001ffc087c10] [c0000000000d3484] create_worker+0x144/0x250
[c000001ffc087cb0] [c000000000cd72d0] workqueue_init+0x124/0x150
[c000001ffc087d00] [c000000000cc0e74] kernel_init_freeable+0x158/0x360
[c000001ffc087dc0] [c00000000000e0e4] kernel_init+0x24/0x160
[c000001ffc087e30] [c00000000000bfa0] ret_from_kernel_thread+0x5c/0xbc
Instruction dump:
62940401 3b800000 3aa00000 7f17c378 3a600001 3b600001 60000000 60000000
60420000 72490021 ebfe0150 2f890001 <ebbf0038> 419e0de0 7fbee840 419e0e58
---[ end trace 0000000000000000 ]---
Fix it by moving wq_numa_init() to workqueue_init(). As this means
that the early intialization may not have full NUMA info for per-cpu
pools and ignores NUMA affinity for unbound pools, fix them up from
workqueue_init() after wq_numa_init().
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Link: http://lkml.kernel.org/r/87twck5wqo.fsf@concordia.ellerman.id.au
Fixes: ac8f73400782 ("workqueue: make workqueue available early during boot")
Signed-off-by: Tejun Heo <tj@kernel.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
keventd_up() no longer has in-kernel users. Remove it and make
wq_online static.
Signed-off-by: Tejun Heo <tj@kernel.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Now that workqueue can handle work item queueing/cancelling from very
early during boot, there is no need to gate cancel_delayed_work_sync()
while !keventd_up(). Remove it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Qiao Zhou <qiaozhou@asrmicro.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Workqueue is currently initialized in an early init call; however,
there are cases where early boot code has to be split and reordered to
come after workqueue initialization or the same code path which makes
use of workqueues is used both before workqueue initailization and
after. The latter cases have to gate workqueue usages with
keventd_up() tests, which is nasty and easy to get wrong.
Workqueue usages have become widespread and it'd be a lot more
convenient if it can be used very early from boot. This patch splits
workqueue initialization into two steps. workqueue_init_early() which
sets up the basic data structures so that workqueues can be created
and work items queued, and workqueue_init() which actually brings up
workqueues online and starts executing queued work items. The former
step can be done very early during boot once memory allocation,
cpumasks and idr are initialized. The latter right after kthreads
become available.
This allows work item queueing and canceling from very early boot
which is what most of these use cases want.
* As systemd_wq being initialized doesn't indicate that workqueue is
fully online anymore, update keventd_up() to test wq_online instead.
The follow-up patches will get rid of all its usages and the
function itself.
* Flushing doesn't make sense before workqueue is fully initialized.
The flush functions trigger WARN and return immediately before fully
online.
* Work items are never in-flight before fully online. Canceling can
always succeed by skipping the flush step.
* Some code paths can no longer assume to be called with irq enabled
as irq is disabled during early boot. Use irqsave/restore
operations instead.
v2: Watchdog init, which requires timer to be running, moved from
workqueue_init_early() to workqueue_init().
Signed-off-by: Tejun Heo <tj@kernel.org>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/CA+55aFx0vPuMuxn00rBSM192n-Du5uxy+4AvKa0SBSOVJeuCGg@mail.gmail.com
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
destroy_workqueue() performs a number of sanity checks to ensure that
the workqueue is empty before proceeding with destruction. However,
it's not always easy to tell what's going on just from the warning
message. Let's dump workqueue state after sanity check failures to
help debugging.
Signed-off-by: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/CACT4Y+Zs6vkjHo9qHb4TrEiz3S4+quvvVQ9VWvj2Mx6pETGb9Q@mail.gmail.com
Cc: Dmitry Vyukov <dvyukov@google.com>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"Again, cpufreq gets more changes than the other parts this time (one
new driver, one old driver less, a bunch of enhancements of the
existing code, new CPU IDs, fixes, cleanups)
There also are some changes in cpuidle (idle injection rework, a
couple of new CPU IDs, online/offline rework in intel_idle, fixes and
cleanups), in the generic power domains framework (mostly related to
supporting power domains containing CPUs), and in the Operating
Performance Points (OPP) library (mostly related to supporting devices
with multiple voltage regulators)
In addition to that, the system sleep state selection interface is
modified to make it easier for distributions with unchanged user space
to support suspend-to-idle as the default system suspend method, some
issues are fixed in the PM core, the latency tolerance PM QoS
framework is improved a bit, the Intel RAPL power capping driver is
cleaned up and there are some fixes and cleanups in the devfreq
subsystem
Specifics:
- New cpufreq driver for Broadcom STB SoCs and a Device Tree binding
for it (Markus Mayer)
- Support for ARM Integrator/AP and Integrator/CP in the generic DT
cpufreq driver and elimination of the old Integrator cpufreq driver
(Linus Walleij)
- Support for the zx296718, r8a7743 and r8a7745, Socionext UniPhier,
and PXA SoCs in the the generic DT cpufreq driver (Baoyou Xie,
Geert Uytterhoeven, Masahiro Yamada, Robert Jarzmik)
- cpufreq core fix to eliminate races that may lead to using inactive
policy objects and related cleanups (Rafael Wysocki)
- cpufreq schedutil governor update to make it use SCHED_FIFO kernel
threads (instead of regular workqueues) for doing delayed work (to
reduce the response latency in some cases) and related cleanups
(Viresh Kumar)
- New cpufreq sysfs attribute for resetting statistics (Markus Mayer)
- cpufreq governors fixes and cleanups (Chen Yu, Stratos Karafotis,
Viresh Kumar)
- Support for using generic cpufreq governors in the intel_pstate
driver (Rafael Wysocki)
- Support for per-logical-CPU P-state limits and the EPP/EPB (Energy
Performance Preference/Energy Performance Bias) knobs in the
intel_pstate driver (Srinivas Pandruvada)
- New CPU ID for Knights Mill in intel_pstate (Piotr Luc)
- intel_pstate driver modification to use the P-state selection
algorithm based on CPU load on platforms with the system profile in
the ACPI tables set to "mobile" (Srinivas Pandruvada)
- intel_pstate driver cleanups (Arnd Bergmann, Rafael Wysocki,
Srinivas Pandruvada)
- cpufreq powernv driver updates including fast switching support
(for the schedutil governor), fixes and cleanus (Akshay Adiga,
Andrew Donnellan, Denis Kirjanov)
- acpi-cpufreq driver rework to switch it over to the new CPU
offline/online state machine (Sebastian Andrzej Siewior)
- Assorted cleanups in cpufreq drivers (Wei Yongjun, Prashanth
Prakash)
- Idle injection rework (to make it use the regular idle path instead
of a home-grown custom one) and related powerclamp thermal driver
updates (Peter Zijlstra, Jacob Pan, Petr Mladek, Sebastian Andrzej
Siewior)
- New CPU IDs for Atom Z34xx and Knights Mill in intel_idle (Andy
Shevchenko, Piotr Luc)
- intel_idle driver cleanups and switch over to using the new CPU
offline/online state machine (Anna-Maria Gleixner, Sebastian
Andrzej Siewior)
- cpuidle DT driver update to support suspend-to-idle properly
(Sudeep Holla)
- cpuidle core cleanups and misc updates (Daniel Lezcano, Pan Bian,
Rafael Wysocki)
- Preliminary support for power domains including CPUs in the generic
power domains (genpd) framework and related DT bindings (Lina Iyer)
- Assorted fixes and cleanups in the generic power domains (genpd)
framework (Colin Ian King, Dan Carpenter, Geert Uytterhoeven)
- Preliminary support for devices with multiple voltage regulators
and related fixes and cleanups in the Operating Performance Points
(OPP) library (Viresh Kumar, Masahiro Yamada, Stephen Boyd)
- System sleep state selection interface rework to make it easier to
support suspend-to-idle as the default system suspend method
(Rafael Wysocki)
- PM core fixes and cleanups, mostly related to the interactions
between the system suspend and runtime PM frameworks (Ulf Hansson,
Sahitya Tummala, Tony Lindgren)
- Latency tolerance PM QoS framework imorovements (Andrew Lutomirski)
- New Knights Mill CPU ID for the Intel RAPL power capping driver
(Piotr Luc)
- Intel RAPL power capping driver fixes, cleanups and switch over to
using the new CPU offline/online state machine (Jacob Pan, Thomas
Gleixner, Sebastian Andrzej Siewior)
- Fixes and cleanups in the exynos-ppmu, exynos-nocp, rk3399_dmc,
rockchip-dfi devfreq drivers and the devfreq core (Axel Lin,
Chanwoo Choi, Javier Martinez Canillas, MyungJoo Ham, Viresh Kumar)
- Fix for false-positive KASAN warnings during resume from ACPI S3
(suspend-to-RAM) on x86 (Josh Poimboeuf)
- Memory map verification during resume from hibernation on x86 to
ensure a consistent address space layout (Chen Yu)
- Wakeup sources debugging enhancement (Xing Wei)
- rockchip-io AVS driver cleanup (Shawn Lin)"
* tag 'pm-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (127 commits)
devfreq: rk3399_dmc: Don't use OPP structures outside of RCU locks
devfreq: rk3399_dmc: Remove dangling rcu_read_unlock()
devfreq: exynos: Don't use OPP structures outside of RCU locks
Documentation: intel_pstate: Document HWP energy/performance hints
cpufreq: intel_pstate: Support for energy performance hints with HWP
cpufreq: intel_pstate: Add locking around HWP requests
PM / sleep: Print active wakeup sources when blocking on wakeup_count reads
PM / core: Fix bug in the error handling of async suspend
PM / wakeirq: Fix dedicated wakeirq for drivers not using autosuspend
PM / Domains: Fix compatible for domain idle state
PM / OPP: Don't WARN on multiple calls to dev_pm_opp_set_regulators()
PM / OPP: Allow platform specific custom set_opp() callbacks
PM / OPP: Separate out _generic_set_opp()
PM / OPP: Add infrastructure to manage multiple regulators
PM / OPP: Pass struct dev_pm_opp_supply to _set_opp_voltage()
PM / OPP: Manage supply's voltage/current in a separate structure
PM / OPP: Don't use OPP structure outside of rcu protected section
PM / OPP: Reword binding supporting multiple regulators per device
PM / OPP: Fix incorrect cpu-supply property in binding
cpuidle: Add a kerneldoc comment to cpuidle_use_deepest_state()
..
|
| | \ \ | |
| | \ \ | |
| |\ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
* pm-sleep:
PM / sleep: Print active wakeup sources when blocking on wakeup_count reads
x86/suspend: fix false positive KASAN warning on suspend/resume
PM / sleep / ACPI: Use the ACPI_FADT_LOW_POWER_S0 flag
PM / sleep: System sleep state selection interface rework
PM / hibernate: Verify the consistent of e820 memory map by md5 digest
* powercap:
powercap / RAPL: Add Knights Mill CPUID
powercap/intel_rapl: fix and tidy up error handling
powercap/intel_rapl: Track active CPUs internally
powercap/intel_rapl: Cleanup duplicated init code
powercap/intel rapl: Convert to hotplug state machine
powercap/intel_rapl: Propagate error code when registration fails
powercap/intel_rapl: Add missing domain data update on hotplug
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Modify the ACPI system sleep support setup code to select
suspend-to-idle as the default system sleep state if the
ACPI_FADT_LOW_POWER_S0 flag is set in the FADT and the
default sleep state was not selected from the kernel command
line.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Mario Limonciello <mario.limonciello@dell.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
There are systems in which the platform doesn't support any special
sleep states, so suspend-to-idle (PM_SUSPEND_FREEZE) is the only
available system sleep state. However, some user space frameworks
only use the "mem" and (sometimes) "standby" sleep state labels, so
the users of those systems need to modify user space in order to be
able to use system suspend at all and that may be a pain in practice.
Commit 0399d4db3edf (PM / sleep: Introduce command line argument for
sleep state enumeration) attempted to address this problem by adding
a command line argument to change the meaning of the "mem" string in
/sys/power/state to make it trigger suspend-to-idle (instead of
suspend-to-RAM).
However, there also are systems in which the platform does support
special sleep states, but suspend-to-idle is the preferred one anyway
(it even may save more energy than the platform-provided sleep states
in some cases) and the above commit doesn't help in those cases.
For this reason, rework the system sleep state selection interface
again (but preserve backwards compatibiliby). Namely, add a new
sysfs file, /sys/power/mem_sleep, that will control the system
suspend mode triggered by writing "mem" to /sys/power/state (in
analogy with what /sys/power/disk does for hibernation). Make it
select suspend-to-RAM ("deep" sleep) by default (if supported) and
fall back to suspend-to-idle ("s2idle") otherwise and add a new
command line argument, mem_sleep_default, allowing that default to
be overridden if need be.
At the same time, drop the relative_sleep_states command line
argument that doesn't make sense any more.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Mario Limonciello <mario.limonciello@dell.com>
|
| |\ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
* pm-cpuidle:
cpuidle: Add a kerneldoc comment to cpuidle_use_deepest_state()
cpuidle: fix improper return value on error
intel_idle: Convert to hotplug state machine
intel_idle: Remove superfluous SMP fuction call
MAINTAINERS: Add Jacob Pan as a new intel_idle maintainer
MAINTAINERS: Add bug tracking system location entries for cpuidle
x86/intel_idle: Add Knights Mill CPUID
x86/intel_idle: Add CPU model 0x4a (Atom Z34xx series)
thermal/intel_powerclamp: stop sched tick in forced idle
thermal/intel_powerclamp: Convert to CPU hotplug state
thermal/intel_powerclamp: Convert the kthread to kthread worker API
thermal/intel_powerclamp: Remove duplicated code that starts the kthread
sched/idle: Add support for tasks that inject idle
cpuidle: Allow enforcing deepest idle state selection
cpuidle/powernv: staticise powernv_idle_driver
cpuidle: dt: assign ->enter_freeze to same as ->enter callback function
cpuidle: governors: Remove remaining old module code
|
| | |\ \ \ \ \ |
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
Idle injection drivers such as Intel powerclamp and ACPI PAD drivers use
realtime tasks to take control of CPU then inject idle. There are two
issues with this approach:
1. Low efficiency: injected idle task is treated as busy so sched ticks
do not stop during injected idle period, the result of these
unwanted wakeups can be ~20% loss in power savings.
2. Idle accounting: injected idle time is presented to user as busy.
This patch addresses the issues by introducing a new PF_IDLE flag which
allows any given task to be treated as idle task while the flag is set.
Therefore, idle injection tasks can run through the normal flow of NOHZ
idle enter/exit to get the correct accounting as well as tick stop when
possible.
The implication is that idle task is then no longer limited to PID == 0.
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
| | | | |_|/ /
| | | |/| | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
When idle injection is used to cap power, we need to override the
governor's choice of idle states.
For this reason, make it possible the deepest idle state selection to
be enforced by setting a flag on a given CPU to achieve the maximum
potential power draw reduction.
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
[ rjw: Subject & changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
| |\ \ \ \ \ \
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
* pm-cpufreq: (51 commits)
Documentation: intel_pstate: Document HWP energy/performance hints
cpufreq: intel_pstate: Support for energy performance hints with HWP
cpufreq: intel_pstate: Add locking around HWP requests
cpufreq: ondemand: Set MIN_FREQUENCY_UP_THRESHOLD to 1
cpufreq: intel_pstate: Add Knights Mill CPUID
MAINTAINERS: Add bug tracking system location entry for cpufreq
cpufreq: dt: Add support for zx296718
cpufreq: acpi-cpufreq: drop rdmsr_on_cpus() usage
cpufreq: acpi-cpufreq: Convert to hotplug state machine
cpufreq: intel_pstate: fix intel_pstate_exit_perf_limits() prototype
cpufreq: intel_pstate: Set EPP/EPB to 0 in performance mode
cpufreq: schedutil: Rectify comment in sugov_irq_work() function
cpufreq: intel_pstate: increase precision of performance limits
cpufreq: intel_pstate: round up min_perf limits
cpufreq: Make cpufreq_update_policy() void
ACPI / processor: Make acpi_processor_ppc_has_changed() void
cpufreq: Avoid using inactive policies
cpufreq: intel_pstate: Generic governors support
cpufreq: intel_pstate: Request P-states control from SMM if needed
cpufreq: dt: Add support for r8a7743 and r8a7745
...
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
This patch rectifies a comment present in sugov_irq_work() function to
follow proper grammar.
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
Execute the irq-work specific initialization/exit code only when the
fast path isn't available.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
If slow path frequency changes are conducted in a SCHED_OTHER context
then they may be delayed for some amount of time, including
indefinitely, when real time or deadline activity is taking place.
Move the slow path to a real time kernel thread. In the future the
thread should be made SCHED_DEADLINE. The RT priority is arbitrarily set
to 50 for now.
Hackbench results on ARM Exynos, dual core A15 platform for 10
iterations:
$ hackbench -s 100 -l 100 -g 10 -f 20
Before After
---------------------------------
1.808 1.603
1.847 1.251
2.229 1.590
1.952 1.600
1.947 1.257
1.925 1.627
2.694 1.620
1.258 1.621
1.919 1.632
1.250 1.240
Average:
1.8829 1.5041
Based on initial work by Steve Muckle.
Signed-off-by: Steve Muckle <smuckle.linux@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
The fast_switch_enabled flag will be used by both sugov_policy_alloc()
and sugov_policy_free() with a later patch.
Prepare for that by moving the calls to enable and disable it to the
beginning of sugov_init() and end of sugov_exit().
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
Switch to the more common practice of writing labels.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|\ \ \ \ \ \ \ \
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Pull block layer updates from Jens Axboe:
"This is the main block pull request this series. Contrary to previous
release, I've kept the core and driver changes in the same branch. We
always ended up having dependencies between the two for obvious
reasons, so makes more sense to keep them together. That said, I'll
probably try and keep more topical branches going forward, especially
for cycles that end up being as busy as this one.
The major parts of this pull request is:
- Improved support for O_DIRECT on block devices, with a small
private implementation instead of using the pig that is
fs/direct-io.c. From Christoph.
- Request completion tracking in a scalable fashion. This is utilized
by two components in this pull, the new hybrid polling and the
writeback queue throttling code.
- Improved support for polling with O_DIRECT, adding a hybrid mode
that combines pure polling with an initial sleep. From me.
- Support for automatic throttling of writeback queues on the block
side. This uses feedback from the device completion latencies to
scale the queue on the block side up or down. From me.
- Support from SMR drives in the block layer and for SD. From Hannes
and Shaun.
- Multi-connection support for nbd. From Josef.
- Cleanup of request and bio flags, so we have a clear split between
which are bio (or rq) private, and which ones are shared. From
Christoph.
- A set of patches from Bart, that improve how we handle queue
stopping and starting in blk-mq.
- Support for WRITE_ZEROES from Chaitanya.
- Lightnvm updates from Javier/Matias.
- Supoort for FC for the nvme-over-fabrics code. From James Smart.
- A bunch of fixes from a whole slew of people, too many to name
here"
* 'for-4.10/block' of git://git.kernel.dk/linux-block: (182 commits)
blk-stat: fix a few cases of missing batch flushing
blk-flush: run the queue when inserting blk-mq flush
elevator: make the rqhash helpers exported
blk-mq: abstract out blk_mq_dispatch_rq_list() helper
blk-mq: add blk_mq_start_stopped_hw_queue()
block: improve handling of the magic discard payload
blk-wbt: don't throttle discard or write zeroes
nbd: use dev_err_ratelimited in io path
nbd: reset the setup task for NBD_CLEAR_SOCK
nvme-fabrics: Add FC LLDD loopback driver to test FC-NVME
nvme-fabrics: Add target support for FC transport
nvme-fabrics: Add host support for FC transport
nvme-fabrics: Add FC transport LLDD api definitions
nvme-fabrics: Add FC transport FC-NVME definitions
nvme-fabrics: Add FC transport error codes to nvme.h
Add type 0x28 NVME type code to scsi fc headers
nvme-fabrics: patch target code in prep for FC transport support
nvme-fabrics: set sqe.command_id in core not transports
parser: add u64 number parser
nvme-rdma: align to generic ib_event logging helper
...
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Remove the WRITE_* and READ_SYNC wrappers, and just use the flags
directly. Where applicable this also drops usage of the
bio_set_op_attrs wrapper.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
| | |_|_|/ / / /
| |/| | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
Now that we don't need the common flags to overflow outside the range
of a 32-bit type we can encode them the same way for both the bio and
request fields. This in addition allows us to place the operation
first (and make some room for more ops while we're at it) and to
stop having to shift around the operation values.
In addition this allows passing around only one value in the block layer
instead of two (and eventuall also in the file systems, but we can do
that later) and thus clean up a lot of code.
Last but not least this allows decreasing the size of the cmd_flags
field in struct request to 32-bits. Various functions passing this
value could also be updated, but I'd like to avoid the churn for now.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|\ \ \ \ \ \ \ \
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull pstore updates from Kees Cook:
"Improvements and fixes to pstore subsystem:
- add additional checks for bad platform data
- remove bounce buffer in console writer
- protect read/unlink race with a mutex
- correctly give up during dump locking failures
- increase ftrace bandwidth by splitting ftrace buffers per CPU"
* tag 'pstore-v4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
ramoops: add pdata NULL check to ramoops_probe
pstore: Convert console write to use ->write_buf
pstore: Protect unlink with read_mutex
pstore: Use global ftrace filters for function trace filtering
ftrace: Provide API to use global filtering for ftrace ops
pstore: Clarify context field przs as dprzs
pstore: improve error report for failed setup
pstore: Merge per-CPU ftrace records into one
pstore: Add ftrace timestamp counter
ramoops: Split ftrace buffer space into per-CPU zones
pstore: Make ramoops_init_przs generic for other prz arrays
pstore: Allow prz to control need for locking
pstore: Warn on PSTORE_TYPE_PMSG using deprecated function
pstore: Make spinlock per zone instead of global
pstore: Actually give up during locking failure
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Currently the global_ops filtering hash is not available to outside users
registering for function tracing. Provide an API for those users to be
able to choose global filtering.
This is in preparation for pstore's ftrace feature to be able to
use the global filters.
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Joel Fernandes <joelaf@google.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|\ \ \ \ \ \ \ \ \
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
Merge updates from Andrew Morton:
- various misc bits
- most of MM (quite a lot of MM material is awaiting the merge of
linux-next dependencies)
- kasan
- printk updates
- procfs updates
- MAINTAINERS
- /lib updates
- checkpatch updates
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (123 commits)
init: reduce rootwait polling interval time to 5ms
binfmt_elf: use vmalloc() for allocation of vma_filesz
checkpatch: don't emit unified-diff error for rename-only patches
checkpatch: don't check c99 types like uint8_t under tools
checkpatch: avoid multiple line dereferences
checkpatch: don't check .pl files, improve absolute path commit log test
scripts/checkpatch.pl: fix spelling
checkpatch: don't try to get maintained status when --no-tree is given
lib/ida: document locking requirements a bit better
lib/rbtree.c: fix typo in comment of ____rb_erase_color
lib/Kconfig.debug: make CONFIG_STRICT_DEVMEM depend on CONFIG_DEVMEM
MAINTAINERS: add drm and drm/i915 irc channels
MAINTAINERS: add "C:" for URI for chat where developers hang out
MAINTAINERS: add drm and drm/i915 bug filing info
MAINTAINERS: add "B:" for URI where to file bugs
get_maintainer: look for arbitrary letter prefixes in sections
printk: add Kconfig option to set default console loglevel
printk/sound: handle more message headers
printk/btrfs: handle more message headers
printk/kdb: handle more message headers
...
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
Commit 4bcc595ccd80 ("printk: reinstate KERN_CONT for printing
continuation lines") allows to define more message headers for a single
message. The motivation is that continuous lines might get mixed.
Therefore it make sense to define the right log level for every piece of
a cont line.
This patch introduces printk_skip_headers() that will skip all headers
and uses it in the kdb code instead of printk_skip_level().
This approach helps to fix other printk_skip_level() users
independently.
Link: http://lkml.kernel.org/r/1478695291-12169-3-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Cc: Joe Perches <joe@perches.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.com>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <jbacik@fb.com>
Cc: David Sterba <dsterba@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
Commit 4bcc595ccd80 ("printk: reinstate KERN_CONT for printing
continuation lines") added back KERN_CONT message header. As a result
it might appear in the middle of the line when the parts are squashed
via the temporary NMI buffer.
A reasonable solution seems to be to split the text in the NNI temporary
not only by newlines but also by the message headers.
Another solution would be to filter out KERN_CONT when writing to the
temporary buffer. But this would complicate the lockless handling.
Also it would not solve problems with a missing newline that was there
even before the KERN_CONT stuff.
This patch moves the temporary buffer handling into separate function.
I played with it and it seems that using the char pointers make the code
easier to read.
Also it prints the final newline as a continuous line.
Finally, it moves handling of the s->len overflow into the paranoid
check. And allows to recover from the disaster.
Link: http://lkml.kernel.org/r/1478695291-12169-2-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.com>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <jbacik@fb.com>
Cc: David Sterba <dsterba@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
vsnprintf() adds the trailing '\0' but it does not count it into the
number of printed characters. The result is that there is one byte less
space for the real characters in the buffer.
The broken check for the free space might cause that we will repeatedly
try to print 1 character into the buffer, never reach the full buffer,
and do not count the messages as missed.
Also vsnprintf() returns the number of characters that would be printed
if the buffer was big enough. As a result, s->len might be bigger than
the size of the buffer[*]. And the printk() function might return
bigger len than it really printed. Both problems are fixed by using
vscnprintf() instead.
Note that I though about increasing the number of missed messages even
when the message was shrunken. But it made the code even more
complicated. I think that it is not worth it. Shrunken messages are
usually easy to recognize. And it should be a corner case.
[*] The overflown s->len value is crazy and unexpected. I "made a
mistake" and reported this situation as an internal error when fixed
handling of PR_CONT headers in some other patch.
Link: http://lkml.kernel.org/r/20161208174912.GA17042@linux.suse
Signed-off-by: Petr Mladek <pmladek@suse.com>
CcL Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Chris Mason <clm@fb.com>
Cc: David Sterba <dsterba@suse.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Joe Perches <joe@perches.com>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Takashi Iwai <tiwai@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
Since sysctl_hung_task_warnings == -1 is allowed (infinite warnings),
commit 48a6d64edadb ("hung_task: allow hung_task_panic when
hung_task_warnings is 0") should decrement it only when it is not -1.
This prevents the kernel from ceasing warnings after the first
4294967295 ;)
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: John Siddle <jsiddle@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
vfree() is going to use sleeping lock. Thread stack freed in atomic
context, therefore we must use vfree_atomic() here.
Link: http://lkml.kernel.org/r/1479474236-4139-6-git-send-email-hch@lst.de
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Jisheng Zhang <jszhang@marvell.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: John Dias <joaodias@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
This limitation came with the reason to remove "another way for
malicious code to obscure a compromised program and masquerade as a
benign process" by allowing "security-concious program can use this
prctl once during its early initialization to ensure the prctl cannot
later be abused for this purpose":
http://marc.info/?l=linux-kernel&m=133160684517468&w=2
This explanation doesn't look sufficient. The only thing "exe" link is
indicating is the file, used to execve, which is basically nothing and
not reliable immediately after process has returned from execve system
call.
Moreover, to use this feture, all the mappings to previous exe file have
to be unmapped and all the new exe file permissions must be satisfied.
Which means, that changing exe link is very similar to calling execve on
the binary.
The need to remove this limitations comes from migration of NFS mount
point, which is not accessible during restore and replaced by other file
system. Because of this exe link has to be changed twice.
[akpm@linux-foundation.org: fix up comment]
Link: http://lkml.kernel.org/r/20160927153755.9337.69650.stgit@localhost.localdomain
Signed-off-by: Stanislav Kinsburskiy <skinsbursky@virtuozzo.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
When commit fbae2d44aa1d ("kthread: add kthread_create_worker*()")
introduced some kthread_create_...() functions which were taking
printf-like parametter, it introduced __printf attributes to some
functions (e.g. kthread_create_worker()). Nevertheless some new
functions were forgotten (they have been detected thanks to
-Wmissing-format-attribute warning flag).
Add the missing __printf attributes to the newly-introduced functions in
order to detect formatting issues at build-time with -Wformat flag.
Link: http://lkml.kernel.org/r/20161126193543.22672-1-nicolas.iooss_linux@m4x.org
Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\ \ \ \ \ \ \ \ \ \
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"The irq department provides:
- a major update to the auto affinity management code, which is used
by multi-queue devices
- move of the microblaze irq chip driver into the common driver code
so it can be shared between microblaze, powerpc and MIPS
- a series of updates to the ARM GICV3 interrupt controller
- the usual pile of fixes and small improvements all over the place"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
powerpc/virtex: Use generic xilinx irqchip driver
irqchip/xilinx: Try to fall back if xlnx,kind-of-intr not provided
irqchip/xilinx: Add support for parent intc
irqchip/xilinx: Rename get_irq to xintc_get_irq
irqchip/xilinx: Restructure and use jump label api
irqchip/xilinx: Clean up print messages
microblaze/irqchip: Move intc driver to irqchip
ARM: virt: Select ARM_GIC_V3_ITS
ARM: gic-v3-its: Add 32bit support to GICv3 ITS
irqchip/gic-v3-its: Specialise readq and writeq accesses
irqchip/gic-v3-its: Specialise flush_dcache operation
irqchip/gic-v3-its: Narrow down Entry Size when used as a divider
irqchip/gic-v3-its: Change unsigned types for AArch32 compatibility
irqchip/gic-v3: Use nops macro for Cavium ThunderX erratum 23154
irqchip/gic-v3: Convert arm64 GIC accessors to {read,write}_sysreg_s
genirq/msi: Drop artificial PCI dependency
irqchip/bcm7038-l1: Implement irq_cpu_offline() callback
genirq/affinity: Use default affinity mask for reserved vectors
genirq/affinity: Take reserved vectors into account when spreading irqs
PCI: Remove the irq_affinity mask from struct pci_dev
...
|