| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
version
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
|
|
|
|
|
|
|
|
| |
... at a cost of added small ifdef __ia64__ in asm-generic siginfo.h,
that is.
-- EWB Corrected the comment on _flags to reflect the move
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
|
|
|
|
| |
We have never passed either field to or from userspace so just remove them.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
|
|
|
|
|
|
| |
In preparation for unconditionally copying the whole of siginfo
to userspace clear si_sys_private. So this kernel internal
value is guaranteed not to make it to userspace.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
|
|
| |
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
|
|
|
|
|
|
|
| |
The header uapi/asm-generic/siginfo.h appears to the the repository of
all of these definitions in linux so collect up glibcs additions as
well. Just to prevent someone from accidentally creating a conflict
in the future.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
|
|
| |
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
|
|
| |
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SEGV_PKUERR is a signal specific si_code which happens to have the
same numeric value as several others: BUS_MCEERR_AR, ILL_ILLTRP,
FPE_FLTOVF, TRAP_HWBKPT, CLD_TRAPPED, POLL_ERR, SEGV_THREAD_ID,
as such it is not safe to just test the si_code the signal number
must also be tested to prevent a false positive in fill_sig_info_pkey.
I found this error by inspection, and BUS_MCEERR_AR appears to
be a real candidate for confusion. So pass in si_signo and fix it.
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Fixes: 019132ff3daf ("x86/mm/pkeys: Fill in pkey field in siginfo")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Setting si_code to 0 is the same a setting si_code to SI_USER which is definitely
not correct. With si_code set to SI_USER si_pid and si_uid will be copied to
userspace instead of si_addr. Which is very wrong.
So fix this by using a sensible si_code (SEGV_MAPERR) for this failure.
Cc: stable@vger.kernel.org
Fixes: b920de1b77b7 ("mn10300: add the MN10300/AM33 architecture to the kernel")
Cc: David Howells <dhowells@redhat.com>
Cc: Masakazu Urade <urade.masakazu@jp.panasonic.com>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Call clear_siginfo to ensure stack allocated siginfos are fully
initialized before being passed to the signal sending functions.
This ensures that if there is the kind of confusion documented by
TRAP_FIXME, FPE_FIXME, or BUS_FIXME the kernel won't send unitialized
data to userspace when the kernel generates a signal with SI_USER but
the copy to userspace assumes it is a different kind of signal, and
different fields are initialized.
This also prepares the way for turning copy_siginfo_to_user
into a copy_to_user, by removing the need in many cases to perform
a field by field copy simply to skip the uninitialized fields.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unfortunately struct siginfo has holes both in the common part of the
structure, in the union members, and in the lack of padding of the
union members. The result of those wholes is that the C standard does
not guarantee those bits will be initialized. As struct siginfo is
for communication between the kernel and userspace that is a problem.
Add the helper function clear_siginfo that is guaranteed to clear all of
the bits in struct siginfo so when the structure is copied there is no danger
of copying old kernel data and causing a leak of information from kernel
space to userspace.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The savings for copying just part of struct siginfo appears to be in the
noise on modern machines. So remove this ``optimization'' and simplify the code.
At the same time mark the second parameter as constant so there is no confusion
as to which direction the copy will go.
This ensures that a fully initialized siginfo that is sent ends up as
a fully initialized siginfo on the signal queue. This full initialization
ensures even confused code won't copy unitialized data to userspace, and
it prepares for turning copy_siginfo_to_user into a simple copy_to_user.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Setting si_code to 0 results in a userspace seeing an si_code of 0.
This is the same si_code as SI_USER. Posix and common sense requires
that SI_USER not be a signal specific si_code. As such this use of 0
for the si_code is a pretty horribly broken ABI.
Further use of si_code == 0 guaranteed that copy_siginfo_to_user saw a
value of __SI_KILL and now sees a value of SIL_KILL with the result
that uid and pid fields are copied and which might copying the si_addr
field by accident but certainly not by design. Making this a very
flakey implementation.
Utilizing FPE_FIXME, siginfo_layout will now return SIL_FAULT and the
appropriate fields will be reliably copied.
Possible ABI fixes includee:
- Send the signal without siginfo
- Don't generate a signal
- Possibly assign and use an appropriate si_code
- Don't handle cases which can't happen
Cc: Russell King <rmk@flint.arm.linux.org.uk>
Cc: linux-arm-kernel@lists.infradead.org
Ref: 451436b7bbb2 ("[ARM] Add support code for ARM hardware vector floating point")
History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Setting si_code to 0 results in a userspace seeing an si_code of 0.
This is the same si_code as SI_USER. Posix and common sense requires
that SI_USER not be a signal specific si_code. As such this use of 0
for the si_code is a pretty horribly broken ABI.
Further use of si_code == 0 guaranteed that copy_siginfo_to_user saw a
value of __SI_KILL and now sees a value of SIL_KILL with the result
that uid and pid fields are copied and which might copying the si_addr
field by accident but certainly not by design. Making this a very
flakey implementation.
Utilizing FPE_FIXME, BUS_FIXME, TRAP_FIXME siginfo_layout will now return
SIL_FAULT and the appropriate fields will be reliably copied.
But folks this is a new and unique kind of bad. This is massively
untested code bad. This is inventing new and unique was to get
siginfo wrong bad. This is don't even think about Posix or what
siginfo means bad. This is lots of eyeballs all missing the fact
that the code does the wrong thing bad. This is getting stuck
and keep making the same mistake bad.
I really hope we can find a non userspace breaking fix for this on a
port as new as arm64.
Possible ABI fixes include:
- Send the signal without siginfo
- Don't generate a signal
- Possibly assign and use an appropriate si_code
- Don't handle cases which can't happen
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Tyler Baicar <tbaicar@codeaurora.org>
Cc: James Morse <james.morse@arm.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Nicolas Pitre <nico@linaro.org>
Cc: Olof Johansson <olof@lixom.net>
Cc: Santosh Shilimkar <santosh.shilimkar@ti.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linux-arm-kernel@lists.infradead.org
Ref: 53631b54c870 ("arm64: Floating point and SIMD")
Ref: 32015c235603 ("arm64: exception: handle Synchronous External Abort")
Ref: 1d18c47c735e ("arm64: MMU fault handling and page table management")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Setting si_code to 0 results in a userspace seeing an si_code of 0.
This is the same si_code as SI_USER. Posix and common sense requires
that SI_USER not be a signal specific si_code. As such this use of 0
for the si_code is a pretty horribly broken ABI.
Further use of si_code == 0 guaranteed that copy_siginfo_to_user saw a
value of __SI_KILL and now sees a value of SIL_KILL with the result
that uid and pid fields are copied and which might copying the si_addr
field by accident but certainly not by design. Making this a very
flakey implementation.
Utilizing FPE_FIXME and TRAP_FIXME, siginfo_layout() will now return
SIL_FAULT and the appropriate fields will be reliably copied.
Possible ABI fixes includee:
- Send the signal without siginfo
- Don't generate a signal
- Possibly assign and use an appropriate si_code
- Don't handle cases which can't happen
Cc: Paul Mackerras <paulus@samba.org>
Cc: Kumar Gala <kumar.gala@freescale.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@lists.ozlabs.org
Ref: 9bad068c24d7 ("[PATCH] ppc32: support for e500 and 85xx")
Ref: 0ed70f6105ef ("PPC32: Provide proper siginfo information on various exceptions.")
History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Setting si_code to 0 results in a userspace seeing an si_code of 0.
This is the same si_code as SI_USER. Posix and common sense requires
that SI_USER not be a signal specific si_code. As such this use of 0
for the si_code is a pretty horribly broken ABI.
Further use of si_code == 0 guaranteed that copy_siginfo_to_user saw a
value of __SI_KILL and now sees a value of SIL_KILL with the result
hat uid and pid fields are copied and which might copying the si_addr
field by accident but certainly not by design. Making this a very
flakey implementation.
Utilizing FPE_FIXME siginfo_layout will now return SIL_FAULT and the
appropriate fields will reliably be copied.
Possible ABI fixes includee:
- Send the signal without siginfo
- Don't generate a signal
- Possibly assign and use an appropriate si_code
- Don't handle cases which can't happen
Cc: James Hogan <james.hogan@imgtec.com>
Cc: linux-metag@vger.kernel.org
Ref: ac919f0883e5 ("metag: Traps")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Setting si_code to 0 results in a userspace seeing an si_code of 0.
This is the same si_code as SI_USER. Posix and common sense requires
that SI_USER not be a signal specific si_code. As such this use of 0
for the si_code is a pretty horribly broken ABI.
Further use of si_code == 0 guaranteed that copy_siginfo_to_user saw a
value of __SI_KILL and now sees a value of SIL_KILL with the result
that uid and pid fields are copied and which might copying the si_addr
field by accident but certainly not by design. Making this a very
flakey implementation.
Utilizing FPE_FIXME siginfo_layout will now return SIL_FAULT and the
appropriate fields will reliably be copied.
This bug is 13 years old and parsic machines are no longer being built
so I don't know if it possible or worth fixing it. But it is at least
worth documenting this so other architectures don't make the same
mistake.
Possible ABI fixes includee:
- Send the signal without siginfo
- Don't generate a signal
- Possibly assign and use an appropriate si_code
- Don't handle cases which can't happen
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: linux-parisc@vger.kernel.org
Ref: 313c01d3e3fd ("[PATCH] PA-RISC update for 2.6.0")
Histroy Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While reviewing the signal sending on openrisc the do_unaligned_access
function stood out because it is obviously wrong. A comment about an
si_code set above when actually si_code is never set. Leading to a
random si_code being sent to userspace in the event of an unaligned
access.
Looking further SIGBUS BUS_ADRALN is the proper pair of signal and
si_code to send for an unaligned access. That is what other
architectures do and what is required by posix.
Given that do_unaligned_access is broken in a way that no one can be
relying on it on openrisc fix the code to just do the right thing.
Cc: stable@vger.kernel.org
Fixes: 769a8a96229e ("OpenRISC: Traps")
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: openrisc@lists.librecores.org
Acked-by: Stafford Horne <shorne@gmail.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Set si_signo.
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: linux-sh@vger.kernel.org
Cc: stable@vger.kernel.org
Fixes: 0983b31849bb ("sh: Wire up division and address error exceptions on SH-2A.")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Rename from kdb_send_sig_info to kdb_send_sig
As there is no meaningful siginfo sent
- Use SEND_SIG_PRIV instead of generating a siginfo for a kdb
signal. The generated siginfo had a bogus rationale and was
not correct in the face of pid namespaces. SEND_SIG_PRIV
is simpler and actually correct.
- As the code grabs siglock just send the signal with siglock
held instead of dropping siglock and attempting to grab it again.
- Move the sig_valid test into kdb_kill where it can generate
a good error message.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
| |
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A couple of fixlets for x86:
- Fix the ESPFIX double fault handling for 5-level pagetables
- Fix the commandline parsing for 'apic=' on 32bit systems and update
documentation
- Make zombie stack traces reliable
- Fix kexec with stack canary
- Fix the delivery mode for APICs which was missed when the x86
vector management was converted to single target delivery. Caused a
regression due to the broken hardware which ignores affinity
settings in lowest prio delivery mode.
- Unbreak modules when AMD memory encryption is enabled
- Remove an unused parameter of prepare_switch_to"
* 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/apic: Switch all APICs to Fixed delivery mode
x86/apic: Update the 'apic=' description of setting APIC driver
x86/apic: Avoid wrong warning when parsing 'apic=' in X86-32 case
x86-32: Fix kexec with stack canary (CONFIG_CC_STACKPROTECTOR)
x86: Remove unused parameter of prepare_switch_to
x86/stacktrace: Make zombie stack traces reliable
x86/mm: Unbreak modules that use the DMA API
x86/build: Make isoimage work on Debian
x86/espfix/64: Fix espfix double-fault handling on 5-level systems
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Some of the APIC incarnations are operating in lowest priority delivery
mode. This worked as long as the vector management code allocated the same
vector on all possible CPUs for each interrupt.
Lowest priority delivery mode does not necessarily respect the affinity
setting and may redirect to some other online CPU. This was documented
somewhere in the old code and the conversion to single target delivery
missed to update the delivery mode of the affected APIC drivers which
results in spurious interrupts on some of the affected CPU/Chipset
combinations.
Switch the APIC drivers over to Fixed delivery mode and remove all
leftovers of lowest priority delivery mode.
Switching to Fixed delivery mode is not a problem on these CPUs because the
kernel already uses Fixed delivery mode for IPIs. The reason for this is
that th SDM explicitely forbids lowest prio mode for IPIs. The reason is
obvious: If the irq routing does not honor destination targets in lowest
prio mode then an IPI targeted at CPU1 might end up on CPU0, which would be
a fatal problem in many cases.
As a consequence of this change, the apic::irq_delivery_mode field is now
pointless, but this needs to be cleaned up in a separate patch.
Fixes: fdba46ffb4c2 ("x86/apic: Get rid of multi CPU affinity")
Reported-by: vcaputo@pengaru.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: vcaputo@pengaru.com
Cc: Pavel Machek <pavel@ucw.cz>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712281140440.1688@nanos
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
There are two consumers of apic=: the APIC debug level and the low
level generic architecture code, but Linux just documented the first
one.
Append the second description.
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: peterz@infradead.org
Cc: rdunlap@infradead.org
Cc: corbet@lwn.net
Link: https://lkml.kernel.org/r/20171204040313.24824-2-douly.fnst@cn.fujitsu.com
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
There are two consumers of apic=:
apic_set_verbosity() for setting the APIC debug level;
parse_apic() for registering APIC driver by hand.
X86-32 supports both of them, but sometimes, kernel issues a weird warning.
eg: when kernel was booted up with 'apic=bigsmp' in command line,
early_param would warn like that:
...
[ 0.000000] APIC Verbosity level bigsmp not recognised use apic=verbose or apic=debug
[ 0.000000] Malformed early option 'apic'
...
Wrap the warning code in CONFIG_X86_64 case to avoid this.
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: peterz@infradead.org
Cc: rdunlap@infradead.org
Cc: corbet@lwn.net
Link: https://lkml.kernel.org/r/20171204040313.24824-1-douly.fnst@cn.fujitsu.com
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Commit e802a51ede91 ("x86/idt: Consolidate IDT invalidation") cleaned up
and unified the IDT invalidation that existed in a couple of places. It
changed no actual real code.
Despite not changing any actual real code, it _did_ change code generation:
by implementing the common idt_invalidate() function in
archx86/kernel/idt.c, it made the use of the function in
arch/x86/kernel/machine_kexec_32.c be a real function call rather than an
(accidental) inlining of the function.
That, in turn, exposed two issues:
- in load_segments(), we had incorrectly reset all the segment
registers, which then made the stack canary load (which gcc does
using offset of %gs) cause a trap. Instead of %gs pointing to the
stack canary, it will be the normal zero-based kernel segment, and
the stack canary load will take a page fault at address 0x14.
- to make this even harder to debug, we had invalidated the GDT just
before calling idt_invalidate(), which meant that the fault happened
with an invalid GDT, which in turn causes a triple fault and
immediate reboot.
Fix this by
(a) not reloading the special segments in load_segments(). We currently
don't do any percpu accesses (which would require %fs on x86-32) in
this area, but there's no reason to think that we might not want to
do them, and like %gs, it's pointless to break it.
(b) doing idt_invalidate() before invalidating the GDT, to keep things
at least _slightly_ more debuggable for a bit longer. Without a
IDT, traps will not work. Without a GDT, traps also will not work,
but neither will any segment loads etc. So in a very real sense,
the GDT is even more core than the IDT.
Fixes: e802a51ede91 ("x86/idt: Consolidate IDT invalidation")
Reported-and-tested-by: Alexandru Chirvasitu <achirvasub@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.LFD.2.21.1712271143180.8572@i7.lan
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Commit e37e43a497d5 ("x86/mm/64: Enable vmapped stacks
(CONFIG_HAVE_ARCH_VMAP_STACK=y)") added prepare_switch_to with one extra
parameter which is not used by the function, remove it.
Signed-off-by: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-janitors@vger.kernel.org
Link: https://lkml.kernel.org/r/20171215131533.hp6kqebw45o7uvsb@smtp.gmail.com
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Commit:
1959a60182f4 ("x86/dumpstack: Pin the target stack when dumping it")
changed the behavior of stack traces for zombies. Before that commit,
/proc/<pid>/stack reported the last execution path of the zombie before
it died:
[<ffffffff8105b877>] do_exit+0x6f7/0xa80
[<ffffffff8105bc79>] do_group_exit+0x39/0xa0
[<ffffffff8105bcf0>] __wake_up_parent+0x0/0x30
[<ffffffff8152dd09>] system_call_fastpath+0x16/0x1b
[<00007fd128f9c4f9>] 0x7fd128f9c4f9
[<ffffffffffffffff>] 0xffffffffffffffff
After the commit, it just reports an empty stack trace.
The new behavior is actually probably more correct. If the stack
refcount has gone down to zero, then the task has already gone through
do_exit() and isn't going to run anymore. The stack could be freed at
any time and is basically gone, so reporting an empty stack makes sense.
However, save_stack_trace_tsk_reliable() treats such a missing stack
condition as an error. That can cause livepatch transition stalls if
there are any unreaped zombies. Instead, just treat it as a reliable,
empty stack.
Reported-and-tested-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Fixes: af085d9084b4 ("stacktrace/x86: add function for detecting reliable stack traces")
Link: http://lkml.kernel.org/r/e4b09e630e99d0c1080528f0821fc9d9dbaeea82.1513631620.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Commit d8aa7eea78a1 ("x86/mm: Add Secure Encrypted Virtualization (SEV)
support") changed sme_active() from an inline function that referenced
sme_me_mask to a non-inlined function in order to make the sev_enabled
variable a static variable. This function was marked EXPORT_SYMBOL_GPL
because at the time the patch was submitted, sme_me_mask was marked
EXPORT_SYMBOL_GPL.
Commit 87df26175e67 ("x86/mm: Unbreak modules that rely on external
PAGE_KERNEL availability") changed sme_me_mask variable from
EXPORT_SYMBOL_GPL to EXPORT_SYMBOL, allowing external modules the ability
to build with CONFIG_AMD_MEM_ENCRYPT=y. Now, however, with sev_active()
no longer an inline function and marked as EXPORT_SYMBOL_GPL, external
modules that use the DMA API are once again broken in 4.15. Since the DMA
API is meant to be used by external modules, this needs to be changed.
Change the sme_active() and sev_active() functions from EXPORT_SYMBOL_GPL
to EXPORT_SYMBOL.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Link: https://lkml.kernel.org/r/20171215162011.14125.7113.stgit@tlendack-t1.amdoffice.net
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Debian does not ship a 'mkisofs' symlink to genisoimage. All modern
distros ship genisoimage, so just use that directly. That requires
renaming the 'genisoimage' function. Also neaten up the 'for' loop
while I'm in here.
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Using PGDIR_SHIFT to identify espfix64 addresses on 5-level systems
was wrong, and it resulted in panics due to unhandled double faults.
Use P4D_SHIFT instead, which is correct on 4-level and 5-level
machines.
This fixes a panic when running x86 selftests on 5-level machines.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Fixes: 1d33b219563f ("x86/espfix: Add support for 5-level paging")
Link: http://lkml.kernel.org/r/24c898b4f44fdf8c22d93703850fb384ef87cfdc.1513035461.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 page table isolation fixes from Thomas Gleixner:
"Four patches addressing the PTI fallout as discussed and debugged
yesterday:
- Remove stale and pointless TLB flush invocations from the hotplug
code
- Remove stale preempt_disable/enable from __native_flush_tlb()
- Plug the memory leak in the write_ldt() error path"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/ldt: Make LDT pgtable free conditional
x86/ldt: Plug memory leak in error path
x86/mm: Remove preempt_disable/enable() from __native_flush_tlb()
x86/smpboot: Remove stale TLB flush invocations
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Andy prefers to be paranoid about the pagetable free in the error path of
write_ldt(). Make it conditional and warn whenever the installment of a
secondary LDT fails.
Requested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The error path in write_ldt() tries to free 'old_ldt' instead of the newly
allocated 'new_ldt', resulting in a memory leak. It also misses to clean up a
half populated LDT pagetable, which is not a leak as it gets cleaned up
when the process exits.
Free both the potentially half populated LDT pagetable and the newly
allocated LDT struct. This can be done unconditionally because once an LDT
is mapped subsequent maps will succeed, because the PTE page is already
populated and the two LDTs fit into that single page.
Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linuxfoundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: f55f0501cbf6 ("x86/pti: Put the LDT in its own PGD if PTI is on")
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1712311121340.1899@nanos
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The preempt_disable/enable() pair in __native_flush_tlb() was added in
commit:
5cf0791da5c1 ("x86/mm: Disable preemption during CR3 read+write")
... to protect the UP variant of flush_tlb_mm_range().
That preempt_disable/enable() pair should have been added to the UP variant
of flush_tlb_mm_range() instead.
The UP variant was removed with commit:
ce4a4e565f52 ("x86/mm: Remove the UP asm/tlbflush.h code, always use the (formerly) SMP code")
... but the preempt_disable/enable() pair stayed around.
The latest change to __native_flush_tlb() in commit:
6fd166aae78c ("x86/mm: Use/Fix PCID to optimize user/kernel switches")
... added an access to a per CPU variable outside the preempt disabled
regions, which makes no sense at all. __native_flush_tlb() must always
be called with at least preemption disabled.
Remove the preempt_disable/enable() pair and add a WARN_ON_ONCE() to catch
bad callers independent of the smp_processor_id() debugging.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linuxfoundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171230211829.679325424@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
smpboot_setup_warm_reset_vector() and smpboot_restore_warm_reset_vector()
invoke local_flush_tlb() for no obvious reason.
Digging in history revealed that the original code in the 2.1 era added
those because the code manipulated a swapper_pg_dir pagetable entry. The
pagetable manipulation was removed long ago in the 2.3 timeframe, but the
TLB flush invocations stayed around forever.
Remove them along with the pointless pr_debug()s which come from the same 2.1
change.
Reported-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linuxfoundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171230211829.586548655@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
"A pile of fixes for long standing issues with the timer wheel and the
NOHZ code:
- Prevent timer base confusion accross the nohz switch, which can
cause unlocked access and data corruption
- Reinitialize the stale base clock on cpu hotplug to prevent subtle
side effects including rollovers on 32bit
- Prevent an interrupt storm when the timer softirq is already
pending caused by tick_nohz_stop_sched_tick()
- Move the timer start tracepoint to a place where it actually makes
sense
- Add documentation to timerqueue functions as they caused confusion
several times now"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timerqueue: Document return values of timerqueue_add/del()
timers: Invoke timer_start_debug() where it makes sense
nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()
timers: Reinitialize per cpu bases on hotplug
timers: Use deferrable base independent of base::nohz_active
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The return values of timerqueue_add/del() are not documented in the kernel doc
comment. Add proper documentation.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: rt@linutronix.de
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Link: https://lkml.kernel.org/r/20171222145337.872681338@linutronix.de
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The timer start debug function is called before the proper timer base is
set. As a consequence the trace data contains the stale CPU and flags
values.
Call the debug function after setting the new base and flags.
Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: stable@vger.kernel.org
Cc: rt@linutronix.de
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Link: https://lkml.kernel.org/r/20171222145337.792907137@linutronix.de
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The conditions in irq_exit() to invoke tick_nohz_irq_exit() which
subsequently invokes tick_nohz_stop_sched_tick() are:
if ((idle_cpu(cpu) && !need_resched()) || tick_nohz_full_cpu(cpu))
If need_resched() is not set, but a timer softirq is pending then this is
an indication that the softirq code punted and delegated the execution to
softirqd. need_resched() is not true because the current interrupted task
takes precedence over softirqd.
Invoking tick_nohz_irq_exit() in this case can cause an endless loop of
timer interrupts because the timer wheel contains an expired timer, but
softirqs are not yet executed. So it returns an immediate expiry request,
which causes the timer to fire immediately again. Lather, rinse and
repeat....
Prevent that by adding a check for a pending timer soft interrupt to the
conditions in tick_nohz_stop_sched_tick() which avoid calling
get_next_timer_interrupt(). That keeps the tick sched timer on the tick and
prevents a repetitive programming of an already expired timer.
Reported-by: Sebastian Siewior <bigeasy@linutronix.d>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272156050.2431@nanos
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The timer wheel bases are not (re)initialized on CPU hotplug. That leaves
them with a potentially stale clk and next_expiry valuem, which can cause
trouble then the CPU is plugged.
Add a prepare callback which forwards the clock, sets next_expiry to far in
the future and reset the control flags to a known state.
Set base->must_forward_clk so the first timer which is queued will try to
forward the clock to current jiffies.
Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel")
Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272152200.2431@nanos
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
During boot and before base::nohz_active is set in the timer bases, deferrable
timers are enqueued into the standard timer base. This works correctly as
long as base::nohz_active is false.
Once it base::nohz_active is set and a timer which was enqueued before that
is accessed the lock selector code choses the lock of the deferred
base. This causes unlocked access to the standard base and in case the
timer is removed it does not clear the pending flag in the standard base
bitmap which causes get_next_timer_interrupt() to return bogus values.
To prevent that, the deferrable timers must be enqueued in the deferrable
base, even when base::nohz_active is not set. Those deferrable timers also
need to be expired unconditional.
Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel")
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: stable@vger.kernel.org
Cc: rt@linutronix.de
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Link: https://lkml.kernel.org/r/20171222145337.633328378@linutronix.de
|
|\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull smp fixlet from Thomas Gleixner:
"A trivial build warning fix for newer compilers"
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
cpu/hotplug: Move inline keyword at the beginning of declaration
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Fix non-fatal warnings such as:
kernel/cpu.c:95:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]
static void inline cpuhp_lock_release(bool bringup) { }
^~~~~~
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Link: https://lkml.kernel.org/r/20171226140855.16583-1-malat@debian.org
|
|\ \ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Thomas Gleixner:
"Three patches addressing the fallout of the CPU_ISOLATION changes
especially with NO_HZ_FULL plus documentation of boot parameter
dependency"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/isolation: Document boot parameters dependency on CONFIG_CPU_ISOLATION=y
sched/isolation: Enable CONFIG_CPU_ISOLATION=y by default
sched/isolation: Make CONFIG_NO_HZ_FULL select CONFIG_CPU_ISOLATION
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
The "isolcpus=" and "nohz_full=" boot parameters depend on CPU Isolation
support. Let's document that.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wanpeng Li <kernellwp@gmail.com>
Cc: kernel test robot <xiaolong.ye@intel.com>
Link: http://lkml.kernel.org/r/1513275507-29200-4-git-send-email-frederic@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
The "isolcpus=" boot parameter support was always built-in before we
moved the related code under CONFIG_CPU_ISOLATION. Having it disabled by
default is very confusing for people accustomed to use this parameter.
So enable it by dafault to keep the previous behaviour but keep it
optable for those who want to tinify their kernels.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wanpeng Li <kernellwp@gmail.com>
Cc: kernel test robot <xiaolong.ye@intel.com>
Link: http://lkml.kernel.org/r/1513275507-29200-3-git-send-email-frederic@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
CONFIG_NO_HZ_FULL doesn't make sense without CONFIG_CPU_ISOLATION. In
fact enabling the first without the second is a regression as nohz_full=
boot parameter gets silently ignored.
Besides this unnatural combination hangs RCU gp kthread when running
rcutorture for reasons that are not yet fully understood:
rcu_preempt kthread starved for 9974 jiffies! g4294967208
+c4294967207 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=0
rcu_preempt I 7464 8 2 0x80000000
Call Trace:
__schedule+0x493/0x620
schedule+0x24/0x40
schedule_timeout+0x330/0x3b0
? preempt_count_sub+0xea/0x140
? collect_expired_timers+0xb0/0xb0
rcu_gp_kthread+0x6bf/0xef0
This commit therefore makes NO_HZ_FULL select CPU_ISOLATION, which
prevents all these bad behaviours.
Reported-by: kernel test robot <xiaolong.ye@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wanpeng Li <kernellwp@gmail.com>
Fixes: 5c4991e24c69 ("sched/isolation: Split out new CONFIG_CPU_ISOLATION=y config from CONFIG_NO_HZ_FULL")
Link: http://lkml.kernel.org/r/1513275507-29200-2-git-send-email-frederic@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|\ \ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
- plug a memory leak in the intel pmu init code
- clang fixes
- tooling fix to avoid including kernel headers
- a fix for jvmti to generate correct debug information for inlined
code
- replace backtick with a regular shell function
- fix the build in hardened environments
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Plug memory leak in intel_pmu_init()
x86/asm: Allow again using asm.h when building for the 'bpf' clang target
tools arch s390: Do not include header files from the kernel sources
perf jvmti: Generate correct debug information for inlined code
perf tools: Fix up build in hardened environments
perf tools: Use shell function for perl cflags retrieval
|