| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Stop abusing struct page functionality and the swap end_io handler, and
instead add a modified version of the blk-lib.c bio_batch helpers.
Also move the block I/O code into swap.c as they are directly tied into
each other.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Pavel Machek <pavel@ucw.cz>
Tested-by: Ming Lin <mlin@kernel.org>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
block plug callback could sleep, so we introduce a parameter
'from_schedule' and corresponding drivers can use it to destinguish a
schedule plug flush or a plug finish. Unfortunately io_schedule_out
still uses blk_flush_plug(). This causes below output (Note, I added a
might_sleep() in raid1_unplug to make it trigger faster, but the whole
thing doesn't matter if I add might_sleep). In raid1/10, this can cause
deadlock.
This patch makes io_schedule_out always uses blk_schedule_flush_plug.
This should only impact drivers (as far as I know, raid 1/10) which are
sensitive to the 'from_schedule' parameter.
[ 370.817949] ------------[ cut here ]------------
[ 370.817960] WARNING: CPU: 7 PID: 145 at ../kernel/sched/core.c:7306 __might_sleep+0x7f/0x90()
[ 370.817969] do not call blocking ops when !TASK_RUNNING; state=2 set at [<ffffffff81092fcf>] prepare_to_wait+0x2f/0x90
[ 370.817971] Modules linked in: raid1
[ 370.817976] CPU: 7 PID: 145 Comm: kworker/u16:9 Tainted: G W 4.0.0+ #361
[ 370.817977] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153802- 04/01/2014
[ 370.817983] Workqueue: writeback bdi_writeback_workfn (flush-9:1)
[ 370.817985] ffffffff81cd83be ffff8800ba8cb298 ffffffff819dd7af 0000000000000001
[ 370.817988] ffff8800ba8cb2e8 ffff8800ba8cb2d8 ffffffff81051afc ffff8800ba8cb2c8
[ 370.817990] ffffffffa00061a8 000000000000041e 0000000000000000 ffff8800ba8cba28
[ 370.817993] Call Trace:
[ 370.817999] [<ffffffff819dd7af>] dump_stack+0x4f/0x7b
[ 370.818002] [<ffffffff81051afc>] warn_slowpath_common+0x8c/0xd0
[ 370.818004] [<ffffffff81051b86>] warn_slowpath_fmt+0x46/0x50
[ 370.818006] [<ffffffff81092fcf>] ? prepare_to_wait+0x2f/0x90
[ 370.818008] [<ffffffff81092fcf>] ? prepare_to_wait+0x2f/0x90
[ 370.818010] [<ffffffff810776ef>] __might_sleep+0x7f/0x90
[ 370.818014] [<ffffffffa0000c03>] raid1_unplug+0xd3/0x170 [raid1]
[ 370.818024] [<ffffffff81421d9a>] blk_flush_plug_list+0x8a/0x1e0
[ 370.818028] [<ffffffff819e3550>] ? bit_wait+0x50/0x50
[ 370.818031] [<ffffffff819e21b0>] io_schedule_timeout+0x130/0x140
[ 370.818033] [<ffffffff819e3586>] bit_wait_io+0x36/0x50
[ 370.818034] [<ffffffff819e31b5>] __wait_on_bit+0x65/0x90
[ 370.818041] [<ffffffff8125b67c>] ? ext4_read_block_bitmap_nowait+0xbc/0x630
[ 370.818043] [<ffffffff819e3550>] ? bit_wait+0x50/0x50
[ 370.818045] [<ffffffff819e3302>] out_of_line_wait_on_bit+0x72/0x80
[ 370.818047] [<ffffffff810935e0>] ? autoremove_wake_function+0x40/0x40
[ 370.818050] [<ffffffff811de744>] __wait_on_buffer+0x44/0x50
[ 370.818053] [<ffffffff8125ae80>] ext4_wait_block_bitmap+0xe0/0xf0
[ 370.818058] [<ffffffff812975d6>] ext4_mb_init_cache+0x206/0x790
[ 370.818062] [<ffffffff8114bc6c>] ? lru_cache_add+0x1c/0x50
[ 370.818064] [<ffffffff81297c7e>] ext4_mb_init_group+0x11e/0x200
[ 370.818066] [<ffffffff81298231>] ext4_mb_load_buddy+0x341/0x360
[ 370.818068] [<ffffffff8129a1a3>] ext4_mb_find_by_goal+0x93/0x2f0
[ 370.818070] [<ffffffff81295b54>] ? ext4_mb_normalize_request+0x1e4/0x5b0
[ 370.818072] [<ffffffff8129ab67>] ext4_mb_regular_allocator+0x67/0x460
[ 370.818074] [<ffffffff81295b54>] ? ext4_mb_normalize_request+0x1e4/0x5b0
[ 370.818076] [<ffffffff8129ca4b>] ext4_mb_new_blocks+0x4cb/0x620
[ 370.818079] [<ffffffff81290956>] ext4_ext_map_blocks+0x4c6/0x14d0
[ 370.818081] [<ffffffff812a4d4e>] ? ext4_es_lookup_extent+0x4e/0x290
[ 370.818085] [<ffffffff8126399d>] ext4_map_blocks+0x14d/0x4f0
[ 370.818088] [<ffffffff81266fbd>] ext4_writepages+0x76d/0xe50
[ 370.818094] [<ffffffff81149691>] do_writepages+0x21/0x50
[ 370.818097] [<ffffffff811d5c00>] __writeback_single_inode+0x60/0x490
[ 370.818099] [<ffffffff811d630a>] writeback_sb_inodes+0x2da/0x590
[ 370.818103] [<ffffffff811abf4b>] ? trylock_super+0x1b/0x50
[ 370.818105] [<ffffffff811abf4b>] ? trylock_super+0x1b/0x50
[ 370.818107] [<ffffffff811d665f>] __writeback_inodes_wb+0x9f/0xd0
[ 370.818109] [<ffffffff811d69db>] wb_writeback+0x34b/0x3c0
[ 370.818111] [<ffffffff811d70df>] bdi_writeback_workfn+0x23f/0x550
[ 370.818116] [<ffffffff8106bbd8>] process_one_work+0x1c8/0x570
[ 370.818117] [<ffffffff8106bb5b>] ? process_one_work+0x14b/0x570
[ 370.818119] [<ffffffff8106c09b>] worker_thread+0x11b/0x470
[ 370.818121] [<ffffffff8106bf80>] ? process_one_work+0x570/0x570
[ 370.818124] [<ffffffff81071868>] kthread+0xf8/0x110
[ 370.818126] [<ffffffff81071770>] ? kthread_create_on_node+0x210/0x210
[ 370.818129] [<ffffffff819e9322>] ret_from_fork+0x42/0x70
[ 370.818131] [<ffffffff81071770>] ? kthread_create_on_node+0x210/0x210
[ 370.818132] ---[ end trace 7b4deb71e68b6605 ]---
V2: don't change ->in_iowait
Cc: NeilBrown <neilb@suse.de>
Signed-off-by: Shaohua Li <shli@fb.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Pull networking fixes from David Miller:
1) Receive packet length needs to be adjust by 2 on RX to accomodate
the two padding bytes in altera_tse driver. From Vlastimil Setka.
2) If rx frame is dropped due to out of memory in macb driver, we leave
the receive ring descriptors in an undefined state. From Punnaiah
Choudary Kalluri
3) Some netlink subsystems erroneously signal NLM_F_MULTI. That is
only for dumps. Fix from Nicolas Dichtel.
4) Fix mis-use of raw rt->rt_pmtu value in ipv4, one must always go via
the ipv4_mtu() helper. From Herbert Xu.
5) Fix null deref in bridge netfilter, and miscalculated lengths in
jump/goto nf_tables verdicts. From Florian Westphal.
6) Unhash ping sockets properly.
7) Software implementation of BPF divide did 64/32 rather than 64/64
bit divide. The JITs got it right. Fix from Alexei Starovoitov.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (30 commits)
ipv4: Missing sk_nulls_node_init() in ping_unhash().
net: fec: Fix RGMII-ID mode
net/mlx4_en: Schedule napi when RX buffers allocation fails
netxen_nic: use spin_[un]lock_bh around tx_clean_lock
net/mlx4_core: Fix unaligned accesses
mlx4_en: Use correct loop cursor in error path.
cxgb4: Fix MC1 memory offset calculation
bnx2x: Delay during kdump load
net: Fix Kernel Panic in bonding driver debugfs file: rlb_hash_table
net: dsa: Fix scope of eeprom-length property
net: macb: Fix race condition in driver when Rx frame is dropped
hv_netvsc: Fix a bug in netvsc_start_xmit()
altera_tse: Correct rx packet length
mlx4: Fix tx ring affinity_mask creation
tipc: fix problem with parallel link synchronization mechanism
tipc: remove wrong use of NLM_F_MULTI
bridge/nl: remove wrong use of NLM_F_MULTI
bridge/mdb: remove wrong use of NLM_F_MULTI
net: sched: act_connmark: don't zap skb->nfct
trivial: net: systemport: bcmsysport.h: fix 0x0x prefix
...
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
ALU64_DIV instruction should be dividing 64-bit by 64-bit,
whereas do_div() does 64-bit by 32-bit divide.
x64 and arm64 JITs correctly implement 64 by 64 unsigned divide.
llvm BPF backend emits code assuming that ALU64_DIV does 64 by 64.
Fixes: 89aa075832b0 ("net: sock: allow eBPF programs to be attached to sockets")
Reported-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI fixes from Rafael Wysocki:
"Three regression fixes this time, one for a recent regression in the
cpuidle core affecting multiple systems, one for an inadvertently
added duplicate typedef in ACPICA that breaks compilation with GCC 4.5
and one for an ACPI Smart Battery Subsystem driver regression
introduced during the 3.18 cycle (stable-candidate).
Specifics:
- Fix for a regression in the cpuidle core introduced by one of the
recent commits in the clockevents_notify() removal series that put
a call to a function which had to be executed with disabled
interrupts into a code path running with enabled interrupts (Rafael
J Wysocki)
- Fix for a build problem in ACPICA (with GCC 4.5) introduced by one
of the recent ACPICA tools commits that added a duplicate typedef
to one of the ACPICA's header files by mistake (Olaf Hering)
- Fix for a regression in the ACPI SBS (Smart Battery Subsystem)
driver introduced during the 3.18 development cycle causing the
smart battery manager to be marked as not present when it should be
marked as present (Chris Bainbridge)"
* tag 'pm+acpi-4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpuidle: Run tick_broadcast_exit() with disabled interrupts
ACPI / SBS: Enable battery manager when present
ACPICA: remove duplicate u8 typedef
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Commit 335f49196fd6 (sched/idle: Use explicit broadcast oneshot
control function) replaced clockevents_notify() invocations in
cpuidle_idle_call() with direct calls to tick_broadcast_enter()
and tick_broadcast_exit(), but it overlooked the fact that
interrupts were already enabled before calling the latter which
led to functional breakage on systems using idle states with the
CPUIDLE_FLAG_TIMER_STOP flag set.
Fix that by moving the invocations of tick_broadcast_enter()
and tick_broadcast_exit() down into cpuidle_enter_state() where
interrupts are still disabled when tick_broadcast_exit() is
called. Also ensure that interrupts will be disabled before
running tick_broadcast_exit() even if they have been enabled by
the idle state's ->enter callback. Trigger a WARN_ON_ONCE() in
that case, as we generally don't want that to happen for states
with CPUIDLE_FLAG_TIMER_STOP set.
Fixes: 335f49196fd6 (sched/idle: Use explicit broadcast oneshot control function)
Reported-and-tested-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reported-and-tested-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Pull kvm changes from Paolo Bonzini:
"Remove from guest code the handling of task migration during a pvclock
read; instead use the correct protocol in KVM.
This removes the need for task migration notifiers in core scheduler
code"
[ The scheduler people really hated the migration notifiers, so this was
kind of required - Linus ]
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
x86: pvclock: Really remove the sched notifier for cross-cpu migrations
kvm: x86: fix kvmclock update protocol
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This reverts commits 0a4e6be9ca17c54817cf814b4b5aa60478c6df27
and 80f7fdb1c7f0f9266421f823964fd1962681f6ce.
The task migration notifier was originally introduced in order to support
the pvclock vsyscall with non-synchronized TSC, but KVM only supports it
with synchronized TSC. Hence, on KVM the race condition is only needed
due to a bad implementation on the host side, and even then it's so rare
that it's mostly theoretical.
As far as KVM is concerned it's possible to fix the host, avoiding the
additional complexity in the vDSO and the (re)introduction of the task
migration notifier.
Xen, on the other hand, hasn't yet implemented vsyscall support at
all, so we do not care about its plans for non-synchronized TSC.
Reported-by: Peter Zijlstra <peterz@infradead.org>
Suggested-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Change default key details to be more obviously unspecified.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\ \
| |/
|/|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
"One additional new feature for 4.1, a new PRNG based on SHA-512 for
the zcrypt driver.
Two memory management related changes, the page table reallocation for
KVM is removed, and with file ptes gone the encoding of page table
entries is improved.
And three bug fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/zcrypt: Introduce new SHA-512 based Pseudo Random Generator.
s390/mm: change swap pte encoding and pgtable cleanup
s390/mm: correct transfer of dirty & young bits in __pmd_to_pte
s390/bpf: add dependency to z196 features
s390/3215: free memory in error path
s390/kvm: remove delayed reallocation of page tables for KVM
kexec: allocate the kexec control page with KEXEC_CONTROL_MEMORY_GFP
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Introduce KEXEC_CONTROL_MEMORY_GFP to allow the architecture code
to override the gfp flags of the allocation for the kexec control
page. The loop in kimage_alloc_normal_control_pages allocates pages
with GFP_KERNEL until a page is found that happens to have an
address smaller than the KEXEC_CONTROL_MEMORY_LIMIT. On systems
with a large memory size but a small KEXEC_CONTROL_MEMORY_LIMIT
the loop will keep allocating memory until the oom killer steps in.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|\ \
| |/
|/|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull fourth vfs update from Al Viro:
"d_inode() annotations from David Howells (sat in for-next since before
the beginning of merge window) + four assorted fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
RCU pathwalk breakage when running into a symlink overmounting something
fix I_DIO_WAKEUP definition
direct-io: only inc/dec inode->i_dio_count for file systems
fs/9p: fix readdir()
VFS: assorted d_backing_inode() annotations
VFS: fs/inode.c helpers: d_inode() annotations
VFS: fs/cachefiles: d_backing_inode() annotations
VFS: fs library helpers: d_inode() annotations
VFS: assorted weird filesystems: d_inode() annotations
VFS: normal filesystems (and lustre): d_inode() annotations
VFS: security/: d_inode() annotations
VFS: security/: d_backing_inode() annotations
VFS: net/: d_inode() annotations
VFS: net/unix: d_backing_inode() annotations
VFS: kernel/: d_inode() annotations
VFS: audit: d_backing_inode() annotations
VFS: Fix up some ->d_inode accesses in the chelsio driver
VFS: Cachefiles should perform fs modifications on the top layer only
VFS: AF_UNIX sockets should call mknod on the top layer only
|
| |
| |
| |
| |
| |
| |
| |
| | |
relayfs and tracefs are dealing with inodes of their own;
those two act as filesystem drivers
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
| |
| |
| |
| |
| | |
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Pull audit fixes from Paul Moore:
"Seven audit patches for v4.1, all bug fixes.
The largest, and perhaps most significant commit helps resolve some
memory pressure issues related to the inode cache and audit, there are
also a few small commits which help resolve some timing issues with
the audit log queue, and the rest fall into the always popular "code
clean-up" category.
In general, nothing really substantial, just a nice set of maintenance
patches"
* 'upstream' of git://git.infradead.org/users/pcmoore/audit:
audit: Remove condition which always evaluates to false
audit: reduce mmap_sem hold for mm->exe_file
audit: consolidate handling of mm->exe_file
audit: code clean up
audit: don't reset working wait time accidentally with auditd
audit: don't lose set wait time on first successful call to audit_log_start()
audit: move the tree pruning to a dedicated thread
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
After commit 3e1d0bb6224f019893d1c498cc3327559d183674 ("audit: Convert int limit
uses to u32"), by converting an int to u32, few conditions will always evaluate
to false.
These warnings were emitted during compilation:
kernel/audit.c: In function ‘audit_set_enabled’:
kernel/audit.c:347:2: warning: comparison of unsigned expression < 0 is always
false [-Wtype-limits]
if (state < AUDIT_OFF || state > AUDIT_LOCKED)
^
kernel/audit.c: In function ‘audit_receive_msg’:
kernel/audit.c:880:9: warning: comparison of unsigned expression < 0 is
always false [-Wtype-limits]
if (s.backlog_wait_time < 0 ||
The following patch removes those unnecessary conditions.
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The mm->exe_file is currently serialized with mmap_sem (shared)
in order to both safely (1) read the file and (2) audit it via
audit_log_d_path(). Good users will, on the other hand, make use
of the more standard get_mm_exe_file(), requiring only holding
the mmap_sem to read the value, and relying on reference counting
to make sure that the exe file won't dissapear underneath us.
Additionally, upon NULL return of get_mm_exe_file, we also call
audit_log_format(ab, " exe=(null)").
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
[PM: tweaked subject line]
Signed-off-by: Paul Moore <pmoore@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This patch adds a audit_log_d_path_exe() helper function
to share how we handle auditing of the exe_file's path.
Used by both audit and auditsc. No functionality is changed.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
[PM: tweaked subject line]
Signed-off-by: Paul Moore <pmoore@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Fixed a coding style issue (unnecessary parentheses , unnecessary braces)
Signed-off-by: Ameen-Ali <Ameenali023@gmail.com>
[PM: tweaked subject line]
Signed-off-by: Paul Moore <pmoore@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
During a queue overflow condition while we are waiting for auditd to drain the
queue to make room for regular messages, we don't want a successful auditd that
has bypassed the queue check to reset the backlog wait time.
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Copy the set wait time to a working value to avoid losing the set
value if the queue overflows.
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When file auditing is enabled, during a low memory situation, a memory
allocation with __GFP_FS can lead to pruning the inode cache. Which can,
in turn lead to audit_tree_freeing_mark() being called. This can call
audit_schedule_prune(), that tries to fork a pruning thread, and
waits until the thread is created. But forking needs memory, and the
memory allocations there are done with __GFP_FS.
So we are waiting merrily for some __GFP_FS memory allocations to complete,
while holding some filesystem locks. This can take a while ...
This patch creates a single thread for pruning the tree from
audit_add_tree_rule(), and thus avoids the deadlock that the on-demand
thread creation can cause.
Reported-by: Matt Wilson <msw@amazon.com>
Cc: Matt Wilson <msw@amazon.com>
Signed-off-by: Imre Palik <imrep@amazon.de>
Reviewed-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"This adds three fixes for the tracing code.
The first is a bug when ftrace_dump_on_oops is triggered in atomic
context and function graph tracer is the tracer that is being
reported.
The second fix is bad parsing of the trace_events from the kernel
command line, where it would ignore specific events if the system name
is used with defining the event(it enables all events within the
system).
The last one is a fix to the TRACE_DEFINE_ENUM(), where a check was
missing to see if the ptr was incremented to the end of the string,
but the loop increments it again and can miss the nul delimiter to
stop processing"
* tag 'trace-v4.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix possible out of bounds memory access when parsing enums
tracing: Fix incorrect enabling of trace events by boot cmdline
tracing: Handle ftrace_dump() atomic context in graph_trace_open()
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The code that replaces the enum names with the enum values in the
tracepoints' format files could possible miss the end of string nul
character. This was caused by processing things like backslashes, quotes
and other tokens. After processing the tokens, a check for the nul
character needed to be done before continuing the loop, because the loop
incremented the pointer before doing the check, which could bypass the nul
character.
Link: http://lkml.kernel.org/r/552E661D.5060502@oracle.com
Reported-by: Sasha Levin <sasha.levin@oracle.com> # via KASan
Tested-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Fixes: 0c564a538aa9 "tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
There is a problem that trace events are not properly enabled with
boot cmdline. The problem is that if we pass "trace_event=kmem:mm_page_alloc"
to the boot cmdline, it enables all kmem trace events, and not just
the page_alloc event.
This is caused by the parsing mechanism. When we parse the cmdline, the buffer
contents is modified due to tokenization. And, if we use this buffer
again, we will get the wrong result.
Unfortunately, this buffer is be accessed three times to set trace events
properly at boot time. So, we need to handle this situation.
There is already code handling ",", but we need another for ":".
This patch adds it.
Link: http://lkml.kernel.org/r/1429159484-22977-1-git-send-email-iamjoonsoo.kim@lge.com
Cc: stable@vger.kernel.org # 3.19+
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
[ added missing return ret; ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
graph_trace_open() can be called in atomic context from ftrace_dump().
Use GFP_ATOMIC for the memory allocations when that's the case, in order
to avoid the following splat.
BUG: sleeping function called from invalid context at mm/slab.c:2849
in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper/0
Backtrace:
..
[<8004dc94>] (__might_sleep) from [<801371f4>] (kmem_cache_alloc_trace+0x160/0x238)
r7:87800040 r6:000080d0 r5:810d16e8 r4:000080d0
[<80137094>] (kmem_cache_alloc_trace) from [<800cbd60>] (graph_trace_open+0x30/0xd0)
r10:00000100 r9:809171a8 r8:00008e28 r7:810d16f0 r6:00000001 r5:810d16e8
r4:810d16f0
[<800cbd30>] (graph_trace_open) from [<800c79c4>] (trace_init_global_iter+0x50/0x9c)
r8:00008e28 r7:808c853c r6:00000001 r5:810d16e8 r4:810d16f0 r3:800cbd30
[<800c7974>] (trace_init_global_iter) from [<800c7aa0>] (ftrace_dump+0x90/0x2ec)
r4:810d2580 r3:00000000
[<800c7a10>] (ftrace_dump) from [<80414b2c>] (sysrq_ftrace_dump+0x1c/0x20)
r10:00000100 r9:809171a8 r8:808f6e7c r7:00000001 r6:00000007 r5:0000007a
r4:808d5394
[<80414b10>] (sysrq_ftrace_dump) from [<800169b8>] (return_to_handler+0x0/0x18)
[<80415498>] (__handle_sysrq) from [<800169b8>] (return_to_handler+0x0/0x18)
r8:808c8100 r7:808c8444 r6:00000101 r5:00000010 r4:84eb3210
[<80415668>] (handle_sysrq) from [<800169b8>] (return_to_handler+0x0/0x18)
[<8042a760>] (pl011_int) from [<800169b8>] (return_to_handler+0x0/0x18)
r10:809171bc r9:809171a8 r8:00000001 r7:00000026 r6:808c6000 r5:84f01e60
r4:8454fe00
[<8007782c>] (handle_irq_event_percpu) from [<80077b44>] (handle_irq_event+0x4c/0x6c)
r10:808c7ef0 r9:87283e00 r8:00000001 r7:00000000 r6:8454fe00 r5:84f01e60
r4:84f01e00
[<80077af8>] (handle_irq_event) from [<8007aa28>] (handle_fasteoi_irq+0xf0/0x1ac)
r6:808f52a4 r5:84f01e60 r4:84f01e00 r3:00000000
[<8007a938>] (handle_fasteoi_irq) from [<80076dc0>] (generic_handle_irq+0x3c/0x4c)
r6:00000026 r5:00000000 r4:00000026 r3:8007a938
[<80076d84>] (generic_handle_irq) from [<80077128>] (__handle_domain_irq+0x8c/0xfc)
r4:808c1e38 r3:0000002e
[<8007709c>] (__handle_domain_irq) from [<800087b8>] (gic_handle_irq+0x34/0x6c)
r10:80917748 r9:00000001 r8:88802100 r7:808c7ef0 r6:808c8fb0 r5:00000015
r4:8880210c r3:808c7ef0
[<80008784>] (gic_handle_irq) from [<80014044>] (__irq_svc+0x44/0x7c)
Link: http://lkml.kernel.org/r/1428953721-31349-1-git-send-email-rabin@rab.in
Link: http://lkml.kernel.org/r/1428957012-2319-1-git-send-email-rabin@rab.in
Cc: stable@vger.kernel.org # 3.13+
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module updates from Rusty Russell:
"Quentin opened a can of worms by adding extable entry checking to
modpost, but most architectures seem fixed now. Thanks to all
involved.
Last minute rebase because I noticed a "[PATCH]" had snuck into a
commit message somehow"
* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
modpost: don't emit section mismatch warnings for compiler optimizations
modpost: expand pattern matching to support substring matches
modpost: do not try to match the SHT_NUL section.
modpost: fix extable entry size calculation.
modpost: fix inverted logic in is_extable_fault_address().
modpost: handle -ffunction-sections
modpost: Whitelist .text.fixup and .exception.text
params: handle quotes properly for values not of form foo="bar".
modpost: document the use of struct section_check.
modpost: handle relocations mismatch in __ex_table.
scripts: add check_extable.sh script.
modpost: mismatch_handler: retrieve tosym information only when needed.
modpost: factorize symbol pretty print in get_pretty_name().
modpost: add handler function pointer to sectioncheck.
modpost: add .sched.text and .kprobes.text to the TEXT_SECTIONS list.
modpost: add strict white-listing when referencing sections.
module: do not print allocation-fail warning on bogus user buffer size
kernel/module.c: fix typos in message about unused symbols
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
When starting kernel with arguments like:
init=/bin/sh -c "echo arguments"
the trailing double quote is not removed which results in following command
being executed:
/bin/sh -c 'echo arguments"'
Reported-by: Arthur Gautier <baloo@gandi.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
init_module(2) passes user-specified buffer length directly to
vmalloc(). It makes warn_alloc_failed() to print out a lot of info into
dmesg if user specified insane size, like -1.
Let's silence the warning. It doesn't add much value to -ENOMEM return
code. Without the patch the syscall is prohibitive noisy for testing
with trinity.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Fix typos in pr_warn message about unused symbols
Signed-off-by: Yannick Guerrini <yguerrini@tomshardware.fr>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|\ \ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
"Here's the big char/misc driver patchset for 4.1-rc1.
Lots of different driver subsystem updates here, nothing major, full
details are in the shortlog.
All of this has been in linux-next for a while"
* tag 'char-misc-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (133 commits)
mei: trace: remove unused TRACE_SYSTEM_STRING
DTS: ARM: OMAP3-N900: Add lis3lv02d support
Documentation: DT: lis302: update wakeup binding
lis3lv02d: DT: add wakeup unit 2 and wakeup threshold
lis3lv02d: DT: use s32 to support negative values
Drivers: hv: hv_balloon: correctly handle num_pages>INT_MAX case
Drivers: hv: hv_balloon: correctly handle val.freeram<num_pages case
mei: replace check for connection instead of transitioning
mei: use mei_cl_is_connected consistently
mei: fix mei_poll operation
hv_vmbus: Add gradually increased delay for retries in vmbus_post_msg()
Drivers: hv: hv_balloon: survive ballooning request with num_pages=0
Drivers: hv: hv_balloon: eliminate jumps in piecewiese linear floor function
Drivers: hv: hv_balloon: do not online pages in offline blocks
hv: remove the per-channel workqueue
hv: don't schedule new works in vmbus_onoffer()/vmbus_onoffer_rescind()
hv: run non-blocking message handlers in the dispatch tasklet
coresight: moving to new "hwtracing" directory
coresight-tmc: Adding a status interface to sysfs
coresight: remove the unnecessary configuration coresight-default-sink
...
|
| |\ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
We want those fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
| |\ \ \ \ \ \
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
We want the mei fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
It looks like clockevents_unbind is being exported by mistake as:
- it is static;
- it is not listed in include/linux/clockchips.h;
- EXPORT_SYMBOL_GPL(clockevents_unbind) follows clockevents_unbind_device()
implementation.
I think clockevents_unbind_device should be exported instead. This is going to
be used to teardown Hyper-V clockevent devices on module unload.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|\ \ \ \ \ \ \ \
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial updates from Greg KH:
"Here's the big tty/serial driver update for 4.1-rc1.
It was delayed for a bit due to some questions surrounding some of the
console command line parsing changes that are in here. There's still
one tiny regression for people who were previously putting multiple
console command lines and expecting them all to be ignored for some
odd reason, but Peter is working on fixing that. If not, I'll send a
revert for the offending patch, but I have faith that Peter can
address it.
Other than the console work here, there's the usual serial driver
updates and changes, and a buch of 8250 reworks to try to make that
driver easier to maintain over time, and have it support more devices
in the future.
All of these have been in linux-next for a while"
* tag 'tty-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (119 commits)
n_gsm: Drop unneeded cast on netdev_priv
sc16is7xx: expose RTS inversion in RS-485 mode
serial: 8250_pci: port failed after wakeup from S3
earlycon: 8250: Document kernel command line options
earlycon: 8250: Fix command line regression
earlycon: Fix __earlycon_table stride
tty: clean up the tty time logic a bit
serial: 8250_dw: only get the clock rate in one place
serial: 8250_dw: remove useless ACPI ID check
dmaengine: hsu: move memory allocation to GFP_NOWAIT
dmaengine: hsu: remove redundant pieces of code
serial: 8250_pci: add Intel Tangier support
dmaengine: hsu: add Intel Tangier PCI ID
serial: 8250_pci: replace switch-case by formula for Intel MID
serial: 8250_pci: replace switch-case by formula
tty: cpm_uart: replace CONFIG_8xx by CONFIG_CPM1
serial: jsm: some off by one bugs
serial: xuartps: Fix check in console_setup().
serial: xuartps: Get rid of register access macros.
serial: xuartps: Fix iobase use.
...
|
| |\ \ \ \ \ \ \ \
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
We want the fixes in here as well, also to help out with merge issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
Add match() method to struct console which allows the console to
perform console command line matching instead of (or in addition to)
default console matching (ie., by fixed name and index).
The match() method returns 0 to indicate a successful match; normal
console matching occurs if no match() method is defined or the
match() method returns non-zero. The match() method is expected to set
the console index if required.
Re-implement earlycon-to-console-handoff with direct matching of
"console=uart|uart8250,..." to the 8250 ttyS console.
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
| |\ \ \ \ \ \ \ \ \
| | | |_|_|/ / / / /
| | |/| | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
We want the tty/serial fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
| |\ \ \ \ \ \ \ \ \
| | |_|_|_|/ / / / /
| |/| | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
This resolves a merge issue in drivers/tty/serial/8250/8250_pci.c
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
Before register_console() calls the setup() method of the matched
console, the registering console index is already equal to the index
from the console command line; ie. newcon->index == c->index.
This change is also required to support extensible console matching;
(the command line index may have no relation to the console index
assigned by the console-defined match() function).
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
Commit 8053871d0f7f ("smp: Fix smp_call_function_single_async()
locking") fixed the locking for the asynchronous smp-call case, but in
the process of moving the lock handling around, one of the error cases
ended up not unlocking the call data at all.
This went unnoticed on x86, because this is a "caller is buggy" case,
where the caller is trying to call a non-existent CPU. But apparently
ARM does that (at least under qemu-arm). Bindly doing cross-cpu calls
to random CPU's that aren't even online seems a bit fishy, but the error
handling was clearly not correct.
Simply add the missing "csd_unlock()" to the error path.
Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net>
Analyzed-by: Rabin Vincent <rabin@rab.in>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\ \ \ \ \ \ \ \ \ \
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
"Two fixes: an smp-call fix and a lockdep fix"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
smp: Fix smp_call_function_single_async() locking
lockdep: Make print_lock() robust against concurrent release
|
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | | |
The current smp_function_call code suffers a number of problems, most
notably smp_call_function_single_async() is broken.
The problem is that flush_smp_call_function_queue() does csd_unlock()
_after_ calling csd->func(). This means that a caller cannot properly
synchronize the csd usage as it has to.
Change the code to release the csd before calling ->func() for the
async case, and put a WARN_ON_ONCE(csd->flags & CSD_FLAG_LOCK) in
smp_call_function_single_async() to warn us of improper serialization,
because any waiting there can results in deadlocks when called with
IRQs disabled.
Rename the (currently) unused WAIT flag to SYNCHRONOUS and (re)use it
such that we know what to do in flush_smp_call_function_queue().
Rework csd_{,un}lock() to use smp_load_acquire() / smp_store_release()
to avoid some full barriers while more clearly providing lock
semantics.
Finally move the csd maintenance out of generic_exec_single() into its
callers for clearer code.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[ Added changelog. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Rafael David Tinoco <inaddy@ubuntu.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/CA+55aFz492bzLFhdbKN-Hygjcreup7CjMEYk3nTSfRWjppz-OA@mail.gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | | |
During sysrq's show-held-locks command it is possible that
hlock_class() returns NULL for a given lock. The result is then (after
the warning):
|BUG: unable to handle kernel NULL pointer dereference at 0000001c
|IP: [<c1088145>] get_usage_chars+0x5/0x100
|Call Trace:
| [<c1088263>] print_lock_name+0x23/0x60
| [<c1576b57>] print_lock+0x5d/0x7e
| [<c1088314>] lockdep_print_held_locks+0x74/0xe0
| [<c1088652>] debug_show_all_locks+0x132/0x1b0
| [<c1315c48>] sysrq_handle_showlocks+0x8/0x10
This *might* happen because the thread on the other CPU drops the lock
after we are looking ->lockdep_depth and ->held_locks points no longer
to a lock that is held.
The fix here is to simply ignore it and continue.
Reported-by: Andreas Messerschmid <andreas@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|\ \ \ \ \ \ \ \ \ \ \
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | | |
Pull networking fixes from David Miller:
1) Fix verifier memory corruption and other bugs in BPF layer, from
Alexei Starovoitov.
2) Add a conservative fix for doing BPF properly in the BPF classifier
of the packet scheduler on ingress. Also from Alexei.
3) The SKB scrubber should not clear out the packet MARK and security
label, from Herbert Xu.
4) Fix oops on rmmod in stmmac driver, from Bryan O'Donoghue.
5) Pause handling is not correct in the stmmac driver because it
doesn't take into consideration the RX and TX fifo sizes. From
Vince Bridgers.
6) Failure path missing unlock in FOU driver, from Wang Cong.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits)
net: dsa: use DEVICE_ATTR_RW to declare temp1_max
netns: remove BUG_ONs from net_generic()
IB/ipoib: Fix ndo_get_iflink
sfc: Fix memcpy() with const destination compiler warning.
altera tse: Fix network-delays and -retransmissions after high throughput.
net: remove unused 'dev' argument from netif_needs_gso()
act_mirred: Fix bogus header when redirecting from VLAN
inet_diag: fix access to tcp cc information
tcp: tcp_get_info() should fetch socket fields once
net: dsa: mv88e6xxx: Add missing initialization in mv88e6xxx_set_port_state()
skbuff: Do not scrub skb mark within the same name space
Revert "net: Reset secmark when scrubbing packet"
bpf: fix two bugs in verification logic when accessing 'ctx' pointer
bpf: fix bpf helpers to use skb->mac_header relative offsets
stmmac: Configure Flow Control to work correctly based on rxfifo size
stmmac: Enable unicast pause frame detect in GMAC Register 6
stmmac: Read tx-fifo-depth and rx-fifo-depth from the devicetree
stmmac: Add defines and documentation for enabling flow control
stmmac: Add properties for transmit and receive fifo sizes
stmmac: fix oops on rmmod after assigning ip addr
...
|
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | | |
1.
first bug is a silly mistake. It broke tracing examples and prevented
simple bpf programs from loading.
In the following code:
if (insn->imm == 0 && BPF_SIZE(insn->code) == BPF_W) {
} else if (...) {
// this part should have been executed when
// insn->code == BPF_W and insn->imm != 0
}
Obviously it's not doing that. So simple instructions like:
r2 = *(u64 *)(r1 + 8)
will be rejected. Note the comments in the code around these branches
were and still valid and indicate the true intent.
Replace it with:
if (BPF_SIZE(insn->code) != BPF_W)
continue;
if (insn->imm == 0) {
} else if (...) {
// now this code will be executed when
// insn->code == BPF_W and insn->imm != 0
}
2.
second bug is more subtle.
If malicious code is using the same dest register as source register,
the checks designed to prevent the same instruction to be used with different
pointer types will fail to trigger, since we were assigning src_reg_type
when it was already overwritten by check_mem_access().
The fix is trivial. Just move line:
src_reg_type = regs[insn->src_reg].type;
before check_mem_access().
Add new 'access skb fields bad4' test to check this case.
Fixes: 9bac3d6d548e ("bpf: allow extended BPF programs access skb fields")
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | | |
Due to missing bounds check the DAG pass of the BPF verifier can corrupt
the memory which can cause random crashes during program loading:
[8.449451] BUG: unable to handle kernel paging request at ffffffffffffffff
[8.451293] IP: [<ffffffff811de33d>] kmem_cache_alloc_trace+0x8d/0x2f0
[8.452329] Oops: 0000 [#1] SMP
[8.452329] Call Trace:
[8.452329] [<ffffffff8116cc82>] bpf_check+0x852/0x2000
[8.452329] [<ffffffff8116b7e4>] bpf_prog_load+0x1e4/0x310
[8.452329] [<ffffffff811b190f>] ? might_fault+0x5f/0xb0
[8.452329] [<ffffffff8116c206>] SyS_bpf+0x806/0xa30
Fixes: f1bca824dabb ("bpf: add search pruning optimization to verifier")
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | | |
sync_buffer() needs the mmap_sem for two distinct operations, both only
occurring upon user context switch handling:
1) Dealing with the exe_file.
2) Adding the dcookie data as we need to lookup the vma that
backs it. This is done via add_sample() and add_data().
This patch isolates 1), for it will no longer need the mmap_sem for
serialization. However, for now, make of the more standard
get_mm_exe_file(), requiring only holding the mmap_sem to read the value,
and relying on reference counting to make sure that the exe file won't
dissappear underneath us while doing the get dcookie.
As a consequence, for 2) we move the mmap_sem locking into where we really
need it, in lookup_dcookie(). The benefits are twofold: reduce mmap_sem
hold times, and cleaner code.
[akpm@linux-foundation.org: export get_mm_exe_file for arch/x86/oprofile/oprofile.ko]
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Robert Richter <rric@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | | |
gcov profiling if enabled with other heavy compile-time instrumentation
like KASan could trigger following softlockups:
NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:1]
Modules linked in:
irq event stamp: 22823276
hardirqs last enabled at (22823275): [<ffffffff86e8d10d>] mutex_lock_nested+0x7d9/0x930
hardirqs last disabled at (22823276): [<ffffffff86e9521d>] apic_timer_interrupt+0x6d/0x80
softirqs last enabled at (22823172): [<ffffffff811ed969>] __do_softirq+0x4db/0x729
softirqs last disabled at (22823167): [<ffffffff811edfcf>] irq_exit+0x7d/0x15b
CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 3.19.0-05245-gbb33326-dirty #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org 04/01/2014
task: ffff88006cba8000 ti: ffff88006cbb0000 task.ti: ffff88006cbb0000
RIP: kasan_mem_to_shadow+0x1e/0x1f
Call Trace:
strcmp+0x28/0x70
get_node_by_name+0x66/0x99
gcov_event+0x4f/0x69e
gcov_enable_events+0x54/0x7b
gcov_fs_init+0xf8/0x134
do_one_initcall+0x1b2/0x288
kernel_init_freeable+0x467/0x580
kernel_init+0x15/0x18b
ret_from_fork+0x7c/0xb0
Kernel panic - not syncing: softlockup: hung tasks
Fix this by sticking cond_resched() in gcov_enable_events().
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | | |
When converting unsigned long to int overflows may occur. These currently
are not detected when writing to the sysctl file system.
E.g. on a system where int has 32 bits and long has 64 bits
echo 0x800001234 > /proc/sys/kernel/threads-max
has the same effect as
echo 0x1234 > /proc/sys/kernel/threads-max
The patch adds the missing check in do_proc_dointvec_conv.
With the patch an overflow will result in an error EINVAL when writing to
the the sysctl file system.
Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|