summaryrefslogtreecommitdiffstats
path: root/kernel
Commit message (Collapse)AuthorAgeFilesLines
* swiotlb: Skip cache maintenance on map errorRobin Murphy2018-11-211-1/+2
| | | | | | | | | | | | | | If swiotlb_bounce_page() failed, calling arch_sync_dma_for_device() may lead to such delights as performing cache maintenance on whatever address phys_to_virt(SWIOTLB_MAP_ERROR) looks like, which is typically outside the kernel memory map and goes about as well as expected. Don't do that. Fixes: a4a4330db46a ("swiotlb: add support for non-coherent DMA") Tested-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
* Merge branch 'akpm' (patches from Andrew)Linus Torvalds2018-11-181-21/+22
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge misc fixes from Andrew Morton: "16 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm/memblock.c: fix a typo in __next_mem_pfn_range() comments mm, page_alloc: check for max order in hot path scripts/spdxcheck.py: make python3 compliant tmpfs: make lseek(SEEK_DATA/SEK_HOLE) return ENXIO with a negative offset lib/ubsan.c: don't mark __ubsan_handle_builtin_unreachable as noreturn mm/vmstat.c: fix NUMA statistics updates mm/gup.c: fix follow_page_mask() kerneldoc comment ocfs2: free up write context when direct IO failed scripts/faddr2line: fix location of start_kernel in comment mm: don't reclaim inodes with many attached pages mm, memory_hotplug: check zone_movable in has_unmovable_pages mm/swapfile.c: use kvzalloc for swap_info_struct allocation MAINTAINERS: update OMAP MMC entry hugetlbfs: fix kernel BUG at fs/hugetlbfs/inode.c:444! kernel/sched/psi.c: simplify cgroup_move_task() z3fold: fix possible reclaim races
| * kernel/sched/psi.c: simplify cgroup_move_task()Olof Johansson2018-11-181-21/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing code triggered an invalid warning about 'rq' possibly being used uninitialized. Instead of doing the silly warning suppression by initializa it to NULL, refactor the code to bail out early instead. Warning was: kernel/sched/psi.c: In function `cgroup_move_task': kernel/sched/psi.c:639:13: warning: `rq' may be used uninitialized in this function [-Wmaybe-uninitialized] Link: http://lkml.kernel.org/r/20181103183339.8669-1-olof@lixom.net Fixes: 2ce7135adc9ad ("psi: cgroup support") Signed-off-by: Olof Johansson <olof@lixom.net> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds2018-11-181-14/+48
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Ingo Molnar: "Fix an exec() related scalability/performance regression, which was caused by incorrectly calculating load and migrating tasks on exec() when they shouldn't be" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/fair: Fix cpu_util_wake() for 'execl' type workloads
| * sched/fair: Fix cpu_util_wake() for 'execl' type workloadsPatrick Bellasi2018-11-121-14/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A ~10% regression has been reported for UnixBench's execl throughput test by Aaron Lu and Ye Xiaolong: https://lkml.org/lkml/2018/10/30/765 That test is pretty simple, it does a "recursive" execve() syscall on the same binary. Starting from the syscall, this sequence is possible: do_execve() do_execveat_common() __do_execve_file() sched_exec() select_task_rq_fair() <==| Task already enqueued find_idlest_cpu() find_idlest_group() capacity_spare_wake() <==| Functions not called from cpu_util_wake() | the wakeup path which means we can end up calling cpu_util_wake() not only from the "wakeup path", as its name would suggest. Indeed, the task doing an execve() syscall is already enqueued on the CPU we want to get the cpu_util_wake() for. The estimated utilization for a CPU computed in cpu_util_wake() was written under the assumption that function can be called only from the wakeup path. If instead the task is already enqueued, we end up with a utilization which does not remove the current task's contribution from the estimated utilization of the CPU. This will wrongly assume a reduced spare capacity on the current CPU and increase the chances to migrate the task on execve. The regression is tracked down to: commit d519329f72a6 ("sched/fair: Update util_est only on util_avg updates") because in that patch we turn on by default the UTIL_EST sched feature. However, the real issue is introduced by: commit f9be3e5961c5 ("sched/fair: Use util_est in LB and WU paths") Let's fix this by ensuring to always discount the task estimated utilization from the CPU's estimated utilization when the task is also the current one. The same benchmark of the bug report, executed on a dual socket 40 CPUs Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz machine, reports these "Execl Throughput" figures (higher the better): mainline : 48136.5 lps mainline+fix : 55376.5 lps which correspond to a 15% speedup. Moreover, since {cpu_util,capacity_spare}_wake() are not really only used from the wakeup path, let's remove this ambiguity by using a better matching name: {cpu_util,capacity_spare}_without(). Since we are at that, let's also improve the existing documentation. Reported-by: Aaron Lu <aaron.lu@intel.com> Reported-by: Ye Xiaolong <xiaolong.ye@intel.com> Tested-by: Aaron Lu <aaron.lu@intel.com> Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Fixes: f9be3e5961c5 (sched/fair: Use util_est in LB and WU paths) Link: https://lore.kernel.org/lkml/20181025093100.GB13236@e110439-lin/ Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | kdb: kdb_support: mark expected switch fall-throughsGustavo A. R. Silva2018-11-131-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Notice that in this particular case, I replaced the code comments with a proper "fall through" annotation, which is what GCC is expecting to find. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
* | kdb: kdb_keyboard: mark expected switch fall-throughsGustavo A. R. Silva2018-11-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Notice that in this particular case, I replaced the code comments with a proper "fall through" annotation, which is what GCC is expecting to find. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
* | kdb: kdb_main: refactor code in kdb_md_lineGustavo A. R. Silva2018-11-131-18/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace the whole switch statement with a for loop. This makes the code clearer and easy to read. This also addresses the following Coverity warnings: Addresses-Coverity-ID: 115090 ("Missing break in switch") Addresses-Coverity-ID: 115091 ("Missing break in switch") Addresses-Coverity-ID: 114700 ("Missing break in switch") Suggested-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org> [daniel.thompson@linaro.org: Tiny grammar change in description] Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
* | kdb: Use strscpy with destination buffer sizePrarit Bhargava2018-11-133-12/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gcc 8.1.0 warns with: kernel/debug/kdb/kdb_support.c: In function ‘kallsyms_symbol_next’: kernel/debug/kdb/kdb_support.c:239:4: warning: ‘strncpy’ specified bound depends on the length of the source argument [-Wstringop-overflow=] strncpy(prefix_name, name, strlen(name)+1); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ kernel/debug/kdb/kdb_support.c:239:31: note: length computed here Use strscpy() with the destination buffer size, and use ellipses when displaying truncated symbols. v2: Use strscpy() Signed-off-by: Prarit Bhargava <prarit@redhat.com> Cc: Jonathan Toppins <jtoppins@redhat.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Daniel Thompson <daniel.thompson@linaro.org> Cc: kgdb-bugreport@lists.sourceforge.net Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
* | kdb: print real address of pointers instead of hashed addressesChristophe Leroy2018-11-132-13/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit ad67b74d2469 ("printk: hash addresses printed with %p"), all pointers printed with %p are printed with hashed addresses instead of real addresses in order to avoid leaking addresses in dmesg and syslog. But this applies to kdb too, with is unfortunate: Entering kdb (current=0x(ptrval), pid 329) due to Keyboard Entry kdb> ps 15 sleeping system daemon (state M) processes suppressed, use 'ps A' to see all. Task Addr Pid Parent [*] cpu State Thread Command 0x(ptrval) 329 328 1 0 R 0x(ptrval) *sh 0x(ptrval) 1 0 0 0 S 0x(ptrval) init 0x(ptrval) 3 2 0 0 D 0x(ptrval) rcu_gp 0x(ptrval) 4 2 0 0 D 0x(ptrval) rcu_par_gp 0x(ptrval) 5 2 0 0 D 0x(ptrval) kworker/0:0 0x(ptrval) 6 2 0 0 D 0x(ptrval) kworker/0:0H 0x(ptrval) 7 2 0 0 D 0x(ptrval) kworker/u2:0 0x(ptrval) 8 2 0 0 D 0x(ptrval) mm_percpu_wq 0x(ptrval) 10 2 0 0 D 0x(ptrval) rcu_preempt The whole purpose of kdb is to debug, and for debugging real addresses need to be known. In addition, data displayed by kdb doesn't go into dmesg. This patch replaces all %p by %px in kdb in order to display real addresses. Fixes: ad67b74d2469 ("printk: hash addresses printed with %p") Cc: <stable@vger.kernel.org> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
* | kdb: use correct pointer when 'btc' calls 'btt'Christophe Leroy2018-11-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On a powerpc 8xx, 'btc' fails as follows: Entering kdb (current=0x(ptrval), pid 282) due to Keyboard Entry kdb> btc btc: cpu status: Currently on cpu 0 Available cpus: 0 kdb_getarea: Bad address 0x0 when booting the kernel with 'debug_boot_weak_hash', it fails as well Entering kdb (current=0xba99ad80, pid 284) due to Keyboard Entry kdb> btc btc: cpu status: Currently on cpu 0 Available cpus: 0 kdb_getarea: Bad address 0xba99ad80 On other platforms, Oopses have been observed too, see https://github.com/linuxppc/linux/issues/139 This is due to btc calling 'btt' with %p pointer as an argument. This patch replaces %p by %px to get the real pointer value as expected by 'btt' Fixes: ad67b74d2469 ("printk: hash addresses printed with %p") Cc: <stable@vger.kernel.org> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
* | Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds2018-11-111-3/+0
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Thomas Gleixner: "Just the removal of a redundant call into the sched deadline overrun check" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: posix-cpu-timers: Remove useless call to check_dl_overrun()
| * | posix-cpu-timers: Remove useless call to check_dl_overrun()Juri Lelli2018-11-081-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | check_dl_overrun() is used to send a SIGXCPU to users that asked to be informed when a SCHED_DEADLINE runtime overruns occur. The function is called by check_thread_timers() already, so the call in check_process_timers() is redundant/wrong (even though harmless). Remove it. Fixes: 34be39305a77 ("sched/deadline: Implement "runtime overrun signal" support") Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: linux-rt-users@vger.kernel.org Cc: mtk.manpages@gmail.com Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Luca Abeni <luca.abeni@santannapisa.it> Cc: Claudio Scordino <claudio@evidence.eu.com> Link: https://lkml.kernel.org/r/20181107111032.32291-1-juri.lelli@redhat.com
* | | Merge branch 'sched/urgent' of ↵Linus Torvalds2018-11-112-3/+6
|\ \ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Thomas Gleixner: "Two small scheduler fixes: - Take hotplug lock in sched_init_smp(). Technically not really required, but lockdep will complain other. - Trivial comment fix in sched/fair" * 'sched/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/fair: Fix a comment in task_numa_fault() sched/core: Take the hotplug lock in sched_init_smp()
| * | sched/fair: Fix a comment in task_numa_fault()Yi Wang2018-11-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Duplicated 'case it'. Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Reviewed-by: Xi Xu <xu.xi8@zte.com.cn> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: zhong.weidong@zte.com.cn Link: http://lkml.kernel.org/r/1541379013-11352-1-git-send-email-wang.yi59@zte.com.cn Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | sched/core: Take the hotplug lock in sched_init_smp()Valentin Schneider2018-11-041-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When running on linux-next (8c60c36d0b8c ("Add linux-next specific files for 20181019")) + CONFIG_PROVE_LOCKING=y on a big.LITTLE system (e.g. Juno or HiKey960), we get the following report: [ 0.748225] Call trace: [ 0.750685] lockdep_assert_cpus_held+0x30/0x40 [ 0.755236] static_key_enable_cpuslocked+0x20/0xc8 [ 0.760137] build_sched_domains+0x1034/0x1108 [ 0.764601] sched_init_domains+0x68/0x90 [ 0.768628] sched_init_smp+0x30/0x80 [ 0.772309] kernel_init_freeable+0x278/0x51c [ 0.776685] kernel_init+0x10/0x108 [ 0.780190] ret_from_fork+0x10/0x18 The static_key in question is 'sched_asym_cpucapacity' introduced by commit: df054e8445a4 ("sched/topology: Add static_key for asymmetric CPU capacity optimizations") In this particular case, we enable it because smp_prepare_cpus() will end up fetching the capacity-dmips-mhz entry from the devicetree, so we already have some asymmetry detected when entering sched_init_smp(). This didn't get detected in tip/sched/core because we were missing: commit cb538267ea1e ("jump_label/lockdep: Assert we hold the hotplug lock for _cpuslocked() operations") Calls to build_sched_domains() post sched_init_smp() will hold the hotplug lock, it just so happens that this very first call is a special case. As stated by a comment in sched_init_smp(), "There's no userspace yet to cause hotplug operations" so this is a harmless warning. However, to both respect the semantics of underlying callees and make lockdep happy, take the hotplug lock in sched_init_smp(). This also satisfies the comment atop sched_init_domains() that says "Callers must hold the hotplug lock". Reported-by: Sudeep Holla <sudeep.holla@arm.com> Tested-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Dietmar.Eggemann@arm.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: morten.rasmussen@arm.com Cc: quentin.perret@arm.com Link: http://lkml.kernel.org/r/1540301851-3048-1-git-send-email-valentin.schneider@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds2018-11-111-5/+14
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core fixes from Thomas Gleixner: "A couple of fixlets for the core: - Kernel doc function documentation fixes - Missing prototypes for weak watchdog functions" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: resource/docs: Complete kernel-doc style function documentation watchdog/core: Add missing prototypes for weak functions resource/docs: Fix new kernel-doc warnings
| * | | resource/docs: Complete kernel-doc style function documentationBorislav Petkov2018-11-071-9/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the missing kernel-doc style function parameters documentation. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: akpm@linux-foundation.org Cc: linux-tip-commits@vger.kernel.org Cc: rdunlap@infradead.org Fixes: b69c2e20f6e4 ("resource: Clean it up a bit") Link: http://lkml.kernel.org/r/20181105093307.GA12445@zn.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | resource/docs: Fix new kernel-doc warningsRandy Dunlap2018-11-051-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The first group of warnings is caused by a "/**" kernel-doc notation marker but the function comments are not in kernel-doc format. Also add another error return value here. ../kernel/resource.c:337: warning: Function parameter or member 'start' not described in 'find_next_iomem_res' ../kernel/resource.c:337: warning: Function parameter or member 'end' not described in 'find_next_iomem_res' ../kernel/resource.c:337: warning: Function parameter or member 'flags' not described in 'find_next_iomem_res' ../kernel/resource.c:337: warning: Function parameter or member 'desc' not described in 'find_next_iomem_res' ../kernel/resource.c:337: warning: Function parameter or member 'first_lvl' not described in 'find_next_iomem_res' ../kernel/resource.c:337: warning: Function parameter or member 'res' not described in 'find_next_iomem_res' Add the missing function parameter documentation for the other warnings: ../kernel/resource.c:409: warning: Function parameter or member 'arg' not described in 'walk_iomem_res_desc' ../kernel/resource.c:409: warning: Function parameter or member 'func' not described in 'walk_iomem_res_desc' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: b69c2e20f6e4 ("resource: Clean it up a bit") Link: http://lkml.kernel.org/r/dda2e4d8-bedd-3167-20fe-8c7d2d35b354@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | Merge branch 'for-linus' of ↵Linus Torvalds2018-11-101-4/+8
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull namespace fixes from Eric Biederman: "I believe all of these are simple obviously correct bug fixes. These fall into two groups: - Fixing the implementation of MNT_LOCKED which prevents lesser privileged users from seeing unders mounts created by more privileged users. - Fixing the extended uid and group mapping in user namespaces. As well as ensuring the code looks correct I have spot tested these changes as well and in my testing the fixes are working" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: mount: Prevent MNT_DETACH from disconnecting locked mounts mount: Don't allow copying MNT_UNBINDABLE|MNT_LOCKED mounts mount: Retest MNT_LOCKED in do_umount userns: also map extents in the reverse map to kernel IDs
| * | | | userns: also map extents in the reverse map to kernel IDsJann Horn2018-11-071-4/+8
| | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current logic first clones the extent array and sorts both copies, then maps the lower IDs of the forward mapping into the lower namespace, but doesn't map the lower IDs of the reverse mapping. This means that code in a nested user namespace with >5 extents will see incorrect IDs. It also breaks some access checks, like inode_owner_or_capable() and privileged_wrt_inode_uidgid(), so a process can incorrectly appear to be capable relative to an inode. To fix it, we have to make sure that the "lower_first" members of extents in both arrays are translated; and we have to make sure that the reverse map is sorted *after* the translation (since otherwise the translation can break the sorting). This is CVE-2018-18955. Fixes: 6397fac4915a ("userns: bump idmap limits to 340") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn <jannh@google.com> Tested-by: Eric W. Biederman <ebiederm@xmission.com> Reviewed-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
* | | | Merge tag 'trace-v4.20-rc1' of ↵Linus Torvalds2018-11-061-1/+1
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "Masami found a slight bug in his code where he transposed the arguments of a call to strpbrk. The reason this wasn't detected in our tests is that the only way this would transpire is when a kprobe event with a symbol offset is attached to a function that belongs to a module that isn't loaded yet. When the kprobe trace event is added, the offset would be truncated after it was parsed, and when the module is loaded, it would use the symbol without the offset (as the nul character added by the parsing would not be replaced with the original character)" * tag 'trace-v4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing/kprobes: Fix strpbrk() argument order
| * | | | tracing/kprobes: Fix strpbrk() argument orderMasami Hiramatsu2018-11-051-1/+1
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix strpbrk()'s argument order, it must pass acceptable string in 2nd argument. Note that this can cause a kernel panic where it recovers backup character to code->data. Link: http://lkml.kernel.org/r/154108256792.2604.1816052586385217811.stgit@devbox Fixes: a6682814f371 ("tracing/kprobes: Allow kprobe-events to record module symbol") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
* | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2018-11-062-13/+26
|\ \ \ \ | |/ / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: 1) Handle errors mid-stream of an all dump, from Alexey Kodanev. 2) Fix build of openvswitch with certain combinations of netfilter options, from Arnd Bergmann. 3) Fix interactions between GSO and BQL, from Eric Dumazet. 4) Don't put a '/' in RTL8201F's sysfs file name, from Holger Hoffstätte. 5) S390 qeth driver fixes from Julian Wiedmann. 6) Allow ipv6 link local addresses for netconsole when both source and destination are link local, from Matwey V. Kornilov. 7) Fix the BPF program address seen in /proc/kallsyms, from Song Liu. 8) Initialize mutex before use in dsa microchip driver, from Tristram Ha. 9) Out-of-bounds access in hns3, from Yunsheng Lin. 10) Various netfilter fixes from Stefano Brivio, Jozsef Kadlecsik, Jiri Slaby, Florian Westphal, Eric Westbrook, Andrey Ryabinin, and Pablo Neira Ayuso. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (50 commits) net: alx: make alx_drv_name static net: bpfilter: fix iptables failure if bpfilter_umh is disabled sock_diag: fix autoloading of the raw_diag module net: core: netpoll: Enable netconsole IPv6 link local address ipv6: properly check return value in inet6_dump_all() rtnetlink: restore handling of dumpit return value in rtnl_dump_all() net/ipv6: Move anycast init/cleanup functions out of CONFIG_PROC_FS bonding/802.3ad: fix link_failure_count tracking net: phy: realtek: fix RTL8201F sysfs name sctp: define SCTP_SS_DEFAULT for Stream schedulers sctp: fix strchange_flags name for Stream Change Event mlxsw: spectrum: Fix IP2ME CPU policer configuration openvswitch: fix linking without CONFIG_NF_CONNTRACK_LABELS qed: fix link config error handling net: hns3: Fix for out-of-bounds access when setting pfc back pressure net/mlx4_en: use __netdev_tx_sent_queue() net: do not abort bulk send on BQL status net: bql: add __netdev_tx_sent_queue() s390/qeth: report 25Gbit link speed s390/qeth: sanitize ARP requests ...
| * | | bpf: fix bpf_prog_get_info_by_fd to return 0 func_lens for unprivDaniel Borkmann2018-11-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While dbecd7388476 ("bpf: get kernel symbol addresses via syscall") zeroed info.nr_jited_ksyms in bpf_prog_get_info_by_fd() for queries from unprivileged users, commit 815581c11cc2 ("bpf: get JITed image lengths of functions via syscall") forgot about doing so and therefore returns the #elems of the user set up buffer which is incorrect. It also needs to indicate a info.nr_jited_func_lens of zero. Fixes: 815581c11cc2 ("bpf: get JITed image lengths of functions via syscall") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Sandipan Das <sandipan@linux.vnet.ibm.com> Cc: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| * | | bpf: show main program address and length in bpf_prog_infoSong Liu2018-11-021-9/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, when there is no subprog (prog->aux->func_cnt == 0), bpf_prog_info does not return any jited_ksyms or jited_func_lens. This patch adds main program address (prog->bpf_func) and main program length (prog->jited_len) to bpf_prog_info. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| * | | bpf: show real jited address in bpf_prog_info->jited_ksymsSong Liu2018-11-021-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, jited_ksyms in bpf_prog_info shows page addresses of jited bpf program. The main reason here is to not expose randomized start address. However, this is not ideal for detailed profiling (find hot instructions from stack traces). This patch replaces the page address with real prog start address. This change is OK because bpf_prog_get_info_by_fd() is only available to root. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| * | | bpf: show real jited prog address in /proc/kallsymsSong Liu2018-11-021-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, /proc/kallsyms shows page address of jited bpf program. The main reason here is to not expose randomized start address. However, This is not ideal for detailed profiling (find hot instructions from stack traces). This patch replaces the page address with real prog start address. This change is OK because these addresses are still protected by sysctl kptr_restrict (see kallsyms_show_value()), and only programs loaded by root are added to kallsyms (see bpf_prog_kallsyms_add()). Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* | | | Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds2018-11-032-2/+2
|\ \ \ \ | | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "A memory (under-)allocation fix and a comment fix" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/topology: Fix off by one bug sched/rt: Update comment in pick_next_task_rt()
| * | | sched/topology: Fix off by one bugPeter Zijlstra2018-11-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the addition of the NUMA identity level, we increased @level by one and will run off the end of the array in the distance sort loop. Fixed: 051f3ca02e46 ("sched/topology: Introduce NUMA identity node sched domain") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | sched/rt: Update comment in pick_next_task_rt()Muchun Song2018-10-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit: f4ebcbc0d7e0 ("sched/rt: Substract number of tasks of throttled queues from rq->nr_running") added a new rt_rq->rt_queued field, which is used to indicate the status of rq->rt enqueue or dequeue. So, the ->rt_nr_running check was removed and we now check ->rt_queued instead. Fix the comment in pick_next_task_rt() as well, which was still referencing the old logic. Signed-off-by: Muchun Song <smuchun@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20181027030517.23292-1-smuchun@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds2018-11-031-1/+1
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "A number of fixes and some late updates: - make in_compat_syscall() behavior on x86-32 similar to other platforms, this touches a number of generic files but is not intended to impact non-x86 platforms. - objtool fixes - PAT preemption fix - paravirt fixes/cleanups - cpufeatures updates for new instructions - earlyprintk quirk - make microcode version in sysfs world-readable (it is already world-readable in procfs) - minor cleanups and fixes" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: compat: Cleanup in_compat_syscall() callers x86/compat: Adjust in_compat_syscall() to generic code under !COMPAT objtool: Support GCC 9 cold subfunction naming scheme x86/numa_emulation: Fix uniform-split numa emulation x86/paravirt: Remove unused _paravirt_ident_32 x86/mm/pat: Disable preemption around __flush_tlb_all() x86/paravirt: Remove GPL from pv_ops export x86/traps: Use format string with panic() call x86: Clean up 'sizeof x' => 'sizeof(x)' x86/cpufeatures: Enumerate MOVDIR64B instruction x86/cpufeatures: Enumerate MOVDIRI instruction x86/earlyprintk: Add a force option for pciserial device objtool: Support per-function rodata sections x86/microcode: Make revision and processor flags world-readable
| * \ \ \ Merge branch 'core/urgent' into x86/urgent, to pick up objtool fixIngo Molnar2018-11-0333-1080/+1433
| |\ \ \ \ | | | |_|/ | | |/| | | | | | | Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | compat: Cleanup in_compat_syscall() callersDmitry Safonov2018-11-011-1/+1
| | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that in_compat_syscall() is consistent on all architectures and does not longer report true on native i686, the workarounds (ifdeffery and helpers) can be removed. Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Andy Lutomirsky <luto@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: John Stultz <john.stultz@linaro.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Stephen Boyd <sboyd@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: linux-efi@vger.kernel.org Cc: netdev@vger.kernel.org Link: https://lkml.kernel.org/r/20181012134253.23266-3-dima@arista.com
* | | | Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2018-11-031-1/+1
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates and fixes from Ingo Molnar: "These are almost all tooling updates: 'perf top', 'perf trace' and 'perf script' fixes and updates, an UAPI header sync with the merge window versions, license marker updates, much improved Sparc support from David Miller, and a number of fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (66 commits) perf intel-pt/bts: Calculate cpumode for synthesized samples perf intel-pt: Insert callchain context into synthesized callchains perf tools: Don't clone maps from parent when synthesizing forks perf top: Start display thread earlier tools headers uapi: Update linux/if_link.h header copy tools headers uapi: Update linux/netlink.h header copy tools headers: Sync the various kvm.h header copies tools include uapi: Update linux/mmap.h copy perf trace beauty: Use the mmap flags table generated from headers perf beauty: Wire up the mmap flags table generator to the Makefile perf beauty: Add a generator for MAP_ mmap's flag constants tools include uapi: Update asound.h copy tools arch uapi: Update asm-generic/unistd.h and arm64 unistd.h copies tools include uapi: Update linux/fs.h copy perf callchain: Honour the ordering of PERF_CONTEXT_{USER,KERNEL,etc} perf cs-etm: Correct CPU mode for samples perf unwind: Take pgoff into account when reporting elf to libdwfl perf top: Do not use overwrite mode by default perf top: Allow disabling the overwrite mode perf trace: Beautify mount's first pathname arg ...
| * | | | perf/core: Clean up inconsisent indentationColin Ian King2018-10-301-1/+1
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace a bunch of spaces with tab, cleans up indentation Signed-off-by: Colin Ian King <colin.king@canonical.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-janitors@vger.kernel.org Link: http://lkml.kernel.org/r/20181029233211.21475-1-colin.king@canonical.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds2018-11-031-1/+1
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Ingo Molnar: "An irqchip driver fix and a memory (over-)allocation fix" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/irq-mvebu-sei: Fix a NULL vs IS_ERR() bug in probe function irq/matrix: Fix memory overallocation
| * | | | irq/matrix: Fix memory overallocationMichael Kelley2018-11-011-1/+1
| | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | IRQ_MATRIX_SIZE is the number of longs needed for a bitmap, multiplied by the size of a long, yielding a byte count. But it is used to size an array of longs, which is way more memory than is needed. Change IRQ_MATRIX_SIZE so it is just the number of longs needed and the arrays come out the correct size. Fixes: 2f75d9e1c905 ("genirq: Implement bitmap matrix allocator") Signed-off-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: KY Srinivasan <kys@microsoft.com> Link: https://lkml.kernel.org/r/1541032428-10392-1-git-send-email-mikelley@microsoft.com
* | | | kernel/sysctl.c: remove duplicated includeMichael Schupikov2018-11-031-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove one include of <linux/pipe_fs_i.h>. No functional changes. Link: http://lkml.kernel.org/r/20181004134223.17735-1-michael@schupikov.de Signed-off-by: Michael Schupikov <michael@schupikov.de> Reviewed-by: Richard Weinberger <richard@nod.at> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | kernel/kexec_file.c: remove some duplicated includeszhong jiang2018-11-031-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We include kexec.h and slab.h twice in kexec_file.c. It's unnecessary. hence just remove them. Link: http://lkml.kernel.org/r/1537498098-19171-1-git-send-email-zhongjiang@huawei.com Signed-off-by: zhong jiang <zhongjiang@huawei.com> Reviewed-by: Bhupesh Sharma <bhsharma@redhat.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Baoquan He <bhe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge tag 'for-linus-20181102' of git://git.kernel.dk/linux-blockLinus Torvalds2018-11-022-41/+11
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block layer fixes from Jens Axboe: "The biggest part of this pull request is the revert of the blkcg cleanup series. It had one fix earlier for a stacked device issue, but another one was reported. Rather than play whack-a-mole with this, revert the entire series and try again for the next kernel release. Apart from that, only small fixes/changes. Summary: - Indentation fixup for mtip32xx (Colin Ian King) - The blkcg cleanup series revert (Dennis Zhou) - Two NVMe fixes. One fixing a regression in the nvme request initialization in this merge window, causing nvme-fc to not work. The other is a suspend/resume p2p resource issue (James, Keith) - Fix sg discard merge, allowing us to merge in cases where we didn't before (Jianchao Wang) - Call rq_qos_exit() after the queue is frozen, preventing a hang (Ming) - Fix brd queue setup, fixing an oops if we fail setting up all devices (Ming)" * tag 'for-linus-20181102' of git://git.kernel.dk/linux-block: nvme-pci: fix conflicting p2p resource adds nvme-fc: fix request private initialization blkcg: revert blkcg cleanups series block: brd: associate with queue until adding disk block: call rq_qos_exit() after queue is frozen mtip32xx: clean an indentation issue, remove extraneous tabs block: fix the DISCARD request merge
| * | | | blkcg: revert blkcg cleanups seriesDennis Zhou2018-11-012-41/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts a series committed earlier due to null pointer exception bug report in [1]. It seems there are edge case interactions that I did not consider and will need some time to understand what causes the adverse interactions. The original series can be found in [2] with a follow up series in [3]. [1] https://www.spinics.net/lists/cgroups/msg20719.html [2] https://lore.kernel.org/lkml/20180911184137.35897-1-dennisszhou@gmail.com/ [3] https://lore.kernel.org/lkml/20181020185612.51587-1-dennis@kernel.org/ This reverts the following commits: d459d853c2ed, b2c3fa546705, 101246ec02b5, b3b9f24f5fcc, e2b0989954ae, f0fcb3ec89f3, c839e7a03f92, bdc2491708c4, 74b7c02a9bc1, 5bf9a1f3b4ef, a7b39b4e961c, 07b05bcc3213, 49f4c2dc2b50, 27e6fa996c53 Signed-off-by: Dennis Zhou <dennis@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | | | | Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds2018-11-011-0/+1
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull virtio/vhost updates from Michael Tsirkin: "Fixes and tweaks: - virtio balloon page hinting support - vhost scsi control queue - misc fixes" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: MAINTAINERS: remove reference to bogus vsock file vhost/scsi: Use common handling code in request queue handler vhost/scsi: Extract common handling code from control queue handler vhost/scsi: Respond to control queue operations vhost/scsi: truncate T10 PI iov_iter to prot_bytes virtio-balloon: VIRTIO_BALLOON_F_PAGE_POISON mm/page_poison: expose page_poisoning_enabled to kernel modules virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT kvm_config: add CONFIG_VIRTIO_MENU
| * | | | | kvm_config: add CONFIG_VIRTIO_MENULénaïc Huard2018-10-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make sure that make kvmconfig enables all the virtio drivers even if it is preceded by a make allnoconfig. Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | | | | | Merge tag 'stackleak-v4.20-rc1' of ↵Linus Torvalds2018-11-014-1/+153
|\ \ \ \ \ \ | |_|_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull stackleak gcc plugin from Kees Cook: "Please pull this new GCC plugin, stackleak, for v4.20-rc1. This plugin was ported from grsecurity by Alexander Popov. It provides efficient stack content poisoning at syscall exit. This creates a defense against at least two classes of flaws: - Uninitialized stack usage. (We continue to work on improving the compiler to do this in other ways: e.g. unconditional zero init was proposed to GCC and Clang, and more plugin work has started too). - Stack content exposure. By greatly reducing the lifetime of valid stack contents, exposures via either direct read bugs or unknown cache side-channels become much more difficult to exploit. This complements the existing buddy and heap poisoning options, but provides the coverage for stacks. The x86 hooks are included in this series (which have been reviewed by Ingo, Dave Hansen, and Thomas Gleixner). The arm64 hooks have already been merged through the arm64 tree (written by Laura Abbott and reviewed by Mark Rutland and Will Deacon). With VLAs having been removed this release, there is no need for alloca() protection, so it has been removed from the plugin" * tag 'stackleak-v4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: arm64: Drop unneeded stackleak_check_alloca() stackleak: Allow runtime disabling of kernel stack erasing doc: self-protection: Add information about STACKLEAK feature fs/proc: Show STACKLEAK metrics in the /proc file system lkdtm: Add a test for STACKLEAK gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
| * | | | | stackleak: Allow runtime disabling of kernel stack erasingAlexander Popov2018-09-042-1/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce CONFIG_STACKLEAK_RUNTIME_DISABLE option, which provides 'stack_erasing' sysctl. It can be used in runtime to control kernel stack erasing for kernels built with CONFIG_GCC_PLUGIN_STACKLEAK. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Alexander Popov <alex.popov@linux.com> Tested-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | | | fs/proc: Show STACKLEAK metrics in the /proc file systemAlexander Popov2018-09-041-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce CONFIG_STACKLEAK_METRICS providing STACKLEAK information about tasks via the /proc file system. In particular, /proc/<pid>/stack_depth shows the maximum kernel stack consumption for the current and previous syscalls. Although this information is not precise, it can be useful for estimating the STACKLEAK performance impact for your workloads. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Alexander Popov <alex.popov@linux.com> Tested-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | | | gcc-plugins: Add STACKLEAK plugin for tracking the kernel stackAlexander Popov2018-09-041-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The STACKLEAK feature erases the kernel stack before returning from syscalls. That reduces the information which kernel stack leak bugs can reveal and blocks some uninitialized stack variable attacks. This commit introduces the STACKLEAK gcc plugin. It is needed for tracking the lowest border of the kernel stack, which is important for the code erasing the used part of the kernel stack at the end of syscalls (comes in a separate commit). The STACKLEAK feature is ported from grsecurity/PaX. More information at: https://grsecurity.net/ https://pax.grsecurity.net/ This code is modified from Brad Spengler/PaX Team's code in the last public patch of grsecurity/PaX based on our understanding of the code. Changes or omissions from the original code are ours and don't reflect the original grsecurity/PaX code. Signed-off-by: Alexander Popov <alex.popov@linux.com> Tested-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | | | x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscallsAlexander Popov2018-09-043-0/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The STACKLEAK feature (initially developed by PaX Team) has the following benefits: 1. Reduces the information that can be revealed through kernel stack leak bugs. The idea of erasing the thread stack at the end of syscalls is similar to CONFIG_PAGE_POISONING and memzero_explicit() in kernel crypto, which all comply with FDP_RIP.2 (Full Residual Information Protection) of the Common Criteria standard. 2. Blocks some uninitialized stack variable attacks (e.g. CVE-2017-17712, CVE-2010-2963). That kind of bugs should be killed by improving C compilers in future, which might take a long time. This commit introduces the code filling the used part of the kernel stack with a poison value before returning to userspace. Full STACKLEAK feature also contains the gcc plugin which comes in a separate commit. The STACKLEAK feature is ported from grsecurity/PaX. More information at: https://grsecurity.net/ https://pax.grsecurity.net/ This code is modified from Brad Spengler/PaX Team's code in the last public patch of grsecurity/PaX based on our understanding of the code. Changes or omissions from the original code are ours and don't reflect the original grsecurity/PaX code. Performance impact: Hardware: Intel Core i7-4770, 16 GB RAM Test #1: building the Linux kernel on a single core 0.91% slowdown Test #2: hackbench -s 4096 -l 2000 -g 15 -f 25 -P 4.2% slowdown So the STACKLEAK description in Kconfig includes: "The tradeoff is the performance impact: on a single CPU system kernel compilation sees a 1% slowdown, other systems and workloads may vary and you are advised to test this feature on your expected workload before deploying it". Signed-off-by: Alexander Popov <alex.popov@linux.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org>
* | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2018-11-011-9/+12
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: 1) BPF verifier fixes from Daniel Borkmann. 2) HNS driver fixes from Huazhong Tan. 3) FDB only works for ethernet devices, reject attempts to install FDB rules for others. From Ido Schimmel. 4) Fix spectre V1 in vhost, from Jason Wang. 5) Don't pass on-stack object to irq_set_affinity_hint() in mvpp2 driver, from Marc Zyngier. 6) Fix mlx5e checksum handling when RXFCS is enabled, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (49 commits) openvswitch: Fix push/pop ethernet validation net: stmmac: Fix stmmac_mdio_reset() when building stmmac as modules bpf: test make sure to run unpriv test cases in test_verifier bpf: add various test cases to test_verifier bpf: don't set id on after map lookup with ptr_to_map_val return bpf: fix partial copy of map_ptr when dst is scalar libbpf: Fix compile error in libbpf_attach_type_by_name kselftests/bpf: use ping6 as the default ipv6 ping binary if it exists selftests: mlxsw: qos_mc_aware: Add a test for UC awareness selftests: mlxsw: qos_mc_aware: Tweak for min shaper mlxsw: spectrum: Set minimum shaper on MC TCs mlxsw: reg: QEEC: Add minimum shaper fields net: hns3: bugfix for rtnl_lock's range in the hclgevf_reset() net: hns3: bugfix for rtnl_lock's range in the hclge_reset() net: hns3: bugfix for handling mailbox while the command queue reinitialized net: hns3: fix incorrect return value/type of some functions net: hns3: bugfix for hclge_mdio_write and hclge_mdio_read net: hns3: bugfix for is_valid_csq_clean_head() net: hns3: remove unnecessary queue reset in the hns3_uninit_all_ring() net: hns3: bugfix for the initialization of command queue's spin lock ...