summaryrefslogtreecommitdiffstats
path: root/arch/s390
Commit message (Collapse)AuthorAgeFilesLines
...
| * | | s390/vmemmap: remove memset call from vmemmap_populate()Heiko Carstens2014-09-251-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the vmemmap array gets filled with large pages we allocate those pages with vmemmap_alloc_block(), which returns cleared pages. Only for single 4k pages we call our own vmem_alloc_pages() which does not return cleared pages. However we can also call vmemmap_alloc_block() to allocate the 4k pages. This way we can also make sure the vmemmap array is cleared after its population. Therefore we can remove the memset at the end of the function which would clear the vmmemmap array a second time on machines which do support EDAT1. On very large configurations this can save us several seconds. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | s390/head.s: use zero as address for stflChristian Borntraeger2014-09-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The architecture suggests to use address 0 as parameter for stfl, to allow for future extensions. Using __LC_STFL_FAC_LIST (0x200) shows which address is used, but might be not future proof. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | s390/rwlock: use directed yield for write-locked rwlocksMartin Schwidefsky2014-09-255-32/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add an owner field to the arch_rwlock_t to be able to pass the timeslice of a virtual CPU with diagnose 0x9c to the lock owner in case the rwlock is write-locked. The undirected yield in case the rwlock is acquired writable but the lock is read-locked is removed. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | s390/hmcdrv: HMC drive CD/DVD accessRalf Hoppe2014-09-252-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This device driver allows accessing a HMC drive CD/DVD-ROM. It can be used in a LPAR and z/VM environment. Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Ralf Hoppe <rhoppe@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | s390/spinlock: optimize spin_unlock codeHeiko Carstens2014-09-092-9/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use a memory barrier + store sequence instead of a load + compare and swap sequence to unlock a spinlock and an rw lock. For the spinlock case this saves us two memory reads and a not needed cpu serialization after the compare and swap instruction stored the new value. The kernel size (performance_defconfig) gets reduced by ~14k. Average execution time of a tight inlined spin_unlock loop drops from 5.8ns to 0.7ns on a zEC12 machine. An artificial stress test case where several counters are protected with a single spinlock and which are only incremented while holding the spinlock shows ~30% improvement on a 4 cpu machine. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | s390/ftrace: optimize mcount codeHeiko Carstens2014-09-093-34/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reduce the number of executed instructions within the mcount block if function tracing is enabled. We achieve that by using a non-standard C function call ABI. Since the called function is also written in assembler this is not a problem. This also allows to replace the unconditional store at the beginning of the mcount block with a larl instruction, which doesn't touch memory. In theory we could also patch the first instruction of the mcount block to enable and disable function tracing. However this would break kprobes. This could be fixed with implementing the "kprobes_on_ftrace" feature; however keeping the odd jprobes working seems not to be possible without a lot of code churn. Therefore keep the code easy and simply accept one wasted 1-cycle "larl" instruction per function prologue. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | s390/kprobes: remove unused jprobe_return_end()Heiko Carstens2014-09-091-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Even if it has a __used annotation it is actually unused. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | s390/ftrace: enforce DYNAMIC_FTRACE if FUNCTION_TRACER is selectedHeiko Carstens2014-09-095-13/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have too many combinations for function tracing. Lets simply stick to the most advanced option, so we don't have to care of other combinations. This means we always select DYNAMIC_FTRACE if FUNCTION_TRACER is selected. In the s390 Makefile also remove CONFIG_FTRACE_SYSCALLS since that functionality got moved to architecture independent code in the meantime. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | s390/ftrace: add HAVE_DYNAMIC_FTRACE_WITH_REGS supportHeiko Carstens2014-09-094-12/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This code is based on a patch from Vojtech Pavlik. http://marc.info/?l=linux-s390&m=140438885114413&w=2 The actual implementation now differs significantly: Instead of adding a second function "ftrace_regs_caller" which would be nearly identical to the existing ftrace_caller function, the current ftrace_caller function is now an alias to ftrace_regs_caller and always passes the needed pt_regs structure and function_trace_op parameters unconditionally. Besides that also use asm offsets to correctly allocate and access the new struct pt_regs on the stack. While at it we can make use of new instruction to get rid of some indirect loads if compiled for new machines. The passed struct pt_regs can be changed by the called function and it's new contents will replace the current contents. Note: to change the return address the embedded psw member of the pt_regs structure must be changed. The psw member is right now incomplete, since the mask part is missing. For all current use cases this should be sufficent. Providing and restoring a sane mask would mean we need to add an epsw/lpswe pair to the mcount code. Only these two instruction would cost us ~120 cycles which currently seems not necessary. Cc: Vojtech Pavlik <vojtech@suse.cz> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | s390/ftrace: optimize function graph caller codeHeiko Carstens2014-09-093-6/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the function graph tracer is disabled we can skip three additional instructions. So let's just do this. So if function tracing is enabled but function graph tracing is runtime disabled, we get away with a single unconditional branch. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | s390: pass march flag to assembly files as wellHeiko Carstens2014-09-091-7/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the march flag gets only passed to C files, but not to assembler files. This means that we can't add new instructions like e.g. aghik to asm files, since the assembler doesn't know of the new instructions if the appropriate march flag isn't specified. So also pass the march flag when compiling assembler files as well. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | s390/vdso: add vdso support for coarse clocksMartin Schwidefsky2014-09-097-10/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add CLOCK_REALTIME_COARSE and CLOCK_MONOTONIC_COARSE optimization to the 64-bit and 31-bit vdso. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | s390/vdso: replace stck with stckeMartin Schwidefsky2014-09-094-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If gettimeofday / clock_gettime are called multiple times in a row the STCK instruction will stall until a difference in the result is visible. This unnecessarily slows down the vdso calls, use stcke instead of stck to get rid of the stall. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | s390: remove unused MACHINE_FLAG_RRBMHeiko Carstens2014-09-092-5/+0
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | Merge branch 'x86-seccomp-for-linus' of ↵Linus Torvalds2014-10-141-1/+1
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 seccomp changes from Ingo Molnar: "This tree includes x86 seccomp filter speedups and related preparatory work, which touches core seccomp facilities as well. The main idea is to split seccomp into two phases, to be able to enter a simple fast path for syscalls with ptrace side effects. There's no substantial user-visible (and ABI) effects expected from this, except a change in how we emit a better audit record for SECCOMP_RET_TRACE events" * 'x86-seccomp-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86_64, entry: Use split-phase syscall_trace_enter for 64-bit syscalls x86_64, entry: Treat regs->ax the same in fastpath and slowpath syscalls x86: Split syscall_trace_enter into two phases x86, entry: Only call user_exit if TIF_NOHZ x86, x32, audit: Fix x32's AUDIT_ARCH wrt audit seccomp: Document two-phase seccomp and arch-provided seccomp_data seccomp: Allow arch code to provide seccomp_data seccomp: Refactor the filter callback and the API seccomp,x86,arm,mips,s390: Remove nr parameter from secure_computing
| * | | | seccomp,x86,arm,mips,s390: Remove nr parameter from secure_computingAndy Lutomirski2014-09-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The secure_computing function took a syscall number parameter, but it only paid any attention to that parameter if seccomp mode 1 was enabled. Rather than coming up with a kludge to get the parameter to work in mode 2, just remove the parameter. To avoid churn in arches that don't have seccomp filters (and may not even support syscall_get_nr right now), this leaves the parameter in secure_computing_strict, which is now a real function. For ARM, this is a bit ugly due to the fact that ARM conditionally supports seccomp filters. Fixing that would probably only be a couple of lines of code, but it should be coordinated with the audit maintainers. This will be a slight slowdown on some arches. The right fix is to pass in all of seccomp_data instead of trying to make just the syscall nr part be fast. This is a prerequisite for making two-phase seccomp work cleanly. Cc: Russell King <linux@arm.linux.org.uk> Cc: linux-arm-kernel@lists.infradead.org Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux-s390@vger.kernel.org Cc: x86@kernel.org Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Kees Cook <keescook@chromium.org>
* | | | | Merge branch 'sched-core-for-linus' of ↵Linus Torvalds2014-10-131-0/+2
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The main changes in this cycle were: - Optimized support for Intel "Cluster-on-Die" (CoD) topologies (Dave Hansen) - Various sched/idle refinements for better idle handling (Nicolas Pitre, Daniel Lezcano, Chuansheng Liu, Vincent Guittot) - sched/numa updates and optimizations (Rik van Riel) - sysbench speedup (Vincent Guittot) - capacity calculation cleanups/refactoring (Vincent Guittot) - Various cleanups to thread group iteration (Oleg Nesterov) - Double-rq-lock removal optimization and various refactorings (Kirill Tkhai) - various sched/deadline fixes ... and lots of other changes" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits) sched/dl: Use dl_bw_of() under rcu_read_lock_sched() sched/fair: Delete resched_cpu() from idle_balance() sched, time: Fix build error with 64 bit cputime_t on 32 bit systems sched: Improve sysbench performance by fixing spurious active migration sched/x86: Fix up typo in topology detection x86, sched: Add new topology for multi-NUMA-node CPUs sched/rt: Use resched_curr() in task_tick_rt() sched: Use rq->rd in sched_setaffinity() under RCU read lock sched: cleanup: Rename 'out_unlock' to 'out_free_new_mask' sched: Use dl_bw_of() under RCU read lock sched/fair: Remove duplicate code from can_migrate_task() sched, mips, ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW sched: print_rq(): Don't use tasklist_lock sched: normalize_rt_tasks(): Don't use _irqsave for tasklist_lock, use task_rq_lock() sched: Fix the task-group check in tg_has_rt_tasks() sched/fair: Leverage the idle state info when choosing the "idlest" cpu sched: Let the scheduler see CPU idle states sched/deadline: Fix inter- exclusive cpusets migrations sched/deadline: Clear dl_entity params when setscheduling to different class sched/numa: Kill the wrong/dead TASK_DEAD check in task_numa_fault() ...
| * | | | | sched, time: Fix build error with 64 bit cputime_t on 32 bit systemsRik van Riel2014-10-031-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On 32 bit systems cmpxchg cannot handle 64 bit values, so some additional magic is required to allow a 32 bit system with CONFIG_VIRT_CPU_ACCOUNTING_GEN=y enabled to build. Make sure the correct cmpxchg function is used when doing an atomic swap of a cputime_t. Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rik van Riel <riel@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: umgwanakikbuti@gmail.com Cc: fweisbec@gmail.com Cc: srao@redhat.com Cc: lwoodman@redhat.com Cc: atheurer@redhat.com Cc: oleg@redhat.com Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: linux390@de.ibm.com Cc: linux-arch@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s390@vger.kernel.org Link: http://lkml.kernel.org/r/20140930155947.070cdb1f@annuminas.surriel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | | | Merge branch 'for-v3.18' of ↵Linus Torvalds2014-10-102-14/+22
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.linaro.org/people/mszyprowski/linux-dma-mapping Pull dma-mapping update from Marek Szyprowski: "Provide the dma write coherent api (available previously on ARM architecture) for all other architectures, which use dma_ops-based dma mapping implementation. This lets one to use the same code in the device drivers regardless of the selected architecture" * 'for-v3.18' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping: dma-mapping: Provide write-combine allocations s390: Implement dma_{alloc,free}_attrs()
| * | | | | | s390: Implement dma_{alloc,free}_attrs()Thierry Reding2014-08-262-14/+22
| | |_|_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The S390 architecture advertises support for HAVE_DMA_ATTRS when PCI is enabled. Patches to unify some of the DMA API would like to rely on the dma_alloc_attrs() and dma_free_attrs() functions to be provided when an architecture supports DMA attributes. Rename dma_alloc_coherent() and dma_free_coherent() to dma_alloc_attrs() and dma_free_attrs() since they are functionally equivalent and alias the former to the latter for compatibility. For consistency with other architectures, also reuse the existing symbol HAVE_DMA_ATTRS defined in arch/Kconfig instead of providing a duplicate. Select it when PCI is enabled. While at it, drop a redundant 'default n' from the PCI Kconfig symbol. Signed-off-by: Thierry Reding <treding@nvidia.com> Acked-By: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
* | | | | | nosave: consolidate __nosave_{begin,end} in <asm/sections.h>Geert Uytterhoeven2014-10-091-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The different architectures used their own (and different) declarations: extern __visible const void __nosave_begin, __nosave_end; extern const void __nosave_begin, __nosave_end; extern long __nosave_begin, __nosave_end; Consolidate them using the first variant in <asm/sections.h>. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | Merge branch 'timers-nohz-for-linus' of ↵Linus Torvalds2014-10-091-0/+1
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Ingo Molnar: "Main changes: - Fix the deadlock reported by Dave Jones et al - Clean up and fix nohz_full interaction with arch abilities - nohz init code consolidation/cleanup" * 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: nohz: nohz full depends on irq work self IPI support nohz: Consolidate nohz full init code arm64: Tell irq work about self IPI support arm: Tell irq work about self IPI support x86: Tell irq work about self IPI support irq_work: Force raised irq work to run on irq work interrupt irq_work: Introduce arch_irq_work_has_interrupt() nohz: Move nohz full init call to tick init
| * | | | | | irq_work: Introduce arch_irq_work_has_interrupt()Peter Zijlstra2014-09-131-0/+1
| | |/ / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nohz full code needs irq work to trigger its own interrupt so that the subsystem can work even when the tick is stopped. Lets introduce arch_irq_work_has_interrupt() that archs can override to tell about their support for this ability. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
* | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds2014-10-081-72/+12
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking updates from David Miller: "Most notable changes in here: 1) By far the biggest accomplishment, thanks to a large range of contributors, is the addition of multi-send for transmit. This is the result of discussions back in Chicago, and the hard work of several individuals. Now, when the ->ndo_start_xmit() method of a driver sees skb->xmit_more as true, it can choose to defer the doorbell telling the driver to start processing the new TX queue entires. skb->xmit_more means that the generic networking is guaranteed to call the driver immediately with another SKB to send. There is logic added to the qdisc layer to dequeue multiple packets at a time, and the handling mis-predicted offloads in software is now done with no locks held. Finally, pktgen is extended to have a "burst" parameter that can be used to test a multi-send implementation. Several drivers have xmit_more support: i40e, igb, ixgbe, mlx4, virtio_net Adding support is almost trivial, so export more drivers to support this optimization soon. I want to thank, in no particular or implied order, Jesper Dangaard Brouer, Eric Dumazet, Alexander Duyck, Tom Herbert, Jamal Hadi Salim, John Fastabend, Florian Westphal, Daniel Borkmann, David Tat, Hannes Frederic Sowa, and Rusty Russell. 2) PTP and timestamping support in bnx2x, from Michal Kalderon. 3) Allow adjusting the rx_copybreak threshold for a driver via ethtool, and add rx_copybreak support to enic driver. From Govindarajulu Varadarajan. 4) Significant enhancements to the generic PHY layer and the bcm7xxx driver in particular (EEE support, auto power down, etc.) from Florian Fainelli. 5) Allow raw buffers to be used for flow dissection, allowing drivers to determine the optimal "linear pull" size for devices that DMA into pools of pages. The objective is to get exactly the necessary amount of headers into the linear SKB area pre-pulled, but no more. The new interface drivers use is eth_get_headlen(). From WANG Cong, with driver conversions (several had their own by-hand duplicated implementations) by Alexander Duyck and Eric Dumazet. 6) Support checksumming more smoothly and efficiently for encapsulations, and add "foo over UDP" facility. From Tom Herbert. 7) Add Broadcom SF2 switch driver to DSA layer, from Florian Fainelli. 8) eBPF now can load programs via a system call and has an extensive testsuite. Alexei Starovoitov and Daniel Borkmann. 9) Major overhaul of the packet scheduler to use RCU in several major areas such as the classifiers and rate estimators. From John Fastabend. 10) Add driver for Intel FM10000 Ethernet Switch, from Alexander Duyck. 11) Rearrange TCP_SKB_CB() to reduce cache line misses, from Eric Dumazet. 12) Add Datacenter TCP congestion control algorithm support, From Florian Westphal. 13) Reorganize sk_buff so that __copy_skb_header() is significantly faster. From Eric Dumazet" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1558 commits) netlabel: directly return netlbl_unlabel_genl_init() net: add netdev_txq_bql_{enqueue, complete}_prefetchw() helpers net: description of dma_cookie cause make xmldocs warning cxgb4: clean up a type issue cxgb4: potential shift wrapping bug i40e: skb->xmit_more support net: fs_enet: Add NAPI TX net: fs_enet: Remove non NAPI RX r8169:add support for RTL8168EP net_sched: copy exts->type in tcf_exts_change() wimax: convert printk to pr_foo() af_unix: remove 0 assignment on static ipv6: Do not warn for informational ICMP messages, regardless of type. Update Intel Ethernet Driver maintainers list bridge: Save frag_max_size between PRE_ROUTING and POST_ROUTING tipc: fix bug in multicast congestion handling net: better IFF_XMIT_DST_RELEASE support net/mlx4_en: remove NETDEV_TX_BUSY 3c59x: fix bad split of cpu_to_le32(pci_map_single()) net: bcmgenet: fix Tx ring priority programming ...
| * \ \ \ \ \ Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-09-245-0/+5
| |\ \ \ \ \ \
| * \ \ \ \ \ \ Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-09-235-61/+96
| |\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: arch/mips/net/bpf_jit.c drivers/net/can/flexcan.c Both the flexcan and MIPS bpf_jit conflicts were cases of simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | net: filter: constify detection of pkt_type_offsetHannes Frederic Sowa2014-09-131-34/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we have 2 pkt_type_offset functions doing the same thing and spread across the architecture files. Remove those and replace them with a PKT_TYPE_OFFSET macro helper which gets the constant value from a zero sized sk_buff member right in front of the bitfield with offsetof. This new offset marker does not change size of struct sk_buff. Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Markos Chandras <markos.chandras@imgtec.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Daniel Borkmann <dborkman@redhat.com> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | net: bpf: be friendly to kmemcheckDaniel Borkmann2014-09-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reported by Mikulas Patocka, kmemcheck currently barks out a false positive since we don't have special kmemcheck annotation for bitfields used in bpf_prog structure. We currently have jited:1, len:31 and thus when accessing len while CONFIG_KMEMCHECK enabled, kmemcheck throws a warning that we're reading uninitialized memory. As we don't need the whole bit universe for pages member, we can just split it to u16 and use a bool flag for jited instead of a bitfield. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | net: bpf: consolidate JIT binary allocatorDaniel Borkmann2014-09-091-36/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduced in commit 314beb9bcabf ("x86: bpf_jit_comp: secure bpf jit against spraying attacks") and later on replicated in aa2d2c73c21f ("s390/bpf,jit: address randomize and write protect jit code") for s390 architecture, write protection for BPF JIT images got added and a random start address of the JIT code, so that it's not on a page boundary anymore. Since both use a very similar allocator for the BPF binary header, we can consolidate this code into the BPF core as it's mostly JIT independant anyway. This will also allow for future archs that support DEBUG_SET_MODULE_RONX to just reuse instead of reimplementing it. JIT tested on x86_64 and s390x with BPF test suite. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-09-079-18/+50
| |\ \ \ \ \ \ \ \ | | | |_|/ / / / / | | |/| | | | | |
| * | | | | | | | net: bpf: make eBPF interpreter images read-onlyDaniel Borkmann2014-09-051-1/+1
| | |_|_|/ / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With eBPF getting more extended and exposure to user space is on it's way, hardening the memory range the interpreter uses to steer its command flow seems appropriate. This patch moves the to be interpreted bytecode to read-only pages. In case we execute a corrupted BPF interpreter image for some reason e.g. caused by an attacker which got past a verifier stage, it would not only provide arbitrary read/write memory access but arbitrary function calls as well. After setting up the BPF interpreter image, its contents do not change until destruction time, thus we can setup the image on immutable made pages in order to mitigate modifications to that code. The idea is derived from commit 314beb9bcabf ("x86: bpf_jit_comp: secure bpf jit against spraying attacks"). This is possible because bpf_prog is not part of sk_filter anymore. After setup bpf_prog cannot be altered during its life-time. This prevents any modifications to the entire bpf_prog structure (incl. function/JIT image pointer). Every eBPF program (including classic BPF that are migrated) have to call bpf_prog_select_runtime() to select either interpreter or a JIT image as a last setup step, and they all are being freed via bpf_prog_free(), including non-JIT. Therefore, we can easily integrate this into the eBPF life-time, plus since we directly allocate a bpf_prog, we have no performance penalty. Tested with seccomp and test_bpf testsuite in JIT/non-JIT mode and manual inspection of kernel_page_tables. Brad Spengler proposed the same idea via Twitter during development of this patch. Joint work with Hannes Frederic Sowa. Suggested-by: Brad Spengler <spender@grsecurity.net> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Kees Cook <keescook@chromium.org> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2014-10-0814-646/+593
|\ \ \ \ \ \ \ \ | |_|_|_|/ / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull KVM updates from Paolo Bonzini: "Fixes and features for 3.18. Apart from the usual cleanups, here is the summary of new features: - s390 moves closer towards host large page support - PowerPC has improved support for debugging (both inside the guest and via gdbstub) and support for e6500 processors - ARM/ARM64 support read-only memory (which is necessary to put firmware in emulated NOR flash) - x86 has the usual emulator fixes and nested virtualization improvements (including improved Windows support on Intel and Jailhouse hypervisor support on AMD), adaptive PLE which helps overcommitting of huge guests. Also included are some patches that make KVM more friendly to memory hot-unplug, and fixes for rare caching bugs. Two patches have trivial mm/ parts that were acked by Rik and Andrew. Note: I will soon switch to a subkey for signing purposes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (157 commits) kvm: do not handle APIC access page if in-kernel irqchip is not in use KVM: s390: count vcpu wakeups in stat.halt_wakeup KVM: s390/facilities: allow TOD-CLOCK steering facility bit KVM: PPC: BOOK3S: HV: CMA: Reserve cma region only in hypervisor mode arm/arm64: KVM: Report correct FSC for unsupported fault types arm/arm64: KVM: Fix VTTBR_BADDR_MASK and pgd alloc kvm: Fix kvm_get_page_retry_io __gup retval check arm/arm64: KVM: Fix set_clear_sgi_pend_reg offset kvm: x86: Unpin and remove kvm_arch->apic_access_page kvm: vmx: Implement set_apic_access_page_addr kvm: x86: Add request bit to reload APIC access page address kvm: Add arch specific mmu notifier for page invalidation kvm: Rename make_all_cpus_request() to kvm_make_all_cpus_request() and make it non-static kvm: Fix page ageing bugs kvm/x86/mmu: Pass gfn and level to rmapp callback. x86: kvm: use alternatives for VMCALL vs. VMMCALL if kernel text is read-only kvm: x86: use macros to compute bank MSRs KVM: x86: Remove debug assertion of non-PAE reserved bits kvm: don't take vcpu mutex for obviously invalid vcpu ioctls kvm: Faults which trigger IO release the mmap_sem ...
| * | | | | | | KVM: s390: count vcpu wakeups in stat.halt_wakeupDavid Hildenbrand2014-10-013-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces the halt_wakeup counter used by common code and uses it to count vcpu wakeups done in s390 arch specific code. Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | | | | | | KVM: s390/facilities: allow TOD-CLOCK steering facility bitChristian Borntraeger2014-10-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is nothing to do for KVM to support TOD-CLOCK steering. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
| * | | | | | | KVM: s390: register flic ops dynamicallyCornelia Huck2014-09-172-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using the new kvm_register_device_ops() interface makes us get rid of an #ifdef in common code. Cc: Gleb Natapov <gleb@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | | | | KVM: s390/interrupt: remove double assignmentChristian Borntraeger2014-09-101-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | r is already initialized to 0. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
| * | | | | | | KVM: s390/cmm: Fix prefix handling for diag 10 balloonChristian Borntraeger2014-09-101-8/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The old handling of prefix pages was broken in the diag10 ballooner. We now rely on gmap_discard to check for start > end and do a slow path if the prefix swap pages are affected: 1. discard the pages from start to prefix 2. discard the absolute 0 pages 3. discard the pages after prefix swap to end Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
| * | | | | | | KVM: s390: get rid of constant condition in ipte_unlock_simpleChristian Borntraeger2014-09-101-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to the earlier check we know that ipte_lock_count must be 0. No need to add a useless if. Let's make clear that we are going to always wakeup when we execute that code. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
| * | | | | | | KVM: s390: unintended fallthrough for external callChristian Borntraeger2014-09-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We must not fallthrough if the conditions for external call are not met. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com> Cc: stable@vger.kernel.org
| * | | | | | | KVM: s390: Limit guest size to 16TBChristian Borntraeger2014-09-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we fill up a full 5 level page table to hold the guest mapping. Since commit "support gmap page tables with less than 5 levels" we can do better. Having more than 4 TB might be useful for some testing scenarios, so let's just limit ourselves to 16TB guest size. Having more than that is totally untested as I do not have enough swap space/memory. We continue to allow ucontrol the full size. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | | | | | KVM: s390: add __must_check to interrupt deliver functionsChristian Borntraeger2014-09-102-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We now propagate interrupt injection errors back to the ioctl. We should mark functions that might fail with __must_check. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
| * | | | | | | KVM: CPACF: Enable MSA4 instructions for kvm guestTony Krowiak2014-09-102-1/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have to provide a per guest crypto block for the CPUs to enable MSA4 instructions. According to icainfo on z196 or later this enables CCM-AES-128, CMAC-AES-128, CMAC-AES-192 and CMAC-AES-256. Signed-off-by: Tony Krowiak <akrowiak@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Michael Mueller <mimu@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> [split MSA4/protected key into two patches]
| * | | | | | | KVM: remove garbage arg to *hardware_{en,dis}ableRadim Krčmář2014-08-292-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the beggining was on_each_cpu(), which required an unused argument to kvm_arch_ops.hardware_{en,dis}able, but this was soon forgotten. Remove unnecessary arguments that stem from this. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | | | | KVM: static inline empty kvm_arch functionsRadim Krčmář2014-08-292-43/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using static inline is going to save few bytes and cycles. For example on powerpc, the difference is 700 B after stripping. (5 kB before) This patch also deals with two overlooked empty functions: kvm_arch_flush_shadow was not removed from arch/mips/kvm/mips.c 2df72e9bc KVM: split kvm_arch_flush_shadow and kvm_arch_sched_in never made it into arch/ia64/kvm/kvm-ia64.c. e790d9ef6 KVM: add kvm_arch_sched_in Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | | | | KVM: forward declare structs in kvm_types.hPaolo Bonzini2014-08-291-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Opaque KVM structs are useful for prototypes in asm/kvm_host.h, to avoid "'struct foo' declared inside parameter list" warnings (and consequent breakage due to conflicting types). Move them from individual files to a generic place in linux/kvm_types.h. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | | | | KVM: s390/mm: fix up indentation of set_guest_storage_keyChristian Borntraeger2014-08-291-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit ab3f285f227f ("KVM: s390/mm: try a cow on read only pages for key ops")' misaligned a code block. Let's fixup the indentation. Reported-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | | | | Merge tag 'kvm-s390-next-20140825' of ↵Paolo Bonzini2014-08-2612-585/+497
| |\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Fixes and features for 3.18 part 1 1. The usual cleanups: get rid of duplicate code, use defines, factor out the sync_reg handling, additional docs for sync_regs, better error handling on interrupt injection 2. We use KVM_REQ_TLB_FLUSH instead of open coding tlb flushes 3. Additional registers for kvm_run sync regs. This is usually not needed in the fast path due to eventfd/irqfd, but kvm stat claims that we reduced the overhead of console output by ~50% on my system 4. A rework of the gmap infrastructure. This is the 2nd step towards host large page support (after getting rid of the storage key dependency). We introduces two radix trees to store the guest-to-host and host-to-guest translations. This gets us rid of most of the page-table walks in the gmap code. Only one in __gmap_link is left, this one is required to link the shadow page table to the process page table. Finally this contains the plumbing to support gmap page tables with less than 5 levels.
| | * | | | | | | KVM: s390/mm: remove outdated gmap data structuresMartin Schwidefsky2014-08-261-23/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The radix tree rework removed all code that uses the gmap_rmap and gmap_pgtable data structures. Remove these outdated definitions. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * | | | | | | KVM: s390/mm: support gmap page tables with less than 5 levelsMartin Schwidefsky2014-08-263-33/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add an addressing limit to the gmap address spaces and only allocate the page table levels that are needed for the given limit. The limit is fixed and can not be changed after a gmap has been created. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * | | | | | | KVM: s390/mm: use radix trees for guest to host mappingsMartin Schwidefsky2014-08-267-383/+308
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Store the target address for the gmap segments in a radix tree instead of using invalid segment table entries. gmap_translate becomes a simple radix_tree_lookup, gmap_fault is split into the address translation with gmap_translate and the part that does the linking of the gmap shadow page table with the process page table. A second radix tree is used to keep the pointers to the segment table entries for segments that are mapped in the guest address space. On unmap of a segment the pointer is retrieved from the radix tree and is used to carry out the segment invalidation in the gmap shadow page table. As the radix tree can only store one pointer, each host segment may only be mapped to exactly one guest location. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>