summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* lib: bitmap: remove redundant code from __bitmap_shift_rightRasmus Villemoes2015-02-131-2/+0
| | | | | | | | | | | | If the condition k==lim-1 is true, we must have off == 0 (otherwise, k could never become that big). But in that case we have upper == 0 and hence dst[k] == (src[k] & mask) >> rem. Since mask consists of a consecutive range of bits starting from the LSB, anding dst[k] with mask is a no-op. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* lib: bitmap: eliminate branch in __bitmap_shift_rightRasmus Villemoes2015-02-131-3/+3
| | | | | | | | | | We can shift the bits from lower and upper into place before assembling dst[k]; moving the shift of upper into the branch where we already know that rem is non-zero allows us to remove a conditional. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* lib: bitmap: change bitmap_shift_right to take unsigned parametersRasmus Villemoes2015-02-132-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I've previously changed the nbits parameter of most bitmap_* functions to unsigned; now it is bitmap_shift_{left,right}'s turn. This alone saves some .text, but while at it I found that there were a few other things one could do. The end result of these seven patches is $ scripts/bloat-o-meter /tmp/bitmap.o.{old,new} add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-328 (-328) function old new delta __bitmap_shift_right 384 226 -158 __bitmap_shift_left 306 136 -170 and less importantly also a smaller stack footprint $ stack-o-meter.pl master bitmap file function old new delta lib/bitmap.o __bitmap_shift_right 24 8 -16 lib/bitmap.o __bitmap_shift_left 24 0 -24 For each pair of 0 <= shift <= nbits <= 256 I've tested the end result with a few randomly filled src buffers (including garbage beyond nbits), in each case verifying that the shift {left,right}-most bits of dst are zero and the remaining nbits-shift bits correspond to src, so I'm fairly confident I didn't screw up. That hasn't stopped me from being wrong before, though. This patch (of 7): gcc can generate slightly better code for stuff like "nbits % BITS_PER_LONG" when it knows nbits is not negative. Since negative size bitmaps or shift amounts don't make sense, change these parameters of bitmap_shift_right to unsigned. The expressions involving "lim - 1" are still ok, since if lim is 0 the loop is never executed. Also use "shift" and "nbits" consistently for the parameter names. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* lib/bitmap.c: elide bitmap_copy_le on little-endianRasmus Villemoes2015-02-132-0/+6
| | | | | | | | | On little-endian, there's no reason to have an extra, presumably less efficient, way of copying a bitmap. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* lib/bitmap.c: change prototype of bitmap_copy_leRasmus Villemoes2015-02-132-6/+5
| | | | | | | | | | | | | | | | | Make the prototype of bitmap_copy_le the same as bitmap_copy's. All other bitmap_* functions take unsigned long* parameters; there's no reason this should be special. The only current user is the static inline uwb_mas_bm_copy_le, which already does the void* laundering, so the end users can pass their u8 or __le32 buffers without a cast. Furthermore, this allows us to simply let bitmap_copy_le be an alias for bitmap_copy on little-endian; see next patch. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'for-next' of ↵Linus Torvalds2015-02-1310-8/+748
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds Pull LED subsystem update from Bryan Wu: "The big change of LED subsystem is introducing a new LED class for Flash type LEDs which will be used for V4L2 subsystem. Also we got some cleanup and fixes" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds: leds: leds-gpio: Pass on error codes unmodified DT: leds: Add led-sources property leds: Add LED Flash class extension to the LED subsystem leds: leds-mc13783: Use of_get_child_by_name() instead of refcount hack leds: Use setup_timer leds: Don't allow brightness values greater than max_brightness DT: leds: Add flash LED devices related properties
| * leds: leds-gpio: Pass on error codes unmodifiedSoren Brinkmann2015-02-021-1/+2
| | | | | | | | | | | | | | | | | | | | | | Instead of overriding error codes, pass them on unmodified. This way a EPROBE_DEFER is correctly passed to the driver core. This results in the LED driver correctly requesting probe deferral in cases the GPIO controller is not yet available. Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com> Reported-and-tested-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Bryan Wu <cooloney@gmail.com>
| * DT: leds: Add led-sources propertyJacek Anaszewski2015-01-291-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a property for defining device outputs the LED represented by the DT child node is connected to. Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com> Acked-by: Kyungmin Park <kyungmin.park@samsung.com> Cc: Bryan Wu <cooloney@gmail.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ian Campbell <ijc+devicetree@hellion.org.uk> Cc: Kumar Gala <galak@codeaurora.org> Cc: devicetree@vger.kernel.org Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Bryan Wu <cooloney@gmail.com>
| * leds: Add LED Flash class extension to the LED subsystemJacek Anaszewski2015-01-266-0/+711
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some LED devices support two operation modes - torch and flash. This patch provides support for flash LED devices in the LED subsystem by introducing new sysfs attributes and kernel internal interface. The attributes being introduced are: flash_brightness, flash_strobe, flash_timeout, max_flash_timeout, max_flash_brightness, flash_fault, flash_sync_strobe and available_sync_leds. All the flash related features are placed in a separate module. The modifications aim to be compatible with V4L2 framework requirements related to the flash devices management. The design assumes that V4L2 sub-device can take of the LED class device control and communicate with it through the kernel internal interface. When V4L2 Flash sub-device file is opened, the LED class device sysfs interface is made unavailable. Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com> Acked-by: Kyungmin Park <kyungmin.park@samsung.com> Cc: Richard Purdie <rpurdie@rpsys.net> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Bryan Wu <cooloney@gmail.com>
| * leds: leds-mc13783: Use of_get_child_by_name() instead of refcount hackGeert Uytterhoeven2015-01-141-3/+1
| | | | | | | | | | | | | | | | | | | | | | of_find_node_by_name() calls of_node_put() on its "from" parameter. To counter this, mc13xxx_led_probe_dt() calls of_node_get() first. Use of_get_child_by_name() instead to get rid of the refcount hack. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: linux-leds@vger.kernel.org Signed-off-by: Bryan Wu <cooloney@gmail.com>
| * leds: Use setup_timerJulia Lawall2015-01-141-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert a call to init_timer and accompanying intializations of the timer's data and function fields to a call to setup_timer. A simplified version of the semantic match that fixes this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression t,f,d; @@ -init_timer(&t); +setup_timer(&t,f,d); -t.function = f; -t.data = d; // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Bryan Wu <cooloney@gmail.com>
| * leds: Don't allow brightness values greater than max_brightnessGabriele Mazzotta2015-01-141-1/+2
| | | | | | | | | | | | | | | | | | | | Since commit 4d71a4a12b13 ("leds: Add support for setting brightness in a synchronous way") the value passed to brightness_set() is no longer limited to max_brightness and can be different from the internally saved brightness value. Signed-off-by: Gabriele Mazzotta <gabriele.mzt@gmail.com> Signed-off-by: Bryan Wu <cooloney@gmail.com>
| * DT: leds: Add flash LED devices related propertiesPavel Machek2015-01-141-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Addition of a LED Flash class extension entails the need for flash LED specific device tree properties. The properties being added are: max-microamp, flash-max-microamp, flash-timeout-microsec. (cooloney@gmail.com: remove white spaces) Signed-off-by: Pavel Machek <pavel@ucw.cz> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Acked-by: Jacek Anaszewski <j.anaszewski@samsung.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Bryan Wu <cooloney@gmail.com>
* | Merge tag 'modules-next-for-linus' of ↵Linus Torvalds2015-02-132-39/+28
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module update from Rusty Russell: "Trivial cleanups, mainly" * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: module: Replace over-engineered nested sleep module: Annotate nested sleep in resolve_symbol() module: Remove double spaces in module verification taint message kernel/module.c: Free lock-classes if parse_args failed module: set ksymtab/kcrctab* section addresses to 0x0
| * | module: Replace over-engineered nested sleepPeter Zijlstra2015-02-111-28/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the introduction of the nested sleep warning; we've established that the occasional sleep inside a wait_event() is fine. wait_event() loops are invariant wrt. spurious wakeups, and the occasional sleep has a similar effect on them. As long as its occasional its harmless. Therefore replace the 'correct' but verbose wait_woken() thing with a simple annotation to shut up the warning. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * | module: Annotate nested sleep in resolve_symbol()Peter Zijlstra2015-02-111-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because wait_event() loops are safe vs spurious wakeups we can allow the occasional sleep -- which ends up being very similar. Reported-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * | module: Remove double spaces in module verification taint messageMarcel Holtmann2015-02-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The warning message when loading modules with a wrong signature has two spaces in it: "module verification failed: signature and/or required key missing" Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * | kernel/module.c: Free lock-classes if parse_args failedAndrey Tsyvarev2015-02-061-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | parse_args call module parameters' .set handlers, which may use locks defined in the module. So, these classes should be freed in case parse_args returns error(e.g. due to incorrect parameter passed). Signed-off-by: Andrey Tsyvarev <tsyvarev@ispras.ru> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * | module: set ksymtab/kcrctab* section addresses to 0x0Rabin Vincent2015-02-061-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These __ksymtab*/__kcrctab* sections currently have non-zero addresses. Non-zero section addresses in a relocatable ELF confuse GDB and it ends up not relocating all symbols when add-symbol-file is used on modules which have exports. The kernel's module loader does not care about these addresses, so let's just set them to zero. Before: $ readelf -S lib/notifier-error-inject.ko | grep 'Name\| __ksymtab_gpl' [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 8] __ksymtab_gpl PROGBITS 0000000c 0001b4 000010 00 A 0 0 4 (gdb) add-symbol-file lib/notifier-error-inject.ko 0x500000 -s .bss 0x700000 add symbol table from file "lib/notifier-error-inject.ko" at .text_addr = 0x500000 .bss_addr = 0x700000 (gdb) p &notifier_err_inject_dir $3 = (struct dentry **) 0x0 After: $ readelf -S lib/notifier-error-inject.ko | grep 'Name\| __ksymtab_gpl' [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 8] __ksymtab_gpl PROGBITS 00000000 0001b4 000010 00 A 0 0 4 (gdb) add-symbol-file lib/notifier-error-inject.ko 0x500000 -s .bss 0x700000 add symbol table from file "lib/notifier-error-inject.ko" at .text_addr = 0x500000 .bss_addr = 0x700000 (gdb) p &notifier_err_inject_dir $3 = (struct dentry **) 0x700000 Signed-off-by: Rabin Vincent <rabin.vincent@axis.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tileLinus Torvalds2015-02-133-21/+22
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull arch/tile changes from Chris Metcalf: "Not much in this batch, just some minor cleanups" * git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: tile: change MAINTAINERS website from tilera.com to ezchip.com tile: enable sparse checks for get/put_user tile: fix put_user sparse errors tile: default to little endian on older toolchains
| * | | tile: change MAINTAINERS website from tilera.com to ezchip.comChris Metcalf2015-02-131-1/+1
| | | | | | | | | | | | | | | | Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
| * | | tile: enable sparse checks for get/put_userChris Metcalf2015-01-131-8/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add an extra intermediate variable to __get_user and __put_user to give sparse an opportunity to detect mismatches. Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
| * | | tile: fix put_user sparse errorsChris Metcalf2015-01-131-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use x86's __inttype macro instead of using the typeof(x-x) trick to generate a suitable integer size type. This avoids a sparse warning when examining the x-x type with a bitwise type. Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
| * | | tile: default to little endian on older toolchainsChris Metcalf2015-01-131-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Older toolchains may not specify __LITTLE_ENDIAN__, but older toolchains were all little endian. Don't make things unnecessarily hard for those toolchains. Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
* | | | Revert "x86/apic: Only disable CPU x2apic mode when necessary"Linus Torvalds2015-02-131-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 5fcee53ce705d49c766f8a302c7e93bdfc33c124. It causes the suspend to fail on at least the Chromebook Pixel, possibly other platforms too. Joerg Roedel points out that the logic should probably have been if (max_physical_apicid > 255 || !(IS_ENABLED(CONFIG_HYPERVISOR_GUEST) && hypervisor_x2apic_available())) { instead, but since the code is not in any fast-path, so we can just live without that optimization and just revert to the original code. Acked-by: Joerg Roedel <joro@8bytes.org> Acked-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2015-02-1388-1626/+6026
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull KVM update from Paolo Bonzini: "Fairly small update, but there are some interesting new features. Common: Optional support for adding a small amount of polling on each HLT instruction executed in the guest (or equivalent for other architectures). This can improve latency up to 50% on some scenarios (e.g. O_DSYNC writes or TCP_RR netperf tests). This also has to be enabled manually for now, but the plan is to auto-tune this in the future. ARM/ARM64: The highlights are support for GICv3 emulation and dirty page tracking s390: Several optimizations and bugfixes. Also a first: a feature exposed by KVM (UUID and long guest name in /proc/sysinfo) before it is available in IBM's hypervisor! :) MIPS: Bugfixes. x86: Support for PML (page modification logging, a new feature in Broadwell Xeons that speeds up dirty page tracking), nested virtualization improvements (nested APICv---a nice optimization), usual round of emulation fixes. There is also a new option to reduce latency of the TSC deadline timer in the guest; this needs to be tuned manually. Some commits are common between this pull and Catalin's; I see you have already included his tree. Powerpc: Nothing yet. The KVM/PPC changes will come in through the PPC maintainers, because I haven't received them yet and I might end up being offline for some part of next week" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (130 commits) KVM: ia64: drop kvm.h from installed user headers KVM: x86: fix build with !CONFIG_SMP KVM: x86: emulate: correct page fault error code for NoWrite instructions KVM: Disable compat ioctl for s390 KVM: s390: add cpu model support KVM: s390: use facilities and cpu_id per KVM KVM: s390/CPACF: Choose crypto control block format s390/kernel: Update /proc/sysinfo file with Extended Name and UUID KVM: s390: reenable LPP facility KVM: s390: floating irqs: fix user triggerable endless loop kvm: add halt_poll_ns module parameter kvm: remove KVM_MMIO_SIZE KVM: MIPS: Don't leak FPU/DSP to guest KVM: MIPS: Disable HTW while in guest KVM: nVMX: Enable nested posted interrupt processing KVM: nVMX: Enable nested virtual interrupt delivery KVM: nVMX: Enable nested apic register virtualization KVM: nVMX: Make nested control MSRs per-cpu KVM: nVMX: Enable nested virtualize x2apic mode KVM: nVMX: Prepare for using hardware MSR bitmap ...
| * | | | KVM: ia64: drop kvm.h from installed user headersMike Frysinger2015-02-101-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The header was deleted, so stop trying to install it. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | KVM: x86: fix build with !CONFIG_SMPRadim Krčmář2015-02-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | <asm/apic.h> isn't included directly and without CONFIG_SMP, an option that automagically pulls it can't be enabled. Reported-by: Jim Davis <jim.epost@gmail.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | KVM: x86: emulate: correct page fault error code for NoWrite instructionsPaolo Bonzini2015-02-091-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NoWrite instructions (e.g. cmp or test) never set the "write access" bit in the error code, even if one of the operands is treated as a destination. Fixes: c205fb7d7d4f81e46fc577b707ceb9e356af1456 Cc: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | Merge tag 'kvm-s390-next-20150209' of ↵Paolo Bonzini2015-02-0912-56/+397
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: fixes and features for kvm/next (3.20) 1. Fixes - Fix user triggerable endless loop - reenable LPP facility - disable KVM compat ioctl on s390 (untested and broken) 2. cpu models for s390 - provide facilities and instruction blocking per VM - add s390 specific vm attributes for setting values 3. crypto - toleration patch for z13 support 4. add uuid and long name to /proc/sysinfo (stsi 322) - patch Acked by Heiko Carstens (touches non-kvm s390 code)
| | * | | | KVM: Disable compat ioctl for s390Christian Borntraeger2015-02-092-6/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We never had a 31bit QEMU/kuli running. We would need to review several ioctls to check if this creates holes, bugs or whatever to make it work. Lets just disable compat support for KVM on s390. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com>
| | * | | | KVM: s390: add cpu model supportMichael Mueller2015-02-094-1/+201
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enables cpu model support in kvm/s390 via the vm attribute interface. During KVM initialization, the host properties cpuid, IBC value and the facility list are stored in the architecture specific cpu model structure. During vcpu setup, these properties are taken to initialize the related SIE state. This mechanism allows to adjust the properties from user space and thus to implement different selectable cpu models. This patch uses the IBC functionality to block instructions that have not been implemented at the requested CPU type and GA level compared to the full host capability. Userspace has to initialize the cpu model before vcpu creation. A cpu model change of running vcpus is not possible. Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * | | | KVM: s390: use facilities and cpu_id per KVMMichael Mueller2015-02-095-44/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch introduces facilities and cpu_ids per virtual machine. Different virtual machines may want to expose different facilities and cpu ids to the guest, so let's make them per-vm instead of global. Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * | | | KVM: s390/CPACF: Choose crypto control block formatTony Krowiak2015-02-092-2/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need to specify a different format for the crypto control block depending on whether the APXA facility is installed or not. Let's test for it by executing the PQAP(QCI) function and use either a format-1 or a format-2 crypto control block accordingly. This is a host only change for z13 and does not affect the guest view. Signed-off-by: Tony Krowiak <akrowiak@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * | | | s390/kernel: Update /proc/sysinfo file with Extended Name and UUIDEkaterina Tumanova2015-02-092-3/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A new architecture extends STSI 3.2.2 with UUID and long names. KVM will provide the first implementation. This patch adds the additional data fields (Extended Name and UUID) from the 4KB block returned by the STSI 3.2.2 command and reflect this information in the /proc/sysinfo file accordingly. Signed-off-by: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * | | | KVM: s390: reenable LPP facilityChristian Borntraeger2015-02-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 7be81a46695d ("KVM: s390/facilities: allow TOD-CLOCK steering facility bit") accidentially disabled the "load program parameter" facility bit during rebase for upstream submission (my fault). Re-add that bit. As this is only for a performance measurement helper instruction (used by KVM itself) cc stable is not necessary see http://www-01.ibm.com/support/docview.wss?uid=isg26fcd1cc32246f4c8852574ce0044734a (SA23-2260 The Load-Program-Parameter and CPU-Measurement Facilities) for details about LPP and its usecase. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Fixes: 7be81a46695d ("KVM: s390/facilities: allow TOD-CLOCK steering")
| | * | | | KVM: s390: floating irqs: fix user triggerable endless loopDavid Hildenbrand2015-02-091-0/+2
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a vm with no VCPUs is created, the injection of a floating irq leads to an endless loop in the kernel. Let's skip the search for a destination VCPU for a floating irq if no VCPUs were created. Reviewed-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Cc: stable@vger.kernel.org # v3.15+ Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * | | | kvm: add halt_poll_ns module parameterPaolo Bonzini2015-02-0613-7/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces a new module parameter for the KVM module; when it is present, KVM attempts a bit of polling on every HLT before scheduling itself out via kvm_vcpu_block. This parameter helps a lot for latency-bound workloads---in particular I tested it with O_DSYNC writes with a battery-backed disk in the host. In this case, writes are fast (because the data doesn't have to go all the way to the platters) but they cannot be merged by either the host or the guest. KVM's performance here is usually around 30% of bare metal, or 50% if you use cache=directsync or cache=writethrough (these parameters avoid that the guest sends pointless flush requests, and at the same time they are not slow because of the battery-backed cache). The bad performance happens because on every halt the host CPU decides to halt itself too. When the interrupt comes, the vCPU thread is then migrated to a new physical CPU, and in general the latency is horrible because the vCPU thread has to be scheduled back in. With this patch performance reaches 60-65% of bare metal and, more important, 99% of what you get if you use idle=poll in the guest. This means that the tunable gets rid of this particular bottleneck, and more work can be done to improve performance in the kernel or QEMU. Of course there is some price to pay; every time an otherwise idle vCPUs is interrupted by an interrupt, it will poll unnecessarily and thus impose a little load on the host. The above results were obtained with a mostly random value of the parameter (500000), and the load was around 1.5-2.5% CPU usage on one of the host's core for each idle guest vCPU. The patch also adds a new stat, /sys/kernel/debug/kvm/halt_successful_poll, that can be used to tune the parameter. It counts how many HLT instructions received an interrupt during the polling period; each successful poll avoids that Linux schedules the VCPU thread out and back in, and may also avoid a likely trip to C1 and back for the physical CPU. While the VM is idle, a Linux 4 VCPU VM halts around 10 times per second. Of these halts, almost all are failed polls. During the benchmark, instead, basically all halts end within the polling period, except a more or less constant stream of 50 per second coming from vCPUs that are not running the benchmark. The wasted time is thus very low. Things may be slightly different for Windows VMs, which have a ~10 ms timer tick. The effect is also visible on Marcelo's recently-introduced latency test for the TSC deadline timer. Though of course a non-RT kernel has awful latency bounds, the latency of the timer is around 8000-10000 clock cycles compared to 20000-120000 without setting halt_poll_ns. For the TSC deadline timer, thus, the effect is both a smaller average latency and a smaller variance. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | kvm: remove KVM_MMIO_SIZETiejun Chen2015-02-052-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After f78146b0f923, "KVM: Fix page-crossing MMIO", and 87da7e66a405, "KVM: x86: fix vcpu->mmio_fragments overflow", actually KVM_MMIO_SIZE is gone. Signed-off-by: Tiejun Chen <tiejun.chen@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | KVM: MIPS: Don't leak FPU/DSP to guestJames Hogan2015-02-042-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The FPU and DSP are enabled via the CP0 Status CU1 and MX bits by kvm_mips_set_c0_status() on a guest exit, presumably in case there is active state that needs saving if pre-emption occurs. However neither of these bits are cleared again when returning to the guest. This effectively gives the guest access to the FPU/DSP hardware after the first guest exit even though it is not aware of its presence, allowing FP instructions in guest user code to intermittently actually execute instead of trapping into the guest OS for emulation. It will then read & manipulate the hardware FP registers which technically belong to the user process (e.g. QEMU), or are stale from another user process. It can also crash the guest OS by causing an FP exception, for which a guest exception handler won't have been registered. First lets save and disable the FPU (and MSA) state with lose_fpu(1) before entering the guest. This simplifies the problem, especially for when guest FPU/MSA support is added in the future, and prevents FR=1 FPU state being live when the FR bit gets cleared for the guest, which according to the architecture causes the contents of the FPU and vector registers to become UNPREDICTABLE. We can then safely remove the enabling of the FPU in kvm_mips_set_c0_status(), since there should never be any active FPU or MSA state to save at pre-emption, which should plug the FPU leak. DSP state is always live rather than being lazily restored, so for that it is simpler to just clear the MX bit again when re-entering the guest. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Sanjay Lal <sanjayl@kymasys.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: kvm@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # v3.10+: 044f0f03eca0: MIPS: KVM: Deliver guest interrupts Cc: <stable@vger.kernel.org> # v3.10+ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | KVM: MIPS: Disable HTW while in guestJames Hogan2015-02-041-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ensure any hardware page table walker (HTW) is disabled while in KVM guest mode, as KVM doesn't yet set up hardware page table walking for guest mappings so the wrong mappings would get loaded, resulting in the guest hanging or crashing once it reaches userland. The HTW is disabled and re-enabled around the call to __kvm_mips_vcpu_run() which does the initial switch into guest mode and the final switch out of guest context. Additionally it is enabled for the duration of guest exits (i.e. kvm_mips_handle_exit()), getting disabled again before returning back to guest or host. In all cases the HTW is only disabled in normal kernel mode while interrupts are disabled, so that the HTW doesn't get left disabled if the process is preempted. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Markos Chandras <markos.chandras@imgtec.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: kvm@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # v3.17+ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | KVM: nVMX: Enable nested posted interrupt processingWincy Van2015-02-033-7/+161
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If vcpu has a interrupt in vmx non-root mode, injecting that interrupt requires a vmexit. With posted interrupt processing, the vmexit is not needed, and interrupts are fully taken care of by hardware. In nested vmx, this feature avoids much more vmexits than non-nested vmx. When L1 asks L0 to deliver L1's posted interrupt vector, and the target VCPU is in non-root mode, we use a physical ipi to deliver POSTED_INTR_NV to the target vCPU. Using POSTED_INTR_NV avoids unexpected interrupts if a concurrent vmexit happens and L1's vector is different with L0's. The IPI triggers posted interrupt processing in the target physical CPU. In case the target vCPU was not in guest mode, complete the posted interrupt delivery on the next entry to L2. Signed-off-by: Wincy Van <fanwenyi0529@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | KVM: nVMX: Enable nested virtual interrupt deliveryWincy Van2015-02-031-2/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With virtual interrupt delivery, the hardware lets KVM use a more efficient mechanism for interrupt injection. This is an important feature for nested VMX, because it reduces vmexits substantially and they are much more expensive with nested virtualization. This is especially important for throughput-bound scenarios. Signed-off-by: Wincy Van <fanwenyi0529@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | KVM: nVMX: Enable nested apic register virtualizationWincy Van2015-02-031-4/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can reduce apic register virtualization cost with this feature, it is also a requirement for virtual interrupt delivery and posted interrupt processing. Signed-off-by: Wincy Van <fanwenyi0529@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | KVM: nVMX: Make nested control MSRs per-cpuWincy Van2015-02-031-86/+129
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To enable nested apicv support, we need per-cpu vmx control MSRs: 1. If in-kernel irqchip is enabled, we can enable nested posted interrupt, we should set posted intr bit in the nested_vmx_pinbased_ctls_high. 2. If in-kernel irqchip is disabled, we can not enable nested posted interrupt, the posted intr bit in the nested_vmx_pinbased_ctls_high will be cleared. Since there would be different settings about in-kernel irqchip between VMs, different nested control MSRs are needed. Signed-off-by: Wincy Van <fanwenyi0529@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | KVM: nVMX: Enable nested virtualize x2apic modeWincy Van2015-02-031-2/+112
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When L2 is using x2apic, we can use virtualize x2apic mode to gain higher performance, especially in apicv case. This patch also introduces nested_vmx_check_apicv_controls for the nested apicv patches. Signed-off-by: Wincy Van <fanwenyi0529@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | KVM: nVMX: Prepare for using hardware MSR bitmapWincy Van2015-02-031-11/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, if L1 enables MSR_BITMAP, we will emulate this feature, all of L2's msr access is intercepted by L0. Features like "virtualize x2apic mode" require that the MSR bitmap is enabled, or the hardware will exit and for example not virtualize the x2apic MSRs. In order to let L1 use these features, we need to build a merged bitmap that only not cause a VMEXIT if 1) L1 requires that 2) the bit is not required by the processor for APIC virtualization. For now the guests are still run with MSR bitmap disabled, but this patch already introduces nested_vmx_merge_msr_bitmap for future use. Signed-off-by: Wincy Van <fanwenyi0529@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | KVM: x86: revert "add method to test PIR bitmap vector"Marcelo Tosatti2015-02-022-15/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Revert 7c6a98dfa1ba9dc64a62e73624ecea9995736bbd, given that testing PIR is not necessary anymore. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | KVM: x86: fix lapic_timer_int_injected with APIC-vMarcelo Tosatti2015-02-021-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With APICv, LAPIC timer interrupt is always delivered via IRR: apic_find_highest_irr syncs PIR to IRR. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | kvm: vmx: fix oops with explicit flexpriority=0 optionPaolo Bonzini2015-01-301-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A function pointer was not NULLed, causing kvm_vcpu_reload_apic_access_page to go down the wrong path and OOPS when doing put_page(NULL). This did not happen on old processors, only when setting the module option explicitly. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>