summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* sparc64: Fix several bugs in quad floating point emulation.David S. Miller2012-05-252-11/+21
| | | | | | | | | | | | | | | | UltraSPARC-T2 and later do not use the fp_exception_other trap and do not set the floating point trap type field in the %fsr at all when you try to execute an unimplemented FPU operation. Instead, it uses the illegal_instruction trap and it leaves the floating point trap type field clear. So we should not validate the %fsr trap type field when do_mathemu() is invoked from the illegal instruction handler. Also, the floating point trap type field is 3 bits, not 4 bits. Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2012-05-24124-1968/+5419
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull KVM changes from Avi Kivity: "Changes include additional instruction emulation, page-crossing MMIO, faster dirty logging, preventing the watchdog from killing a stopped guest, module autoload, a new MSI ABI, and some minor optimizations and fixes. Outside x86 we have a small s390 and a very large ppc update. Regarding the new (for kvm) rebaseless workflow, some of the patches that were merged before we switch trees had to be rebased, while others are true pulls. In either case the signoffs should be correct now." Fix up trivial conflicts in Documentation/feature-removal-schedule.txt arch/powerpc/kvm/book3s_segment.S and arch/x86/include/asm/kvm_para.h. I suspect the kvm_para.h resolution ends up doing the "do I have cpuid" check effectively twice (it was done differently in two different commits), but better safe than sorry ;) * 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (125 commits) KVM: make asm-generic/kvm_para.h have an ifdef __KERNEL__ block KVM: s390: onereg for timer related registers KVM: s390: epoch difference and TOD programmable field KVM: s390: KVM_GET/SET_ONEREG for s390 KVM: s390: add capability indicating COW support KVM: Fix mmu_reload() clash with nested vmx event injection KVM: MMU: Don't use RCU for lockless shadow walking KVM: VMX: Optimize %ds, %es reload KVM: VMX: Fix %ds/%es clobber KVM: x86 emulator: convert bsf/bsr instructions to emulate_2op_SrcV_nobyte() KVM: VMX: unlike vmcs on fail path KVM: PPC: Emulator: clean up SPR reads and writes KVM: PPC: Emulator: clean up instruction parsing kvm/powerpc: Add new ioctl to retreive server MMU infos kvm/book3s: Make kernel emulated H_PUT_TCE available for "PR" KVM KVM: PPC: bookehv: Fix r8/r13 storing in level exception handler KVM: PPC: Book3S: Enable IRQs during exit handling KVM: PPC: Fix PR KVM on POWER7 bare metal KVM: PPC: Fix stbux emulation KVM: PPC: bookehv: Use lwz/stw instead of PPC_LL/PPC_STL for 32-bit fields ...
| * KVM: make asm-generic/kvm_para.h have an ifdef __KERNEL__ blockPaul Gortmaker2012-05-211-0/+3
| | | | | | | | | | | | | | | | | | | | There are two functions in this asm-generic file. Looking at other arch which do not use the generic version, these two fcns are within an #ifdef __KERNEL__ block, so make the generic one consistent with those. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: s390: onereg for timer related registersJason J. herne2012-05-172-0/+18
| | | | | | | | | | | | | | | | | | | | Enhance the KVM ONE_REG capability within S390 to allow getting/setting the following special cpu registers: clock comparator and the cpu timer. These are needed for migration. Signed-off-by: Jason J. herne <jjherne@us.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: s390: epoch difference and TOD programmable fieldCarsten Otte2012-05-172-0/+19
| | | | | | | | | | | | | | | | | | | | | | This patch makes vcpu epoch difference and the TOD programmable field accessible from userspace. This is needed in order to implement a couple of instructions that deal with the time of day clock on s390, such as SET CLOCK and for migration. Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: s390: KVM_GET/SET_ONEREG for s390Carsten Otte2012-05-171-0/+38
| | | | | | | | | | | | | | | | | | This patch enables KVM_CAP_ONE_REG for s390 and implements stubs for KVM_GET/SET_ONE_REG. This is based on the ppc implementation. Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: s390: add capability indicating COW supportChristian Borntraeger2012-05-174-1/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently qemu/kvm on s390 uses a guest mapping that does not allow the guest backing page table to be write-protected to support older systems. On those older systems a host write protection fault will be delivered to the guest. Newer systems allow to write-protect the guest backing memory and let the fault be delivered to the host, thus allowing COW. Use a capability bit to tell qemu if that is possible. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: Fix mmu_reload() clash with nested vmx event injectionAvi Kivity2012-05-161-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the inject_pending_event() call during guest entry happens after kvm_mmu_reload(). This is for historical reasons - we used to inject_pending_event() in atomic context, while kvm_mmu_reload() needs task context. A problem is that nested vmx can cause the mmu context to be reset, if event injection is intercepted and causes a #VMEXIT instead (the #VMEXIT resets CR0/CR3/CR4). If this happens, we end up with invalid root_hpa, and since kvm_mmu_reload() has already run, no one will fix it and we end up entering the guest this way. Fix by reordering event injection to be before kvm_mmu_reload(). Use ->cancel_injection() to undo if kvm_mmu_reload() fails. https://bugzilla.kernel.org/show_bug.cgi?id=42980 Reported-by: Luke-Jr <luke-jr+linuxbugs@utopios.org> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: MMU: Don't use RCU for lockless shadow walkingAvi Kivity2012-05-163-49/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using RCU for lockless shadow walking can increase the amount of memory in use by the system, since RCU grace periods are unpredictable. We also have an unconditional write to a shared variable (reader_counter), which isn't good for scaling. Replace that with a scheme similar to x86's get_user_pages_fast(): disable interrupts during lockless shadow walk to force the freer (kvm_mmu_commit_zap_page()) to wait for the TLB flush IPI to find the processor with interrupts enabled. We also add a new vcpu->mode, READING_SHADOW_PAGE_TABLES, to prevent kvm_flush_remote_tlbs() from avoiding the IPI. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: VMX: Optimize %ds, %es reloadAvi Kivity2012-05-161-5/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On x86_64, we can defer %ds and %es reload to the heavyweight context switch, since nothing in the lightweight paths uses the host %ds or %es (they are ignored by the processor). Furthermore we can avoid the load if the segments are null, by letting the hardware load the null segments for us. This is the expected case. On i386, we could avoid the reload entirely, since the entry.S paths take care of reload, except for the SYSEXIT path which leaves %ds and %es set to __USER_DS. So we set them to the same values as well. Saves about 70 cycles out of 1600 (around 4%; noisy measurements). Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: VMX: Fix %ds/%es clobberAvi Kivity2012-05-161-1/+5
| | | | | | | | | | | | | | | | | | | | The vmx exit code unconditionally restores %ds and %es to __USER_DS. This can override the user's values, since %ds and %es are not saved and restored in x86_64 syscalls. In practice, this isn't dangerous since nobody uses segment registers in long mode, least of all programs that use KVM. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: x86 emulator: convert bsf/bsr instructions to emulate_2op_SrcV_nobyte()Joerg Roedel2012-05-141-24/+2
| | | | | | | | | | | | | | | | | | | | The instruction emulation for bsrw is broken in KVM because the code always uses bsr with 32 or 64 bit operand size for emulation. Fix that by using emulate_2op_SrcV_nobyte() macro to use guest operand size for emulation. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: VMX: unlike vmcs on fail pathXiao Guangrong2012-05-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fix: [ 1529.577273] Call Trace: [ 1529.577289] [<ffffffffa060d58f>] kvm_arch_hardware_disable+0x13/0x30 [kvm] [ 1529.577302] [<ffffffffa05fa2d4>] hardware_disable_nolock+0x35/0x39 [kvm] [ 1529.577311] [<ffffffffa05fa29f>] ? cpumask_clear_cpu.constprop.31+0x13/0x13 [kvm] [ 1529.577315] [<ffffffff81096ba8>] on_each_cpu+0x44/0x84 [ 1529.577326] [<ffffffffa05f98b5>] hardware_disable_all_nolock+0x34/0x36 [kvm] [ 1529.577335] [<ffffffffa05f98e2>] hardware_disable_all+0x2b/0x39 [kvm] [ 1529.577349] [<ffffffffa05fafe5>] kvm_put_kvm+0xed/0x10f [kvm] [ 1529.577358] [<ffffffffa05fb3d7>] kvm_vm_release+0x22/0x28 [kvm] Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * Merge branch 'for-upstream' of git://github.com/agraf/linux-2.6 into nextAvi Kivity2012-05-0827-462/+782
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PPC updates from Alex. * 'for-upstream' of git://github.com/agraf/linux-2.6: KVM: PPC: Emulator: clean up SPR reads and writes KVM: PPC: Emulator: clean up instruction parsing kvm/powerpc: Add new ioctl to retreive server MMU infos kvm/book3s: Make kernel emulated H_PUT_TCE available for "PR" KVM KVM: PPC: bookehv: Fix r8/r13 storing in level exception handler KVM: PPC: Book3S: Enable IRQs during exit handling KVM: PPC: Fix PR KVM on POWER7 bare metal KVM: PPC: Fix stbux emulation KVM: PPC: bookehv: Use lwz/stw instead of PPC_LL/PPC_STL for 32-bit fields KVM: PPC: Book3S: PR: No isync in slbie path KVM: PPC: Book3S: PR: Optimize entry path KVM: PPC: booke(hv): Fix save/restore of guest accessible SPRGs. KVM: PPC: Restrict PPC_[L|ST]D macro to asm code KVM: PPC: bookehv: Use a Macro for saving/restoring guest registers to/from their 64 bit copies. KVM: PPC: Use clockevent multiplier and shifter for decrementer KVM: Use minimum and maximum address mapped by TLB1 Signed-off-by: Avi Kivity <avi@redhat.com>
| | * KVM: PPC: Emulator: clean up SPR reads and writesAlexander Graf2012-05-068-144/+190
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When reading and writing SPRs, every SPR emulation piece had to read or write the respective GPR the value was read from or stored in itself. This approach is pretty prone to failure. What if we accidentally implement mfspr emulation where we just do "break" and nothing else? Suddenly we would get a random value in the return register - which is always a bad idea. So let's consolidate the generic code paths and only give the core specific SPR handling code readily made variables to read/write from/to. Functionally, this patch doesn't change anything, but it increases the readability of the code and makes is less prone to bugs. Signed-off-by: Alexander Graf <agraf@suse.de>
| | * KVM: PPC: Emulator: clean up instruction parsingAlexander Graf2012-05-065-137/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instructions on PPC are pretty similarly encoded. So instead of every instruction emulation code decoding the instruction fields itself, we can move that code to more generic places and rely on the compiler to optimize the unused bits away. This has 2 advantages. It makes the code smaller and it makes the code less error prone, as the instruction fields are always available, so accidental misusage is reduced. Functionally, this patch doesn't change anything. Signed-off-by: Alexander Graf <agraf@suse.de>
| | * kvm/powerpc: Add new ioctl to retreive server MMU infosBenjamin Herrenschmidt2012-05-067-1/+177
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is necessary for qemu to be able to pass the right information to the guest, such as the supported page sizes and corresponding encodings in the SLB and hash table, which can vary depending on the processor type, the type of KVM used (PR vs HV) and the version of KVM Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [agraf: fix compilation on hv, adjust for newer ioctl numbers] Signed-off-by: Alexander Graf <agraf@suse.de>
| | * kvm/book3s: Make kernel emulated H_PUT_TCE available for "PR" KVMBenjamin Herrenschmidt2012-05-069-112/+191
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is nothing in the code for emulating TCE tables in the kernel that prevents it from working on "PR" KVM... other than ifdef's and location of the code. This and moves the bulk of the code there to a new file called book3s_64_vio.c. This speeds things up a bit on my G5. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [agraf: fix for hv kvm, 32bit, whitespace] Signed-off-by: Alexander Graf <agraf@suse.de>
| | * KVM: PPC: bookehv: Fix r8/r13 storing in level exception handlerMihai Caraman2012-05-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Guest r8 register is held in the scratch register and stored correctly, so remove the instruction that clobbers it. Guest r13 was missing from vcpu, store it there. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * KVM: PPC: Book3S: Enable IRQs during exit handlingAlexander Graf2012-05-061-0/+3
| | | | | | | | | | | | | | | | | | | | | While handling an exit, we should listen for interrupts and make sure to receive them when they arrive, to keep our latencies low. Signed-off-by: Alexander Graf <agraf@suse.de>
| | * KVM: PPC: Fix PR KVM on POWER7 bare metalAlexander Graf2012-05-061-13/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When running on a system that is HV capable, some interrupts use HSRR SPRs instead of the normal SRR SPRs. These are also used in the Linux handlers to jump back to code after an interrupt got processed. Unfortunately, in our "jump back to the real host handler after we've done the context switch" code, we were only setting the SRR SPRs, rendering Linux to jump back to some invalid IP after it's processed the interrupt. This fixes random crashes on p7 opal mode with PR KVM for me. Signed-off-by: Alexander Graf <agraf@suse.de>
| | * KVM: PPC: Fix stbux emulationAlexander Graf2012-05-061-1/+1
| | | | | | | | | | | | | | | | | | | | | Stbux writes the address it's operating on to the register specified in ra, not into the data source register. Signed-off-by: Alexander Graf <agraf@suse.de>
| | * KVM: PPC: bookehv: Use lwz/stw instead of PPC_LL/PPC_STL for 32-bit fieldsMihai Caraman2012-05-061-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | Interrupt code used PPC_LL/PPC_STL macros to load/store some of u32 fields which led to memory overflow on 64-bit. Use lwz/stw instead. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * KVM: PPC: Book3S: PR: No isync in slbie pathAlexander Graf2012-05-061-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While messing around with the SLBs we're running in real mode. The entry to guest space goes through rfid, which is context synchronizing, so there's no need to manually synchronize anything through isync. With this patch and a simple priviledged SPR access loop guest, I get a speed bump from 2035607 to 2181301 exits per second. Signed-off-by: Alexander Graf <agraf@suse.de>
| | * KVM: PPC: Book3S: PR: Optimize entry pathAlexander Graf2012-05-061-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By shuffling a few instructions around we can execute more memory loads in parallel, giving us a small performance boost. With this patch and a simple priviledged SPR access loop guest, I get a speed bump from 2013052 to 2035607 exits per second. Signed-off-by: Alexander Graf <agraf@suse.de>
| | * KVM: PPC: booke(hv): Fix save/restore of guest accessible SPRGs.Varun Sethi2012-05-062-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | For Guest accessible SPRGs 4-7, save/restore must be handled differently for 64bit and non-64 bit case. Use the PPC_STD/PPC_LD macros for saving/restoring to/from these registers. Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * KVM: PPC: Restrict PPC_[L|ST]D macro to asm codeAlexander Graf2012-05-061-0/+2
| | | | | | | | | | | | | | | | | | | | | We only want asm code macros to be accessible from asm code, so #ifdef it depending on it. Signed-off-by: Alexander Graf <agraf@suse.de>
| | * KVM: PPC: bookehv: Use a Macro for saving/restoring guest registers to/from ↵Varun Sethi2012-05-062-20/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | their 64 bit copies. Introduced PPC_STD/PPC_LD macros for saving/restoring guest registers to/from their 64 bit copies. Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * KVM: PPC: Use clockevent multiplier and shifter for decrementerBharat Bhushan2012-05-063-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Time for which the hrtimer is started for decrementer emulation is calculated using tb_ticks_per_usec. While hrtimer uses the clockevent for DEC reprogramming (if needed) and which calculate timebase ticks using the multiplier and shifter mechanism implemented within clockevent layer. It was observed that this conversion (timebase->time->timebase) are not correct because the mechanism are not consistent. In our setup it adds 2% jitter. With this patch clockevent multiplier and shifter mechanism are used when starting hrtimer for decrementer emulation. Now the jitter is < 0.5%. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * KVM: Use minimum and maximum address mapped by TLB1Bharat Bhushan2012-05-062-2/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | Keep track of minimum and maximum address mapped by tlb1. This helps in TLBMISS handling in KVM to quick check whether the address lies in mapped range. If address does not lies in this range then no need to look in each tlb1 entry of tlb1 array. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
| * | KVM: x86 emulator: Avoid pushing back ModRM byte fetched for group decodingTakuya Yoshikawa2012-05-061-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Although ModRM byte is fetched for group decoding, it is soon pushed back to make decode_modrm() fetch it later again. Now that ModRM flag can be found in the top level opcode tables, fetch ModRM byte before group decoding to make the code simpler. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: x86 emulator: Move ModRM flags for groups to top level opcode tablesTakuya Yoshikawa2012-05-061-55/+56
| | | | | | | | | | | | | | | | | | | | | Needed for the following patch which simplifies ModRM fetching code. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM guest: make kvm_para_available() check hypervisor bit reading cpuid leafGleb Natapov2012-05-061-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | This cpuid range does not exist on real HW and Intel spec says that "Information returned for highest basic information leaf" will be returned. Not very well defined. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * | KVM: fix cpuid eax for KVM leafMichael S. Tsirkin2012-05-062-2/+6
| |/ | | | | | | | | | | | | | | | | | | | | | | cpuid eax should return the max leaf so that guests can find out the valid range. This matches Xen et al. Update documentation to match. Tested with -cpu host. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: s390: implement KVM_CAP_NR/MAX_VCPUSChristian Borntraeger2012-05-021-0/+4
| | | | | | | | | | | | | | | | Let userspace know the number of max and supported cpus for kvm on s390. Return KVM_MAX_VCPUS (currently 64) for both values. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: s390: Handle sckpf instructionCornelia Huck2012-04-303-0/+33
| | | | | | | | | | | | | | | | Handle the mandatory intercept SET CLOCK PROGRAMMABLE FIELD instruction. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: s390: use kvm_vcpu_on_spin for diag 0x44Christian Borntraeger2012-04-301-3/+1
| | | | | | | | | | | | | | | | Lets replace the old open coded version of diag 0x44 (which relied on compat_sched_yield) with kvm_vcpu_on_spin. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: s390: Implement the directed yield (diag 9c) hypervisor call for KVMKonstantin Weitz2012-04-305-16/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements the directed yield hypercall found on other System z hypervisors. It delegates execution time to the virtual cpu specified in the instruction's parameter. Useful to avoid long spinlock waits in the guest. Christian Borntraeger: moved common code in virt/kvm/ Signed-off-by: Konstantin Weitz <WEITZKON@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: x86: Run PIT work in own kthreadJan Kiszka2012-04-273-14/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can't run PIT IRQ injection work in the interrupt context of the host timer. This would allow the user to influence the handler complexity by asking for a broadcast to a large number of VCPUs. Therefore, this work was pushed into workqueue context in 9d244caf2e. However, this prevents prioritizing the PIT injection over other task as workqueues share kernel threads. This replaces the workqueue with a kthread worker and gives that thread a name in the format "kvm-pit/<owner-process-pid>". That allows to identify and adjust the kthread priority according to the VM process parameters. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: x86: Document in-kernel PIT APIJan Kiszka2012-04-271-0/+63
| | | | | | | | | | | | | | Add descriptions for KVM_CREATE_PIT2 and KVM_GET/SET_PIT2. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: Improve readability of KVM API docJan Kiszka2012-04-271-7/+92
| | | | | | | | | | | | | | | | This helps to identify sections and it also fixes the numbering from 4.54 to 4.61. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: x86 emulator: fix asm constraint in flush_pending_x87_faultsAvi Kivity2012-04-241-1/+1
| | | | | | | | | | | | | | 'bool' wants 8-bit registers. Reported-by: Takuya Yoshikawa <takuya.yoshikawa@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Introduce bitmask for apic attention reasonsGleb Natapov2012-04-242-5/+11
| | | | | | | | | | | | | | | | | | | | | | The patch introduces a bitmap that will hold reasons apic should be checked during vmexit. This is in a preparation for vp eoi patch that will add one more check on vmexit. With the bitmap we can do if(apic_attention) to check everything simultaneously which will add zero overhead on the fast path. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: Introduce direct MSI message injection for in-kernel irqchipsJan Kiszka2012-04-247-0/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, MSI messages can only be injected to in-kernel irqchips by defining a corresponding IRQ route for each message. This is not only unhandy if the MSI messages are generated "on the fly" by user space, IRQ routes are a limited resource that user space has to manage carefully. By providing a direct injection path, we can both avoid using up limited resources and simplify the necessary steps for user land. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * KVM: add kvm_arch_para_features stub to asm-generic/kvm_para.hMarcelo Tosatti2012-04-201-0/+5
| | | | | | | | | | | | | | Needed by kvm_para_has_feature(). Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: ia64: fix build due to typoAvi Kivity2012-04-191-1/+1
| | | | | | | | | | | | | | s/kcm/kvm/. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * KVM: Fix page-crossing MMIOAvi Kivity2012-04-194-42/+115
| | | | | | | | | | | | | | | | | | | | | | | | | | MMIO that are split across a page boundary are currently broken - the code does not expect to be aborted by the exit to userspace for the first MMIO fragment. This patch fixes the problem by generalizing the current code for handling 16-byte MMIOs to handle a number of "fragments", and changes the MMIO code to create those fragments. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * Merge branch 'linus' into queueMarcelo Tosatti2012-04-19431-3055/+4025
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge reason: development work has dependency on kvm patches merged upstream. Conflicts: Documentation/feature-removal-schedule.txt Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * | KVM: MMU: use page table level macroDavidlohr Bueso2012-04-182-2/+2
| | | | | | | | | | | | | | | | | | | | | Its much cleaner to use PT_PAGE_TABLE_LEVEL than its numeric value. Signed-off-by: Davidlohr Bueso <dave@gnu.org> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * | KVM: dont clear TMR on EOIMichael S. Tsirkin2012-04-163-9/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Intel spec says that TMR needs to be set/cleared when IRR is set, but kvm also clears it on EOI. I did some tests on a real (AMD based) system, and I see same TMR values both before and after EOI, so I think it's a minor bug in kvm. This patch fixes TMR to be set/cleared on IRR set only as per spec. And now that we don't clear TMR, we can save an atomic read of TMR on EOI that's not propagated to ioapic, by checking whether ioapic needs a specific vector first and calculating the mode afterwards. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>