summaryrefslogtreecommitdiffstats
path: root/drivers/kvm/mmu.c
Commit message (Collapse)AuthorAgeFilesLines
* KVM: Move arch dependent files to new directory arch/x86/kvm/Avi Kivity2008-01-301-1806/+0
| | | | | | | This paves the way for multiple architecture support. Note that while ioapic.c could potentially be shared with ia64, it is also moved. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Portability: Move mmu-related fields to kvm_archZhang Xiantao2008-01-301-28/+30
| | | | | | | | This patches moves mmu-related fields to kvm_arch. Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com> Acked-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Portability: Split mmu-related static inline functions to mmu.hZhang Xiantao2008-01-301-0/+1
| | | | | | | | | Since these functions need to know the details of kvm or kvm_vcpu structure, it can't be put in x86.h. Create mmu.h to hold them. Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com> Acked-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Portability: Introduce kvm_vcpu_archZhang Xiantao2008-01-301-71/+71
| | | | | | | | | Move all the architecture-specific fields in kvm_vcpu into a new struct kvm_vcpu_arch. Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com> Acked-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Fix SMP shadow instantiation raceMarcelo Tosatti2008-01-301-4/+8
| | | | | | | | | | | There is a race where VCPU0 is shadowing a pagetable entry while VCPU1 is updating it, which results in a stale shadow copy. Fix that by comparing the contents of the cached guest pte with the current guest pte after write-protecting the guest pagetable. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Use mmu_set_spte() for real-mode shadowsAvi Kivity2008-01-301-31/+10
| | | | | | | | | | In addition to removing some duplicated code, this also handles the unlikely case of real-mode code updating a guest page table. This can happen when one vcpu (in real mode) touches a second vcpu's (in protected mode) page tables, or if a vcpu switches to real mode, touches page tables, and switches back. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Adjust mmu_set_spte() debug code for gpte removalAvi Kivity2008-01-301-2/+2
| | | | Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Move set_pte() into guest paging mode independent codeAvi Kivity2008-01-301-0/+83
| | | | | | | | As set_pte() no longer references either a gpte or the guest walker, we can move it out of paging mode dependent code (which compiles twice and is generally nasty). Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Fix inherited permissions for emulated guest pte updatesAvi Kivity2008-01-301-2/+2
| | | | | | | | When we emulate a guest pte write, we fail to apply the correct inherited permissions from the parent ptes. Now that we store inherited permissions in the shadow page, we can use that to update the pte permissions correctly. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Simplify calculation of pte accessAvi Kivity2008-01-301-4/+10
| | | | | | | | | | | The nx bit is awkwardly placed in the 63rd bit position; furthermore it has a reversed meaning compared to the other bits, which means we can't use a bitwise and to calculate compounded access masks. So, we simplify things by creating a new 3-bit exec/write/user access word, and doing all calculations in that. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Replace page fault injection by the generalized exception queueAvi Kivity2008-01-301-1/+1
| | | | Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: mark pages that were inserted to the shadow pages table as accessedIzik Eidus2008-01-301-0/+2
| | | | | | | | Mark guest pages as accessed when removed from the shadow page tables for better lru processing. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Rename 'release_page'Avi Kivity2008-01-301-4/+4
| | | | | | Rename the awkwardly named variable. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Rename variables of type 'struct kvm_mmu_page *'Avi Kivity2008-01-301-154/+146
| | | | | | | | These are traditionally named 'page', but even more traditionally, that name is reserved for variables that point to a 'struct page'. Rename them to 'sp' (for "shadow page"). Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Remove gpa_to_hpa()Avi Kivity2008-01-301-18/+3
| | | | | | Converting last uses along the way. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Remove gva_to_hpa()Avi Kivity2008-01-301-9/+0
| | | | | | No longer used. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Simplify nonpaging_map()Avi Kivity2008-01-301-14/+10
| | | | | | Instead of passing an hpa, pass a regular struct page. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Adjust page_header_update_slot() to accept a gfn instead of a gpaAvi Kivity2008-01-301-3/+4
| | | | Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Remove extra gaddr parameter from set_pte_common()Avi Kivity2008-01-301-0/+1
| | | | | | Similar information is available in the gfn parameter, so use that. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Move pse36 handling to the guest walkerAvi Kivity2008-01-301-0/+7
| | | | Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Don't bother the mmu if cr3 load doesn't change cr3Avi Kivity2008-01-301-1/+1
| | | | | | | If the guest requests just a tlb flush, don't take the vm lock and drop the mmu context pointlessly. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Avoid unnecessary remote tlb flushes when guest updates a pteAvi Kivity2008-01-301-1/+26
| | | | | | | | If all we're doing is increasing permissions on a pte (typical for demand paging), then there's not need to flush remote tlbs. Worst case they'll get a spurious page fault. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Recalculate mmu pages needed for every memory region changeZhang Xiantao2008-01-301-0/+19
| | | | | | | | Instead of incrementally changing the mmu cache size for every memory slot operation, recalculate it from scratch. This is simpler and safer. Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Selectively set PageDirty when releasing guest memoryIzik Eidus2008-01-301-8/+15
| | | | | | | | | Improve dirty bit setting for pages that kvm release, until now every page that we released we marked dirty, from now only pages that have potential to get dirty we mark dirty. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Fix potential memory leak with smp real-modeIzik Eidus2008-01-301-1/+3
| | | | | | | | When we map a page, we check whether some other vcpu mapped it for us and if so, bail out. But we should decrease the refcount on the page as we do so. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Add some mmu statisticsAvi Kivity2008-01-301-1/+8
| | | | Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Topup the mmu memory preallocation caches before emulating an insnAvi Kivity2008-01-301-0/+4
| | | | | | | Emulation may cause a shadow pte to be instantiated, which requires memory resources. Make sure the caches are filled to avoid an oops. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Move page fault processing to common codeAvi Kivity2008-01-301-0/+36
| | | | | | | The code that dispatches the page fault and emulates if we failed to map is duplicated across vmx and svm. Merge it to simplify further bugfixing. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Portability: Split kvm_vcpu into arch dependent and independent parts ↵Zhang Xiantao2008-01-301-0/+1
| | | | | | | | | | | (part 1) First step to split kvm_vcpu. Currently, we just use an macro to define the common fields in kvm_vcpu for all archs, and all archs need to define its own kvm_vcpu struct. Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Partial swapping of guest memoryIzik Eidus2008-01-301-1/+13
| | | | | | | | | | | | This allows guest memory to be swapped. Pages which are currently mapped via shadow page tables are pinned into memory, but all other pages can be freely swapped. The patch makes gfn_to_page() elevate the page's reference count, and introduces kvm_release_page() that pairs with it. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Make gfn_to_page() always safeIzik Eidus2008-01-301-11/+5
| | | | | | | | | | In case the page is not present in the guest memory map, return a dummy page the guest can scribble on. This simplifies error checking in its users. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Keep a reverse mapping of non-writable translationsIzik Eidus2008-01-301-12/+11
| | | | | | | | | | | | The current kvm mmu only reverse maps writable translation. This is used to write-protect a page in case it becomes a pagetable. But with swapping support, we need a reverse mapping of read-only pages as well: when we evict a page, we need to remove any mapping to it, whether writable or not. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Add rmap_next(), a helper for walking kvm rmapsIzik Eidus2008-01-301-10/+35
| | | | | Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Instantiate real-mode shadows as user writable shadowsAvi Kivity2008-01-301-1/+1
| | | | | | This is consistent with real-mode permissions. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Move guest pte dirty bit management to the guest pagetable walkerAvi Kivity2008-01-301-0/+5
| | | | | | | This is more consistent with the accessed bit management, and makes the dirty bit available earlier for other purposes. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: More struct kvm_vcpu -> struct kvm cleanupsAnthony Liguori2008-01-301-13/+13
| | | | | | | | | This time, the biggest change is gpa_to_hpa. The translation of GPA to HPA does not depend on the VCPU state unlike GVA to GPA so there's no need to pass in the kvm_vcpu. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Clean up MMU functions to take struct kvm when appropriateAnthony Liguori2008-01-301-9/+9
| | | | | | | | | | | | Some of the MMU functions take a struct kvm_vcpu even though they affect all VCPUs. This patch cleans up some of them to instead take a struct kvm. This makes things a bit more clear. The main thing that was confusing me was whether certain functions need to be called on all VCPUs. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: CodingStyle cleanupMike Day2008-01-301-4/+6
| | | | | Signed-off-by: Mike D. Day <ncmike@ncultra.org> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Allow dynamic allocation of the mmu shadow cache sizeIzik Eidus2008-01-301-2/+38
| | | | | | | The user is now able to set how many mmu pages will be allocated to the guest. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Remove the usage of page->private field by rmapIzik Eidus2008-01-301-52/+70
| | | | | | | | | | | | | When kvm uses user-allocated pages in the future for the guest, we won't be able to use page->private for rmap, since page->rmap is reserved for the filesystem. So we move the rmap base pointers to the memory slot. A side effect of this is that we need to store the gfn of each gpte in the shadow pages, since the memory slot is addressed by gfn, instead of hfn like struct page. Signed-off-by: Izik Eidus <izik@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Make flooding detection work when guest page faults are bypassedAvi Kivity2008-01-301-1/+20
| | | | | | | | | When we allow guest page faults to reach the guests directly, we lose the fault tracking which allows us to detect demand paging. So we provide an alternate mechnism by clearing the accessed bit when we set a pte, and checking it later to see if the guest actually used it. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Allow not-present guest page faults to bypass kvmAvi Kivity2008-01-301-21/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | There are two classes of page faults trapped by kvm: - host page faults, where the fault is needed to allow kvm to install the shadow pte or update the guest accessed and dirty bits - guest page faults, where the guest has faulted and kvm simply injects the fault back into the guest to handle The second class, guest page faults, is pure overhead. We can eliminate some of it on vmx using the following evil trick: - when we set up a shadow page table entry, if the corresponding guest pte is not present, set up the shadow pte as not present - if the guest pte _is_ present, mark the shadow pte as present but also set one of the reserved bits in the shadow pte - tell the vmx hardware not to trap faults which have the present bit clear With this, normal page-not-present faults go directly to the guest, bypassing kvm entirely. Unfortunately, this trick only works on Intel hardware, as AMD lacks a way to discriminate among page faults based on error code. It is also a little risky since it uses reserved bits which might become unreserved in the future, so a module parameter is provided to disable it. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Reset mmu context when entering real modeEddie Dong2007-10-221-0/+1
| | | | | | | | | | | | | Resetting an SMP guest will force AP enter real mode (RESET) with paging enabled in protected mode. While current enter_rmode() can only handle mode switch from nonpaging mode to real mode which leads to SMP reboot failure. Fix by reloading the mmu context on entering real mode. Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Set shadow pte atomically in mmu_pte_write_zap_pte()Izik Eidus2007-10-221-1/+1
| | | | | | | Setting shadow page table entry should be set atomicly using set_shadow_pte(). Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Don't do GFP_NOWAIT allocationsAvi Kivity2007-10-131-24/+10
| | | | | | | | | | | Before preempt notifiers, kvm needed to allocate memory with GFP_NOWAIT so as not to have to enable preemption and take a heavyweight exit. On oom, we'd fall back to a GFP_KERNEL allocation. With preemption notifiers, we can do a GFP_KERNEL allocation, and perform the heavyweight exit only if the kernel decides to put us to sleep. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Rename kvm_arch_ops to kvm_x86_opsChristian Ehrhardt2007-10-131-3/+3
| | | | | | | | This patch just renames the current (misnamed) _arch namings to _x86 to ensure better readability when a real arch layer takes place. Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Convert vm lock to a mutexShaohua Li2007-10-131-5/+4
| | | | | | | | This allows the kvm mmu to perform sleepy operations, such as memory allocation. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Use the scheduler preemption notifiers to make kvm preemptibleAvi Kivity2007-10-131-2/+0
| | | | | | | | | | | | | | Current kvm disables preemption while the new virtualization registers are in use. This of course is not very good for latency sensitive workloads (one use of virtualization is to offload user interface and other latency insensitive stuff to a container, so that it is easier to analyze the remaining workload). This patch re-enables preemption for kvm; preemption is now only disabled when switching the registers in and out, and during the switch to guest mode and back. Contains fixes from Shaohua Li <shaohua.li@intel.com>. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Move gfn_to_page out of kmap/unmap pairsShaohua Li2007-10-131-1/+1
| | | | | | | gfn_to_page might sleep with swap support. Move it out of the kmap calls. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Trivial: Use standard CR0 flags macros from asm/cpu-features.hRusty Russell2007-10-131-1/+1
| | | | | | | | | | The kernel now has asm/cpu-features.h: use those macros instead of inventing our own. Also spell out definition of CR0_RESEVED_BITS (no code change) and fix typo. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>