| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
xen_flush_tlb_all() is used in arch/x86/xen/mmu.c only. Make it static.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
| |
There are some leftovers testing for pvh guest mode in pv-only source
files. Remove them.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When booted as pv-guest the p2m list presented by the Xen is already
mapped to virtual addresses. In dom0 case the hypervisor might make use
of 2M- or 1G-pages for this mapping. Unfortunately while being properly
aligned in virtual and machine address space, those pages might not be
aligned properly in guest physical address space.
So when trying to obtain the guest physical address of such a page
pud_pfn() and pmd_pfn() must be avoided as those will mask away guest
physical address bits not being zero in this special case.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When running as Xen pv guest X86_BUG_SYSRET_SS_ATTRS must not be set
on AMD cpus.
This bug/feature bit is kind of special as it will be used very early
when switching threads. Setting the bit and clearing it a little bit
later leaves a critical window where things can go wrong. This time
window has enlarged a little bit by using setup_clear_cpu_cap() instead
of the hypervisor's set_cpu_features callback. It seems this larger
window now makes it rather easy to hit the problem.
The proper solution is to never set the bit in case of Xen.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Routines that are set by xen_init_time_ops() use shared_info's
pvclock_vcpu_time_info area. This area is not properly available until
shared_info is mapped in xen_setup_shared_info().
This became especially problematic due to commit dd759d93f4dd ("x86/timers:
Add simple udelay calibration") where we end up reading tsc_to_system_mul
from xen_dummy_shared_info (i.e. getting zero value) and then trying
to divide by it in pvclock_tsc_khz().
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 690b7f10b4f9f ("x86/xen: use capabilities instead of fake cpuid
values for xsave") introduced a regression as it tried to make use of
the fixup feature before it being available.
Fall back to the old variant testing via cpuid().
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 84d582d236dc ("xen: Revert commits da72ff5bfcb0 and
72a9b186292d") defined xen_have_vector_callback in enlighten_hvm.c.
Since guest-type-neutral code refers to this variable this causes
build failures when CONFIG_XEN_PVHVM is not defined.
Moving xen_have_vector_callback definition to enlighten.c resolves
this issue.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When rebooting DOM0 with ACPI on ARM64, the kernel is crashing with the stack
trace [1].
This is happening because when EFI runtimes are enabled, the reset code
(see machine_restart) will first try to use EFI restart method.
However, the EFI restart code is expecting the reset_system callback to
be always set. This is not the case for Xen and will lead to crash.
The EFI restart helper is used in multiple places and some of them don't
not have fallback (see machine_power_off). So implement reset_system
callback as a call to xen_reboot when using EFI Xen.
[ 36.999270] reboot: Restarting system
[ 37.002921] Internal error: Attempting to execute userspace memory: 86000004 [#1] PREEMPT SMP
[ 37.011460] Modules linked in:
[ 37.014598] CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted 4.11.0-rc1-00003-g1e248b60a39b-dirty #506
[ 37.023903] Hardware name: (null) (DT)
[ 37.027734] task: ffff800902068000 task.stack: ffff800902064000
[ 37.033739] PC is at 0x0
[ 37.036359] LR is at efi_reboot+0x94/0xd0
[ 37.040438] pc : [<0000000000000000>] lr : [<ffff00000880f2c4>] pstate: 404001c5
[ 37.047920] sp : ffff800902067cf0
[ 37.051314] x29: ffff800902067cf0 x28: ffff800902068000
[ 37.056709] x27: ffff000008992000 x26: 000000000000008e
[ 37.062104] x25: 0000000000000123 x24: 0000000000000015
[ 37.067499] x23: 0000000000000000 x22: ffff000008e6e250
[ 37.072894] x21: ffff000008e6e000 x20: 0000000000000000
[ 37.078289] x19: ffff000008e5d4c8 x18: 0000000000000010
[ 37.083684] x17: 0000ffffa7c27470 x16: 00000000deadbeef
[ 37.089079] x15: 0000000000000006 x14: ffff000088f42bef
[ 37.094474] x13: ffff000008f42bfd x12: ffff000008e706c0
[ 37.099870] x11: ffff000008e70000 x10: 0000000005f5e0ff
[ 37.105265] x9 : ffff800902067a50 x8 : 6974726174736552
[ 37.110660] x7 : ffff000008cc6fb8 x6 : ffff000008cc6fb0
[ 37.116055] x5 : ffff000008c97dd8 x4 : 0000000000000000
[ 37.121453] x3 : 0000000000000000 x2 : 0000000000000000
[ 37.126845] x1 : 0000000000000000 x0 : 0000000000000000
[ 37.132239]
[ 37.133808] Process systemd-shutdow (pid: 1, stack limit = 0xffff800902064000)
[ 37.141118] Stack: (0xffff800902067cf0 to 0xffff800902068000)
[ 37.146949] 7ce0: ffff800902067d40 ffff000008085334
[ 37.154869] 7d00: 0000000000000000 ffff000008f3b000 ffff800902067d40 ffff0000080852e0
[ 37.162787] 7d20: ffff000008cc6fb0 ffff000008cc6fb8 ffff000008c7f580 ffff000008c97dd8
[ 37.170706] 7d40: ffff800902067d60 ffff0000080e2c2c 0000000000000000 0000000001234567
[ 37.178624] 7d60: ffff800902067d80 ffff0000080e2ee8 0000000000000000 ffff0000080e2df4
[ 37.186544] 7d80: 0000000000000000 ffff0000080830f0 0000000000000000 00008008ff1c1000
[ 37.194462] 7da0: ffffffffffffffff 0000ffffa7c4b1cc 0000000000000000 0000000000000024
[ 37.202380] 7dc0: ffff800902067dd0 0000000000000005 0000fffff24743c8 0000000000000004
[ 37.210299] 7de0: 0000fffff2475f03 0000000000000010 0000fffff2474418 0000000000000005
[ 37.218218] 7e00: 0000fffff2474578 000000000000000a 0000aaaad6b722c0 0000000000000001
[ 37.226136] 7e20: 0000000000000123 0000000000000038 ffff800902067e50 ffff0000081e7294
[ 37.234055] 7e40: ffff800902067e60 ffff0000081e935c ffff800902067e60 ffff0000081e9388
[ 37.241973] 7e60: ffff800902067eb0 ffff0000081ea388 0000000000000000 00008008ff1c1000
[ 37.249892] 7e80: ffffffffffffffff 0000ffffa7c4a79c 0000000000000000 ffff000000020000
[ 37.257810] 7ea0: 0000010000000004 0000000000000000 0000000000000000 ffff0000080830f0
[ 37.265729] 7ec0: fffffffffee1dead 0000000028121969 0000000001234567 0000000000000000
[ 37.273651] 7ee0: ffffffffffffffff 8080000000800000 0000800000008080 feffa9a9d4ff2d66
[ 37.281567] 7f00: 000000000000008e feffa9a9d5b60e0f 7f7fffffffff7f7f 0101010101010101
[ 37.289485] 7f20: 0000000000000010 0000000000000008 000000000000003a 0000ffffa7ccf588
[ 37.297404] 7f40: 0000aaaad6b87d00 0000ffffa7c4b1b0 0000fffff2474be0 0000aaaad6b88000
[ 37.305326] 7f60: 0000fffff2474fb0 0000000001234567 0000000000000000 0000000000000000
[ 37.313240] 7f80: 0000000000000000 0000000000000001 0000aaaad6b70d4d 0000000000000000
[ 37.321159] 7fa0: 0000000000000001 0000fffff2474ea0 0000aaaad6b5e2e0 0000fffff2474e80
[ 37.329078] 7fc0: 0000ffffa7c4b1cc 0000000000000000 fffffffffee1dead 000000000000008e
[ 37.336997] 7fe0: 0000000000000000 0000000000000000 9ce839cffee77eab fafdbf9f7ed57f2f
[ 37.344911] Call trace:
[ 37.347437] Exception stack(0xffff800902067b20 to 0xffff800902067c50)
[ 37.353970] 7b20: ffff000008e5d4c8 0001000000000000 0000000080f82000 0000000000000000
[ 37.361883] 7b40: ffff800902067b60 ffff000008e17000 ffff000008f44c68 00000001081081b4
[ 37.369802] 7b60: ffff800902067bf0 ffff000008108478 0000000000000000 ffff000008c235b0
[ 37.377721] 7b80: ffff800902067ce0 0000000000000000 0000000000000000 0000000000000015
[ 37.385643] 7ba0: 0000000000000123 000000000000008e ffff000008992000 ffff800902068000
[ 37.393557] 7bc0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 37.401477] 7be0: 0000000000000000 ffff000008c97dd8 ffff000008cc6fb0 ffff000008cc6fb8
[ 37.409396] 7c00: 6974726174736552 ffff800902067a50 0000000005f5e0ff ffff000008e70000
[ 37.417318] 7c20: ffff000008e706c0 ffff000008f42bfd ffff000088f42bef 0000000000000006
[ 37.425234] 7c40: 00000000deadbeef 0000ffffa7c27470
[ 37.430190] [< (null)>] (null)
[ 37.434982] [<ffff000008085334>] machine_restart+0x6c/0x70
[ 37.440550] [<ffff0000080e2c2c>] kernel_restart+0x6c/0x78
[ 37.446030] [<ffff0000080e2ee8>] SyS_reboot+0x130/0x228
[ 37.451337] [<ffff0000080830f0>] el0_svc_naked+0x24/0x28
[ 37.456737] Code: bad PC value
[ 37.459891] ---[ end trace 76e2fc17e050aecd ]---
Signed-off-by: Julien Grall <julien.grall@arm.com>
--
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
The x86 code has theoritically a similar issue, altought EFI does not
seem to be the preferred method. I have only built test it on x86.
This should also probably be fixed in stable tree.
Changes in v2:
- Implement xen_efi_reset_system using xen_reboot
- Move xen_efi_reset_system in drivers/xen/efi.c
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
The helper xen_reboot will be called by the EFI code in a later patch.
Note that the ARM version does not yet exist and will be added in a
later patch too.
Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Recent code rework that split handling ov PV, HVM and PVH guests into
separate files missed calling xen_smp_intr_init_pv() on CPU0.
Add this call.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Recent discussion (http://marc.info/?l=xen-devel&m=149192184523741)
established that commit 72a9b186292d ("xen: Remove event channel
notification through Xen PCI platform device") (and thus commit
da72ff5bfcb0 ("partially revert "xen: Remove event channel
notification through Xen PCI platform device"")) are unnecessary and,
in fact, prevent HVM guests from booting on Xen releases prior to 4.0
Therefore we revert both of those commits.
The summary of that discussion is below:
Here is the brief summary of the current situation:
Before the offending commit (72a9b186292):
1) INTx does not work because of the reset_watches path.
2) The reset_watches path is only taken if you have Xen > 4.0
3) The Linux Kernel by default will use vector inject if the hypervisor
support. So even INTx does not work no body running the kernel with
Xen > 4.0 would notice. Unless he explicitly disabled this feature
either in the kernel or in Xen (and this can only be disabled by
modifying the code, not user-supported way to do it).
After the offending commit (+ partial revert):
1) INTx is no longer support for HVM (only for PV guests).
2) Any HVM guest The kernel will not boot on Xen < 4.0 which does
not have vector injection support. Since the only other mode
supported is INTx which.
So based on this summary, I think before commit (72a9b186292) we were
in much better position from a user point of view.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
e820 map is updated with information from the zeropage (i.e. pvh_bootparams)
by default_machine_specific_memory_setup(). With the way things are done
now, we end up with a duplicated e820 map.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is no user of x86_hyper->set_cpu_features() any more. Remove it.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is no need to set the same capabilities for each cpu
individually. This can be done for all cpus in platform initialization.
Cc: Alok Kataria <akataria@vmware.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Alok Kataria <akataria@vmware.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When running as pv domain xen_cpuid() is being used instead of
native_cpuid(). In xen_cpuid() the xsave feature availability is
indicated by special casing the related cpuid leaf.
Instead of delivering fake cpuid values set or clear the cpu
capability bits for xsave instead.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When running as pv domain xen_cpuid() is being used instead of
native_cpuid(). In xen_cpuid() the x2apic feature is indicated as not
being present by special casing the related cpuid leaf.
Instead of delivering fake cpuid values clear the cpu capability bit
for x2apic instead.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When running as pv domain xen_cpuid() is being used instead of
native_cpuid(). In xen_cpuid() the mwait feature is indicated to be
present or not by special casing the related cpuid leaf.
Instead of delivering fake cpuid values use the cpu capability bit
for mwait instead.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When running as pv domain xen_cpuid() is being used instead of
native_cpuid(). In xen_cpuid() the acpi feature is indicated as not
being present by special casing the related cpuid leaf in case we
are not the initial domain.
Instead of delivering fake cpuid values clear the cpu capability bit
for acpi instead.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When running as pv domain xen_cpuid() is being used instead of
native_cpuid(). In xen_cpuid() the acc feature (thermal monitoring)
is indicated as not being present by special casing the related
cpuid leaf.
Instead of delivering fake cpuid values clear the cpu capability bit
for acc instead.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When running as pv domain xen_cpuid() is being used instead of
native_cpuid(). In xen_cpuid() the mtrr feature is indicated as not
being present by special casing the related cpuid leaf.
Instead of delivering fake cpuid values clear the cpu capability bit
for mtrr instead.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When running as pv domain xen_cpuid() is being used instead of
native_cpuid(). In xen_cpuid() the aperf/mperf feature is indicated
as not being present by special casing the related cpuid leaf.
Instead of delivering fake cpuid values clear the cpu capability bit
for aperf/mperf instead.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
| |
Xen doesn't support DCA (direct cache access) for pv domains. Clear
the corresponding capability indicator.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
There is no need to set the same capabilities for each cpu
individually. This can easily be done for all cpus when starting the
kernel.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For kdump to work correctly it needs the physical address of
vmcoreinfo_note. When running as dom0 this means the virtual address
has to be translated to the related machine address.
paddr_vmcoreinfo_note() is meant to do the translation via
__pa_symbol() only, but being attributed "weak" it can be replaced
easily in Xen case.
Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Petr Tesarik <ptesarik@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
| |
xen_call_function_interrupt() isn't used in smp_pv.c. Remove it.
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
|
| |
After code split between PV and HVM some functions in xen_smp_ops have
xen_pv_ prefix and some only xen_ which makes them look like they're
common for both PV and HVM while they're not. Rename all the rest to
have xen_pv_ prefix.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
| |
Now everything is in place and we can move PV-only code under
CONFIG_XEN_PV. CONFIG_XEN_PV_SMP is created to support the change.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
__pfn_to_mfn() is only used from PV code (mmu_pv.c, p2m.c) and from
page.h where all functions calling it check for
xen_feature(XENFEAT_auto_translated_physmap) first so we can replace
it with any stub to make build happy.
set_foreign_p2m_mapping()/clear_foreign_p2m_mapping() are used from
grant-table.c but only if !xen_feature(XENFEAT_auto_translated_physmap).
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
| |
startup_xen references PV-only code, decorate it with #ifdef CONFIG_XEN_PV
to make PV-free builds possible.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
| |
xen_pmu_init/finish() functions are used in suspend.c and
enlighten.c, add stubs for now.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
| |
Slit the code in suspend.c into suspend_pv.c and suspend_hvm.c.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
| |
Basically, mmu.c is renamed to mmu_pv.c and some code moved out to common
mmu.c.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
| |
Move PVHVM related code to mmu_hvm.c.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
| |
Basically, smp.c is renamed to smp_pv.c and some code moved out to common
smp.c. struct xen_common_irq delcaration ended up in smp.h.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Move PVHVM related code to smp_hvm.c. Drop 'static' qualifier from
xen_smp_send_reschedule(), xen_smp_send_call_function_ipi(),
xen_smp_send_call_function_single_ipi(), these functions will be moved to
common smp code when smp_pv.c is split.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
| |
Split xen_cpu_die() into xen_pv_cpu_die() and xen_hvm_cpu_die() to support
further splitting of smp.c.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
| |
Split xen_smp_prepare_boot_cpu() into xen_pv_smp_prepare_boot_cpu() and
xen_hvm_smp_prepare_boot_cpu() to support further splitting of smp.c.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
xen_smp_intr_init() and xen_smp_intr_free() have PV-specific code and as
a praparatory change to splitting smp.c we need to split these fucntions.
Create xen_smp_intr_init_pv()/xen_smp_intr_free_pv().
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
| |
Basically, enlighten.c is renamed to enlighten_pv.c and some code moved
out to common enlighten.c.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Move PVHVM related code to enlighten_hvm.c. Three functions:
xen_cpuhp_setup(), xen_reboot(), xen_emergency_restart() are shared, drop
static qualifier from them. These functions will go to common code once
it is split from enlighten.c.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
| |
Create enlighten_pvh.c by splitting off PVH related code from enlighten.c,
put it under CONFIG_XEN_PVH.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
| |
All code to support Xen PV will get under this new option. For the
beginning, check for it in the common code.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
|
| |
have_vcpu_info_placement applies to both PV and HVM and as we're going
to split the code we need to make it global.
Rename to xen_have_vcpu_info_placement.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As a preparation to splitting the code we need to untangle it:
x86_hyper_xen -> x86_hyper_xen_hvm and x86_hyper_xen_pv
xen_platform() -> xen_platform_hvm() and xen_platform_pv()
xen_cpu_up_prepare() -> xen_cpu_up_prepare_pv() and xen_cpu_up_prepare_hvm()
xen_cpu_dead() -> xen_cpu_dead_pv() and xen_cpu_dead_pv_hvm()
Add two parameters to xen_cpuhp_setup() to pass proper cpu_up_prepare and
cpu_dead hooks. xen_set_cpu_features() is now PV-only so the redundant
xen_pv_domain() check can be dropped.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 mm updates from Ingo Molnar:
"The main x86 MM changes in this cycle were:
- continued native kernel PCID support preparation patches to the TLB
flushing code (Andy Lutomirski)
- various fixes related to 32-bit compat syscall returning address
over 4Gb in applications, launched from 64-bit binaries - motivated
by C/R frameworks such as Virtuozzo. (Dmitry Safonov)
- continued Intel 5-level paging enablement: in particular the
conversion of x86 GUP to the generic GUP code. (Kirill A. Shutemov)
- x86/mpx ABI corner case fixes/enhancements (Joerg Roedel)
- ... plus misc updates, fixes and cleanups"
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (62 commits)
mm, zone_device: Replace {get, put}_zone_device_page() with a single reference to fix pmem crash
x86/mm: Fix flush_tlb_page() on Xen
x86/mm: Make flush_tlb_mm_range() more predictable
x86/mm: Remove flush_tlb() and flush_tlb_current_task()
x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly()
x86/mm/64: Fix crash in remove_pagetable()
Revert "x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation"
x86/boot/e820: Remove a redundant self assignment
x86/mm: Fix dump pagetables for 4 levels of page tables
x86/mpx, selftests: Only check bounds-vs-shadow when we keep shadow
x86/mpx: Correctly report do_mpx_bt_fault() failures to user-space
Revert "x86/mm/numa: Remove numa_nodemask_from_meminfo()"
x86/espfix: Add support for 5-level paging
x86/kasan: Extend KASAN to support 5-level paging
x86/mm: Add basic defines/helpers for CONFIG_X86_5LEVEL=y
x86/paravirt: Add 5-level support to the paravirt code
x86/mm: Define virtual memory map for 5-level paging
x86/asm: Remove __VIRTUAL_MASK_SHIFT==47 assert
x86/boot: Detect 5-level paging support
x86/mm/numa: Remove numa_nodemask_from_meminfo()
...
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
flush_tlb_page() passes a bogus range to flush_tlb_others() and
expects the latter to fix it up. native_flush_tlb_others() has the
fixup but Xen's version doesn't. Move the fixup to
flush_tlb_others().
AFAICS the only real effect is that, without this fix, Xen would
flush everything instead of just the one page on remote vCPUs in
when flush_tlb_page() was called.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: e7b52ffd45a6 ("x86/flush_tlb: try flush_tlb_single one by one in flush_tlb_range")
Link: http://lkml.kernel.org/r/10ed0e4dfea64daef10b87fb85df1746999b4dba.1492844372.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
I'm about to rewrite the function almost completely, but first I
want to get a functional change out of the way. Currently, if
flush_tlb_mm_range() does not flush the local TLB at all, it will
never do individual page flushes on remote CPUs. This seems to be
an accident, and preserving it will be awkward. Let's change it
first so that any regressions in the rewrite will be easier to
bisect and so that the rewrite can attempt to change no visible
behavior at all.
The fix is simple: we can simply avoid short-circuiting the
calculation of base_pages_to_flush.
As a side effect, this also eliminates a potential corner case: if
tlb_single_page_flush_ceiling == TLB_FLUSH_ALL, flush_tlb_mm_range()
could have ended up flushing the entire address space one page at a
time.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/4b29b771d9975aad7154c314534fec235618175a.1492844372.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
I was trying to figure out what how flush_tlb_current_task() would
possibly work correctly if current->mm != current->active_mm, but I
realized I could spare myself the effort: it has no callers except
the unused flush_tlb() macro.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/e52d64c11690f85e9f1d69d7b48cc2269cd2e94b.1492844372.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
mark_screen_rdonly() is the last remaining caller of flush_tlb().
flush_tlb_mm_range() is potentially faster and isn't obsolete.
Compile-tested only because I don't know whether software that uses
this mechanism even exists.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/791a644076fc3577ba7f7b7cafd643cc089baa7d.1492844372.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
remove_pagetable() does page walk using p*d_page_vaddr() plus cast.
It's not canonical approach -- we usually use p*d_offset() for that.
It works fine as long as all page table levels are present. We broke the
invariant by introducing folded p4d page table level.
As result, remove_pagetable() interprets PMD as PUD and it leads to
crash:
BUG: unable to handle kernel paging request at ffff880300000000
IP: memchr_inv+0x60/0x110
PGD 317d067
P4D 317d067
PUD 3180067
PMD 33f102067
PTE 8000000300000060
Let's fix this by using p*d_offset() instead of p*d_page_vaddr() for
page walk.
Reported-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Fixes: f2a6a7050109 ("x86: Convert the rest of the code to support p4d_t")
Link: http://lkml.kernel.org/r/20170425092557.21852-1-kirill.shutemov@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|