summaryrefslogtreecommitdiffstats
path: root/arch/x86/xen/smp.c
Commit message (Collapse)AuthorAgeFilesLines
* xen/smp: Fix CPU online/offline bug triggering a BUG: scheduling while atomic.Konrad Rzeszutek Wilk2012-02-031-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a user offlines a VCPU and then onlines it, we get: NMI watchdog disabled (cpu2): hardware events not enabled BUG: scheduling while atomic: swapper/2/0/0x00000002 Modules linked in: dm_multipath dm_mod xen_evtchn iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi scsi_mod libcrc32c crc32c radeon fbco ttm bitblit softcursor drm_kms_helper xen_blkfront xen_netfront xen_fbfront fb_sys_fops sysimgblt sysfillrect syscopyarea xen_kbdfront xenfs [last unloaded: Pid: 0, comm: swapper/2 Tainted: G O 3.2.0phase15.1-00003-gd6f7f5b-dirty #4 Call Trace: [<ffffffff81070571>] __schedule_bug+0x61/0x70 [<ffffffff8158eb78>] __schedule+0x798/0x850 [<ffffffff8158ed6a>] schedule+0x3a/0x50 [<ffffffff810349be>] cpu_idle+0xbe/0xe0 [<ffffffff81583599>] cpu_bringup_and_idle+0xe/0x10 The reason for this should be obvious from this call-chain: cpu_bringup_and_idle: \- cpu_bringup | \-[preempt_disable] | |- cpu_idle \- play_dead [assuming the user offlined the VCPU] | \ | +- (xen_play_dead) | \- HYPERVISOR_VCPU_off [so VCPU is dead, once user | | onlines it starts from here] | \- cpu_bringup [preempt_disable] | +- preempt_enable_no_reschedule() +- schedule() \- preempt_enable() So we have two preempt_disble() and one preempt_enable(). Calling preempt_enable() after the cpu_bringup() in the xen_play_dead fixes the imbalance. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen: disable PV spinlocks on HVMStefano Stabellini2011-09-081-1/+0
| | | | | | | | | | | | | PV spinlocks cannot possibly work with the current code because they are enabled after pvops patching has already been done, and because PV spinlocks use a different data structure than native spinlocks so we cannot switch between them dynamically. A spinlock that has been taken once by the native code (__ticket_spin_lock) cannot be taken by __xen_spin_lock even after it has been released. Reported-and-Tested-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen/smp: Warn user why they keel over - nosmp or noapic and what to use instead.Konrad Rzeszutek Wilk2011-09-011-0/+10
| | | | | | | | | | | | | | | We have hit a couple of customer bugs where they would like to use those parameters to run an UP kernel - but both of those options turn of important sources of interrupt information so we end up not being able to boot. The correct way is to pass in 'dom0_max_vcpus=1' on the Xen hypervisor line and the kernel will patch itself to be a UP kernel. Fixes bug: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=637308 CC: stable@kernel.org Acked-by: Ian Campbell <Ian.Campbell@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen: Do not enable PV IPIs when vector callback not presentStefano Stabellini2011-08-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | | Fix regression for HVM case on older (<4.1.1) hypervisors caused by commit 99bbb3a84a99cd04ab16b998b20f01a72cfa9f4f Author: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Date: Thu Dec 2 17:55:10 2010 +0000 xen: PV on HVM: support PV spinlocks and IPIs This change replaced the SMP operations with event based handlers without taking into account that this only works when the hypervisor supports callback vectors. This causes unexplainable hangs early on boot for HVM guests with more than one CPU. BugLink: http://bugs.launchpad.net/bugs/791850 CC: stable@kernel.org Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Tested-and-Reported-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen: support CONFIG_MAXSMPAndrew Jones2011-06-151-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | The MAXSMP config option requires CPUMASK_OFFSTACK, which in turn requires we init the memory for the maps while we bring up the cpus. MAXSMP also increases NR_CPUS to 4096. This increase in size exposed an issue in the argument construction for multicalls from xen_flush_tlb_others. The args should only need space for the actual number of cpus. Also in 2.6.39 it exposes a bootup problem. BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff8157a1d3>] set_cpu_sibling_map+0x123/0x30d ... Call Trace: [<ffffffff81039a3f>] ? xen_restore_fl_direct_reloc+0x4/0x4 [<ffffffff819dc4db>] xen_smp_prepare_cpus+0x36/0x135 .. CC: stable@kernel.org Signed-off-by: Andrew Jones <drjones@redhat.com> [v2: Updated to compile on 3.0] [v3: Updated to compile when CONFIG_SMP is not defined] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
*-. Merge branches 'sched-core-for-linus' and 'sched-urgent-for-linus' of ↵Linus Torvalds2011-05-191-3/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (60 commits) sched: Fix and optimise calculation of the weight-inverse sched: Avoid going ahead if ->cpus_allowed is not changed sched, rt: Update rq clock when unthrottling of an otherwise idle CPU sched: Remove unused parameters from sched_fork() and wake_up_new_task() sched: Shorten the construction of the span cpu mask of sched domain sched: Wrap the 'cfs_rq->nr_spread_over' field with CONFIG_SCHED_DEBUG sched: Remove unused 'this_best_prio arg' from balance_tasks() sched: Remove noop in alloc_rt_sched_group() sched: Get rid of lock_depth sched: Remove obsolete comment from scheduler_tick() sched: Fix sched_domain iterations vs. RCU sched: Next buddy hint on sleep and preempt path sched: Make set_*_buddy() work on non-task entities sched: Remove need_migrate_task() sched: Move the second half of ttwu() to the remote cpu sched: Restructure ttwu() some more sched: Rename ttwu_post_activation() to ttwu_do_wakeup() sched: Remove rq argument from ttwu_stat() sched: Remove rq->lock from the first half of ttwu() sched: Drop rq->lock from sched_exec() ... * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Fix rt_rq runtime leakage bug
| * | sched: Provide scheduler_ipi() callback in response to smp_send_reschedule()Peter Zijlstra2011-04-141-3/+2
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For future rework of try_to_wake_up() we'd like to push part of that function onto the CPU the task is actually going to run on. In order to do so we need a generic callback from the existing scheduler IPI. This patch introduces such a generic callback: scheduler_ipi() and implements it as a NOP. BenH notes: PowerPC might use this IPI on offline CPUs under rare conditions! Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: Chris Metcalf <cmetcalf@tilera.com> Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Reviewed-by: Frank Rowand <frank.rowand@am.sony.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110405152728.744338123@chello.nl
* / arch/x86/xen/smp: Cleanup code/data sections definitionsDaniel Kiper2011-05-191-4/+4
|/ | | | | | | | Cleanup code/data sections definitions accordingly to include/linux/init.h. Signed-off-by: Daniel Kiper <dkiper@net-space.pl> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen: PV on HVM: support PV spinlocks and IPIsStefano Stabellini2011-02-251-0/+38
| | | | | | | | | | | Initialize PV spinlocks on boot CPU right after native_smp_prepare_cpus (that switch to APIC mode and initialize APIC routing); on secondary CPUs on CPU_UP_PREPARE. Enable the usage of event channels to send and receive IPIs when running as a PV on HVM guest. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
* Merge branch 'stable/xen-pcifront-0.8.2' of ↵Linus Torvalds2010-10-281-0/+26
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen and branch 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm * 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm: xen: register xen pci notifier xen: initialize cpu masks for pv guests in xen_smp_init xen: add a missing #include to arch/x86/pci/xen.c xen: mask the MTRR feature from the cpuid xen: make hvc_xen console work for dom0. xen: add the direct mapping area for ISA bus access xen: Initialize xenbus for dom0. xen: use vcpu_ops to setup cpu masks xen: map a dummy page for local apic and ioapic in xen_set_fixmap xen: remap MSIs into pirqs when running as initial domain xen: remap GSIs as pirqs when running as initial domain xen: introduce XEN_DOM0 as a silent option xen: map MSIs into pirqs xen: support GSI -> pirq remapping in PV on HVM guests xen: add xen hvm acpi_register_gsi variant acpi: use indirect call to register gsi in different modes xen: implement xen_hvm_register_pirq xen: get the maximum number of pirqs from xen xen: support pirq != irq * 'stable/xen-pcifront-0.8.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (27 commits) X86/PCI: Remove the dependency on isapnp_disable. xen: Update Makefile with CONFIG_BLOCK dependency for biomerge.c MAINTAINERS: Add myself to the Xen Hypervisor Interface and remove Chris Wright. x86: xen: Sanitse irq handling (part two) swiotlb-xen: On x86-32 builts, select SWIOTLB instead of depending on it. MAINTAINERS: Add myself for Xen PCI and Xen SWIOTLB maintainer. xen/pci: Request ACS when Xen-SWIOTLB is activated. xen-pcifront: Xen PCI frontend driver. xenbus: prevent warnings on unhandled enumeration values xenbus: Xen paravirtualised PCI hotplug support. xen/x86/PCI: Add support for the Xen PCI subsystem x86: Introduce x86_msi_ops msi: Introduce default_[teardown|setup]_msi_irqs with fallback. x86/PCI: Export pci_walk_bus function. x86/PCI: make sure _PAGE_IOMAP it set on pci mappings x86/PCI: Clean up pci_cache_line_size xen: fix shared irq device passthrough xen: Provide a variant of xen_poll_irq with timeout. xen: Find an unbound irq number in reverse order (high to low). xen: statically initialize cpu_evtchn_mask_p ... Fix up trivial conflicts in drivers/pci/Makefile
| * xen: initialize cpu masks for pv guests in xen_smp_initStefano Stabellini2010-10-261-2/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Pv guests don't have ACPI and need the cpu masks to be set correctly as early as possible so we call xen_fill_possible_map from xen_smp_init. On the other hand the initial domain supports ACPI so in this case we skip xen_fill_possible_map and rely on it. However Xen might limit the number of cpus usable by the domain, so we filter those masks during smp initialization using the VCPUOP_is_up hypercall. It is important that the filtering is done before xen_setup_vcpu_info_placement. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
| * xen: use vcpu_ops to setup cpu masksStefano Stabellini2010-10-221-1/+7
| | | | | | | | | | Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | x86, kexec: Make sure to stop all CPUs before exiting the kernelAlok Kataria2010-10-211-3/+3
|/ | | | | | | | | | | | | | | | | | | | x86 smp_ops now has a new op, stop_other_cpus which takes a parameter "wait" this allows the caller to specify if it wants to stop until all the cpus have processed the stop IPI. This is required specifically for the kexec case where we should wait for all the cpus to be stopped before starting the new kernel. We now wait for the cpus to stop in all cases except for panic/kdump where we expect things to be broken and we are doing our best to make things work anyway. This patch fixes a legitimate regression, which was introduced during 2.6.30, by commit id 4ef702c10b5df18ab04921fc252c26421d4d6c75. Signed-off-by: Alok N Kataria <akataria@vmware.com> LKML-Reference: <1286833028.1372.20.camel@ank32.eng.vmware.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: <stable@kernel.org> v2.6.30-36 Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* xen/panic: use xen_reboot and fix smp_send_stopIan Campbell2010-08-041-0/+2
| | | | | | | Offline vcpu when using stop_self. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
* include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo2010-03-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
* xen: Fix misspelled CONFIG variable in comment.Robert P. J. Day2010-02-051-1/+1
| | | | | Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* Merge branch 'for-linus' of ↵Linus Torvalds2009-12-141-20/+21
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits) m68k: rename global variable vmalloc_end to m68k_vmalloc_end percpu: add missing per_cpu_ptr_to_phys() definition for UP percpu: Fix kdump failure if booted with percpu_alloc=page percpu: make misc percpu symbols unique percpu: make percpu symbols in ia64 unique percpu: make percpu symbols in powerpc unique percpu: make percpu symbols in x86 unique percpu: make percpu symbols in xen unique percpu: make percpu symbols in cpufreq unique percpu: make percpu symbols in oprofile unique percpu: make percpu symbols in tracer unique percpu: make percpu symbols under kernel/ and mm/ unique percpu: remove some sparse warnings percpu: make alloc_percpu() handle array types vmalloc: fix use of non-existent percpu variable in put_cpu_var() this_cpu: Use this_cpu_xx in trace_functions_graph.c this_cpu: Use this_cpu_xx for ftrace this_cpu: Use this_cpu_xx in nmi handling this_cpu: Use this_cpu operations in RCU this_cpu: Use this_cpu ops for VM statistics ... Fix up trivial (famous last words) global per-cpu naming conflicts in arch/x86/kvm/svm.c mm/slab.c
| * percpu: make percpu symbols in xen uniqueTejun Heo2009-10-291-20/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch updates percpu related symbols in xen such that percpu symbols are unique and don't clash with local symbols. This serves two purposes of decreasing the possibility of global percpu symbol collision and allowing dropping per_cpu__ prefix from percpu symbols. * arch/x86/xen/smp.c, arch/x86/xen/time.c, arch/ia64/xen/irq_xen.c: add xen_ prefix to percpu variables * arch/ia64/xen/time.c: add xen_ prefix to percpu variables, drop processed_ prefix and make them static Partly based on Rusty Russell's "alloc_percpu: rename percpu vars which cause name clashes" patch. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Chris Wright <chrisw@sous-sol.org>
* | Merge branch 'bugfix' of ↵Linus Torvalds2009-12-101-0/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen * 'bugfix' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen: xen: try harder to balloon up under memory pressure. Xen balloon: fix totalram_pages counting. xen: explicitly create/destroy stop_machine workqueues outside suspend/resume region. xen: improve error handling in do_suspend. xen: don't leak IRQs over suspend/resume. xen: call clock resume notifier on all CPUs xen: use iret for return from 64b kernel to 32b usermode xen: don't call dpm_resume_noirq() with interrupts disabled. xen: register runstate info for boot CPU early xen: register runstate on secondary CPUs xen: register timer interrupt with IRQF_TIMER xen: correctly restore pfn_to_mfn_list_list after resume xen: restore runstate_info even if !have_vcpu_info_placement xen: re-register runstate area earlier on resume. xen: wait up to 5 minutes for device connetion xen: improvement to wait_for_devices() xen: fix is_disconnected_device/exists_disconnected_device xen/xenbus: make DEVICE_ATTR()s static
| * | xen: register runstate on secondary CPUsIan Campbell2009-12-031-0/+1
| |/ | | | | | | | | | | | | | | | | | | The commit "xen: re-register runstate area earlier on resume" caused us to never try and setup the runstate area for secondary CPUs. Ensure that we do this... Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stable Kernel <stable@kernel.org>
* / cpumask: Use modern cpumask style in XenRusty Russell2009-11-041-1/+1
|/ | | | | | | Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> LKML-Reference: <200911031458.38406.rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* xen: make -fstack-protector work under XenJeremy Fitzhardinge2009-09-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | -fstack-protector uses a special per-cpu "stack canary" value. gcc generates special code in each function to test the canary to make sure that the function's stack hasn't been overrun. On x86-64, this is simply an offset of %gs, which is the usual per-cpu base segment register, so setting it up simply requires loading %gs's base as normal. On i386, the stack protector segment is %gs (rather than the usual kernel percpu %fs segment register). This requires setting up the full kernel GDT and then loading %gs accordingly. We also need to make sure %gs is initialized when bringing up secondary cpus too. To keep things consistent, we do the full GDT/segment register setup on both architectures. Because we need to avoid -fstack-protected code before setting up the GDT and because there's no way to disable it on a per-function basis, several files need to have stack-protector inhibited. [ Impact: allow Xen booting with stack-protector enabled ] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
* NULL noise: arch/x86/xen/smp.cHannes Eder2009-04-081-2/+2
| | | | | | | | | Fix this sparse warnings: arch/x86/xen/smp.c:316:52: warning: Using plain integer as NULL pointer arch/x86/xen/smp.c:421:60: warning: Using plain integer as NULL pointer Signed-off-by: Hannes Eder <hannes@hanneseder.net> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
* cpumask: use new cpumask functions throughout x86Rusty Russell2009-03-131-3/+3
| | | | | | | | | | Impact: cleanup 1) &cpu_online_map -> cpu_online_mask 2) first_cpu/next_cpu_nr -> cpumask_first/cpumask_next 3) cpu_*_map manipulation -> init_cpu_* / set_cpu_* Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* xen: deal with virtually mapped percpu dataJeremy Fitzhardinge2009-03-021-1/+5
| | | | | | | | | | | | | | | | | | | | The virtually mapped percpu space causes us two problems: - for hypercalls which take an mfn, we need to do a full pagetable walk to convert the percpu va into an mfn, and - when a hypercall requires a page to be mapped RO via all its aliases, we need to make sure its RO in both the percpu mapping and in the linear mapping This primarily affects the gdt and the vcpu info structure. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Xen-devel <xen-devel@lists.xensource.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Tejun Heo <htejun@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* xen: setup percpu data pointersJeremy Fitzhardinge2009-02-041-2/+4
| | | | | | | | | | | | | | We need to access percpu data fairly early, so set up the percpu registers as soon as possible. We only need to load the appropriate segment register. We already have a GDT, but its hard to change it early because we need to manipulate the pagetable to do so, and that hasn't been set up yet. Also, set the kernel stack when bringing up secondary CPUs. If we don't they all end up sharing the same stack... Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* xen: setup percpu data pointersJeremy Fitzhardinge2009-01-311-1/+4
| | | | | | | | | | | | | Impact: fix xen booting We need to access percpu data fairly early, so set up the percpu registers as soon as possible. We only need to load the appropriate segment register. We already have a GDT, but its hard to change it early because we need to manipulate the pagetable to do so, and that hasn't been set up yet. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Tejun Heo <tj@kernel.org>
* x86: initialize per-cpu GDT segment in per-cpu setupBrian Gerst2009-01-271-1/+0
| | | | | | | | | | Impact: cleanup Rename init_gdt() to setup_percpu_segment(), and move it to setup_percpu.c. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
* x86-64: Move current task from PDA to per-cpu and consolidate with 32-bit.Brian Gerst2009-01-191-2/+1
| | | | | Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
* x86-64: Move irq stats from PDA to per-cpu and consolidate with 32-bit.Brian Gerst2009-01-191-15/+3
| | | | | Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
* percpu: add optimized generic percpu accessorsIngo Molnar2009-01-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is an optimization and a cleanup, and adds the following new generic percpu methods: percpu_read() percpu_write() percpu_add() percpu_sub() percpu_and() percpu_or() percpu_xor() and implements support for them on x86. (other architectures will fall back to a default implementation) The advantage is that for example to read a local percpu variable, instead of this sequence: return __get_cpu_var(var); ffffffff8102ca2b: 48 8b 14 fd 80 09 74 mov -0x7e8bf680(,%rdi,8),%rdx ffffffff8102ca32: 81 ffffffff8102ca33: 48 c7 c0 d8 59 00 00 mov $0x59d8,%rax ffffffff8102ca3a: 48 8b 04 10 mov (%rax,%rdx,1),%rax We can get a single instruction by using the optimized variants: return percpu_read(var); ffffffff8102ca3f: 65 48 8b 05 91 8f fd mov %gs:0x7efd8f91(%rip),%rax I also cleaned up the x86-specific APIs and made the x86 code use these new generic percpu primitives. tj: * fixed generic percpu_sub() definition as Roel Kluin pointed out * added percpu_and() for completeness's sake * made generic percpu ops atomic against preemption Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Tejun Heo <tj@kernel.org>
* x86: fold pda into percpu area on SMPTejun Heo2009-01-161-10/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Based on original patch from Christoph Lameter and Mike Travis. ] Currently pdas and percpu areas are allocated separately. %gs points to local pda and percpu area can be reached using pda->data_offset. This patch folds pda into percpu area. Due to strange gcc requirement, pda needs to be at the beginning of the percpu area so that pda->stack_canary is at %gs:40. To achieve this, a new percpu output section macro - PERCPU_VADDR_PREALLOC() - is added and used to reserve pda sized chunk at the start of the percpu area. After this change, for boot cpu, %gs first points to pda in the data.init area and later during setup_per_cpu_areas() gets updated to point to the actual pda. This means that setup_per_cpu_areas() need to reload %gs for CPU0 while clearing pda area for other cpus as cpu0 already has modified it when control reaches setup_per_cpu_areas(). This patch also removes now unnecessary get_local_pda() and its call sites. A lot of this patch is taken from Mike Travis' "x86_64: Fold pda into per cpu area" patch. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86: cosmetic changes apic-related files.Mike Travis2008-12-161-5/+6
| | | | | | | | This patch simply changes cpumask_t to struct cpumask and similar trivial modernizations. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com>
* xen: convert to cpumask_var_t and new cpumask primitives.Mike Travis2008-12-161-3/+6
| | | | | | | | Simple change, and eventual space saving when NR_CPUS >> nr_cpu_ids. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
* x86 smp: modify send_IPI_mask interface to accept cpumask_t pointersMike Travis2008-12-161-9/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: cleanup, change parameter passing * Change genapic interfaces to accept cpumask_t pointers where possible. * Modify external callers to use cpumask_t pointers in function calls. * Create new send_IPI_mask_allbutself which is the same as the send_IPI_mask functions but removes smp_processor_id() from list. This removes another common need for a temporary cpumask_t variable. * Functions that used a temp cpumask_t variable for: cpumask_t allbutme = cpu_online_map; cpu_clear(smp_processor_id(), allbutme); if (!cpus_empty(allbutme)) ... become: if (!cpus_equal(cpu_online_map, cpumask_of_cpu(cpu))) ... * Other minor code optimizations (like using cpus_clear instead of CPU_MASK_NONE, etc.) Applies to linux-2.6.tip/master. Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Ingo Molnar <mingo@elte.hu>
* xen_play_dead() is __cpuinitAl Viro2008-11-301-1/+1
| | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* xen: make CPU hotplug functions staticAlex Nixon2008-09-081-6/+6
| | | | | | | | There's no need for these functions to be accessed from outside of xen/smp.c Signed-off-by: Alex Nixon <alex.nixon@citrix.com> Acked-by: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86, xen: fix build when !CONFIG_HOTPLUG_CPUAlex Nixon2008-09-081-0/+18
| | | | | | Signed-off-by: Alex Nixon <alex.nixon@citrix.com> Acked-by: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* xen: implement CPU hotpluggingAlex Nixon2008-08-251-12/+48
| | | | | | | | | | | | | | | | | Note the changes from 2.6.18-xen CPU hotplugging: A vcpu_down request from the remote admin via Xenbus both hotunplugs the CPU, and disables it by removing it from the cpu_present map, and removing its entry in /sys. A vcpu_up request from the remote admin only re-enables the CPU, and does not immediately bring the CPU up. A udev event is emitted, which can be caught by the user if he wishes to automatically re-up CPUs when available, or implement a more complex policy. Signed-off-by: Alex Nixon <alex.nixon@citrix.com> Acked-by: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86: split spinlock implementations out into their own filesJeremy Fitzhardinge2008-07-241-167/+0
| | | | | | | | | | | | | ftrace requires certain low-level code, like spinlocks and timestamps, to be compiled without -pg in order to avoid infinite recursion. This patch splits out the core paravirt spinlocks and the Xen spinlocks into separate files which can be compiled without -pg. Also do xen/time.c while we're about it. As a result, we can now use ftrace within a Xen domain. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Merge branch 'cpus4096-for-linus' of ↵Linus Torvalds2008-07-231-2/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (31 commits) NR_CPUS: Replace NR_CPUS in speedstep-centrino.c cpumask: Provide a generic set of CPUMASK_ALLOC macros, FIXUP NR_CPUS: Replace NR_CPUS in cpufreq userspace routines NR_CPUS: Replace per_cpu(..., smp_processor_id()) with __get_cpu_var NR_CPUS: Replace NR_CPUS in arch/x86/kernel/genapic_flat_64.c NR_CPUS: Replace NR_CPUS in arch/x86/kernel/genx2apic_uv_x.c NR_CPUS: Replace NR_CPUS in arch/x86/kernel/cpu/proc.c NR_CPUS: Replace NR_CPUS in arch/x86/kernel/cpu/mcheck/mce_64.c cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c, fix cpumask: Use optimized CPUMASK_ALLOC macros in the centrino_target cpumask: Provide a generic set of CPUMASK_ALLOC macros cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c cpumask: Optimize cpumask_of_cpu in kernel/time/tick-common.c cpumask: Optimize cpumask_of_cpu in drivers/misc/sgi-xp/xpc_main.c cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/ldt.c cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/io_apic_64.c cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr Revert "cpumask: introduce new APIs" cpumask: make for_each_cpu_mask a bit smaller net: Pass reference to cpumask variable in net/sunrpc/svc.c ... Fix up trivial conflicts in drivers/cpufreq/cpufreq.c manually
| * Merge branch 'linus' into cpus4096Ingo Molnar2008-07-161-89/+54
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: arch/x86/xen/smp.c kernel/sched_rt.c net/iucv/iucv.c Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | x86: use performance variant for_each_cpu_mask_nrMike Travis2008-05-231-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change references from for_each_cpu_mask to for_each_cpu_mask_nr where appropriate Reviewed-by: Paul Jackson <pj@sgi.com> Reviewed-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> commit 2d474871e2fb092eb46a0930aba5442e10eb96cc Author: Mike Travis <travis@sgi.com> Date: Mon May 12 21:21:13 2008 +0200
* | | xen: implement Xen-specific spinlocksJeremy Fitzhardinge2008-07-161-1/+171
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The standard ticket spinlocks are very expensive in a virtual environment, because their performance depends on Xen's scheduler giving vcpus time in the order that they're supposed to take the spinlock. This implements a Xen-specific spinlock, which should be much more efficient. The fast-path is essentially the old Linux-x86 locks, using a single lock byte. The locker decrements the byte; if the result is 0, then they have the lock. If the lock is negative, then locker must spin until the lock is positive again. When there's contention, the locker spin for 2^16[*] iterations waiting to get the lock. If it fails to get the lock in that time, it adds itself to the contention count in the lock and blocks on a per-cpu event channel. When unlocking the spinlock, the locker looks to see if there's anyone blocked waiting for the lock by checking for a non-zero waiter count. If there's a waiter, it traverses the per-cpu "lock_spinners" variable, which contains which lock each CPU is waiting on. It picks one CPU waiting on the lock and sends it an event to wake it up. This allows efficient fast-path spinlock operation, while allowing spinning vcpus to give up their processor time while waiting for a contended lock. [*] 2^16 iterations is threshold at which 98% locks have been taken according to Thomas Friebel's Xen Summit talk "Preventing Guests from Spinning Around". Therefore, we'd expect the lock and unlock slow paths will only be entered 2% of the time. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Christoph Lameter <clameter@linux-foundation.org> Cc: Petr Tesarik <ptesarik@suse.cz> Cc: Virtualization <virtualization@lists.linux-foundation.org> Cc: Xen devel <xen-devel@lists.xensource.com> Cc: Thomas Friebel <thomas.friebel@amd.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | xen: use lock-byte spinlock implementationJeremy Fitzhardinge2008-07-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Switch to using the lock-byte spinlock implementation, to avoid the worst of the performance hit from ticket locks. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Christoph Lameter <clameter@linux-foundation.org> Cc: Petr Tesarik <ptesarik@suse.cz> Cc: Virtualization <virtualization@lists.linux-foundation.org> Cc: Xen devel <xen-devel@lists.xensource.com> Cc: Thomas Friebel <thomas.friebel@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | xen64: set up syscall and sysenter entrypoints for 64-bitJeremy Fitzhardinge2008-07-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We set up entrypoints for syscall and sysenter. sysenter is only used for 32-bit compat processes, whereas syscall can be used in by both 32 and 64-bit processes. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | xen: set num_processorsJeremy Fitzhardinge2008-07-161-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Someone's got to do it. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | xen64: smp.c compile hackingJeremy Fitzhardinge2008-07-161-41/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A number of random changes to make xen/smp.c compile in 64-bit mode. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>a Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | xen: move smp setup into smp.cJeremy Fitzhardinge2008-07-161-8/+26
| |/ |/| | | | | | | | | | | | | | | | | | | Move all the smp_ops setup into smp.c, allowing a lot of things to become static. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | Merge branch 'generic-ipi' into generic-ipi-for-linusIngo Molnar2008-07-151-88/+47
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: arch/powerpc/Kconfig arch/s390/kernel/time.c arch/x86/kernel/apic_32.c arch/x86/kernel/cpu/perfctr-watchdog.c arch/x86/kernel/i8259_64.c arch/x86/kernel/ldt.c arch/x86/kernel/nmi_64.c arch/x86/kernel/smpboot.c arch/x86/xen/smp.c include/asm-x86/hw_irq_32.h include/asm-x86/hw_irq_64.h include/asm-x86/mach-default/irq_vectors.h include/asm-x86/mach-voyager/irq_vectors.h include/asm-x86/smp.h kernel/Makefile Signed-off-by: Ingo Molnar <mingo@elte.hu>