summaryrefslogtreecommitdiffstats
path: root/arch/x86
Commit message (Collapse)AuthorAgeFilesLines
* x86: Use pci_claim_resourceMatthew Wilcox2009-06-171-10/+7
| | | | | | | | Instead of open-coding pci_find_parent_resource and request_resource, just call pci_claim_resource. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'fixes' of ↵Linus Torvalds2009-06-176-189/+175
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] cpumask: new cpumask operators for arch/x86/kernel/cpu/cpufreq/powernow-k8.c [CPUFREQ] cpumask: avoid playing with cpus_allowed in powernow-k8.c [CPUFREQ] cpumask: avoid cpumask games in arch/x86/kernel/cpu/cpufreq/speedstep-centrino.c [CPUFREQ] cpumask: avoid playing with cpus_allowed in speedstep-ich.c [CPUFREQ] powernow-k8: get drv data for correct CPU [CPUFREQ] powernow-k8: read P-state from HW [CPUFREQ] reduce scope of ACPI_PSS_BIOS_BUG_MSG[] [CPUFREQ] Clean up convoluted code in arch/x86/kernel/tsc.c:time_cpufreq_notifier() [CPUFREQ] minor correction to cpu-freq documentation [CPUFREQ] powernow-k8.c: mess cleanup [CPUFREQ] Only set sampling_rate_max deprecated, sampling_rate_min is useful [CPUFREQ] powernow-k8: Set transition latency to 1 if ACPI tables export 0 [CPUFREQ] ondemand: Uncouple minimal sampling rate from HZ in NO_HZ case
| * [CPUFREQ] cpumask: new cpumask operators for ↵Rusty Russell2009-06-152-15/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | arch/x86/kernel/cpu/cpufreq/powernow-k8.c Remove all old-style cpumask operators, and cpumask_t. Also: get rid of the unused define_siblings function. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Mark Langsdorf <mark.langsdorf@amd.com> Tested-by: Mark Langsdorf <mark.langsdorf@amd.com> Signed-off-by: Dave Jones <davej@redhat.com>
| * [CPUFREQ] cpumask: avoid playing with cpus_allowed in powernow-k8.cRusty Russell2009-06-151-58/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cpumask: avoid playing with cpus_allowed in powernow-k8.c It's generally a very bad idea to mug some process's cpumask: it could legitimately and reasonably be changed by root, which could break us (if done before our code) or them (if we restore the wrong value). I did not replace powernowk8_target; it needs fixing, but it grabs a mutex (so no smp_call_function_single here) but Mark points out it can be called multiple times per second, so work_on_cpu is too heavy. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> To: cpufreq@vger.kernel.org Acked-by: Mark Langsdorf <mark.langsdorf@amd.com> Tested-by: Mark Langsdorf <mark.langsdorf@amd.com> Signed-off-by: Dave Jones <davej@redhat.com>
| * [CPUFREQ] cpumask: avoid cpumask games in ↵Rusty Russell2009-06-151-43/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | arch/x86/kernel/cpu/cpufreq/speedstep-centrino.c Impact: don't play with current's cpumask It's generally a very bad idea to mug some process's cpumask: it could legitimately and reasonably be changed by root, which could break us (if done before our code) or them (if we restore the wrong value). Use rdmsr_on_cpu and wrmsr_on_cpu instead. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> To: cpufreq@vger.kernel.org Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Dave Jones <davej@redhat.com>
| * [CPUFREQ] cpumask: avoid playing with cpus_allowed in speedstep-ich.cRusty Russell2009-06-152-39/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: don't play with current's cpumask It's generally a very bad idea to mug some process's cpumask: it could legitimately and reasonably be changed by root, which could break us (if done before our code) or them (if we restore the wrong value). We use smp_call_function_single: this had the advantage of being more efficient, too. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> To: cpufreq@vger.kernel.org Cc: Dominik Brodowski <linux@brodo.de> Signed-off-by: Dave Jones <davej@redhat.com>
| * [CPUFREQ] powernow-k8: get drv data for correct CPUNaga Chumbalkar2009-06-151-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make powernowk8_get() similar to powernowk8_target() and powernowk8_verify() in the way it obtains "powernow_data" for a given CPU. Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Langsdorf, Mark <mark.langsdorf@amd.com> Cc: Thomas Renninger <trenn@suse.de> Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Reviewed-by: Andreas Herrmann <andreas.herrmann3@amd.com> Tested-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Langsdorf, Mark <mark.langsdorf@amd.com> Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Dave Jones <davej@redhat.com>
| * [CPUFREQ] powernow-k8: read P-state from HWNaga Chumbalkar2009-06-151-14/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By definition, "cpuinfo_cur_freq" should report the value from HW. So, don't depend on the cached value. Instead read P-state directly from HW, while taking into account the erratum 311 workaround for Fam 11h processors. Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Langsdorf, Mark <mark.langsdorf@amd.com> Cc: Thomas Renninger <trenn@suse.de> Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Reviewed-by: Andreas Herrmann <andreas.herrmann3@amd.com> Tested-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Langsdorf, Mark <mark.langsdorf@amd.com> Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Dave Jones <davej@redhat.com>
| * [CPUFREQ] reduce scope of ACPI_PSS_BIOS_BUG_MSG[]Andrew Morton2009-06-151-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | This symbol doesn't need file-global scope. Cc: "Zhang, Rui" <rui.zhang@intel.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Ingo Molnar <mingo@elte.hu> Cc: Langsdorf, Mark <mark.langsdorf@amd.com> Cc: Leo Milano <lmilano@gmx.net> Cc: Thomas Renninger <trenn@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Jones <davej@redhat.com>
| * [CPUFREQ] Clean up convoluted code in ↵Dave Jones2009-06-151-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | arch/x86/kernel/tsc.c:time_cpufreq_notifier() Christoph Hellwig noticed the following potential uninitialised use: > arch/x86/kernel/tsc.c: In function 'time_cpufreq_notifier': > arch/x86/kernel/tsc.c:634: warning: 'dummy' may be used uninitialized in this function > > where we do have CONFIG_SMP set, freq->flags & CPUFREQ_CONST_LOOPS is > true and ref_freq is false. It seems plausable, though the circumstances for hitting it are really low. Nearly all SMP capable cpufreq drivers set CPUFREQ_CONST_LOOPS. powernow-k8 is really the only exception. The older CPUs were typically only ever UP. (powernow-k7 never supported SMP for eg) It's worth fixing regardless, as it cleans up the code. Fix possible uninitialized use of dummy, by just removing it, and making the setting of lpj more obvious. Signed-off-by: Dave Jones <davej@redhat.com>
| * [CPUFREQ] powernow-k8.c: mess cleanupLuis Henriques2009-06-151-7/+9
| | | | | | | | | | | | | | Mess cleanup in powernow_k8_acpi_pst_values() function. Signed-off-by: Luis Henriques <henrix@sapo.pt> Signed-off-by: Dave Jones <davej@redhat.com>
| * [CPUFREQ] powernow-k8: Set transition latency to 1 if ACPI tables export 0Thomas Renninger2009-06-151-0/+13
| | | | | | | | | | | | | | | | | | This doesn't fix anything, but it's expected that a transition latency of 0 could cause trouble in the future. Signed-off-by: Thomas Renninger <trenn@suse.de> Cc: Langsdorf, Mark <mark.langsdorf@amd.com> Signed-off-by: Dave Jones <davej@redhat.com>
* | Merge branch 'akpm'Linus Torvalds2009-06-167-33/+10
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * akpm: (182 commits) fbdev: bf54x-lq043fb: use kzalloc over kmalloc/memset fbdev: *bfin*: fix __dev{init,exit} markings fbdev: *bfin*: drop unnecessary calls to memset fbdev: bfin-t350mcqb-fb: drop unused local variables fbdev: blackfin has __raw I/O accessors, so use them in fb.h fbdev: s1d13xxxfb: add accelerated bitblt functions tcx: use standard fields for framebuffer physical address and length fbdev: add support for handoff from firmware to hw framebuffers intelfb: fix a bug when changing video timing fbdev: use framebuffer_release() for freeing fb_info structures radeon: P2G2CLK_ALWAYS_ONb tested twice, should 2nd be P2G2CLK_DAC_ALWAYS_ONb? s3c-fb: CPUFREQ frequency scaling support s3c-fb: fix resource releasing on error during probing carminefb: fix possible access beyond end of carmine_modedb[] acornfb: remove fb_mmap function mb862xxfb: use CONFIG_OF instead of CONFIG_PPC_OF mb862xxfb: restrict compliation of platform driver to PPC Samsung SoC Framebuffer driver: add Alpha Channel support atmel-lcdc: fix pixclock upper bound detection offb: use framebuffer_alloc() to allocate fb_info struct ... Manually fix up conflicts due to kmemcheck in mm/slab.c
| * | kmap_types: make most arches use generic header fileRandy Dunlap2009-06-161-20/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert most arches to use asm-generic/kmap_types.h. Move the KM_FENCE_ macro additions into asm-generic/kmap_types.h, controlled by __WITH_KM_FENCE from each arch's kmap_types.h file. Would be nice to be able to add custom KM_types per arch, but I don't yet see a nice, clean way to do that. Built on x86_64, i386, mips, sparc, alpha(tonyb), powerpc(tonyb), and 68k(tonyb). Note: avr32 should be able to remove KM_PTE2 (since it's not used) and then just use the generic kmap_types.h file. Get avr32 maintainer approval. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: <linux-arch@vger.kernel.org> Acked-by: Mike Frysinger <vapier@gentoo.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Bryan Wu <cooloney@kernel.org> Cc: Mikael Starvik <starvik@axis.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: "Luck Tony" <tony.luck@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Howells <dhowells@redhat.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | use printk_once() in several placesMinchan Kim2009-06-161-8/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are some places to be able to use printk_once instead of hard coding. Signed-off-by: Minchan Kim <minchan.kim@gmail.com> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: David S. Miller <davem@davemloft.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | page allocator: do not check NUMA node ID when the caller knows the node is ↵Mel Gorman2009-06-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | valid Callers of alloc_pages_node() can optionally specify -1 as a node to mean "allocate from the current node". However, a number of the callers in fast paths know for a fact their node is valid. To avoid a comparison and branch, this patch adds alloc_pages_exact_node() that only checks the nid with VM_BUG_ON(). Callers that know their node is valid are then converted. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Reviewed-by: Christoph Lameter <cl@linux-foundation.org> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Paul Mundt <lethal@linux-sh.org> [for the SLOB NUMA bits] Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | mm: consolidate init_mm definitionAlexey Dobriyan2009-06-161-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * create mm/init-mm.c, move init_mm there * remove INIT_MM, initialize init_mm with C99 initializer * unexport init_mm on all arches: init_mm is already unexported on x86. One strange place is some OMAP driver (drivers/video/omap/) which won't build modular, but it's already wants get_vm_area() export. Somebody should look there. [akpm@linux-foundation.org: add missing #includes] Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Mike Frysinger <vapier.adi@gmail.com> Cc: Americo Wang <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | time: move PIT_TICK_RATE to linux/timex.hArnd Bergmann2009-06-163-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PIT_TICK_RATE is currently defined in four architectures, but in three different places. While linux/timex.h is not the perfect place for it, it is still a reasonable replacement for those drivers that traditionally use asm/timex.h to get CLOCK_TICK_RATE and expect it to be the PIT frequency. Note that for Alpha, the actual value changed from 1193182UL to 1193180UL. This is unlikely to make a difference, and probably can only improve accuracy. There was a discussion on the correct value of CLOCK_TICK_RATE a few years ago, after which every existing instance was getting changed to 1193182. According to the specification, it should be 1193181.818181... Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Len Brown <lenb@kernel.org> Cc: john stultz <johnstul@us.ibm.com> Cc: Dmitry Torokhov <dtor@mail.ru> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge branch 'for-linus2' of ↵Linus Torvalds2009-06-1633-18/+1439
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/vegard/kmemcheck * 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/vegard/kmemcheck: (39 commits) signal: fix __send_signal() false positive kmemcheck warning fs: fix do_mount_root() false positive kmemcheck warning fs: introduce __getname_gfp() trace: annotate bitfields in struct ring_buffer_event net: annotate struct sock bitfield c2port: annotate bitfield for kmemcheck net: annotate inet_timewait_sock bitfields ieee1394/csr1212: fix false positive kmemcheck report ieee1394: annotate bitfield net: annotate bitfields in struct inet_sock net: use kmemcheck bitfields API for skbuff kmemcheck: introduce bitfield API kmemcheck: add opcode self-testing at boot x86: unify pte_hidden x86: make _PAGE_HIDDEN conditional kmemcheck: make kconfig accessible for other architectures kmemcheck: enable in the x86 Kconfig kmemcheck: add hooks for the page allocator kmemcheck: add hooks for page- and sg-dma-mappings kmemcheck: don't track page tables ...
| * \ \ Merge commit 'linus/master' into HEADVegard Nossum2009-06-1565-2108/+3775
| |\ \ \ | | | |/ | | |/| | | | | | | | | | | | | | | | | Conflicts: MAINTAINERS Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
| * | | kmemcheck: add opcode self-testing at bootVegard Nossum2009-06-154-17/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We've had some troubles in the past with weird instructions. This patch adds a self-test framework which can be used to verify that a certain set of opcodes are decoded correctly. Of course, the opcodes which are not tested can still give the wrong results. In short, this is just a safeguard to catch unintentional changes in the opcode decoder. It does not mean that errors can't still occur! [rebased for mainline inclusion] Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
| * | | x86: unify pte_hiddenJeremy Fitzhardinge2009-06-151-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unify and demacro pte_hidden. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> [rebased for mainline inclusion] Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
| * | | x86: make _PAGE_HIDDEN conditionalJeremy Fitzhardinge2009-06-151-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Only _PAGE_HIDDEN when CONFIG_KMEMCHECK is defined, otherwise set it to 0. Allows later cleanups. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> [rebased for mainline inclusion] Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
| * | | kmemcheck: make kconfig accessible for other architecturesPekka Enberg2009-06-152-88/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Kconfig options of kmemcheck are hidden under arch/x86 which makes porting to other architectures harder. To fix that, move the Kconfig bits to lib/Kconfig.kmemcheck and introduce a CONFIG_HAVE_ARCH_KMEMCHECK config option that architectures can define. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> [rebased for mainline inclusion] Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
| * | | kmemcheck: enable in the x86 KconfigVegard Nossum2009-06-151-0/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | let it rip! Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu> [rebased for mainline inclusion] Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no>
| * | | kmemcheck: add hooks for the page allocatorVegard Nossum2009-06-152-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for tracking the initializedness of memory that was allocated with the page allocator. Highmem requests are not tracked. Cc: Dave Hansen <dave@linux.vnet.ibm.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> [build fix for !CONFIG_KMEMCHECK] Signed-off-by: Ingo Molnar <mingo@elte.hu> [rebased for mainline inclusion] Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
| * | | kmemcheck: add hooks for page- and sg-dma-mappingsVegard Nossum2009-06-151-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is needed for page allocator support to prevent false positives when accessing pages which are dma-mapped. [rebased for mainline inclusion] Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
| * | | kmemcheck: don't track page tablesVegard Nossum2009-06-153-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | As these are allocated using the page allocator, we need to pass __GFP_NOTRACK before we add page allocator support to kmemcheck. Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
| * | | kmemcheck: add DMA hooksVegard Nossum2009-06-151-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch hooks into the DMA API to prevent the reporting of the false positives that would otherwise be reported when memory is accessed that is also used directly by devices. [rebased for mainline inclusion] Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
| * | | kmemcheck: add mm functionsVegard Nossum2009-06-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With kmemcheck enabled, the slab allocator needs to do this: 1. Tell kmemcheck to allocate the shadow memory which stores the status of each byte in the allocation proper, e.g. whether it is initialized or uninitialized. 2. Tell kmemcheck which parts of memory that should be marked uninitialized. There are actually a few more states, such as "not yet allocated" and "recently freed". If a slab cache is set up using the SLAB_NOTRACK flag, it will never return memory that can take page faults because of kmemcheck. If a slab cache is NOT set up using the SLAB_NOTRACK flag, callers can still request memory with the __GFP_NOTRACK flag. This does not prevent the page faults from occuring, however, but marks the object in question as being initialized so that no warnings will ever be produced for this object. In addition to (and in contrast to) __GFP_NOTRACK, the __GFP_NOTRACK_FALSE_POSITIVE flag indicates that the allocation should not be tracked _because_ it would produce a false positive. Their values are identical, but need not be so in the future (for example, we could now enable/disable false positives with a config option). Parts of this patch were contributed by Pekka Enberg but merged for atomicity. Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu> [rebased for mainline inclusion] Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
| * | | x86: add hooks for kmemcheckVegard Nossum2009-06-158-5/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The hooks that we modify are: - Page fault handler (to handle kmemcheck faults) - Debug exception handler (to hide pages after single-stepping the instruction that caused the page fault) Also redefine memset() to use the optimized version if kmemcheck is enabled. (Thanks to Pekka Enberg for minimizing the impact on the page fault handler.) As kmemcheck doesn't handle MMX/SSE instructions (yet), we also disable the optimized xor code, and rely instead on the generic C implementation in order to avoid false-positive warnings. Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no> [whitespace fixlet] Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu> [rebased for mainline inclusion] Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no>
| * | | kmemcheck: use kmemcheck_pte_lookup() instead of open-coding itPekka Enberg2009-06-151-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Lets use kmemcheck_pte_lookup() in kmemcheck_fault() instead of open-coding it there. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
| * | | kmemcheck: move 64-bit ifdef out of kmemcheck_opcode_decode()Pekka Enberg2009-06-151-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch moves the CONFIG_X86_64 ifdef out of kmemcheck_opcode_decode() by introducing a version of the function that always returns false for CONFIG_X86_32. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
| * | | kmemcheck: remove multiple ifdef'd definitions of the same global variablePekka Enberg2009-06-151-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Multiple ifdef'd definitions of the same global variable is ugly and error-prone. Fix that up. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
| * | | kmemcheck: make initialization message less confusingPekka Enberg2009-06-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "Bugs, beware!" printout during is cute but confuses users that something bad happened so change the text to the more boring "Initialized" message. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
| * | | kmemcheck: remove forward declarations from error.cPekka Enberg2009-06-151-69/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch reorders code in error.c so that we can get rid of the forward declarations. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
| * | | kmemcheck: include module.h to prevent warningsRandy Dunlap2009-06-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kmemcheck/shadow.c needs to include <linux/module.h> to prevent the following warnings: linux-next-20080724/arch/x86/mm/kmemcheck/shadow.c:64: warning : data definition has no type or storage class linux-next-20080724/arch/x86/mm/kmemcheck/shadow.c:64: warning : type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL' linux-next-20080724/arch/x86/mm/kmemcheck/shadow.c:64: warning : parameter names (without types) in function declaration Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: vegardno@ifi.uio.no Cc: penberg@cs.helsinki.fi Cc: akpm <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | kmemcheck: add the kmemcheck coreVegard Nossum2009-06-1315-2/+1266
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | General description: kmemcheck is a patch to the linux kernel that detects use of uninitialized memory. It does this by trapping every read and write to memory that was allocated dynamically (e.g. using kmalloc()). If a memory address is read that has not previously been written to, a message is printed to the kernel log. Thanks to Andi Kleen for the set_memory_4k() solution. Andrew Morton suggested documenting the shadow member of struct page. Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> [export kmemcheck_mark_initialized] [build fix for setup_max_cpus] Signed-off-by: Ingo Molnar <mingo@elte.hu> [rebased for mainline inclusion] Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no>
| * | | x86: add save_stack_trace_bp() for tracing from a specific stack frameVegard Nossum2009-06-121-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This will help kmemcheck (and possibly other debugging tools) since we can now simply pass regs->bp to the stack tracer instead of specifying the number of stack frames to skip, which is unreliable if gcc decides to inline functions, etc. Note that this makes the API incomplete for other architectures, but I expect that those can be updated lazily, e.g. when they need it. Cc: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
* | | | Driver Core: x86: add nodename for cpuid and msr drivers.Kay Sievers2009-06-152-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support to the x86 cpuid and msr drivers to report the proper device name to userspace for their devices. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Jan Blunck <jblunck@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* | | | Driver Core: misc: add nodename support for misc devices.Kay Sievers2009-06-151-0/+1
| |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for misc devices to report their requested nodename to userspace. It also updates a number of misc drivers to provide the needed subdirectory and device name to be used for them. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Jan Blunck <jblunck@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* | | Merge branch 'timers-for-linus-migration' of ↵Linus Torvalds2009-06-151-1/+1
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-for-linus-migration' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: timers: Logic to move non pinned timers timers: /proc/sys sysctl hook to enable timer migration timers: Identifying the existing pinned timers timers: Framework for identifying pinned timers timers: allow deferrable timers for intervals tv2-tv5 to be deferred Fix up conflicts in kernel/sched.c and kernel/timer.c manually
| * | timers: Identifying the existing pinned timersArun R Bharadwaj2009-05-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Arun R Bharadwaj <arun@linux.vnet.ibm.com> [2009-04-16 12:11:36]: The following pinned hrtimers have been identified and marked: 1)sched_rt_period_timer 2)tick_sched_timer 3)stack_trace_timer_fn [ tglx: fixup the hrtimer pinned mode ] Signed-off-by: Arun R Bharadwaj <arun@linux.vnet.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | | Merge branch 'x86-mce-for-linus' of ↵Linus Torvalds2009-06-1334-1660/+2908
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (80 commits) x86, mce: Add boot options for corrected errors x86, mce: Fix mce printing x86, mce: fix for mce counters x86, mce: support action-optional machine checks x86, mce: define MCE_VECTOR x86, mce: rename mce_notify_user to mce_notify_irq x86: fix panic with interrupts off (needed for MCE) x86, mce: export MCE severities coverage via debugfs x86, mce: implement new status bits x86, mce: print header/footer only once for multiple MCEs x86, mce: default to panic timeout for machine checks x86, mce: improve mce_get_rip x86, mce: make non Monarch panic message "Fatal machine check" too x86, mce: switch x86 machine check handler to Monarch election. x86, mce: implement panic synchronization x86, mce: implement bootstrapping for machine check wakeups x86, mce: check early in exception handler if panic is needed x86, mce: add table driven machine check grading x86, mce: remove TSC print heuristic x86, mce: log corrected errors when panicing ...
| * \ \ Merge branch 'linus' into x86/mce3Ingo Molnar2009-06-11205-4741/+8485
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: arch/x86/kernel/cpu/mcheck/mce_64.c arch/x86/kernel/irq.c Merge reason: Resolve the conflicts above. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | x86, mce: Add boot options for corrected errorsHidetoshi Seto2009-06-113-2/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces three boot options (no_cmci, dont_log_ce and ignore_ce) to control handling for corrected errors. The "mce=no_cmci" boot option disables the CMCI feature. Since CMCI is a new feature so having boot controls to disable it will be a help if the hardware is misbehaving. The "mce=dont_log_ce" boot option disables logging for corrected errors. All reported corrected errors will be cleared silently. This option will be useful if you never care about corrected errors. The "mce=ignore_ce" boot option disables features for corrected errors, i.e. polling timer and cmci. All corrected events are not cleared and kept in bank MSRs. Usually this disablement is not recommended, however it will be a help if there are some conflict with the BIOS or hardware monitoring applications etc., that clears corrected events in banks instead of OS. [ And trivial cleanup (space -> tab) for doc is included. ] Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> LKML-Reference: <4A30ACDF.5030408@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | x86, mce: Fix mce printingHidetoshi Seto2009-06-111-11/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch: - Adds print_mce_head() instead of first flag - Makes the header to be printed always - Stops double printing of corrected errors [ This portion originates from Huang Ying's patch ] Originally-From: Huang Ying <ying.huang@intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> LKML-Reference: <4A30AC83.5010708@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | x86, mce: fix for mce countersHidetoshi Seto2009-06-031-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make the MCE counters work on 32bit and add poll count in arch_irq_stat_cpu. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | | | x86, mce: support action-optional machine checksAndi Kleen2009-06-033-1/+135
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Newer Intel CPUs support a new class of machine checks called recoverable action optional. Action Optional means that the CPU detected some form of corruption in the background and tells the OS about using a machine check exception. The OS can then take appropiate action, like killing the process with the corrupted data or logging the event properly to disk. This is done by the new generic high level memory failure handler added in a earlier patch. The high level handler takes the address with the failed memory and does the appropiate action, like killing the process. In this version of the patch the high level handler is stubbed out with a weak function to not create a direct dependency on the hwpoison branch. The high level handler cannot be directly called from the machine check exception though, because it has to run in a defined process context to be able to sleep when taking VM locks (it is not expected to sleep for a long time, just do so in some exceptional cases like lock contention) Thus the MCE handler has to queue a work item for process context, trigger process context and then call the high level handler from there. This patch adds two path to process context: through a per thread kernel exit notify_user() callback or through a high priority work item. The first runs when the process exits back to user space, the other when it goes to sleep and there is no higher priority process. The machine check handler will schedule both, and whoever runs first will grab the event. This is done because quick reaction to this event is critical to avoid a potential more fatal machine check when the corruption is consumed. There is a simple lock less ring buffer to queue the corrupted addresses between the exception handler and the process context handler. Then in process context it just calls the high level VM code with the corrupted PFNs. The code adds the required code to extract the failed address from the CPU's machine check registers. It doesn't try to handle all possible cases -- the specification has 6 different ways to specify memory address -- but only the linear address. Most of the required checking has been already done earlier in the mce_severity rule checking engine. Following the Intel recommendations Action Optional errors are only enabled for known situations (encoded in MCACODs). The errors are ignored otherwise, because they are action optional. v2: Improve comment, disable preemption while processing ring buffer (reported by Ying Huang) Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | | | x86, mce: define MCE_VECTORAndi Kleen2009-06-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add MCE_VECTOR for the #MC exception. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>