summaryrefslogtreecommitdiffstats
path: root/include
Commit message (Collapse)AuthorAgeFilesLines
* Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds2007-03-062-11/+11
|\ | | | | | | | | * master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: [SPARC64]: Fix floppy build failure.
| * [SPARC64]: Fix floppy build failure.David S. Miller2007-03-052-11/+11
| | | | | | | | | | | | | | | | Just define a local {claim,release}_dma_lock() implementation for the floppy driver to use so we don't need to define and export to modules the silly dma_spin_lock. Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds2007-03-065-4/+6
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [DCCP]: Set RTO for newly created child socket [DCCP]: Correctly split CCID half connections [NET]: Fix compat_sock_common_getsockopt typo. [NET]: Revert incorrect accept queue backlog changes. [INET]: twcal_jiffie should be unsigned long, not int [GIANFAR]: Fix compile error in latest git [PPPOE]: Use ifindex instead of device pointer in key lookups. [NETFILTER]: ip6_route_me_harder should take into account mark [NETFILTER]: nfnetlink_log: fix reference counting [NETFILTER]: nfnetlink_log: fix module reference counting [NETFILTER]: nfnetlink_log: fix possible NULL pointer dereference [NETFILTER]: nfnetlink_log: fix NULL pointer dereference [NETFILTER]: nfnetlink_log: fix use after free [NETFILTER]: nfnetlink_log: fix reference leak [NETFILTER]: tcp conntrack: accept SYN|URG as valid [NETFILTER]: nf_conntrack/nf_nat: fix incorrect config ifdefs [NETFILTER]: conntrack: fix {nf,ip}_ct_iterate_cleanup endless loops
| * | [NET]: Revert incorrect accept queue backlog changes.David S. Miller2007-03-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts two changes: 8488df894d05d6fa41c2bd298c335f944bb0e401 248f06726e866942b3d8ca8f411f9067713b7ff8 A backlog value of N really does mean allow "N + 1" connections to queue to a listening socket. This allows one to specify "0" as the backlog and still get 1 connection. Noticed by Gerrit Renker and Rick Jones. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | [INET]: twcal_jiffie should be unsigned long, not intEric Dumazet2007-03-051-1/+1
| | | | | | | | | | | | | | | Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | [PPPOE]: Use ifindex instead of device pointer in key lookups.Florian Zumbiehl2007-03-051-0/+2
| | | | | | | | | | | | | | | | | | | | | Otherwise we can potentially try to dereference a NULL device pointer in some cases. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | [NETFILTER]: conntrack: fix {nf,ip}_ct_iterate_cleanup endless loopsPatrick McHardy2007-03-052-2/+2
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix {nf,ip}_ct_iterate_cleanup unconfirmed list handling: - unconfirmed entries can not be killed manually, they are removed on confirmation or final destruction of the conntrack entry, which means we might iterate forever without making forward progress. This can happen in combination with the conntrack event cache, which holds a reference to the conntrack entry, which is only released when the packet makes it all the way through the stack or a different packet is handled. - taking references to an unconfirmed entry and using it outside the locked section doesn't work, the list entries are not refcounted and another CPU might already be waiting to destroy the entry What the code really wants to do is make sure the references of the hash table to the selected conntrack entries are released, so they will be destroyed once all references from skbs and the event cache are dropped. Since unconfirmed entries haven't even entered the hash yet, simply mark them as dying and skip confirmation based on that. Reported and tested by Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6Linus Torvalds2007-03-062-1/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: [S390] cio: Call cancel_halt_clear even when actl == 0. [S390] cio: Use path verification to check for path state. [S390] cio: Fix locking when calling notify function. [S390] Fixed handling of access register mode faults. [S390] dasd: Use default recovery for SNSS requests [S390] check_bugs() should be inline. [S390] tape: Compression overwrites crypto setting [S390] nss: disable kexec. [S390] reipl: move dump_prefix_page out of text section. [S390] smp: disable preemption in smp_call_function/smp_call_function_on [S390] kprobes breaks BUG_ON
| * | [S390] check_bugs() should be inline.Heiko Carstens2007-03-051-1/+1
| | | | | | | | | | | | | | | | | | | | | Don't have functions in header files unless they are inline. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | [S390] reipl: move dump_prefix_page out of text section.Heiko Carstens2007-03-051-0/+1
| |/ | | | | | | | | | | | | | | | | | | | | Reipl doesn't work on older machines were s390_reset_machine() gets called. The reason is that the text section is read-only but the variable dump_prefix_page is there. Since s390_reset_machine() writes to it we get a protection exception. Therefore move dump_prefix_page to the bss section. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | Revert "[PATCH] LOG2: Alter get_order() so that it can make use of ilog2() ↵Linus Torvalds2007-03-061-34/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | on a constant" This reverts commit 39d61db0edb34d60b83c5e0d62d0e906578cc707. The commit was buggy in multiple ways: - the conversion to ilog2() was incorrect to begin with - it tested the wrong #defines, so on all architectures but FRV you'd never see the bug except for constant arguments. - the new "get_order()" macro used its arguments multiple times, and didn't even parenthesize them properly - despite the comments, it was not true that you could use it for constant initializers, since not all architectures even use the generic page.h header file. All of the problems are individually fixable, but it all boils down to: better just revert it, and re-do it from scratch. Cc: David Howells <dhowells@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | [PATCH] m68knommu: GPIO line defines for the ColdFire 5282Greg Ungerer2007-03-061-0/+3
| | | | | | | | | | | | | | | | | | For the Freescale M5282 ColdFire, Port UA Pin Assignment Register should set to UART mode. Patch submitted by David Wu <davidwu@arcturusnetworks.com>. Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge branch 'release' of ↵Linus Torvalds2007-03-064-12/+5
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] kexec: Use EFI_LOADER_DATA for ELF core header [IA64] permon use-after-free fix [IA64] sync compat getdents [IA64] always build arch/ia64/lib/xor.o [IA64] Remove stack hard limit on ia64 [IA64] point saved_max_pfn to the max_pfn of the entire system Revert "[IA64] swiotlb abstraction (e.g. for Xen)"
| * | [IA64] kexec: Use EFI_LOADER_DATA for ELF core headerMagnus Damm2007-03-061-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The address where the ELF core header is stored is passed to the secondary kernel as a kernel command line option. The memory area for this header is also marked as a separate EFI memory descriptor on ia64. The separate EFI memory descriptor is at the moment of the type EFI_UNUSABLE_MEMORY. With such a type the secondary kernel skips over the entire memory granule (config option, 16M or 64M) when detecting memory. If we are lucky we will just lose some memory, but if we happen to have data in the same granule (such as an initramfs image), then this data will never get mapped and the kernel bombs out when trying to access it. So this is an attempt to fix this by changing the EFI memory descriptor type into EFI_LOADER_DATA. This type is the same type used for the kernel data and for initramfs. In the secondary kernel we then handle the ELF core header data the same way as we handle the initramfs image. This patch contains the kernel changes to make this happen. Pretty straightforward, we reserve the area in reserve_memory(). The address for the area comes from the kernel command line and the size comes from the specialized EFI parsing function vmcore_find_descriptor_size(). The kexec-tools-testing code for this can be found here: http://lists.osdl.org/pipermail/fastboot/2007-February/005983.html Signed-off-by: Magnus Damm <magnus@valinux.co.jp> Cc: Simon Horman <horms@verge.net.au> Cc: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * | [IA64] Remove stack hard limit on ia64schwab@suse.de2007-03-061-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | Un-Breaks pthreads, since Oct 2003. Signed-off-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * | Revert "[IA64] swiotlb abstraction (e.g. for Xen)"Tony Luck2007-03-062-10/+0
| |/ | | | | | | This reverts commit 51099005ab8e09d68a13fea8d55bc739c1040ca6.
* | Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linusLinus Torvalds2007-03-067-31/+56
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: [MIPS] IP27: Build fix [MIPS] Wire up ioprio_set and ioprio_get. [MIPS] Fix __raw_read_trylock() to allow multiple readers [MIPS] Export __copy_user_inatomic. [MIPS] R2 bitops compile fix for gcc < 4.0. [MIPS] TX39: Remove redundant tx39_blast_icache() calls [MIPS] Cobalt: Fix early printk [MIPS] SMTC: De-obscure Malta hooks. [MIPS] SMTC: Add fordward declarations for mm_struct and task_struct. [MIPS] SMTC: <asm/mips_mt.h> must include <linux/cpumask.h> [MIPS] SMTC: <asm/smtc_ipi.h> must include <linux/spinlock.h> [MIPS] Atlas, Malta: Fix build warning.
| * | [MIPS] Wire up ioprio_set and ioprio_get.Ralf Baechle2007-03-071-6/+12
| | | | | | | | | | | | Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | [MIPS] Fix __raw_read_trylock() to allow multiple readersDave Johnson2007-03-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | A deadlock can occur for mixed irq and non-irq rwlock readers if a 2nd reader attempts to take lock by looping around __raw_read_trylock(). Signed-off-by: Dave Johnson <djohnson+linux-mips@sw.starentnetworks.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | [MIPS] Export __copy_user_inatomic.Ralf Baechle2007-03-071-0/+2
| | | | | | | | | | | | Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | [MIPS] R2 bitops compile fix for gcc < 4.0.Ralf Baechle2007-03-071-23/+33
| | | | | | | | | | | | Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | [MIPS] SMTC: Add fordward declarations for mm_struct and task_struct.Ralf Baechle2007-03-071-0/+3
| | | | | | | | | | | | Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | [MIPS] SMTC: <asm/mips_mt.h> must include <linux/cpumask.h>Ralf Baechle2007-03-071-0/+2
| | | | | | | | | | | | Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | [MIPS] SMTC: <asm/smtc_ipi.h> must include <linux/spinlock.h>Ralf Baechle2007-03-071-0/+2
| |/ | | | | | | Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
* | Merge branch 'linus' of master.kernel.org:/pub/scm/linux/kernel/git/perex/alsaLinus Torvalds2007-03-061-2/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'linus' of master.kernel.org:/pub/scm/linux/kernel/git/perex/alsa: [ALSA] version 1.0.14rc3 [ALSA] cmipci - Allow to disable integrated FM port [ALSA] hda-codec - Fix logic error in headphone mute for Conexant codecs [ALSA] hda-codec - Add missing Mic Boost for AD1986A codec [ALSA] ac97 - Add Thinkpad X31 and R40 to AD1981x blacklist [ALSA] Add missing sysfs device assignment for ALSA PCI drivers [ALSA] hda-codec - Define pin configs for MacBooks [ALSA] hda-codec - Add missing Mic Boost controls for ALC262 [ALSA] soc - WM9712 PCM volume [ALSA] soc - Fix WM9712 register cache entry [ALSA] hda-codec - Add method for configuring Mac Pro without PCI SSID [ALSA] hda-codec - Add LFE support on Dell M90
| * | [ALSA] version 1.0.14rc3Jaroslav Kysela2007-03-061-2/+2
| |/ | | | | | | Signed-off-by: Jaroslav Kysela <perex@suse.cz>
* | Merge branch 'for-linus' of ↵Linus Torvalds2007-03-062-0/+10
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc: sdhci: release irq during suspend sdhci: make isr tolerant of read errors mmc: require explicit support for high-speed ncpfs: make sure server connection survives a kill
| * | mmc: require explicit support for high-speedPierre Ossman2007-03-061-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | The new high-speed timings are similar to each other and the old system, but not identical. And although things "just work" most of the time, sometimes it does not. So we need to start marking which hosts are known to fully comply with the new timings. Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
| * | ncpfs: make sure server connection survives a killPierre Ossman2007-03-061-0/+2
| |/ | | | | | | | | | | | | | | | | Use internal buffers instead of the ones supplied by the caller so that a caller can be interrupted without having to abort the entire ncp connection. Signed-off-by: Pierre Ossman <ossman@cendio.se> Acked-by: Petr Vandrovec <petr@vandrovec.name>
* | Merge branch 'upstream-linus' of ↵Linus Torvalds2007-03-062-0/+3
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: sis900 warning fixes mv643xx_eth: Place explicit port number in mv643xx_eth_platform_data pcnet32: Fix PCnet32 performance bug on non-coherent architecutres __devinit & __devexit cleanups for de2104x driver 3c59x: Handle pci_enable_device() failure while resuming dmfe: Fix link detection dmfe: fix two bugs dmfe: trivial/spelling fixes revert "drivers/net/tulip/dmfe: support basic carrier detection" ucc_geth: returns NETDEV_TX_BUSY when BD ring is full ucc_geth: Fix BD processing natsemi: netpoll fixes bonding: Improve IGMP join processing bonding: only receive ARPs for us bonding: fix double dev_add_pack
| * | mv643xx_eth: Place explicit port number in mv643xx_eth_platform_dataDale Farnsworth2007-03-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We were using the platform_device.id field to identify which ethernet port is used for mv643xx_eth device. This is not generally correct. It will be incorrect, for example, if a hardware platform uses a single port but not the first port. Here, we add an explicit port_number field to struct mv643xx_eth_platform_data. This makes the mv643xx_eth_platform_data structure required, but that isn't an issue since all users currently provide it already. Signed-off-by: Dale Farnsworth <dale@farnsworth.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
| * | bonding: Improve IGMP join processingJay Vosburgh2007-03-061-0/+2
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In active-backup mode, the current bonding code duplicates IGMP traffic to all slaves, so that switches are up to date in case of a failover from an active to a backup interface. If bonding then fails back to the original active interface, it is likely that the "active slave" switch's IGMP forwarding for the port will be out of date until some event occurs to refresh the switch (e.g., a membership query). This patch alters the behavior of bonding to no longer flood IGMP to all ports, and to issue IGMP JOINs to the newly active port at the time of a failover. This insures that switches are kept up to date for all cases. "GOELLESCH Niels" <niels.goellesch@eurocontrol.int> originally reported this problem, and included a patch. His original patch was modified by Jay Vosburgh to additionally remove the existing IGMP flood behavior, use RCU, streamline code paths, fix trailing white space, and adjust for style. Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
* | [PATCH] knfsd: fix recently introduced problem with shutting down a busy NFS ↵NeilBrown2007-03-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | server When the last thread of nfsd exits, it shuts down all related sockets. It currently uses svc_close_socket to do this, but that only is immediately effective if the socket is not SK_BUSY. If the socket is busy - i.e. if a request has arrived that has not yet been processes - svc_close_socket is not effective and the shutdown process spins. So create a new svc_force_close_socket which removes the SK_BUSY flag is set and then calls svc_close_socket. Also change some open-codes loops in svc_destroy to use list_for_each_entry_safe. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | [PATCH] knfsd: remove CONFIG_IPV6 ifdefs from sunrpc server codeNeilBrown2007-03-061-2/+0
| | | | | | | | | | | | | | | | They don't really save that much, and aren't worth the hassle. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | [PATCH] linux/audit.h needs linux/types.hJeff Dike2007-03-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | Include linux/types.h here because we need a definition of __u32. This file appears not be exported verbatim by libc, so I think this doesn't have any userspace consequences. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | [PATCH] hrtimers: hrtimer_clock_base description typoAndres Salomon2007-03-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | The description for the hrtimer_clock_base struct describes "hrtimer_base". That should be hrtimer_clock_base. Signed-off-by: Andres Salomon <dilinger@debian.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | [PATCH] hrtimers: fix HRTIMER_CB_IRQSAFE_NO_SOFTIRQ descriptionAndres Salomon2007-03-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | The description for HRTIMER_CB_IRQSAFE_NO_SOFTIRQ is backwards; "NO SOFTIRQ" sounds a whole lot like it means it must not be run in a softirq. Signed-off-by: Andres Salomon <dilinger@debian.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | [PATCH] i386: make x86_64 tsc header require i386 rather than vice-versaAndres Salomon2007-03-062-68/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prior to commit 95492e4646e5de8b43d9a7908d6177fb737b61f0 ([PATCH] x86: rewrite SMP TSC sync code), the headers in asm-i386 did not really require anything in include/asm-x86_64. This means that distributions such as fedora did not include asm-x86_64 in kernel-devel headers for i386. Ingo's commit changed that, and broke things. This is easy enough to hack around in package builds by just including asm-x86_64 on i386, but that's kind of annoying. If anything, x86_64 should depend upon i386, not the other way around. This patch changes it so that asm-x86_64/tsc.h includes asm-i386/tsc.h, rather than vice-versa. Signed-off-by: Andres Salomon <dilinger@debian.org> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | [PATCH] fix build with CONFIG_NO_IDLE_HZ=nAndrew Morton2007-03-061-0/+8
|/ | | | | | | | | | | | | arch/i386/kernel/vmi.c: In function 'vmi_safe_halt': arch/i386/kernel/vmi.c:262: warning: implicit declaration of function 'vmi_stop_hz_timer' arch/i386/kernel/vmi.c:266: warning: implicit declaration of function 'vmi_account_time_restart_hz_timer' Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Zachary Amsden <zach@vmware.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] disable NMI watchdog by defaultIngo Molnar2007-03-052-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | there's a new NMI watchdog related problem: KVM crashes on certain bzImages because ... we enable the NMI watchdog by default (even if the user does not ask for it) , and no other OS on this planet does that so KVM doesnt have emulation for that yet. So KVM injects a #GP, which crashes the Linux guest: general protection fault: 0000 [#1] PREEMPT SMP Modules linked in: CPU: 0 EIP: 0060:[<c011a8ae>] Not tainted VLI EFLAGS: 00000246 (2.6.20-rc5-rt0 #3) EIP is at setup_apic_nmi_watchdog+0x26d/0x3d3 and no, i did /not/ request an nmi_watchdog on the boot command line! Solution: turn off that darn thing! It's a debug tool, not a 'make life harder' tool!! with this patch the KVM guest boots up just fine. And with this my laptop (Lenovo T60) also stopped its sporadic hard hanging (sometimes in acpi_init(), sometimes later during bootup, sometimes much later during actual use) as well. It hung with both nmi_watchdog=1 and nmi_watchdog=2, so it's generally the fact of NMI injection that is causing problems, not the NMI watchdog variant, nor any particular bootup code. [ NMI breaks on some systems, esp in combination with SMM -Arjan ] Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] timer/hrtimer: take per cpu locks in sane orderHeiko Carstens2007-03-051-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Doing something like this on a two cpu system # echo 0 > /sys/devices/system/cpu/cpu0/online # echo 1 > /sys/devices/system/cpu/cpu0/online # echo 0 > /sys/devices/system/cpu/cpu1/online will give me this: ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.21-rc2-g562aa1d4-dirty #7 ------------------------------------------------------- bash/1282 is trying to acquire lock: (&cpu_base->lock_key){.+..}, at: [<000000000005f17e>] hrtimer_cpu_notify+0xc6/0x240 but task is already holding lock: (&cpu_base->lock_key#2){.+..}, at: [<000000000005f174>] hrtimer_cpu_notify+0xbc/0x240 which lock already depends on the new lock. This happens because we have the following code in kernel/hrtimer.c: migrate_hrtimers(int cpu) [...] old_base = &per_cpu(hrtimer_bases, cpu); new_base = &get_cpu_var(hrtimer_bases); [...] spin_lock(&new_base->lock); spin_lock(&old_base->lock); Which means the spinlocks are taken in an order which depends on which cpu gets shut down from which other cpu. Therefore lockdep complains that there might be an ABBA deadlock. Since migrate_hrtimers() gets only called on cpu hotplug it's safe to assume that it isn't executed concurrently on a The same problem exists in kernel/timer.c: migrate_timers(). As pointed out by Christian Borntraeger one possible solution to avoid the locking order complaints would be to make sure that the locks are always taken in the same order. E.g. by taking the lock of the cpu with the lower number first. To achieve this we introduce two new spinlock functions double_spin_lock and double_spin_unlock which lock or unlock two locks in a given order. Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: John Stultz <johnstul@us.ibm.com> Cc: Christian Borntraeger <cborntra@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] clocksource init adjustments (fix bug #7426)john stultz2007-03-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch resolves the issue found here: http://bugme.osdl.org/show_bug.cgi?id=7426 The basic summary is: Currently we register most of i386/x86_64 clocksources at module_init time. Then we enable clocksource selection at late_initcall time. This causes some problems for drivers that use gettimeofday for init calibration routines (specifically the es1968 driver in this case), where durring module_init, the only clocksource available is the low-res jiffies clocksource. This may cause slight calibration errors, due to the small sampling time used. It should be noted that drivers that require fine grained time may not function on architectures that do not have better then jiffies resolution timekeeping (there are a few). However, this does not discount the reasonable need for such fine-grained timekeeping at init time. Thus the solution here is to register clocksources earlier (ideally when the hardware is being initialized), and then we enable clocksource selection at fs_initcall (before device_initcall). This patch should probably get some testing time in -mm, since clocksource selection is one of the most important issues for correct timekeeping, and I've only been able to test this on a few of my own boxes. Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] vmi: apic opsZachary Amsden2007-03-051-0/+1
| | | | | | | | | | | | | | | | | | | Use para_fill instead of directly setting the APIC ops to the result of the vmi_get_function call - this allows one to implement a VMI ROM without implementing APIC functions, just using the native APIC functions. While doing this, I realized that there is a lot more cleanup that should have been done. Basically, we should never assume that the ROM implements a specific set of functions, and always allow fallback to the native implementation. This is critical for future compatibility. Signed-off-by: Anthony Liguori <anthony@codemonkey.ws> Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] vmi: pit overrideZachary Amsden2007-03-052-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The time_init_hook in paravirt-ops no longer functions in the correct manner after the integration of the hrtimers code. The problem is that now the call path for time initialization is: time_init : late_time_init = hpet_time_init; late_time_init -> hpet_time_init: setup_pit_timer (BAD) do_time_init --> (via paravirt.h) time_init_hook --> (via arch_hooks.h) time_init_hook (in SUBARCH/setup.c) If this isn't confusing enough, the paravirt case goes through an indirect function pointer in the paravirt-ops table. The problem is, by the time the paravirt hook is called, the pit timer is already enabled. But paravirt guests have their own timer, and don't want to use the PIT. Rather than intensify the struggle for power going on here, just make it all nice and simple and just unconditionally do all timer setup in the late_time_init hook. This also has the advantage of enabling timers in the same place in all code paths, so everyone has the same bugs and we don't have outliers who break other code because they turn on timer too early or too late. So the paravirt-ops time init function is now by default hpet_time_init, which is the time init function used for native hardware. Paravirt guests have the chance to override this when they setup the paravirt-ops table, and should need no change. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] vmi: paravirt drop udelay opZachary Amsden2007-03-052-9/+0
| | | | | | | | | | | | | | | | | | Not respecting udelay causes problems with any virtual hardware that is passed through to real hardware. This can be noticed by any device that interacts with the real world in real time - like AP startup, which takes real time. Or keyboard LEDs, which should blink in real-time. Or floppy drives, but only when passed through to a real floppy controller on OSes which can't sufficiently buffer the floppy commands to emulate a zero latency floppy. Or IDE drives, when connecting to a physical CDROM. This was mostly a hack to get the kernel to boot faster, but it introduced a number of misvirtualization bugs, and Alan and Pavel argued pretty strongly against it. We were the only client, and now want to clean up this cruft. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] vmi: fix highpteZachary Amsden2007-03-052-4/+23
| | | | | | | | | | | | | | | | | | | | Provide a PT map hook for HIGHPTE kernels to designate where they are mapping page tables. This information is required so the physical address of PTE updates can be determined; otherwise, the mm layer would have to carry the physical address all the way to each PTE modification callsite, which is even more hideous that the macros required to provide the proper hooks. So lets not mess up arch neutral code to achieve this, but keep the horror in an #ifdef HIGHPTE in include/asm-i386/pgtable.h. I had to use macros here because some types are not yet defined in all the include paths for this header. This patch is absolutely required for HIGHPTE kernels to operate properly with VMI. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] vmi: cpu cycles fixZachary Amsden2007-03-053-0/+5
| | | | | | | | | | | | In order to share the common code in tsc.c which does CPU Khz calibration, we need to make an accurate value of CPU speed available to the tsc.c code. This value loses a lot of precision in a VM because of the timing differences with real hardware, but we need it to be as precise as possible so the guest can make accurate time calculations with the cycle counters. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] vmi: sched clock paravirt op fixZachary Amsden2007-03-054-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | The custom_sched_clock hook is broken. The result from sched_clock needs to be in nanoseconds, not in CPU cycles. The TSC is insufficient for this purpose, because TSC is poorly defined in a virtual environment, and mostly represents real world time instead of scheduled process time (which can be interrupted without notice when a virtual machine is descheduled). To make the scheduler consistent, we must expose a different nature of time, that is scheduled time. So deprecate this custom_sched_clock hack and turn it into a paravirt-op, as it should have been all along. This allows the tsc.c code which converts cycles to nanoseconds to be shared by all paravirt-ops backends. It is unfortunate to add a new paravirt-op, but this is a very distinct abstraction which is clearly different for all virtual machine implementations, and it gets rid of an ugly indirect function which I ashamedly admit I hacked in to try to get this to work earlier, and then even got in the wrong units. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] Page migration: Fix vma flag checkingChristoph Lameter2007-03-051-0/+8
| | | | | | | | | | | | | | | | | Currently we do not check for vma flags if sys_move_pages is called to move individual pages. If sys_migrate_pages is called to move pages then we check for vm_flags that indicate a non migratable vma but that still includes VM_LOCKED and we can migrate mlocked pages. Extract the vma_migratable check from mm/mempolicy.c, fix it and put it into migrate.h so that is can be used from both locations. Problem was spotted by Lee Schermerhorn Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] sched: remove SMT niceCon Kolivas2007-03-057-11/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | Remove the SMT-nice feature which idles sibling cpus on SMT cpus to facilitiate nice working properly where cpu power is shared. The idling of cpus in the presence of runnable tasks is considered too fragile, easy to break with outside code, and the complexity of managing this system if an architecture comes along with many logical cores sharing cpu power will be unworkable. Remove the associated per_cpu_gain variable in sched_domains used only by this code. Also: The reason is that with dynticks enabled, this code breaks without yet further tweaks so dynticks brought on the rapid demise of this code. So either we tweak this code or kill it off entirely. It was Ingo's preference to kill it off. Either way this needs to happen for 2.6.21 since dynticks has gone in. Signed-off-by: Con Kolivas <kernel@kolivas.org> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>