summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* gianfar: Fix DMA unmap invocationsAndy Fleming2008-11-141-7/+8
| | | | | | | | We weren't unmapping DMA memory, which will break when gianfar gets used on systems with more than 32-bits of memory. Also, it's just plain wrong. Signed-off-by: Andy Fleming <afleming@freescale.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* net/ucc_geth: Fix oops in uec_get_ethtool_stats()Anton Vorontsov2008-11-141-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | p_{tx,rx}_fw_statistics_pram are special: they're available only when a device is open. If the device is closed, we should just fill the data with zeroes. Fixes the following oops: root@b1:~# ifconfig eth1 down root@b1:~# ethtool -S eth1 Unable to handle kernel paging request for data at address 0x00000000 Faulting instruction address: 0xc01e1dcc Oops: Kernel access of bad area, sig: 11 [#1] [...] NIP [c01e1dcc] uec_get_ethtool_stats+0x98/0x124 LR [c0287cc8] ethtool_get_stats+0xfc/0x23c Call Trace: [cfaadde0] [c0287ca8] ethtool_get_stats+0xdc/0x23c (unreliable) [cfaade20] [c0288340] dev_ethtool+0x2fc/0x588 [cfaade50] [c0285648] dev_ioctl+0x290/0x33c [cfaadea0] [c0272238] sock_ioctl+0x80/0x2ec [cfaadec0] [c00b5ae4] vfs_ioctl+0x40/0xc0 [cfaadee0] [c00b5fa8] do_vfs_ioctl+0x78/0x20c [cfaadf10] [c00b617c] sys_ioctl+0x40/0x74 [cfaadf40] [c00142d8] ret_from_syscall+0x0/0x38 [...] ---[ end trace b941007b2dfb9759 ]--- Segmentation fault p.s. While at it, also remove u64 casts, they aren't needed. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* 9p: restrict RDMA usageRandy Dunlap2008-11-121-4/+6
| | | | | | | | | | | | | | | | | | | | | | Make 9p's RDMA option depend on INET since it uses Infiniband rdma_* functions and that code depends on INET. Otherwise 9p can try to use symbols which don't exist. ERROR: "rdma_destroy_id" [net/9p/9pnet_rdma.ko] undefined! ERROR: "rdma_connect" [net/9p/9pnet_rdma.ko] undefined! ERROR: "rdma_create_id" [net/9p/9pnet_rdma.ko] undefined! ERROR: "rdma_create_qp" [net/9p/9pnet_rdma.ko] undefined! ERROR: "rdma_resolve_route" [net/9p/9pnet_rdma.ko] undefined! ERROR: "rdma_disconnect" [net/9p/9pnet_rdma.ko] undefined! ERROR: "rdma_resolve_addr" [net/9p/9pnet_rdma.ko] undefined! I used an if/endif block so that the menu items would remain presented together. Also correct an article adjective. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: shy netns_ok checkAlexey Dobriyan2008-11-121-1/+9
| | | | | | | | | | | Failure to pass netns_ok check is SILENT, except some MIB counter is incremented somewhere. And adding "netns_ok = 1" (after long head-scratching session) is usually the last step in making some protocol netns-ready... Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: routing header fixesBrian Haley2008-11-122-0/+10
| | | | | | | | | | | | | | | This patch fixes two bugs: 1. setsockopt() of anything but a Type 2 routing header should return EINVAL instead of EPERM. Noticed by Shan Wei (shanwei@cn.fujitsu.com). 2. setsockopt()/sendmsg() of a Type 2 routing header with invalid length or segments should return EINVAL. These values are statically fixed in RFC 3775, unlike the variable Type 0 was. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bnx2: fix poll_controller to pass proper structures and check all rx queuesNeil Horman2008-11-121-3/+6
| | | | | | | | | | | | | Fix bnx2 so that netpoll works properly. Specifically: 1) Fix parameters to bnx2_interrupt to be a struct bnx2_napi rather than a struct net_device 2) Fix poll_controller method to check every queue in the rx case so frames aren't missed Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2008-11-123-4/+10
|\ | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
| * hostap: pad the skb->cb usage in lieu of a proper fixJohannes Berg2008-11-121-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Like mac80211 did, this driver makes 'clever' use of skb->cb to pass information along with an skb as it is requeued from the virtual device to the physical wireless device. Unfortunately, that trick no longer works... Unlike mac80211, code complexity and driver apathy makes this hack the best option we have in the short run. Hopefully someone will eventually be motivated to code a proper fix before all the effected hardware dies. (Above text by me. Johannes officially disavows all knowledge of this hack. -- JWL) Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Cc: stable@kernel.org Signed-off-by: John W. Linville <linville@tuxdriver.com>
| * rtl8187 : support for Sitecom WL-168 0001 v4Bob Jolliffe2008-11-121-0/+1
| | | | | | | | | | | | | | the Sitecom 0001 v4 with product id 0x0df6:0028, uses Realtek's RTL8187B and work fine with new 2.6.27 driver. Signed-off-by: John W. Linville <linville@tuxdriver.com>
| * mac80211: fix notify_mac functionJohannes Berg2008-11-121-3/+3
| | | | | | | | | | | | | | | | | | | | The ieee80211_notify_mac() function uses ieee80211_sta_req_auth() which in turn calls ieee80211_set_disassoc() which calls a few functions that need to be able to sleep, so ieee80211_notify_mac() cannot use RCU locking for the interface list and must use rtnl locking instead. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| * rtl8187: Add Abocom USB IDIvan Kuten2008-11-121-0/+2
| | | | | | | | | | | | Signed-off-by: Ivan Kuten <ivan.kuten@promwad.com> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
* | niu: Fix readq implementation when architecture does not provide one.David S. Miller2008-11-121-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes a TX hang reported by Jesper Dangaard Brouer. When an architecutre cannot provide a fully functional 64-bit atomic readq/writeq, the driver must implement it's own. This is because only the driver can say whether doing something like using two 32-bit reads to implement the full 64-bit read will actually work properly. In particular one of the issues is whether the top 32-bits or the bottom 32-bits of the 64-bit register should be read first. There could be side effects, and in fact that is exactly the problem here. The TX_CS register has counters in the upper 32-bits and state bits in the lower 32-bits. A read clears the state bits. We would read the counter half before the state bit half. That first read would clear the state bits, and then the driver thinks that no interrupts are pending because the interrupt indication state bits are seen clear every time. Fix this by reading the bottom half before the upper half. Tested-by: Jesper Dangaard Brouer <jdb@comx.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: put_cmsg_compat + SO_TIMESTAMP[NS]: use same name for value as callerPatrick Ohly2008-11-121-2/+2
| | | | | | | | | | | | | | | | In __sock_recv_timestamp() the additional SCM_TIMESTAMP[NS] is used. This has the same value as SO_TIMESTAMP[NS], so this is a purely cosmetic change. Signed-off-by: Patrick Ohly <patrick.ohly@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | tcp_htcp: last_cong bug fixDoug Leith2008-11-121-4/+10
|/ | | | | | | | | | | | | | | | This patch fixes a minor bug in tcp_htcp.c which has been highlighted by Lachlan Andrew and Lawrence Stewart. Currently, the time since the last congestion event, which is stored in variable last_cong, is reset whenever there is a state change into TCP_CA_Open. This includes transitions of the type TCP_CA_Open->TCP_CA_Disorder->TCP_CA_Open which are not associated with backoff of cwnd. The patch changes last_cong to be updated only on transitions into TCP_CA_Open that occur after experiencing the congestion-related states TCP_CA_Loss, TCP_CA_Recovery, TCP_CA_CWR. Signed-off-by: Doug Leith <doug.leith@nuim.ie> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'davem-fixes' of ↵David S. Miller2008-11-115-10/+16
|\ | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6
| * [netdrvr] smc911x: fix for driver resume (and compilation warning)Dasgupta, Romit2008-11-111-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I am trying out suspend, resume on an OMAP3 based board. What I see during resume is that the SMC911x driver resume routing gets stuck after trying to transmit the packet out of the controller. Some debug messages below: --> smc911x_drv_resume eth0: --> smc911x_reset eth0: smc911x_reset timeout waiting for PM restore eth0: --> smc911x_enable eth0: --> smc911x_phy_configure() eth0: --> smc911x_phy_reset() eth0: phy caps=0x782d eth0: phy advertised caps=0x0de1 eth0: --> smc911x_phy_check_media smc911x_phy_read: phyaddr=0x1, phyreg=0x01, phydata=0x7809 smc911x_phy_read: phyaddr=0x1, phyreg=0x01, phydata=0x7809 eth0: link down Restarting tasks ... eth0: --> smc911x_hard_start_xmit eth0: --> smc911x_hardware_send_pkt eth0: --> smc911x_hard_start_xmit eth0: --> smc911x_hardware_send_pkt eth0: --> smc911x_hard_start_xmit eth0: --> smc911x_hardware_send_pkt nfs: server 172.24.190.217 not responding, still trying nfs: server 172.24.190.217 not responding, still trying The following change makes it work fine: (The change within smc911x_drv_probe function was to get rid of a compilation warning). Signed-off-by: Romit Dasgupta <romit@ti.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
| * RDMA/cxgb3: deadlock in iw_cxgb3 can cause hang when configuring interface.Steve Wise2008-11-112-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the iw_cxgb3 module's cxgb3_client "add" func gets called by the cxgb3 module, the iwarp driver ends up calling the ethtool ops get_drvinfo function in cxgb3 to get the fw version and other info. Currently the iwarp driver grabs the rtnl lock around this down call to serialize. As of 2.6.27 or so, things changed such that the rtnl lock is held around the call to the netdev driver open function. Also the cxgb3_client "add" function doesn't get called if the device is down. So, if you load cxgb3, then load iw_cxgb3, then ifconfig up the device, the iw_cxgb3 add func gets called with the rtnl_lock held. If you load cxgb3, ifconfig up the device, then load iw_cxgb3, the add func gets called without the rtnl_lock held. The former causes the deadlock, the latter does not. In addition, there are iw_cxgb3 sysfs handlers that also can call down into cxgb3 to gather the fw and hw versions. These can be called concurrently on different processors and at any time. Thus we need to push this serialization down in the cxgb3 driver get_drvinfo func. The fix is to remove rtnl lock usage, and use a per-device lock in cxgb3. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
| * cxgb3 - Limit multiqueue setting to msi-xDivy Le Ray2008-11-111-1/+1
| | | | | | | | | | | | | | Allow multiqueue setting in MSI-X mode only Signed-off-by: Divy Le Ray <divy@chelsio.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
| * cxgb3 - eeprom read fixesDivy Le Ray2008-11-111-1/+7
| | | | | | | | | | | | | | | | Protect against invalid phy entries in the eeprom. Extend eeprom access timeout. Signed-off-by: Divy Le Ray <divy@chelsio.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
| * myri10ge: fix stop/go ordering even moreBrice Goglin2008-11-111-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | The doorbell writes may be seen out of order by the firmware if they are in WC memory since the tx spin(un)lock does not flush WC writes. Hence if the "stop" is written on a different CPU than the "go", it is possible that the stop will arrive after the go unless we add an explicit memory barrier (and mmiowb() is not enough). It fixes transmit hangs in multi tx queue mode. Signed-off-by: Brice Goglin <brice@myri.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* | Merge branch 'timers-fixes-for-linus' of ↵Linus Torvalds2008-11-114-6/+26
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: timers: handle HRTIMER_CB_IRQSAFE_UNLOCKED correctly from softirq context nohz: disable tick_nohz_kick_tick() for now irq: call __irq_enter() before calling the tick_idle_check x86: HPET: enter hpet_interrupt_handler with interrupts disabled x86: HPET: read from HPET_Tn_CMP() not HPET_T0_CMP x86: HPET: convert WARN_ON to WARN_ON_ONCE
| * | timers: handle HRTIMER_CB_IRQSAFE_UNLOCKED correctly from softirq contextGautham R Shenoy2008-11-111-1/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: fix incorrect locking triggered during hotplug-intense stress-tests While migrating the the CB_IRQSAFE_UNLOCKED timers during a cpu-offline, we queue them on the cb_pending list, so that they won't go stale. Thus, when the callbacks of the timers run from the softirq context, they could run into potential deadlocks, since these callbacks assume that they're running with irq's disabled, thereby annoying lockdep! Fix this by emulating hardirq context while running these callbacks from the hrtimer softirq. ================================= [ INFO: inconsistent lock state ] 2.6.27 #2 -------------------------------- inconsistent {in-hardirq-W} -> {hardirq-on-W} usage. ksoftirqd/0/4 [HC0[0]:SC1[1]:HE1:SE0] takes: (&rq->lock){++..}, at: [<c011db84>] sched_rt_period_timer+0x9e/0x1fc {in-hardirq-W} state was registered at: [<c014103c>] __lock_acquire+0x549/0x121e [<c0107890>] native_sched_clock+0x88/0x99 [<c013aa12>] clocksource_get_next+0x39/0x3f [<c0139abc>] update_wall_time+0x616/0x7df [<c0141d6b>] lock_acquire+0x5a/0x74 [<c0121724>] scheduler_tick+0x3a/0x18d [<c047ed45>] _spin_lock+0x1c/0x45 [<c0121724>] scheduler_tick+0x3a/0x18d [<c0121724>] scheduler_tick+0x3a/0x18d [<c012c436>] update_process_times+0x3a/0x44 [<c013c044>] tick_periodic+0x63/0x6d [<c013c062>] tick_handle_periodic+0x14/0x5e [<c010568c>] timer_interrupt+0x44/0x4a [<c0150c9f>] handle_IRQ_event+0x13/0x3d [<c0151c14>] handle_level_irq+0x79/0xbd [<c0105634>] do_IRQ+0x69/0x7d [<c01041e4>] common_interrupt+0x28/0x30 [<c047007b>] aac_probe_one+0x1a3/0x3f3 [<c047ec2d>] _spin_unlock_irqrestore+0x36/0x39 [<c01512b4>] setup_irq+0x1be/0x1f9 [<c065d70b>] start_kernel+0x259/0x2c5 [<ffffffff>] 0xffffffff irq event stamp: 50102 hardirqs last enabled at (50102): [<c047ebf4>] _spin_unlock_irq+0x20/0x23 hardirqs last disabled at (50101): [<c047edc2>] _spin_lock_irq+0xa/0x4b softirqs last enabled at (50088): [<c0128ba6>] do_softirq+0x37/0x4d softirqs last disabled at (50099): [<c0128ba6>] do_softirq+0x37/0x4d other info that might help us debug this: no locks held by ksoftirqd/0/4. stack backtrace: Pid: 4, comm: ksoftirqd/0 Not tainted 2.6.27 #2 [<c013f6cb>] print_usage_bug+0x13e/0x147 [<c013fef5>] mark_lock+0x493/0x797 [<c01410b1>] __lock_acquire+0x5be/0x121e [<c0141d6b>] lock_acquire+0x5a/0x74 [<c011db84>] sched_rt_period_timer+0x9e/0x1fc [<c047ed45>] _spin_lock+0x1c/0x45 [<c011db84>] sched_rt_period_timer+0x9e/0x1fc [<c011db84>] sched_rt_period_timer+0x9e/0x1fc [<c01210fd>] finish_task_switch+0x41/0xbd [<c0107890>] native_sched_clock+0x88/0x99 [<c011dae6>] sched_rt_period_timer+0x0/0x1fc [<c0136dda>] run_hrtimer_pending+0x54/0xe5 [<c011dae6>] sched_rt_period_timer+0x0/0x1fc [<c0128afb>] __do_softirq+0x7b/0xef [<c0128ba6>] do_softirq+0x37/0x4d [<c0128c12>] ksoftirqd+0x56/0xc5 [<c0128bbc>] ksoftirqd+0x0/0xc5 [<c0134649>] kthread+0x38/0x5d [<c0134611>] kthread+0x0/0x5d [<c0104477>] kernel_thread_helper+0x7/0x10 ======================= Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | nohz: disable tick_nohz_kick_tick() for nowThomas Gleixner2008-11-101-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: nohz powersavings and wakeup regression commit fb02fbc14d17837b4b7b02dbb36142c16a7bf208 (NOHZ: restart tick device from irq_enter()) causes a serious wakeup regression. While the patch is correct it does not take into account that spurious wakeups happen on x86. A fix for this issue is available, but we just revert to the .27 behaviour and let long running softirqs screw themself. Disable it for now. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | irq: call __irq_enter() before calling the tick_idle_checkThomas Gleixner2008-11-101-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: avoid spurious ksoftirqd wakeups The tick idle check which is called from irq_enter() was run before the call to __irq_enter() which did not set the in_interrupt() bits in preempt_count. That way the raise of a softirq woke up softirqd for nothing as the softirq was handled on return from interrupt. Call __irq_enter() before calling into the tick idle check code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | x86: HPET: enter hpet_interrupt_handler with interrupts disabledMatt Fleming2008-11-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some functions that may be called from this handler require that interrupts are disabled. Also, combining IRQF_DISABLED and IRQF_SHARED does not reliably disable interrupts in a handler, so remove IRQF_SHARED from the irq flags (this irq is not shared anyway). Signed-off-by: Matt Fleming <mjf@gentoo.org> Cc: mingo@elte.hu Cc: venkatesh.pallipadi@intel.com Cc: "Will Newton" <will.newton@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | x86: HPET: read from HPET_Tn_CMP() not HPET_T0_CMPMatt Fleming2008-11-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In hpet_next_event() we check that the value we just wrote to HPET_Tn_CMP(timer) has reached the chip. Currently, we're checking that the value we wrote to HPET_Tn_CMP(timer) is in HPET_T0_CMP, which, if timer is anything other than timer 0, is likely to fail. Signed-off-by: Matt Fleming <mjf@gentoo.org> Cc: mingo@elte.hu Cc: venkatesh.pallipadi@intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | x86: HPET: convert WARN_ON to WARN_ON_ONCEMatt Fleming2008-11-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possible to flood the console with call traces if the WARN_ON condition is true because of the frequency with which this function is called. Signed-off-by: Matt Fleming <mjf@gentoo.org> Cc: mingo@elte.hu Cc: venkatesh.pallipadi@intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | | Merge branch 'sched-fixes-for-linus' of ↵Linus Torvalds2008-11-115-26/+48
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: release buddies on yield fix for account_group_exec_runtime(), make sure ->signal can't be freed under rq->lock sched: clean up debug info
| * | | sched: release buddies on yieldPeter Zijlstra2008-11-111-5/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clear buddies on yield, so that the buddy rules don't schedule them despite them being placed right-most. This fixed a performance regression with yield-happy binary JVMs. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Tested-by: Lin Ming <ming.m.lin@intel.com>
| * | | fix for account_group_exec_runtime(), make sure ->signal can't be freed ↵Oleg Nesterov2008-11-113-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | under rq->lock Impact: fix hang/crash on ia64 under high load This is ugly, but the simplest patch by far. Unlike other similar routines, account_group_exec_runtime() could be called "implicitly" from within scheduler after exit_notify(). This means we can race with the parent doing release_task(), we can't just check ->signal != NULL. Change __exit_signal() to do spin_unlock_wait(&task_rq(tsk)->lock) before __cleanup_signal() to make sure ->signal can't be freed under task_rq(tsk)->lock. Note that task_rq_unlock_wait() doesn't care about the case when tsk changes cpu/rq under us, this should be OK. Thanks to Ingo who nacked my previous buggy patch. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Reported-by: Doug Chapman <doug.chapman@hp.com>
| * | | sched: clean up debug infoPeter Zijlstra2008-11-102-21/+22
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: clean up and fix debug info printout While looking over the sched_debug code I noticed that we printed the rq schedstats for every cfs_rq, ammend this. Also change nr_spead_over into an int, and fix a little buglet in min_vruntime printing. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | Merge branch 'tracing-fixes-for-linus' of ↵Linus Torvalds2008-11-113-101/+91
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: ring-buffer: prevent infinite looping on time stamping ftrace: disable tracing on resize ftrace: fix breakage in bin_fmt results ftrace: ftrace.txt version update ftrace: update txt document
| * \ \ Merge branch 'devel' of ↵Ingo Molnar2008-11-11228-1451/+4396
| |\ \ \ | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/urgent
| | * | | ring-buffer: prevent infinite looping on time stampingSteven Rostedt2008-11-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: removal of unnecessary looping The lockless part of the ring buffer allows for reentry into the code from interrupts. A timestamp is taken, a test is preformed and if it detects that an interrupt occurred that did tracing, it tries again. The problem arises if the timestamp code itself causes a trace. The detection will detect this and loop again. The difference between this and an interrupt doing tracing, is that this will fail every time, and cause an infinite loop. Currently, we test if the loop happens 1000 times, and if so, it will produce a warning and disable the ring buffer. The problem with this approach is that it makes it difficult to perform some types of tracing (tracing the timestamp code itself). Each trace entry has a delta timestamp from the previous entry. If a trace entry is reserved but and interrupt occurs and traces before the previous entry is commited, the delta timestamp for that entry will be zero. This actually makes sense in terms of tracing, because the interrupt entry happened before the preempted entry was commited, so one may consider the two happening at the same time. The order is still preserved in the buffer. With this idea, instead of trying to get a new timestamp if an interrupt made it in between the timestamp and the test, the entry could simply make the delta zero and continue. This will prevent interrupts or tracers in the timer code from causing the above loop. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
| | * | | ftrace: disable tracing on resizeSteven Rostedt2008-11-101-1/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: fix for bug on resize This patch addresses the bug found here: http://bugzilla.kernel.org/show_bug.cgi?id=11996 When ftrace converted to the new unified trace buffer, the resizing of the buffer was not protected as much as it was originally. If tracing is performed while the resize occurs, then the buffer can be corrupted. This patch disables all ftrace buffer modifications before a resize takes place. Signed-off-by: Steven Rostedt <srostedt@redhat.com>
| * | | | ftrace: fix breakage in bin_fmt resultsEric Anholt2008-11-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In 777e208d40d0953efc6fb4ab58590da3f7d8f02d we changed from outputting field->cpu (a char) to iter->cpu (unsigned int), increasing the resulting structure size by 3 bytes. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | ftrace: ftrace.txt version updateSteven Rostedt2008-11-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: Documentation update only Update the version that the ftrace document was written for. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | ftrace: update txt documentSteven Rostedt2008-11-041-97/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: Documentation update only A lot of changes have gone into ftrace. This patch updates the ftrace.txt document. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | | Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfsLinus Torvalds2008-11-116-7/+58
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://oss.sgi.com/xfs/xfs: [XFS] XFS: Check for valid transaction headers in recovery [XFS] handle memory allocation failures during log initialisation [XFS] Account for allocated blocks when expanding directories [XFS] Wait for all I/O on truncate to zero file size [XFS] Fix use-after-free with log and quotas
| * | | | | [XFS] XFS: Check for valid transaction headers in recoveryDavid Chinner2008-11-101-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we are about to add a new item to a transaction in recovery, we need to check that it is valid first. Currently we just assert that header magic number matches, but in production systems that is not present and we add a corrupted transaction to the list to be processed. This results in a kernel oops later when processing the corrupted transaction. Instead, if we detect a corrupted transaction, abort recovery and leave the user to clean up the mess that has occurred. SGI-PV: 988145 SGI-Modid: xfs-linux-melb:xfs-kern:32356a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
| * | | | | [XFS] handle memory allocation failures during log initialisationDave Chinner2008-11-101-3/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When there is no memory left in the system, xfs_buf_get_noaddr() can fail. If this happens at mount time during xlog_alloc_log() we fail to catch the error and oops. Catch the error from xfs_buf_get_noaddr(), and allow other memory allocations to fail and catch those errors too. Report the error to the console and fail the mount with ENOMEM. Tested by manually injecting errors into xfs_buf_get_noaddr() and xlog_alloc_log(). Version 2: o remove unnecessary casts of the returned pointer from kmem_zalloc() SGI-PV: 987246 Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
| * | | | | [XFS] Account for allocated blocks when expanding directoriesDavid Chinner2008-11-102-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we create a directory, we reserve a number of blocks for the maximum possible expansion of of the directory due to various btree splits, freespace allocation, etc. Unfortunately, each allocation is not reflected in the total number of blocks still available to the transaction, so the maximal reservation is used over and over again. This leads to problems where an allocation group has only enough blocks for *some* of the allocations required for the directory modification. After the first N allocations, the remaining blocks in the allocation group drops below the total reservation, and subsequent allocations fail because the allocator will not allow the allocation to proceed if the AG does not have the enough blocks available for the entire allocation total. This results in an ENOSPC occurring after an allocation has already occurred. This results in aborting the directory operation (leaving the directory in an inconsistent state) and cancelling a dirty transaction, which results in a filesystem shutdown. Avoid the problem by reflecting the number of blocks allocated in any directory expansion in the total number of blocks available to the modification in progress. This prevents a directory modification from being aborted part way through with an ENOSPC. SGI-PV: 988144 SGI-Modid: xfs-linux-melb:xfs-kern:32340a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
| * | | | | [XFS] Wait for all I/O on truncate to zero file sizeLachlan McIlroy2008-11-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's possible to have outstanding xfs_ioend_t's queued when the file size is zero. This can happen in the direct I/O path when a direct I/O write fails due to ENOSPC. In this case the xfs_ioend_t will still be queued (ie xfs_end_io_direct() does not know that the I/O failed so can't force the xfs_ioend_t to be flushed synchronously). When we truncate a file on unlink we don't know to wait for these xfs_ioend_ts and we can have a use-after-free situation if the inode is reclaimed before the xfs_ioend_t is finally processed. As was suggested by Dave Chinner lets wait for all I/Os to complete when truncating the file size to zero. SGI-PV: 981668 SGI-Modid: xfs-linux-melb:xfs-kern:32216a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
| * | | | | [XFS] Fix use-after-free with log and quotasLachlan McIlroy2008-11-101-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Destroying the quota stuff on unmount can access the log - ie XFS_QM_DONE() ends up in xfs_dqunlock() which calls xfs_trans_unlocked_item() and then xfs_log_move_tail(). By this time the log has already been destroyed. Just move the cleanup of the quota code earlier in xfs_unmountfs() before the call to xfs_log_unmount(). Moving XFS_QM_DONE() up near XFS_QM_DQPURGEALL() seems like a good spot. SGI-PV: 987086 SGI-Modid: xfs-linux-melb:xfs-kern:32148a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Peter Leckie <pleckie@sgi.com>
* | | | | | Merge branch 'upstream-linus' of ↵Linus Torvalds2008-11-119-222/+256
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (21 commits) ocfs2: Check search result in ocfs2_xattr_block_get() ocfs2: fix printk related build warnings in xattr.c ocfs2: truncate outstanding block after direct io failure ocfs2/xattr: Proper hash collision handle in bucket division ocfs2: return 0 in page_mkwrite to let VFS retry. ocfs2: Set journal descriptor to NULL after journal shutdown ocfs2: Fix check of return value of ocfs2_start_trans() in xattr.c. ocfs2: Let inode be really deleted when ocfs2_mknod_locked() fails ocfs2: Fix checking of return value of new_inode() ocfs2: Fix check of return value of ocfs2_start_trans() ocfs2: Fix some typos in xattr annotations. ocfs2: Remove unused ocfs2_restore_xattr_block(). ocfs2: Don't repeat ocfs2_xattr_block_find() ocfs2: Specify appropriate journal access for new xattr buckets. ocfs2: Check errors from ocfs2_xattr_update_xattr_search() ocfs2: Don't return -EFAULT from a corrupt xattr entry. ocfs2: Check xattr block signatures properly. ocfs2: add handler_map array bounds checking ocfs2: remove duplicate definition in xattr ocfs2: fix function declaration and definition in xattr ...
| * | | | | | ocfs2: Check search result in ocfs2_xattr_block_get()Tiger Yang2008-11-101-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ocfs2_xattr_block_get() calls ocfs2_xattr_search() to find an external xattr, but doesn't check the search result that is passed back via struct ocfs2_xattr_search. Add a check for search result, and pass back -ENODATA if the xattr search failed. This avoids a later NULL pointer error. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
| * | | | | | ocfs2: fix printk related build warnings in xattr.cMark Fasheh2008-11-101-14/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Mark Fasheh <mfasheh@suse.com>
| * | | | | | ocfs2: truncate outstanding block after direct io failureDmitri Monakhov2008-11-101-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Dmitri Monakhov <dmonakhov@openvz.org> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: Joel Becker <Joel.Becker@oracle.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
| * | | | | | ocfs2/xattr: Proper hash collision handle in bucket divisionTao Ma2008-11-101-29/+115
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In ocfs2/xattr, we must make sure the xattrs which have the same hash value exist in the same bucket so that the search schema can work. But in the old implementation, when we want to extend a bucket, we just move half number of xattrs to the new bucket. This works in most cases, but if we are lucky enough we will move 2 xattrs into 2 different buckets. This means that an xattr from the previous bucket cannot be found anymore. This patch fix this problem by finding the right position during extending the bucket and extend an empty bucket if needed. Signed-off-by: Tao Ma <tao.ma@oracle.com> Cc: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
| * | | | | | ocfs2: return 0 in page_mkwrite to let VFS retry.Tao Ma2008-11-101-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In ocfs2_page_mkwrite, we return -EINVAL when we found the page mapping isn't updated, and it will cause the user space program get SIGBUS and exit. The reason is that during race writeable mmap, we will do unmap_mapping_range in ocfs2_data_downconvert_worker. The good thing is that if we reuturn 0 in page_mkwrite, VFS will retry fault and then call page_mkwrite again, so it is safe to return 0 here. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>