summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Fix CPU spinlock lockups on secondary CPU bringupRussell King2011-06-231-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Secondary CPU bringup typically calls calibrate_delay() during its initialization. However, calibrate_delay() modifies a global variable (loops_per_jiffy) used for udelay() and __delay(). A side effect of 71c696b1 ("calibrate: extract fall-back calculation into own helper") introduced in the 2.6.39 merge window means that we end up with a substantial period where loops_per_jiffy is zero. This causes the spinlock debugging code to malfunction: u64 loops = loops_per_jiffy * HZ; for (;;) { for (i = 0; i < loops; i++) { if (arch_spin_trylock(&lock->raw_lock)) return; __delay(1); } ... } by never calling arch_spin_trylock() - resulting in the CPU locking up in an infinite loop inside __spin_lock_debug(). Work around this by only writing to loops_per_jiffy only once we have completed all the calibration decisions. Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: <stable@kernel.org> (2.6.39-stable) -- Better solutions (such as omitting the calibration for secondary CPUs, or arranging for calibrate_delay() to return the LPJ value and leave it to the caller to decide where to store it) are a possibility, but would be much more invasive into each architecture. I think this is the best solution for -rc and stable, but it should be revisited for the next merge window. init/calibrate.c | 14 ++++++++------ 1 files changed, 8 insertions(+), 6 deletions(-) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* serial: mrst_max3110: initialize waitqueue earlierMika Westerberg2011-06-231-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The driver went to initialize its waitqueue at the start of the main processing thread. However, it is possible that this thread is not scheduled on a CPU before the write function is called which leads to a following error: BUG: spinlock bad magic on CPU#1, swapper/1 lock: f5f3ebdc, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 Pid: 1, comm: swapper Not tainted 3.0.0-rc2+ #67 Call Trace: [<c1289663>] spin_bug+0xa3/0xf0 [<c12897ad>] do_raw_spin_lock+0x7d/0x150 [<c14963de>] _raw_spin_lock_irqsave+0x4e/0x60 [<c102f2bb>] __wake_up+0x1b/0x50 [<c12d3715>] serial_m3110_con_write+0x55/0x60 [<c1041575>] __call_console_drivers+0x75/0x90 [<c10415d9>] _call_console_drivers+0x49/0x80 [<c1041baa>] console_unlock+0xca/0x1f0 [<c10420ef>] vprintk+0x18f/0x4f0 [<c14928a3>] printk+0x18/0x1a [<c1042730>] register_console+0x2e0/0x350 [<c12d098e>] uart_add_one_port+0x33e/0x3d0 [<c1485ba6>] serial_m3110_probe+0x1c2/0x1df [<c1303db7>] spi_drv_probe+0x17/0x20 ... Fix this by initializing the waitqueue before the main thread is created. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mrst_max3110: Change max missing message priority.William Douglas2011-06-231-1/+1
| | | | | | | | | | | Change print message to notice instead of error to clean up non critical messages showing on startup. The MAX3111 not being present is a normal path for end user systems. Signed-off-by: William Douglas <william.douglas@intel.com> [rebased on 3.0, switched to dev_dbg()] Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'for-linus' of ↵Linus Torvalds2011-06-224-12/+11
|\ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6: jfs: agstart field must be 64 bits JFS: Don't save agno in the inode jfs: Update agstart when resizing volume jfs: old_agsize should be 64 bits in jfs_extendfs
| * jfs: agstart field must be 64 bitsDave Kleikamp2011-06-201-1/+1
| | | | | | | | | | | | | | The previous patch added the agstart field to jfs_ip, but declared it a long. We need to make sure its 64 bits on every platform. Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
| * JFS: Don't save agno in the inodeDave Kleikamp2011-06-203-9/+9
| | | | | | | | | | | | | | | | | | Resizing the file system can result in an in-memory inode being remapped to a different aggregate group (AG). A cached AG number can cause problems when trying to free or allocate inodes. Instead, save the IAG's agstart address and calculate the agno when we need it. Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
| * jfs: Update agstart when resizing volumeDave Kleikamp2011-06-201-2/+1
| | | | | | | | | | | | | | | | | | | | | | A comment indicates that the IAG's agstart does not need to be updated since it will always point to a block in the same aggregate group, but jfs_fsck isn't so forgiving and reports it as an error. I'm fixing this in jfsutils as well, so either a new kernel or new utilities will be sufficient to fix the problem. Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
| * jfs: old_agsize should be 64 bits in jfs_extendfsDave Kleikamp2011-06-201-1/+1
| | | | | | | | Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
* | Merge branch 'pm-fixes' of ↵Linus Torvalds2011-06-229-75/+50
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6 * 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: PCI / PM: Block races between runtime PM and system sleep PM / Domains: Update documentation PM / Runtime: Handle clocks correctly if CONFIG_PM_RUNTIME is unset PM: Fix async resume following suspend failure PM: Free memory bitmaps if opening /dev/snapshot fails PM: Rename dev_pm_info.in_suspend to is_prepared PM: Update documentation regarding sysdevs PM / Runtime: Update doc: usage count no longer incremented across system PM
| * | PCI / PM: Block races between runtime PM and system sleepRafael J. Wysocki2011-06-211-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After commit e8665002477f0278f84f898145b1f141ba26ee26 (PM: Allow pm_runtime_suspend() to succeed during system suspend) it is possible that a device resumed by the pm_runtime_resume(dev) in pci_pm_prepare() will be suspended immediately from a work item, timer function or otherwise, defeating the very purpose of calling pm_runtime_resume(dev) from there. To prevent that from happening it is necessary to increment the runtime PM usage counter of the device by replacing pm_runtime_resume() with pm_runtime_get_sync(). Moreover, the incremented runtime PM usage counter has to be decremented by the corresponding pci_pm_complete(), via pm_runtime_put_sync(). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: stable@kernel.org Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PM / Domains: Update documentationRafael J. Wysocki2011-06-211-27/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 4d27e9dcff00a6425d779b065ec8892e4f391661 (PM: Make power domain callbacks take precedence over subsystem ones) forgot to update the device power management documentation to take changes made by it into account. Correct that mistake. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
| * | PM / Runtime: Handle clocks correctly if CONFIG_PM_RUNTIME is unsetRafael J. Wysocki2011-06-211-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 85eb8c8d0b0900c073b0e6f89979ac9c439ade1a (PM / Runtime: Generic clock manipulation rountines for runtime PM (v6)) converted the shmobile platform to using generic code for runtime PM clock management, but it changed the behavior for CONFIG_PM_RUNTIME unset incorrectly. Specifically, for CONFIG_PM_RUNTIME unset pm_runtime_clk_notify() should enable clocks for action equal to BUS_NOTIFY_BIND_DRIVER and it should disable them for action equal to BUS_NOTIFY_UNBOUND_DRIVER (instead of BUS_NOTIFY_ADD_DEVICE and BUS_NOTIFY_DEL_DEVICE, respectively). Make this function behave as appropriate. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Magnus Damm <damm@opensource.se>
| * | PM: Fix async resume following suspend failureAlan Stern2011-06-212-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PM core doesn't handle suspend failures correctly when it comes to asynchronously suspended devices. These devices are moved onto the dpm_suspended_list as soon as the corresponding async thread is started up, and they remain on the list even if they fail to suspend or the sleep transition is cancelled before they get suspended. As a result, when the PM core unwinds the transition, it tries to resume the devices even though they were never suspended. This patch (as1474) fixes the problem by adding a new "is_suspended" flag to dev_pm_info. Devices are resumed only if the flag is set. [rjw: * Moved the dev->power.is_suspended check into device_resume(), because we need to complete dev->power.completion and clear dev->power.is_prepared too for devices whose dev->power.is_suspended flags are unset. * Fixed __device_suspend() to avoid setting dev->power.is_suspended if async_error is different from zero.] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: stable@kernel.org
| * | PM: Free memory bitmaps if opening /dev/snapshot failsMichal Kubecek2011-06-211-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When opening /dev/snapshot device, snapshot_open() creates memory bitmaps which are freed in snapshot_release(). But if any of the callbacks called by pm_notifier_call_chain() returns NOTIFY_BAD, open() fails, snapshot_release() is never called and bitmaps are not freed. Next attempt to open /dev/snapshot then triggers BUG_ON() check in create_basic_memory_bitmaps(). This happens e.g. when vmwatchdog module is active on s390x. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: stable@kernel.org
| * | PM: Rename dev_pm_info.in_suspend to is_preparedAlan Stern2011-06-214-11/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch (as1473) renames the "in_suspend" field in struct dev_pm_info to "is_prepared", in preparation for an upcoming change. The new name is more descriptive of what the field really means. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: stable@kernel.org
| * | PM: Update documentation regarding sysdevsRafael J. Wysocki2011-06-211-26/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | The part of Documentation/power/devices.txt regarding sysdevs is not valid any more after commit 2e711c04dbbf7a7732a3f7073b1fc285d12b369d (PM: Remove sysdev suspend, resume and shutdown operations), so remove it. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
| * | PM / Runtime: Update doc: usage count no longer incremented across system PMKevin Hilman2011-06-211-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit e8665002477f0278f84f898145b1f141ba26ee26 (PM: Allow pm_runtime_suspend() to succeed during system suspend) removed usage count increment across system PM. Update doc to reflect this. Signed-off-by: Kevin Hilman <khilman@ti.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
* | | mm, hotplug: protect zonelist building with zonelists_mutexDavid Rientjes2011-06-221-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 959ecc48fc75 ("mm/memory_hotplug.c: fix building of node hotplug zonelist") does not protect the build_all_zonelists() call with zonelists_mutex as needed. This can lead to races in constructing zonelist ordering if a concurrent build is underway. Protecting this with lock_memory_hotplug() is insufficient since zonelists can be rebuild though sysfs as well. Signed-off-by: David Rientjes <rientjes@google.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | mm, hotplug: fix error handling in mem_online_node()David Rientjes2011-06-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The error handling in mem_online_node() is incorrect: hotadd_new_pgdat() returns NULL if the new pgdat could not have been allocated and a pointer to it otherwise. mem_online_node() should fail if hotadd_new_pgdat() fails, not the inverse. This fixes an issue when memoryless nodes are not onlined and their sysfs interface is not registered when their first cpu is brought up. The bug was introduced by commit cf23422b9d76 ("cpu/mem hotplug: enable CPUs online before local memory online") iow v2.6.35. Signed-off-by: David Rientjes <rientjes@google.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | KEYS: Fix error handling in construct_key_and_link()David Howells2011-06-211-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix error handling in construct_key_and_link(). If construct_alloc_key() returns an error, it shouldn't pass out through the normal path as the key_serial() called by the kleave() statement will oops when it gets an error code in the pointer: BUG: unable to handle kernel paging request at ffffffffffffff84 IP: [<ffffffff8120b401>] request_key_and_link+0x4d7/0x52f .. Call Trace: [<ffffffff8120b52c>] request_key+0x41/0x75 [<ffffffffa00ed6e8>] cifs_get_spnego_key+0x206/0x226 [cifs] [<ffffffffa00eb0c9>] CIFS_SessSetup+0x511/0x1234 [cifs] [<ffffffffa00d9799>] cifs_setup_session+0x90/0x1ae [cifs] [<ffffffffa00d9c02>] cifs_get_smb_ses+0x34b/0x40f [cifs] [<ffffffffa00d9e05>] cifs_mount+0x13f/0x504 [cifs] [<ffffffffa00caabb>] cifs_do_mount+0xc4/0x672 [cifs] [<ffffffff8113ae8c>] mount_fs+0x69/0x155 [<ffffffff8114ff0e>] vfs_kern_mount+0x63/0xa0 [<ffffffff81150be2>] do_kern_mount+0x4d/0xdf [<ffffffff81152278>] do_mount+0x63c/0x69f [<ffffffff8115255c>] sys_mount+0x88/0xc2 [<ffffffff814fbdc2>] system_call_fastpath+0x16/0x1b Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | MN10300: asm/uaccess.h needs to #include linux/kernel.h for might_sleep()David Howells2011-06-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MN10300's asm/uaccess.h needs to #include linux/kernel.h to get might_sleep() otherwise it fails to build on MN10300 allyesconfig. This fails in a few places with messages like the following: In file included from security/keys/trusted.c:14: include/linux/uaccess.h: In function '__copy_from_user_nocache': include/linux/uaccess.h:52: error: implicit declaration of function 'might_sleep' Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6Linus Torvalds2011-06-2118-66/+139
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: NFS: Fix decode_secinfo_maxsz NFSv4.1: Fix an off-by-one error in pnfs_generic_pg_test NFSv4.1: Fix some issues with pnfs_generic_pg_test NFSv4.1: file layout must consider pg_bsize for coalescing pnfs-obj: No longer needed to take an extra ref at add_device SUNRPC: Ensure the RPC client only quits on fatal signals NFSv4: Fix a readdir regression nfs4.1: mark layout as bad on error path in _pnfs_return_layout nfs4.1: prevent race that allowed use of freed layout in _pnfs_return_layout NFSv4.1: need to put_layout_hdr on _pnfs_return_layout error path NFS: (d)printks should use %zd for ssize_t arguments NFSv4.1: fix break condition in pnfs_find_lseg nfs4.1: fix several problems with _pnfs_return_layout NFSv4.1: allow zero fh array in filelayout decode layout NFSv4.1: allow nfs_fhget to succeed with mounted on fileid NFSv4.1: Fix a refcounting issue in the pNFS device id cache NFSv4.1: deprecate headerpadsz in CREATE_SESSION NFS41: do not update isize if inode needs layoutcommit NLM: Don't hang forever on NLM unlock requests NFS: fix umount of pnfs filesystems
| * | | NFS: Fix decode_secinfo_maxszBryan Schumaker2011-06-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | I initially did the calculation in bytes, and not words Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | NFSv4.1: Fix an off-by-one error in pnfs_generic_pg_testTrond Myklebust2011-06-211-5/+16
| | | | | | | | | | | | | | | | | | | | | | | | And document what is going on there... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | NFSv4.1: Fix some issues with pnfs_generic_pg_testTrond Myklebust2011-06-212-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. If the intention is to coalesce requests 'prev' and 'req' then we have to ensure at least that we have a layout starting at req_offset(prev). 2. If we're only requesting a minimal layout of length desc->pg_count, we need to test the length actually returned by the server before we allow the coalescing to occur. 3. We need to deal correctly with (pgio->lseg == NULL) 4. Fixup the test guarding the pnfs_update_layout. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | NFSv4.1: file layout must consider pg_bsize for coalescingBenny Halevy2011-06-203-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise we end up overflowing the rpc buffer size on the receive end. Signed-off-by: Benny Halevy <benny@tonian.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | pnfs-obj: No longer needed to take an extra ref at add_deviceBoaz Harrosh2011-06-191-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Andy's last device_cache patches, already take an extra reference on the newly inserted device_id. So we can remove it from obj-io. Without this patch the device_ids are leaked. Andy's patches are not in Linus tree yet. So I'm not sure if they are scheduled for this Kernel or the next. This patch should be added as part of these. CC: Andy Adamson <andros@netapp.com> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | SUNRPC: Ensure the RPC client only quits on fatal signalsTrond Myklebust2011-06-172-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a couple of instances where we were exiting the RPC client on arbitrary signals. We should only do so on fatal signals. Cc: stable@kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | NFSv4: Fix a readdir regressionTrond Myklebust2011-06-161-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 7ebb9315 (NFS: use secinfo when crossing mountpoints) introduces a regression when decoding an NFSv4 readdir entry that sets the rdattr_error field. By treating the resulting value as if it is a decoding error, the current code may cause us to skip valid readdir entries. Reported-by: Andy Adamson <andros@netapp.com> Cc: stable@kernel.org [2.6.39] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | nfs4.1: mark layout as bad on error path in _pnfs_return_layoutFred Isaman2011-06-151-0/+2
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | nfs4.1: prevent race that allowed use of freed layout in _pnfs_return_layoutFred Isaman2011-06-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mark_matching_lsegs_invalid could put the last ref to the layout, so the get_layout_hdr needs to be called first. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | NFSv4.1: need to put_layout_hdr on _pnfs_return_layout error pathBenny Halevy2011-06-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We always get a reference on the layout header and we rely on nfs4_layoutreturn_release to put it. If we hit an allocation error before starting the rpc proc we bail out early without dereferncing the layout header properly. Signed-off-by: Benny Halevy <benny@tonian.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | NFS: (d)printks should use %zd for ssize_t argumentsDavid Howells2011-06-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (d)printks should use %zd for ssize_t arguments not %ld, otherwise they might get a warning. I see the following with MN10300. fs/nfs/objlayout/objlayout.c: In function 'objlayout_read_done': fs/nfs/objlayout/objlayout.c:294: warning: format '%ld' expects type 'long int', but argument 3 has type 'ssize_t' Signed-off-by: David Howells <dhowells@redhat.com> cc: Trond Myklebust <Trond.Myklebust@netapp.com> cc: linux-nfs@vger.kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | NFSv4.1: fix break condition in pnfs_find_lsegBenny Halevy2011-06-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The break condition to skip out of the loop got broken when cmp_layout was change. Essentially, we want to stop looking once we know no layout on the remainder of the list can match the first byte of the looked-up range. Reported-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Benny Halevy <benny@tonian.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | nfs4.1: fix several problems with _pnfs_return_layoutFred Isaman2011-06-152-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | _pnfs_return_layout had the following problems: - it did not call pnfs_free_lseg_list on all paths - it unintentionally did a forgetful return when there was no outstanding io - it raced with concurrent LAYOUTGETS Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | NFSv4.1: allow zero fh array in filelayout decode layoutAndy Adamson2011-06-151-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Andy Adamson <andros@netapp.com> cc:stable@kernel.org [2.6.39] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | NFSv4.1: allow nfs_fhget to succeed with mounted on fileidAndy Adamson2011-06-153-11/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 28331a46d88459788c8fca72dbb0415cd7f514c9 "Ensure we request the ordinary fileid when doing readdirplus" changed the meaning of NFS_ATTR_FATTR_FILEID which used to be set when FATTR4_WORD1_MOUNTED_ON_FILED was requested. Allow nfs_fhget to succeed with only a mounted on fileid when crossing a mountpoint or a referral. Ask for the fileid of the absent file system if mounted_on_fileid is not supported. Signed-off-by: Andy Adamson <andros@netapp.com> cc:stable@kernel.org [2.6.39] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | NFSv4.1: Fix a refcounting issue in the pNFS device id cacheTrond Myklebust2011-06-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When we add something to the global device id cache, we need to bump the reference count, so that the cache itself holds a reference. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | NFSv4.1: deprecate headerpadsz in CREATE_SESSIONBenny Halevy2011-06-153-9/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We don't support header padding yet so better off ditching it Reported-by: Sid Moore <learnmost@gmail.com> Signed-off-by: Benny Halevy <benny@tonian.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | NFS41: do not update isize if inode needs layoutcommitPeng Tao2011-06-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nfs_update_inode will update isize if there is no queued pages. For pNFS, layoutcommit is supposed to change file size on server, the same effect as queued pages. nfs_update_inode may be called when dirty pages are written back (nfsi->npages==0) but layoutcommit is not sent, and it will change client file size according to server file size. Then client ends up losing what it just writes back in pNFS path. So we should skip updating client file size if file needs layoutcommit. Signed-off-by: Peng Tao <peng_tao@emc.com> Cc: stable@kernel.org [2.6.39] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | NLM: Don't hang forever on NLM unlock requestsTrond Myklebust2011-06-154-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the NLM daemon is killed on the NFS server, we can currently end up hanging forever on an 'unlock' request, instead of aborting. Basically, if the rpcbind request fails, or the server keeps returning garbage, we really want to quit instead of retrying. Tested-by: Vasily Averin <vvs@sw.ru> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
| * | | NFS: fix umount of pnfs filesystemsWeston Andros Adamson2011-06-152-5/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unmounting a pnfs filesystem hangs using filelayout and possibly others. This fixes the use of the rcu protected node by making use of a new 'tmpnode' for the temporary purge list. Also, the spinlock shouldn't be held when calling synchronize_rcu(). Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | | | Merge branch 'for-linus' of ↵Linus Torvalds2011-06-216-22/+66
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/qib: Ensure that LOS and DFE are being turned off RDMA/cxgb4: Couple of abort fixes RDMA/cxgb4: Don't truncate MR lengths RDMA/cxgb4: Don't exceed hw IQ depth limit for user CQs
| * \ \ \ Merge branches 'cxgb4' and 'qib' into for-nextRoland Dreier2011-06-172-8/+23
| |\ \ \ \
| | * | | | IB/qib: Ensure that LOS and DFE are being turned offMitko Haralanov2011-06-172-8/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to timing, it is possible for the LOS and DFE to remain on. This is due to the link progressing to LinkUP prior to the driver getting the first Status Changed interrupt. By expanding the conditions under which LOS is turned off and DFE timeout is being set, timing is no longer an issue. Signed-off-by: Mitko Haralanov <mitko@qlogic.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| * | | | | RDMA/cxgb4: Couple of abort fixesSteve Wise2011-06-172-13/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - fix a race where the driver could end up sending a close_con_req after an abort_rpl. In c4iw_ep_disconnect(), send abort or close request with the ep mutex held. - fix a hang where driver fails to wake up when a connection is reset during a normal close. Wake up any waiters in the interrupt path, and correctly cleanup after rdma_fini() failures. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| * | | | | RDMA/cxgb4: Don't truncate MR lengthsSteve Wise2011-06-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove left-over code from T3 that limited MR sizes to 32b. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| * | | | | RDMA/cxgb4: Don't exceed hw IQ depth limit for user CQsSteve Wise2011-06-171-0/+4
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Memory allocated for user CQs gets rounded up to the next page boundary. And after rounding, we recalculate the resulting IQ depth and we need to make sure we don't exceed the HW limits. This bug can result a much smaller CQ allocated than was expected if the HW size field is exceeded, resulting in CQ overflow failures. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* | | | | Merge branch 'for_linus' of ↵Linus Torvalds2011-06-2112-265/+223
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: jbd2: Fix oops in jbd2_journal_remove_journal_head() jbd2: Remove obsolete parameters in the comments for some jbd2 functions ext4: fixed tracepoints cleanup ext4: use FIEMAP_EXTENT_LAST flag for last extent in fiemap ext4: Fix max file size and logical block counting of extent format file ext4: correct comments for ext4_free_blocks()
| * | | | | jbd2: Fix oops in jbd2_journal_remove_journal_head()Jan Kara2011-06-135-122/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | jbd2_journal_remove_journal_head() can oops when trying to access journal_head returned by bh2jh(). This is caused for example by the following race: TASK1 TASK2 jbd2_journal_commit_transaction() ... processing t_forget list __jbd2_journal_refile_buffer(jh); if (!jh->b_transaction) { jbd_unlock_bh_state(bh); jbd2_journal_try_to_free_buffers() jbd2_journal_grab_journal_head(bh) jbd_lock_bh_state(bh) __journal_try_to_free_buffer() jbd2_journal_put_journal_head(jh) jbd2_journal_remove_journal_head(bh); jbd2_journal_put_journal_head() in TASK2 sees that b_jcount == 0 and buffer is not part of any transaction and thus frees journal_head before TASK1 gets to doing so. Note that even buffer_head can be released by try_to_free_buffers() after jbd2_journal_put_journal_head() which adds even larger opportunity for oops (but I didn't see this happen in reality). Fix the problem by making transactions hold their own journal_head reference (in b_jcount). That way we don't have to remove journal_head explicitely via jbd2_journal_remove_journal_head() and instead just remove journal_head when b_jcount drops to zero. The result of this is that [__]jbd2_journal_refile_buffer(), [__]jbd2_journal_unfile_buffer(), and __jdb2_journal_remove_checkpoint() can free journal_head which needs modification of a few callers. Also we have to be careful because once journal_head is removed, buffer_head might be freed as well. So we have to get our own buffer_head reference where it matters. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>