summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* mm: avoid false sharing of mm_counterKAMEZAWA Hiroyuki2010-03-067-15/+107
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Considering the nature of per mm stats, it's the shared object among threads and can be a cache-miss point in the page fault path. This patch adds per-thread cache for mm_counter. RSS value will be counted into a struct in task_struct and synchronized with mm's one at events. Now, in this patch, the event is the number of calls to handle_mm_fault. Per-thread value is added to mm at each 64 calls. rough estimation with small benchmark on parallel thread (2threads) shows [before] 4.5 cache-miss/faults [after] 4.0 cache-miss/faults Anyway, the most contended object is mmap_sem if the number of threads grows. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: clean up mm_counterKAMEZAWA Hiroyuki2010-03-0612-101/+174
| | | | | | | | | | | | | | | | | | | Presently, per-mm statistics counter is defined by macro in sched.h This patch modifies it to - defined in mm.h as inlinf functions - use array instead of macro's name creation. This patch is for reducing patch size in future patch to modify implementation of per-mm counter. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* infiniband: use for_each_set_bit()Akinobu Mita2010-03-061-16/+5
| | | | | | | | | | | Replace open-coded loop with for_each_set_bit(). Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Roland Dreier <rolandd@cisco.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* bitops: rename for_each_bit() to for_each_set_bit()Akinobu Mita2010-03-0618-24/+26
| | | | | | | | | | | | | | | | | | | | Rename for_each_bit to for_each_set_bit in the kernel source tree. To permit for_each_clear_bit(), should that ever be added. The patch includes a macro to map the old for_each_bit() onto the new for_each_set_bit(). This is a (very) temporary thing to ease the migration. [akpm@linux-foundation.org: add temporary for_each_bit()] Suggested-by: Alexey Dobriyan <adobriyan@gmail.com> Suggested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Russell King <rmk@arm.linux.org.uk> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Artem Bityutskiy <dedekind@infradead.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* timbgpio: fix buildDavid Miller2010-03-061-0/+1
| | | | | | | | | Use of get_irq_chip_data() et al. requires including linux/irq.h Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Richard Röjfors <richard.rojfors@pelagicore.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Fix a dumb typo - use of & instead of &&Al Viro2010-03-061-1/+1
| | | | | | | | | We managed to lose O_DIRECTORY testing due to a stupid typo in commit 1f36f774b2 ("Switch !O_CREAT case to use of do_last()") Reported-by: Walter Sheets <w41ter@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'slab-for-linus' of ↵Linus Torvalds2010-03-057-260/+146
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 * 'slab-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6: SLUB: Fix per-cpu merge conflict failslab: add ability to filter slab caches slab: fix regression in touched logic dma kmalloc handling fixes slub: remove impossible condition slab: initialize unused alien cache entry as NULL at alloc_alien_cache(). SLUB: Make slub statistics use this_cpu_inc SLUB: this_cpu: Remove slub kmem_cache fields SLUB: Get rid of dynamic DMA kmalloc cache allocation SLUB: Use this_cpu operations in slub
| * SLUB: Fix per-cpu merge conflictStephen Rothwell2010-03-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The slab tree adds a percpu variable usage case (commit 9dfc6e68bfe6ee452efb1a4e9ca26a9007f2b864 "SLUB: Use this_cpu operations in slub"), but the percpu tree removes the prefixing of percpu variables (commit dd17c8f72993f9461e9c19250e3f155d6d99df22 "percpu: remove per_cpu__ prefix"), thus causing the following compilation error: CC mm/slub.o mm/slub.c: In function ‘alloc_kmem_cache_cpus’: mm/slub.c:2078: error: implicit declaration of function ‘per_cpu_var’ mm/slub.c:2078: warning: assignment makes pointer from integer without a cast make[1]: *** [mm/slub.o] Error 1 Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
| *-----. Merge branches 'slab/cleanups', 'slab/failslab', 'slab/fixes' and ↵Pekka Enberg2010-03-047-260/+146
| |\ \ \ \ | | | | | | | | | | | | | | | | | | 'slub/percpu' into slab-for-linus
| | | | | * dma kmalloc handling fixesChristoph Lameter2010-01-221-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. We need kmalloc_percpu for all of the now extended kmalloc caches array not just for each shift value. 2. init_kmem_cache_nodes() must assume node 0 locality for statically allocated dma kmem_cache structures even after boot is complete. Reported-and-tested-by: Alex Chiang <achiang@hp.com> Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
| | | | | * slub: remove impossible conditionDavid Rientjes2010-01-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | `s' cannot be NULL if kmalloc_caches is not NULL. This conditional would trigger a NULL pointer on `s', anyway, since it is immediately derefernced if true. Acked-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
| | | | | * SLUB: Make slub statistics use this_cpu_incChristoph Lameter2009-12-201-23/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | this_cpu_inc() translates into a single instruction on x86 and does not need any register. So use it in stat(). We also want to avoid the calculation of the per cpu kmem_cache_cpu structure pointer. So pass a kmem_cache pointer instead of a kmem_cache_cpu pointer. Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
| | | | | * SLUB: this_cpu: Remove slub kmem_cache fieldsChristoph Lameter2009-12-202-61/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the fields in struct kmem_cache_cpu that were used to cache data from struct kmem_cache when they were in different cachelines. The cacheline that holds the per cpu array pointer now also holds these values. We can cut down the struct kmem_cache_cpu size to almost half. The get_freepointer() and set_freepointer() functions that used to be only intended for the slow path now are also useful for the hot path since access to the size field does not require accessing an additional cacheline anymore. This results in consistent use of functions for setting the freepointer of objects throughout SLUB. Also we initialize all possible kmem_cache_cpu structures when a slab is created. No need to initialize them when a processor or node comes online. Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
| | | | | * SLUB: Get rid of dynamic DMA kmalloc cache allocationChristoph Lameter2009-12-202-22/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dynamic DMA kmalloc cache allocation is troublesome since the new percpu allocator does not support allocations in atomic contexts. Reserve some statically allocated kmalloc_cpu structures instead. Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
| | | | | * SLUB: Use this_cpu operations in slubChristoph Lameter2009-12-202-159/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using per cpu allocations removes the needs for the per cpu arrays in the kmem_cache struct. These could get quite big if we have to support systems with thousands of cpus. The use of this_cpu_xx operations results in: 1. The size of kmem_cache for SMP configuration shrinks since we will only need 1 pointer instead of NR_CPUS. The same pointer can be used by all processors. Reduces cache footprint of the allocator. 2. We can dynamically size kmem_cache according to the actual nodes in the system meaning less memory overhead for configurations that may potentially support up to 1k NUMA nodes / 4k cpus. 3. We can remove the diddle widdle with allocating and releasing of kmem_cache_cpu structures when bringing up and shutting down cpus. The cpu alloc logic will do it all for us. Removes some portions of the cpu hotplug functionality. 4. Fastpath performance increases since per cpu pointer lookups and address calculations are avoided. V7-V8 - Convert missed get_cpu_slab() under CONFIG_SLUB_STATS Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
| | | | * | slab: fix regression in touched logicNick Piggin2010-01-301-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When factoring common code into transfer_objects in commit 3ded175 ("slab: add transfer_objects() function"), the 'touched' logic got a bit broken. When refilling from the shared array (taking objects from the shared array), we are making use of the shared array so it should be marked as touched. Subsequently pulling an element from the cpu array and allocating it should also touch the cpu array, but that is taken care of after the alloc_done label. (So yes, the cpu array was getting touched = 1 twice). So revert this logic to how it worked in earlier kernels. This also affects the behaviour in __drain_alien_cache, which would previously 'touch' the shared array and now does not. I think it is more logical not to touch there, because we are pushing objects into the shared array rather than pulling them off. So there is no good reason to postpone reaping them -- if the shared array is getting utilized, then it will get 'touched' in the alloc path (where this patch now restores the touch). Acked-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
| | | * | | failslab: add ability to filter slab cachesDmitry Monakhov2010-02-266-8/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allow to inject faults only for specific slabs. In order to preserve default behavior cache filter is off by default (all caches are faulty). One may define specific set of slabs like this: # mark skbuff_head_cache as faulty echo 1 > /sys/kernel/slab/skbuff_head_cache/failslab # Turn on cache filter (off by default) echo 1 > /sys/kernel/debug/failslab/cache-filter # Turn on fault injection echo 1 > /sys/kernel/debug/failslab/times echo 1 > /sys/kernel/debug/failslab/probability Acked-by: David Rientjes <rientjes@google.com> Acked-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
| | * | | | slab: initialize unused alien cache entry as NULL at alloc_alien_cache().Haicheng Li2010-01-111-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Comparing with existing code, it's a simpler way to use kzalloc_node() to ensure that each unused alien cache entry is NULL. CC: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Christoph Lameter <cl@linux-foundation.org> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
* | | | | | Merge branch 'nfs-for-2.6.34' of ↵Linus Torvalds2010-03-0524-443/+642
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'nfs-for-2.6.34' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (44 commits) NFS: Remove requirement for inode->i_mutex from nfs_invalidate_mapping NFS: Clean up nfs_sync_mapping NFS: Simplify nfs_wb_page() NFS: Replace __nfs_write_mapping with sync_inode() NFS: Simplify nfs_wb_page_cancel() NFS: Ensure inode is always marked I_DIRTY_DATASYNC, if it has unstable pages NFS: Run COMMIT as an asynchronous RPC call when wbc->for_background is set NFS: Reduce the number of unnecessary COMMIT calls NFS: Add a count of the number of unstable writes carried by an inode NFS: Cleanup - move nfs_write_inode() into fs/nfs/write.c nfs41 fix NFS4ERR_CLID_INUSE for exchange id NFS: Fix an allocation-under-spinlock bug SUNRPC: Handle EINVAL error returns from the TCP connect operation NFSv4.1: Various fixes to the sequence flag error handling nfs4: renewd renew operations should take/put a client reference nfs41: renewd sequence operations should take/put client reference nfs: prevent backlogging of renewd requests nfs: kill renewd before clearing client minor version NFS: Make close(2) asynchronous when closing NFS O_DIRECT files NFS: Improve NFS iostat byte count accuracy for writes ...
| * \ \ \ \ \ Merge branch 'writeback-for-2.6.34' into nfs-for-2.6.34Trond Myklebust2010-03-055508-138110/+336593
| |\ \ \ \ \ \
| | * | | | | | NFS: Remove requirement for inode->i_mutex from nfs_invalidate_mappingTrond Myklebust2010-03-054-43/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | NFS: Clean up nfs_sync_mappingTrond Myklebust2010-03-051-9/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the redundant call to filemap_write_and_wait(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | NFS: Simplify nfs_wb_page()Trond Myklebust2010-03-052-98/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | NFS: Replace __nfs_write_mapping with sync_inode()Trond Myklebust2010-03-053-49/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we have correct COMMIT semantics in writeback_single_inode, we can reduce and simplify nfs_wb_all(). Also replace nfs_wb_nocommit() with a call to filemap_write_and_wait(), which doesn't need to hold the inode->i_mutex. With that done, we can eliminate nfs_write_mapping() altogether. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | NFS: Simplify nfs_wb_page_cancel()Trond Myklebust2010-03-052-40/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In all cases we should be able to just remove the request and call cancel_dirty_page(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | NFS: Ensure inode is always marked I_DIRTY_DATASYNC, if it has unstable pagesTrond Myklebust2010-03-051-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since nfs_scan_list() doesn't wait for locked pages, we have a race in which it is possible to end up with an inode that needs to send a COMMIT, but which does not have the I_DIRTY_DATASYNC flag set. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | NFS: Run COMMIT as an asynchronous RPC call when wbc->for_background is setTrond Myklebust2010-03-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Wu Fengguang <fengguang.wu@intel.com>
| | * | | | | | NFS: Reduce the number of unnecessary COMMIT callsTrond Myklebust2010-03-051-4/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the caller is doing a non-blocking flush, and there are still writebacks pending on the wire, we can usually defer the COMMIT call until those writes are done. Also ensure that we honour the wbc->nonblocking flag. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | NFS: Add a count of the number of unstable writes carried by an inodeTrond Myklebust2010-03-053-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to know when we should do opportunistic commits of the unstable writes, when the VM is doing a background flush, we add a field to count the number of unstable writes. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| | * | | | | | NFS: Cleanup - move nfs_write_inode() into fs/nfs/write.cTrond Myklebust2010-03-053-20/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The sole purpose of nfs_write_inode is to commit unstable writes, so move it into fs/nfs/write.c, and make nfs_commit_inode static. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | | | | | nfs41 fix NFS4ERR_CLID_INUSE for exchange idAndy Adamson2010-03-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | | | | | NFS: Fix an allocation-under-spinlock bugTrond Myklebust2010-03-021-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sunrpc_cache_update() will always call detail->update() from inside the detail->hash_lock, so it cannot allocate memory. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
| * | | | | | | SUNRPC: Handle EINVAL error returns from the TCP connect operationTrond Myklebust2010-03-021-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This can, for instance, happen if the user specifies a link local IPv6 address. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
| * | | | | | | NFSv4.1: Various fixes to the sequence flag error handlingTrond Myklebust2010-03-021-12/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ensure that we change the EXCHANGE_ID verifier (i.e. clp->cl_boot_time) when we want to reset all state. This is mainly needed when the server tells us that it is revoking our open or lock stateids. Handle revoking of recallable state by expiring the delegations. Handle callback path issues by expiring the delegations and then resetting the session. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | | | | | nfs4: renewd renew operations should take/put a client referenceAlexandros Batsakis2010-03-021-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | renewd sends RENEW requests to the NFS server in order to renew state. As the request is asynchronous, renewd should take a reference to the nfs_client to prevent concurrent umounts from freeing the client Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | | | | | nfs41: renewd sequence operations should take/put client referenceAlexandros Batsakis2010-03-021-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | renewd sends SEQUENCE requests to the NFS server in order to renew state. As the request is asynchronous, renewd should take a reference to the nfs_client to prevent concurrent umounts from freeing the session/client Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | | | | | nfs: prevent backlogging of renewd requestsAlexandros Batsakis2010-03-022-21/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the renewd send queue gets backlogged (e.g., if the server goes down), we will keep filling the queue with periodic RENEW/SEQUENCE requests. This patch schedules a new renewd request if and only if the previous one returns (either success or failure) Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> [Trond.Myklebust@netapp.com: moved nfs4_schedule_state_renewal() into separate nfs4_renew_release() and nfs41_sequence_release() callbacks to ensure correct behaviour on call setup failure] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | | | | | nfs: kill renewd before clearing client minor versionAlexandros Batsakis2010-03-021-24/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | renewd should be synchronously killed before we destroy the session in nfs4_clear_minor_version Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> [Trond.Myklebust@netapp.com: clean up to remove 'unused function warning when !CONFIG_NFS_V4] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | | | | | NFS: Make close(2) asynchronous when closing NFS O_DIRECT filesChuck Lever2010-02-101-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For NFSv2 and v3: O_DIRECT writes are always synchronous, and aren't cached, so nothing should be flushed when closing an NFS O_DIRECT file descriptor. Thus there are no write errors to report on close(2). In addition, there's no cached data to verify on the next open(2), so we don't need clean GETATTR results at close time to compare with. Thus, there's no need for the nfs_revalidate_inode() call when closing an NFS O_DIRECT file. This reduces the number of synchronous on-the-wire requests for a simple open-write-close of an NFS O_DIRECT file by roughly 20%. For NFSv4: Call nfs4_do_close() with wait set to zero when closing an NFS O_DIRECT file. The CLOSE will go on the wire, but the application won't wait for it to complete. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | | | | | NFS: Improve NFS iostat byte count accuracy for writesChuck Lever2010-02-101-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The bytes counted by the performance counters for NFS writes should reflect write and sync errors. If the write(2) system call reports an error, the bytes should not be counted. And, if the write is short, the actual number of bytes that was written should be counted, not the number of bytes that was requested. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | | | | | NFS: Account for NFS bytes read via the splice APIChuck Lever2010-02-101-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bytes read via the splice API should be accounted for in the NFS performance statistics. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | | | | | NFS: Fix byte accounting for generic NFS readsChuck Lever2010-02-101-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the NFS I/O counters count the number of bytes requested by applications, rather than the number of bytes actually read by the system calls. The number of bytes requested for reads is actually not that useful, because the value is usually a buffer size for reads. That is, that requested number is usually a maximum, and frequently doesn't reflect the actual number of bytes read. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | | | | | NFS: Proper accounting for NFS VFS callsChuck Lever2010-02-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Nit: The VFSOPEN and VFSFLUSH counters are function call counters. Count every call to these routines. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | | | | | nfs41 do not allocate unused back channel pagesAndy Adamson2010-02-102-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Andy Adamson <andros@netapp.com> [Trond.Myklebust@netapp.com: moved definition of svc_is_backchannel() into include/linux/sunrpc/bc_xprt.h.] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | | | | | nfs41: cleanup callback code to use __be32 typeAndy Adamson2010-02-102-22/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | | | | | nfs41: clear NFS4CLNT_RECALL_SLOT bit on session resetAndy Adamson2010-02-101-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | | | | | nfs41: fix nfs4_callback_recallslotAndy Adamson2010-02-101-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Return NFS4_OK if target high slotid equals enforced high slotid. Fix nfs_client reference leak. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | | | | | nfs41: resize slot table in resetAndy Adamson2010-02-101-19/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When session is reset, client can renegotiate slot table size. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | | | | | nfs41: implement cb_recall_slotAndy Adamson2010-02-106-1/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drain the fore channel and reset the max_slots to the new value. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | | | | | nfs41: back channel drc minimal implementationAndy Adamson2010-02-102-12/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For now the back channel ca_maxresponsesize_cached is 0 and there is no backchannel DRC. Return NFS4ERR_REP_TOO_BIG_TO_CACHE when a cb_sequence cachethis is true. When it is false, return NFS4ERR_RETRY_UNCACHED_REP as the next operation error. Remember the replay error accross compound operation processing. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>