summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'locks' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2007-10-159-131/+109
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'locks' of git://linux-nfs.org/~bfields/linux: nfsd: remove IS_ISMNDLCK macro Rework /proc/locks via seq_files and seq_list helpers fs/locks.c: use list_for_each_entry() instead of list_for_each() NFS: clean up explicit check for mandatory locks AFS: clean up explicit check for mandatory locks 9PFS: clean up explicit check for mandatory locks GFS2: clean up explicit check for mandatory locks Cleanup macros for distinguishing mandatory locks Documentation: move locks.txt in filesystems/ locks: add warning about mandatory locking races Documentation: move mandatory locking documentation to filesystems/ locks: Fix potential OOPS in generic_setlease() Use list_first_entry in locks_wake_up_blocks locks: fix flock_lock_file() comment Memory shortage can result in inconsistent flocks state locks: kill redundant local variable locks: reverse order of posix_locks_conflict() arguments
| * nfsd: remove IS_ISMNDLCK macroJ. Bruce Fields2007-10-091-7/+6
| | | | | | | | | | | | | | This macro is only used in one place; in this place it seems simpler to put open-code it and move the comment to where it's used. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * Rework /proc/locks via seq_files and seq_list helpersPavel Emelyanov2007-10-092-80/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently /proc/locks is shown with a proc_read function, but its behavior is rather complex as it has to manually handle current offset and buffer length. On the other hand, files that show objects from lists can be easily reimplemented using the sequential files and the seq_list_XXX() helpers. This saves (as usually) 16 lines of code and more than 200 from the .text section. [akpm@linux-foundation.org: no externs in C] [akpm@linux-foundation.org: warning fixes] Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * fs/locks.c: use list_for_each_entry() instead of list_for_each()Matthias Kaehlcke2007-10-091-10/+7
| | | | | | | | | | | | | | | | fs/locks.c: use list_for_each_entry() instead of list_for_each() in posix_locks_deadlock() and get_locks_status() Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * NFS: clean up explicit check for mandatory locksPavel Emelyanov2007-10-091-2/+1
| | | | | | | | | | | | | | | | | | | | The __mandatory_lock(inode) macro makes the same check, but makes the code more readable. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * AFS: clean up explicit check for mandatory locksPavel Emelyanov2007-10-091-2/+1
| | | | | | | | | | | | | | | | | | The __mandatory_lock(inode) macro makes the same check, but makes the code more readable. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * 9PFS: clean up explicit check for mandatory locksPavel Emelyanov2007-10-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | The __mandatory_lock(inode) macro makes the same check, but makes the code more readable. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ron Minnich <rminnich@sandia.gov> Cc: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * GFS2: clean up explicit check for mandatory locksPavel Emelyanov2007-10-091-2/+2
| | | | | | | | | | | | | | | | | | The __mandatory_lock(inode) function makes the same check, but makes the code more readable. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * Cleanup macros for distinguishing mandatory locksPavel Emelyanov2007-10-094-13/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The combination of S_ISGID bit set and S_IXGRP bit unset is used to mark the inode as "mandatory lockable" and there's a macro for this check called MANDATORY_LOCK(inode). However, fs/locks.c and some filesystems still perform the explicit i_mode checking. Besides, Andrew pointed out, that this macro is buggy itself, as it dereferences the inode arg twice. Convert this macro into static inline function and switch its users to it, making the code shorter and more readable. The __mandatory_lock() helper is to be used in places where the IS_MANDLOCK() for superblock is already known to be true. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: David Howells <dhowells@redhat.com> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ron Minnich <rminnich@sandia.gov> Cc: Latchesar Ionkov <lucho@ionkov.net> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * locks: Fix potential OOPS in generic_setlease()Pavel Emelyanov2007-10-091-9/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This code is run under lock_kernel(), which is dropped during sleeping operations, so the following race is possible: CPU1: CPU2: vfs_setlease(); vfs_setlease(); lock_kernel(); lock_kernel(); /* spin */ generic_setlease(): ... for (before = ...) /* here we found some lease after * which we will insert the new one */ fl = locks_alloc_lock(); /* go to sleep in this allocation and * drop the BKL */ generic_setlease(): ... for (before = ...) /* here we find the "before" pointing * at the one we found on CPU1 */ ->fl_change(my_before, arg); lease_modify(); locks_free_lock(); /* and we freed it */ ... unlock_kernel(); locks_insert_lock(before, fl); /* OOPS! We have just tried to add the lease * at the tail of already removed one */ The similar races are already handled in other code - all the allocations are performed before any checks/updates. Thanks to Kamalesh Babulal for testing and for a bug report on an earlier version. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
| * Use list_first_entry in locks_wake_up_blocksPavel Emelyanov2007-10-091-1/+3
| | | | | | | | | | | | | | | | This routine deletes all the elements from the list with the "while (!list_empty())" loop, and we already have a list_first_entry() macro to help it look nicer :) Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
| * locks: fix flock_lock_file() commentJ. Bruce Fields2007-10-091-2/+1
| | | | | | | | | | | | | | This comment wasn't updated when lease support was added, and it makes essentially the same mistake that the code made before a recent bugfix. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * Memory shortage can result in inconsistent flocks statePavel Emelyanov2007-10-091-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the flock_lock_file() is called to change the flock from F_RDLCK to F_WRLCK or vice versa the existing flock can be removed without appropriate warning. Look: for_each_lock(inode, before) { struct file_lock *fl = *before; if (IS_POSIX(fl)) break; if (IS_LEASE(fl)) continue; if (filp != fl->fl_file) continue; if (request->fl_type == fl->fl_type) goto out; found = 1; locks_delete_lock(before); <<<<<< ! break; } if after this point the subsequent locks_alloc_lock() will fail the return code will be -ENOMEM, but the existing lock is already removed. This is a known feature that such "re-locking" is not atomic, but in the racy case the file should stay locked (although by some other process), but in this case the file will be unlocked. The proposal is to prepare the lock in advance keeping no chance to fail in the future code. Found during making the flocks pid-namespaces aware. (Note: Thanks to Reuben Farrelly for finding a bug in an earlier version of this patch.) Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Reuben Farrelly <reuben-linuxkernel@reub.net>
| * locks: kill redundant local variableJ. Bruce Fields2007-10-091-1/+1
| | | | | | | | | | | | | | | | There's no need for another variable local to this loop; we can use the variable (of the same name!) already declared at the top of the function, and not used till later (at which point it's initialized, so this is safe). Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
| * locks: reverse order of posix_locks_conflict() argumentsJ. Bruce Fields2007-10-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The first argument to posix_locks_conflict() is meant to be a lock request, and the second a lock from an inode's lock request. It doesn't really make a difference which order you call them in, since the only asymmetric test in posix_lock_conflict() is the check whether the second argument is a posix lock--and every caller already does that check for some reason. But may as well fix posix_test_lock() to call posix_locks_conflict() with the arguments in the same order as everywhere else. Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
* | Merge git://git.linux-nfs.org/pub/linux/nfs-2.6Linus Torvalds2007-10-1526-846/+787
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.linux-nfs.org/pub/linux/nfs-2.6: (131 commits) NFSv4: Fix a typo in nfs_inode_reclaim_delegation NFS: Add a boot parameter to disable 64 bit inode numbers NFS: nfs_refresh_inode should clear cache_validity flags on success NFS: Fix a connectathon regression in NFSv3 and NFSv4 NFS: Use nfs_refresh_inode() in ops that aren't expected to change the inode SUNRPC: Don't call xprt_release in call refresh SUNRPC: Don't call xprt_release() if call_allocate fails SUNRPC: Fix buggy UDP transmission [23/37] Clean up duplicate includes in [2.6 patch] net/sunrpc/rpcb_clnt.c: make struct rpcb_program static SUNRPC: Use correct type in buffer length calculations SUNRPC: Fix default hostname created in rpc_create() nfs: add server port to rpc_pipe info file NFS: Get rid of some obsolete macros NFS: Simplify filehandle revalidation NFS: Ensure that nfs_link() returns a hashed dentry NFS: Be strict about dentry revalidation when doing exclusive create NFS: Don't zap the readdir caches upon error NFS: Remove the redundant nfs_reval_fsid() NFSv3: Always use directory post-op attributes in nfs3_proc_lookup ... Fix up trivial conflict due to sock_owned_by_user() cleanup manually in net/sunrpc/xprtsock.c
| * | NFSv4: Fix a typo in nfs_inode_reclaim_delegationTrond Myklebust2007-10-111-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | We were intending to put the previous instance of delegation->cred before setting a new one. Thanks to David Howells for spotting this. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: Add a boot parameter to disable 64 bit inode numbersTrond Myklebust2007-10-092-2/+28
| | | | | | | | | | | | | | | | | | | | | | | | This boot parameter will allow legacy 32-bit applications which call stat() to continue to function even if the NFSv3/v4 server uses 64-bit inode numbers. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: nfs_refresh_inode should clear cache_validity flags on successTrond Myklebust2007-10-091-18/+17
| | | | | | | | | | | | | | | | | | | | | If the cached attributes match the ones supplied in the fattr, then assume we've revalidated the inode. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: Fix a connectathon regression in NFSv3 and NFSv4Trond Myklebust2007-10-092-7/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | We're failing basic test6 against Linux servers because they lack a correct change attribute. The fix is to assume that we always want to invalidate the readdir caches when we call update_changeattr and/or nfs_post_op_update_inode on a directory. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: Use nfs_refresh_inode() in ops that aren't expected to change the inodeTrond Myklebust2007-10-092-5/+3
| | | | | | | | | | | | | | | | | | | | | nfs_post_op_update_inode() is really only meant to be used if we expect the inode and its attributes to have changed in some way. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: Get rid of some obsolete macrosTrond Myklebust2007-10-092-3/+3
| | | | | | | | | | | | | | | | | | | | | - NFS_READTIME, NFS_CHANGE_ATTR are completely unused. - Inline the few remaining uses of NFS_ATTRTIMEO, and remove. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: Simplify filehandle revalidationTrond Myklebust2007-10-091-0/+1
| | | | | | | | | | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: Ensure that nfs_link() returns a hashed dentryTrond Myklebust2007-10-091-1/+2
| | | | | | | | | | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: Be strict about dentry revalidation when doing exclusive createTrond Myklebust2007-10-091-15/+14
| | | | | | | | | | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: Don't zap the readdir caches upon errorTrond Myklebust2007-10-091-2/+0
| | | | | | | | | | | | | | | | | | If necessary, the caches will get zapped under normal revalidation. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: Remove the redundant nfs_reval_fsid()Trond Myklebust2007-10-091-15/+0
| | | | | | | | | | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFSv3: Always use directory post-op attributes in nfs3_proc_lookupTrond Myklebust2007-10-091-2/+1
| | | | | | | | | | | | | | | | | | | | | LOOKUP returns the directory post-op attributes whether or not the operation was successful. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFSv4: Fix nfs_atomic_open() to set the verifier on negative dentries tooTrond Myklebust2007-10-092-9/+6
| | | | | | | | | | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFSv4: Use NFSv2/v3 rules for negative dentries in nfs_open_revalidateTrond Myklebust2007-10-091-1/+5
| | | | | | | | | | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFSv4: Don't revalidate the directory in nfs_atomic_lookup()Trond Myklebust2007-10-091-8/+0
| | | | | | | | | | | | | | | | | | Why bother, since the call to nfs4_atomic_open() will do it for us. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: Optimise nfs_lookup_revalidate()Trond Myklebust2007-10-091-7/+8
| | | | | | | | | | | | | | | | | | | | | We don't need to call nfs_revalidate_inode() on the directory if we already know that the verifiers don't match. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: Reset nfsi->last_updated only if the attribute changedTrond Myklebust2007-10-091-5/+12
| | | | | | | | | | | | | | | | | | | | | Otherwise set it to nfsi->read_cache_jiffies in order to prevent jiffy wraparound issues. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: Remove nfs_begin_data_update/nfs_end_data_updateTrond Myklebust2007-10-096-64/+1
| | | | | | | | | | | | | | | | | | | | | The lower level routines in fs/nfs/proc.c, fs/nfs/nfs3proc.c and fs/nfs/nfs4proc.c should already be dealing with the revalidation issues. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: Remove NFS_I(inode)->data_updatesTrond Myklebust2007-10-091-20/+1
| | | | | | | | | | | | | | | | | | We have no more users... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: NFS_CACHEINV() should not test for nfs_caches_unstable()Trond Myklebust2007-10-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The fact that we're in the process of modifying the inode does not mean that we should not invalidate the attribute and data caches. The defensive thing is to always invalidate when we're confronted with inode mtime/ctime or change_attribute updates that we do not immediately recognise. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: Remove bogus nfs_mark_for_revalidate() in nfs_lookupTrond Myklebust2007-10-091-6/+0
| | | | | | | | | | | | | | | | | | The parent of the newly materialised dentry has just been revalidated... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: don't cache the verifer across ->lookup() callsTrond Myklebust2007-10-091-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | If the ->lookup() call causes the directory verifier to change, then there is still no need to use the old verifier, since our dentry has been verified. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: nfs_post_op_update_inode don't update cache_change_attributeTrond Myklebust2007-10-091-11/+7
| | | | | | | | | | | | | | | | | | | | | | | | If nfs_post_op_update_inode fails because the server didn't return any attributes, then we let the subsequent inode revalidation update cache_change_attribute. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: Don't revalidate dentries on directory size or ctime changesTrond Myklebust2007-10-091-4/+1
| | | | | | | | | | | | | | | | | | We only need to look at the mtime changes... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: Don't set cache_change_attribute in nfs_revalidate_mappingTrond Myklebust2007-10-091-4/+1
| | | | | | | | | | | | | | | | | | | | | The attribute revalidation code will already have taken care of resetting nfsi->cache_change_attribute. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: Fix a bug in nfs_open_revalidate()Trond Myklebust2007-10-091-1/+1
| | | | | | | | | | | | | | | | | | | | | We want to set the verifier when the call to nfs4_open_revalidate() _succeeds_. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: Don't hash the negative dentry when optimising for an O_EXCL openTrond Myklebust2007-10-092-3/+5
| | | | | | | | | | | | | | | | | | | | | We don't want to leave an unverified hashed negative dentry if the exclusive create fails to complete. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: nfs_instantiate() should set the dentry verifierTrond Myklebust2007-10-091-3/+1
| | | | | | | | | | | | | | | | | | | | | That will also allow us to remove the calls in mknod and mkdir. In addition it will ensure that symlinks set it correctly. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: Ensure nfs_instantiate() invalidates the parent dir on errorTrond Myklebust2007-10-091-8/+15
| | | | | | | | | | | | | | | | | | Also ensure that it drops the dentry in this case. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: Fix nfs_verify_change_attribute()Trond Myklebust2007-10-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | We don't care about whether or not some other process on our client is changing the directory while we're in nfs_lookup_revalidate(), because the dcache will take care of ensuring local atomicity. We can therefore remove the test for nfs_caches_unstable(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: Fake up 'wcc' attributes to prevent cache invalidation after writeTrond Myklebust2007-10-094-3/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | NFSv2 and v4 don't offer weak cache consistency attributes on WRITE calls. In NFSv3, returning wcc data is optional. In all cases, we want to prevent the client from invalidating our cached data whenever ->write_done() attempts to update the inode attributes. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: Remove bogus check of cache_change_attribute in nfs_update_inodeTrond Myklebust2007-10-091-12/+3
| | | | | | | | | | | | | | | | | | | | | | | | Remove the bogus 'data_stable' check in nfs_update_inode. The cache_change_attribute tells you if the directory changed on the server, and should have nothing to do with the file length. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: Fix the ESTALE "revalidation" in _nfs_revalidate_inode()Trond Myklebust2007-10-091-10/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For one thing, the test NFS_ATTRTIMEO() == 0 makes no sense: we're testing whether or not the cache timeout length is zero, which is totally unrelated to the issue of whether or not we trust the file staleness. Secondly, we do not want to retry the GETATTR once a file has been declared stale by the server: we rather want to discard that inode as soon as possible, since there are broken servers still in use out there that reuse filehandles on new files. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: Fix atime revalidation in read()Trond Myklebust2007-10-094-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | NFSv3 will correctly update atime on a read() call, so there is no need to set the NFS_INO_INVALID_ATIME flag unless the call to nfs_refresh_inode() fails. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>