summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
* superblock: introduce per-sb cache shrinker infrastructureDave Chinner2011-07-203-218/+71
| | | | | | | | | | | | | | | | | | | | | | | | With context based shrinkers, we can implement a per-superblock shrinker that shrinks the caches attached to the superblock. We currently have global shrinkers for the inode and dentry caches that split up into per-superblock operations via a coarse proportioning method that does not batch very well. The global shrinkers also have a dependency - dentries pin inodes - so we have to be very careful about how we register the global shrinkers so that the implicit call order is always correct. With a per-sb shrinker callout, we can encode this dependency directly into the per-sb shrinker, hence avoiding the need for strictly ordering shrinker registrations. We also have no need for any proportioning code for the shrinker subsystem already provides this functionality across all shrinkers. Allowing the shrinker to operate on a single superblock at a time means that we do less superblock list traversals and locking and reclaim should batch more effectively. This should result in less CPU overhead for reclaim and potentially faster reclaim of items from each filesystem. Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* superblock: move pin_sb_for_writeback() to fs/super.cDave Chinner2011-07-203-27/+35
| | | | | | | | | | | | | | | The per-sb shrinker has the same requirement as the writeback threads of ensuring that the superblock is usable and pinned for the time it takes to run the work. Both need to take a passive reference to the sb, take a read lock on the s_umount lock and then only continue if an unmount is not in progress. pin_sb_for_writeback() does this exactly, so move it to fs/super.c and rename it to grab_super_passive() and exporting it via fs/internal.h for all the VFS code to be able to use. Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* inode: move to per-sb LRU locksDave Chinner2011-07-202-14/+14
| | | | | | | | | | With the inode LRUs moving to per-sb structures, there is no longer a need for a global inode_lru_lock. The locking can be made more fine-grained by moving to a per-sb LRU lock, isolating the LRU operations of different filesytsems completely from each other. Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* inode: Make unused inode LRU per superblockDave Chinner2011-07-202-11/+81
| | | | | | | | | | | | | | | | | | The inode unused list is currently a global LRU. This does not match the other global filesystem cache - the dentry cache - which uses per-superblock LRU lists. Hence we have related filesystem object types using different LRU reclaimation schemes. To enable a per-superblock filesystem cache shrinker, both of these caches need to have per-sb unused object LRU lists. Hence this patch converts the global inode LRU to per-sb LRUs. The patch only does rudimentary per-sb propotioning in the shrinker infrastructure, as this gets removed when the per-sb shrinker callouts are introduced later on. Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* inode: convert inode_stat.nr_unused to per-cpu countersDave Chinner2011-07-201-5/+11
| | | | | | | | | Before we split up the inode_lru_lock, the unused inode counter needs to be made independent of the global inode_lru_lock. Convert it to per-cpu counters to do this. Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* make d_splice_alias(ERR_PTR(err), dentry) = ERR_PTR(err)Al Viro2011-07-2015-94/+39
| | | | | | ... and simplify the living hell out of callers Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* deuglify squashfs_lookup()Al Viro2011-07-201-4/+1
| | | | | | | | d_splice_alias(NULL, dentry) is equivalent to d_add(dentry, NULL), NULL so no need for that if (inode) ... in there (or ERR_PTR(0), for that matter) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* nfsd4_list_rec_dir(): don't bother with reopening rec_fileAl Viro2011-07-201-31/+21
| | | | | | | just rewind it to the beginning before vfs_readdir() and be done with that... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* kill useless checks for sb->s_op == NULLAl Viro2011-07-202-2/+1
| | | | | | never is... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* btrfs: kill magical embedded struct superblockAl Viro2011-07-204-22/+29
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* get rid of pointless checks for dentry->sb == NULLAl Viro2011-07-201-1/+0
| | | | | | it never is... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Make ->d_sb assign-once and always non-NULLAl Viro2011-07-203-39/+47
| | | | | | | | | | | | | | New helper (non-exported, fs/internal.h-only): __d_alloc(sb, name). Allocates dentry, sets its ->d_sb to given superblock and sets ->d_op accordingly. Old d_alloc(NULL, name) callers are converted to that (all of them know what superblock they want). d_alloc() itself is left only for parent != NULl case; uses __d_alloc(), inserts result into the list of parent's children. Note that now ->d_sb is assign-once and never NULL *and* ->d_parent is never NULL either. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* unexport kern_path_parent()Al Viro2011-07-201-1/+0
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* switch vfs_path_lookup() to struct pathAl Viro2011-07-203-21/+20
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* kill lookup_create()Al Viro2011-07-201-36/+18
| | | | | | folded into the only caller (kern_path_create()) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* new helpers: kern_path_create/user_path_createAl Viro2011-07-202-116/+87
| | | | | | | combination of kern_path_parent() and lookup_create(). Does *not* expose struct nameidata to caller. Syscalls converted to that... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* kill LOOKUP_CONTINUEAl Viro2011-07-201-8/+3
| | | | | | LOOKUP_PARENT is equivalent to it now Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* nfs: LOOKUP_{OPEN,CREATE,EXCL} is set only on the last stepAl Viro2011-07-201-4/+2
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* cifs_lookup(): LOOKUP_OPEN is set only on the last componentAl Viro2011-07-201-1/+1
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* ceph: LOOKUP_OPEN is set only when it's the last componentAl Viro2011-07-201-1/+0
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* jfs_ci_revalidate() is safe from RCU modeAl Viro2011-07-201-2/+0
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* LOOKUP_CREATE and LOOKUP_RENAME_TARGET can be set only on the last stepAl Viro2011-07-203-12/+6
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* no need to check for LOOKUP_OPEN in ->create() instancesAl Viro2011-07-205-10/+10
| | | | | | | ... it will be set in nd->flag for all cases with non-NULL nd (i.e. when called from do_last()). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* don't pass nameidata to vfs_create() from ecryptfs_create()Al Viro2011-07-201-28/+5
| | | | | | | | | Instead of playing with removal of LOOKUP_OPEN, mangling (and restoring) nd->path, just pass NULL to vfs_create(). The whole point of what's being done there is to suppress any attempts to open file by underlying fs, which is what nd == NULL indicates. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* don't transliterate lower bits of ->intent.open.flags to FMODE_...Al Viro2011-07-206-30/+23
| | | | | | ->create() instances are much happier that way... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Don't pass nameidata when calling vfs_create() from mknod()Al Viro2011-07-201-1/+1
| | | | | | | All instances can cope with that now (and ceph one actually starts working properly). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* fix mknod() on nfs4 (hopefully)Al Viro2011-07-201-12/+12
| | | | | | | | | a) check the right flags in ->create() (LOOKUP_OPEN, not LOOKUP_CREATE) b) default (!LOOKUP_OPEN) open_flags is O_CREAT|O_EXCL|FMODE_READ, not 0 c) lookup_instantiate_filp() should be done only with LOOKUP_OPEN; otherwise we need to issue CLOSE, lest we leak stateid on server. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* nameidata_to_nfs_open_context() doesn't need nameidata, actually...Al Viro2011-07-201-6/+7
| | | | | | | just open flags; switched to passing just those and renamed to create_nfs_open_context() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* nfs_open_context doesn't need struct path eitherAl Viro2011-07-207-42/+40
| | | | | | just dentry, please... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* nfs4_opendata doesn't need struct path eitherAl Viro2011-07-201-23/+22
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* nfs4_closedata doesn't need to mess with struct pathAl Viro2011-07-203-22/+21
| | | | | | | instead of path_get()/path_put(), we can just use nfs_sb_{,de}active() to pin the superblock down. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* cifs: fix the type of cifs_demultiplex_thread()Al Viro2011-07-201-2/+3
| | | | | | | | | ... and get rid of a bogus typecast, while we are at it; it's not just that we want a function returning int and not void, but cast to pointer to function taking void * and returning void would be (void (*)(void *)) and not (void *)(void *), TYVM... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* ecryptfs_inode_permission() doesn't need to bail out on RCUAl Viro2011-07-201-2/+0
| | | | | | | ... now that inode_permission() can take MAY_NOT_BLOCK and handle it properly. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* merge do_revalidate() into its only callerAl Viro2011-07-201-24/+18
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* no reason to keep exec_permission() separate nowAl Viro2011-07-201-41/+4
| | | | | | cache footprint alone makes it a bad idea... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* massage generic_permission() to treat directories on a separate pathAl Viro2011-07-201-4/+13
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* ->permission() sanitizing: don't pass flags to exec_permission()Al Viro2011-07-201-10/+7
| | | | | | | pass mask instead; kill security_inode_exec_permission() since we can use security_inode_permission() instead. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* ->permission() sanitizing: don't pass flags to ->permission()Al Viro2011-07-2027-50/+50
| | | | | | not used by the instances anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* ->permission() sanitizing: don't pass flags to generic_permission()Al Viro2011-07-2015-18/+17
| | | | | | | redundant; all callers get it duplicated in mask & MAY_NOT_BLOCK and none of them removes that bit. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* ->permission() sanitizing: don't pass flags to ->check_acl()Al Viro2011-07-2023-27/+27
| | | | | | not used in the instances anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* ->permission() sanitizing: pass MAY_NOT_BLOCK to ->check_acl()Al Viro2011-07-2013-15/+14
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* ->permission() sanitizing: MAY_NOT_BLOCKAl Viro2011-07-202-3/+6
| | | | | | Duplicate the flags argument into mask bitmap. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* kill check_acl callback of generic_permission()Al Viro2011-07-2018-35/+44
| | | | | | | its value depends only on inode and does not change; we might as well store it in ->i_op->check_acl and be done with that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* lockless get_write_access/deny_write_accessAl Viro2011-07-201-46/+0
| | | | | | new helpers: atomic_inc_unless_negative()/atomic_dec_unless_positive() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* move exec_permission() up to the rest of permission-related functionsAl Viro2011-07-201-34/+38
| | | | | | ... and convert the comment before it into linuxdoc form. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* kill file_permission() completelyAl Viro2011-07-201-18/+0
| | | | | | convert the last remaining caller to inode_permission() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* consolidate BINPRM_FLAGS_ENFORCE_NONDUMP handlingAl Viro2011-07-204-9/+14
| | | | | | | | new helper: would_dump(bprm, file). Checks if we are allowed to read the file and if we are not - sets ENFORCE_NODUMP. Exported, used in places that previously open-coded the same logics. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* switch path_init() to exec_permission()Al Viro2011-07-201-1/+1
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* switch udf_ioctl() to inode_permission()Al Viro2011-07-201-1/+1
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* make exec_permission(dir) really equivalent to inode_permission(dir, MAY_EXEC)Al Viro2011-07-201-9/+9
| | | | | | | | | | | | | capability overrides apply only to the default case; if fs has ->permission() that does _not_ call generic_permission(), we have no business doing them. Moreover, if it has ->permission() that does call generic_permission(), we have no need to recheck capabilities. Besides, the capability overrides should apply only if we got EACCES from acl_permission_check(); any other value (-EIO, etc.) should be returned to caller, capabilities or not capabilities. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>