linux.git - Linux kernel mainline tree

	Commit message (Collapse)	Author	Age	Files	Lines
*	kill boilerplate around posix_acl_chmod_masq()	Al Viro	2011-07-25	14	-168/+139
\| \| \| \| \| \| \| \| \|	new helper: posix_acl_chmod(&acl, gfp, mode). Replaces acl with modified clone or with NULL if that has failed; returns 0 or -ve on error. All callers of posix_acl_chmod_masq() switched to that - they'd been doing exactly the same thing. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	reiserfs: cache negative ACLs for v1 stat format	Christoph Hellwig	2011-07-25	2	-9/+5
\| \| \| \| \| \| \| \| \| \|	Always set up a negative ACL cache entry if the inode can't have ACLs. That behaves much better than doing this check inside ->check_acl. Also remove the left over MAY_NOT_BLOCK check. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	xfs: cache negative ACLs if there is no attribute fork	Christoph Hellwig	2011-07-25	2	-12/+5
\| \| \| \| \| \| \| \| \|	Always set up a negative ACL cache entry if the inode doesn't have an attribute fork. That behaves much better than doing this check inside ->check_acl. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	9p: do no return 0 from ->check_acl without actually checking	Christoph Hellwig	2011-07-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	If we do not want to use ACLs we at least need to perform normal Unix permission checks. From the comment I'm not quite sure that's what is intended, but if 0p wants to do permission checks entirely on the server it needs to do so in ->permission, not in ->check_acl. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	vfs: move ACL cache lookup into generic code	Linus Torvalds	2011-07-25	12	-69/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This moves logic for checking the cached ACL values from low-level filesystems into generic code. The end result is a streamlined ACL check that doesn't need to load the inode->i_op->check_acl pointer at all for the common cached case. The filesystems also don't need to check for a non-blocking RCU walk case in their acl_check() functions, because that is all handled at a VFS layer. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	CIFS: Fix oops while mounting with prefixpath	Pavel Shilovsky	2011-07-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit fec11dd9a0109fe52fd631e5c510778d6cbff6cc caused a regression when we have already mounted //server/share/a and want to mount //server/share/a/b. The problem is that lookup_one_len calls __lookup_hash with nd pointer as NULL. Then __lookup_hash calls do_revalidate in the case when dentry exists and we end up with NULL pointer deference in cifs_d_revalidate: if (nd->flags & LOOKUP_RCU) return -ECHILD; Fix this by checking nd for NULL. Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	xfs: Fix wrong return value of xfs_file_aio_write	Markus Trippelsdorf	2011-07-25	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The fsync prototype change commit 02c24a82187d accidentally overwrote the ssize_t return value of xfs_file_aio_write with 0 for SYNC type writes. Fix this by checking if an error occured when calling xfs_file_fsync and only change the return value in this case. In addition xfs_file_fsync actually returns a normal negative error, so fix this, too. Signed-off-by: Markus Trippelsdorf <markus@trippelsdorf.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	fix devtmpfs race	Al Viro	2011-07-25	1	-1/+2
\| \| \| \| \| \| \| \|	After we's done complete(&req->done), there's nothing to prevent the scope containing req from being gone and req overwritten by any kind of junk. So we must read req->next before that... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	caam: don't pass bogus S_IFCHR to debugfs_create_...()	Al Viro	2011-07-24	1	-13/+13
\| \| \| \| \| \|	it will be replaced with S_IFREG anyway Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	get rid of create_proc_entry() abuses - proc_mkdir() is there for purpose	Al Viro	2011-07-24	7	-13/+9
\| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	asus-wmi: ->is_visible() can't return negative	Al Viro	2011-07-24	1	-1/+1
\| \| \| \| \| \|	It's mode_t; return 0 (no access) on error. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	fix jffs2 ACLs on big-endian with 16bit mode_t	Al Viro	2011-07-24	4	-5/+5
\| \| \| \| \| \| \| \| \|	casting int * to mode_t * is not a good thing - on a lot of big-endian architectures mode_t happens to be smaller than int and there it breaks quite spectaculary... Fucked-up-by: commit cfc8dc6f6f69ede939e09c2af06a01adee577285 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	9p: close ACL leaks	Al Viro	2011-07-24	3	-15/+22
\| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	ocfs2_init_acl(): fix a leak	Al Viro	2011-07-24	1	-0/+1
\| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	VFS : mount lock scalability for internal mounts	Tim Chen	2011-07-24	7	-4/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For a number of file systems that don't have a mount point (e.g. sockfs and pipefs), they are not marked as long term. Therefore in mntput_no_expire, all locks in vfs_mount lock are taken instead of just local cpu's lock to aggregate reference counts when we release reference to file objects. In fact, only local lock need to have been taken to update ref counts as these file systems are in no danger of going away until we are ready to unregister them. The attached patch marks file systems using kern_mount without mount point as long term. The contentions of vfs_mount lock is now eliminated. Before un-registering such file system, kern_unmount should be called to remove the long term flag and make the mount point ready to be freed. Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	Merge branch 'for-linus' of ↵	Linus Torvalds	2011-07-22	235	-1948/+2517
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (107 commits) vfs: use ERR_CAST for err-ptr tossing in lookup_instantiate_filp isofs: Remove global fs lock jffs2: fix IN_DELETE_SELF on overwriting rename() killing a directory fix IN_DELETE_SELF on overwriting rename() on ramfs et.al. mm/truncate.c: fix build for CONFIG_BLOCK not enabled fs:update the NOTE of the file_operations structure Remove dead code in dget_parent() AFS: Fix silly characters in a comment switch d_add_ci() to d_splice_alias() in "found negative" case as well simplify gfs2_lookup() jfs_lookup(): don't bother with . or .. get rid of useless dget_parent() in btrfs rename() and link() get rid of useless dget_parent() in fs/btrfs/ioctl.c fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers drivers: fix up various ->llseek() implementations fs: handle SEEK_HOLE/SEEK_DATA properly in all fs's that define their own llseek Ext4: handle SEEK_HOLE/SEEK_DATA generically Btrfs: implement our own ->llseek fs: add SEEK_HOLE and SEEK_DATA flags reiserfs: make reiserfs default to barrier=flush ... Fix up trivial conflicts in fs/xfs/linux-2.6/xfs_super.c due to the new shrinker callout for the inode cache, that clashed with the xfs code to start the periodic workers later.
\| *	vfs: use ERR_CAST for err-ptr tossing in lookup_instantiate_filp	Konstantin Khlebnikov	2011-07-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Replace unclear (struct dentry ) to (struct file ) typecast with ERR_CAST() macro. Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	isofs: Remove global fs lock	Jan Kara	2011-07-22	5	-12/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	sbi->s_mutex isn't needed for isofs at all so we can just remove it. Generally, since isofs is always mounted read-only, filesystem structure cannot change under us. So buffer_head contents stays constant after it's filled in. That leaves us with possible changes of global data structures. Superblock changes only during filesystem mount (even remount does not change it), inodes are only filled in during reading from disk. So there are no changes of these structures to bother about. Arguments why sbi->s_mutex can be removed at each place: isofs_readdir: Accesses sb, inode, filp, local variables => s_mutex not needed isofs_lookup: Protected by directory's i_mutex. Accesses sb, inode, dentry, local variables => s_mutex not needed rock_ridge_symlink_readpage: Protected by page lock. Accesses sb, inode, local variables => s_mutex not needed. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	jffs2: fix IN_DELETE_SELF on overwriting rename() killing a directory	Al Viro	2011-07-22	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We don't generate IN_DELETE_SELF on victim of overwriting rename() if it happens to be a directory. Trivially fixed by doing to ->i_nlink what we do ->pino_nlink a couple of lines later in jffs2_rename(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	fix IN_DELETE_SELF on overwriting rename() on ramfs et.al.	Al Viro	2011-07-22	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On ramfs and other simple_rename() users IN_DELETE_SELF is not generated for victim of overwriting rename() if it's is a directory. Works on most of the local filesystems and really trivial to fix... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	mm/truncate.c: fix build for CONFIG_BLOCK not enabled	Randy Dunlap	2011-07-22	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix build error when CONFIG_BLOCK is not enabled by providing a stub inode_dio_wait() function. mm/truncate.c:612: error: implicit declaration of function 'inode_dio_wait' Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	fs:update the NOTE of the file_operations structure	Wanlong Gao	2011-07-20	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Big kernel lock had been removed and setlease now use the lock_flocks() to hold a special spin lock file_lock_lock by Matthew. So just remove the out-of-date NOTE. Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	Remove dead code in dget_parent()	Al Viro	2011-07-20	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \|	->d_parent is never NULL... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	AFS: Fix silly characters in a comment	David Howells	2011-07-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix silly characters in a comment in AFS code (some weird characters replaced the word 'flag' some point way back). Reported-by: viro@ZenIV.linux.org.uk Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	switch d_add_ci() to d_splice_alias() in "found negative" case as well	Al Viro	2011-07-20	1	-19/+5
\| \| \| \| \| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	simplify gfs2_lookup()	Al Viro	2011-07-20	1	-11/+3
\| \| \| \| \| \| \| \| \| \| \| \|	d_splice_alias() will DTRT when given NULL or ERR_PTR Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	jfs_lookup(): don't bother with . or ..	Al Viro	2011-07-20	1	-24/+15
\| \| \| \| \| \| \| \| \| \| \| \|	they'll never be passed to ->lookup() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	get rid of useless dget_parent() in btrfs rename() and link()	Al Viro	2011-07-20	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \|	->d_parent is locked and stable there... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	get rid of useless dget_parent() in fs/btrfs/ioctl.c	Al Viro	2011-07-20	1	-12/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	both callers there have dentry->d_parent stabilized by the fact that their caller had obtained dentry from lookup_one_len() and had not dropped ->i_mutex on parent since then. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers	Josef Bacik	2011-07-20	71	-164/+462
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Btrfs needs to be able to control how filemap_write_and_wait_range() is called in fsync to make it less of a painful operation, so push down taking i_mutex and the calling of filemap_write_and_wait() down into the ->fsync() handlers. Some file systems can drop taking the i_mutex altogether it seems, like ext3 and ocfs2. For correctness sake I just pushed everything down in all cases to make sure that we keep the current behavior the same for everybody, and then each individual fs maintainer can make up their mind about what to do from there. Thanks, Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	drivers: fix up various ->llseek() implementations	Josef Bacik	2011-07-20	4	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix up a few ->llseek() implementations that won't deal with SEEK_HOLE/SEEK_DATA properly. Make them future proof so that if we ever add new options they will return -EINVAL. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	fs: handle SEEK_HOLE/SEEK_DATA properly in all fs's that define their own llseek	Josef Bacik	2011-07-20	7	-12/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This converts everybody to handle SEEK_HOLE/SEEK_DATA properly. In some cases we just return -EINVAL, in others we do the normal generic thing, and in others we're simply making sure that the properly due-dilligence is done. For example in NFS/CIFS we need to make sure the file size is update properly for the SEEK_HOLE and SEEK_DATA case, but since it calls the generic llseek stuff itself that is all we have to do. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	Ext4: handle SEEK_HOLE/SEEK_DATA generically	Josef Bacik	2011-07-20	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since Ext4 has its own lseek we need to make sure it handles SEEK_HOLE/SEEK_DATA. For now just do the same thing that is done in the generic case, somebody else can come along and make it do fancy things later. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	Btrfs: implement our own ->llseek	Josef Bacik	2011-07-20	2	-1/+150
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to handle SEEK_HOLE/SEEK_DATA we need to implement our own llseek. Basically for the normal SEEK_*'s we will just defer to the generic helper, and for SEEK_HOLE/SEEK_DATA we will use our fiemap helper to figure out the nearest hole or data. Currently this helper doesn't check for delalloc bytes for prealloc space, so for now treat prealloc as data until that is fixed. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	fs: add SEEK_HOLE and SEEK_DATA flags	Josef Bacik	2011-07-20	3	-4/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This just gets us ready to support the SEEK_HOLE and SEEK_DATA flags. Turns out using fiemap in things like cp cause more problems than it solves, so lets try and give userspace an interface that doesn't suck. We need to match solaris here, and the definitions are o If /whence/ is SEEK_HOLE, the offset of the start of the next hole greater than or equal to the supplied offset is returned. The definition of a hole is provided near the end of the DESCRIPTION. o If /whence/ is SEEK_DATA, the file pointer is set to the start of the next non-hole file region greater than or equal to the supplied offset. So in the generic case the entire file is data and there is a virtual hole at the end. That means we will just return i_size for SEEK_HOLE and will return the same offset for SEEK_DATA. This is how Solaris does it so we have to do it the same way. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	reiserfs: make reiserfs default to barrier=flush	Christoph Hellwig	2011-07-20	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change the default reiserfs mount option to barrier=flush. Based on a patch from Jeff Mahoney in the SuSE tree. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	ext3: make ext3 mount default to barrier=1	Christoph Hellwig	2011-07-20	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch turns on barriers by default for ext3. mount -o barrier=0 will turn them off. Based on a patch from Chris Mason in the SuSE tree. Signed-off-by: Chris Mason <chris.mason@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Eric Sandeen <sandeen@redhat.com> Acked-by: Jan Kara <jack@suse.cz> Acked-by: Jeff Mahoney <jeffm@suse.com> Acked-by: Ted Ts'o <tytso@mit.edu> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	don't open-code parent_ino() in assorted ->readdir()	Al Viro	2011-07-20	3	-3/+3
\| \| \| \| \| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	minix_getattr(): don't bother with ->d_parent	Al Viro	2011-07-20	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \|	we can find superblock easier, TYVM... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	coda_venus_readdir(): use offsetof()	Al Viro	2011-07-20	1	-2/+1
\| \| \| \| \| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	arm: don't create useless copies to pass into debugfs_create_dir()	Al Viro	2011-07-20	2	-12/+5
\| \| \| \| \| \| \| \| \| \| \| \|	its first argument is const char * and it's really not modified... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	switch assorted clock drivers to debugfs_remove_recursive()	Al Viro	2011-07-20	6	-41/+13
\| \| \| \| \| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	fs: seq_file - add event counter to simplify poll() support	Kay Sievers	2011-07-20	6	-43/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Moving the event counter into the dynamically allocated 'struc seq_file' allows poll() support without the need to allocate its own tracking structure. All current users are switched over to use the new counter. Requested-by: Andrew Morton akpm@linux-foundation.org Acked-by: NeilBrown <neilb@suse.de> Tested-by: Lucas De Marchi lucas.demarchi@profusion.mobi Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	fs: move inode_dio_done to the end_io handler	Christoph Hellwig	2011-07-20	4	-3/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For filesystems that delay their end_io processing we should keep our i_dio_count until the the processing is done. Enable this by moving the inode_dio_done call to the end_io handler if one exist. Note that the actual move to the workqueue for ext4 and XFS is not done in this patch yet, but left to the filesystem maintainers. At least for XFS it's not needed yet either as XFS has an internal equivalent to i_dio_count. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	fs: simplify the blockdev_direct_IO prototype	Christoph Hellwig	2011-07-20	10	-29/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Simple filesystems always pass inode->i_sb_bdev as the block device argument, and never need a end_io handler. Let's simply things for them and for my grepping activity by dropping these arguments. The only thing not falling into that scheme is ext4, which passes and end_io handler without needing special flags (yet), but given how messy the direct I/O code there is use of __blockdev_direct_IO in one instead of two out of three cases isn't going to make a large difference anyway. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	fs: always maintain i_dio_count	Christoph Hellwig	2011-07-20	3	-24/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Maintain i_dio_count for all filesystems, not just those using DIO_LOCKING. This these filesystems to also protect truncate against direct I/O requests by using common code. Right now the only non-DIO_LOCKING filesystem that appears to do so is XFS, which uses an opencoded variant of the i_dio_count scheme. Behaviour doesn't change for filesystems never calling inode_dio_wait. For ext4 behaviour changes when using the dioread_nonlock option, which previously was missing any protection between truncate and direct I/O reads. For ocfs2 that handcrafted i_dio_count manipulations are replaced with the common code now enable. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	fs: move inode_dio_wait calls into ->setattr	Christoph Hellwig	2011-07-20	12	-3/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Let filesystems handle waiting for direct I/O requests themselves instead of doing it beforehand. This means filesystem-specific locks to prevent new dio referenes from appearing can be held. This is important to allow generalizing i_dio_count to non-DIO_LOCKING filesystems. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	rw_semaphore: remove up/down_read_non_owner	Christoph Hellwig	2011-07-20	2	-26/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that the last users is gone these can be removed. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	fs: kill i_alloc_sem	Christoph Hellwig	2011-07-20	13	-53/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	i_alloc_sem is a rather special rw_semaphore. It's the last one that may be released by a non-owner, and it's write side is always mirrored by real exclusion. It's intended use it to wait for all pending direct I/O requests to finish before starting a truncate. Replace it with a hand-grown construct: - exclusion for truncates is already guaranteed by i_mutex, so it can simply fall way - the reader side is replaced by an i_dio_count member in struct inode that counts the number of pending direct I/O requests. Truncate can't proceed as long as it's non-zero - when i_dio_count reaches non-zero we wake up a pending truncate using wake_up_bit on a new bit in i_flags - new references to i_dio_count can't appear while we are waiting for it to read zero because the direct I/O count always needs i_mutex (or an equivalent like XFS's i_iolock) for starting a new operation. This scheme is much simpler, and saves the space of a spinlock_t and a struct list_head in struct inode (typically 160 bits on a non-debug 64-bit system). Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	fs: simplify handling of zero sized reads in __blockdev_direct_IO	Christoph Hellwig	2011-07-20	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reject zero sized reads as soon as we know our I/O length, and don't borther with locks or allocations that might have to be cleaned up otherwise. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>