linux.git - Linux kernel mainline tree

	Commit message (Collapse)	Author	Age	Files	Lines
*	vfs: Fix absolute RCU path walk failures due to uninitialized seq number	Tim Chen	2011-04-15	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During RCU walk in path_lookupat and path_openat, the rcu lookup frequently failed if looking up an absolute path, because when root directory was looked up, seq number was not properly set in nameidata. We dropped out of RCU walk in nameidata_drop_rcu due to mismatch in directory entry's seq number. We reverted to slow path walk that need to take references. With the following patch, I saw a 50% increase in an exim mail server benchmark throughput on a 4-socket Nehalem-EX system. Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: stable@kernel.org (v2.6.38) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	Fix common misspellings	Lucas De Marchi	2011-03-31	1	-1/+1
\| \| \| \| \| \|	Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
*	vfs - check non-mountpoint dentry might block in __follow_mount_rcu()	Ian Kent	2011-03-24	1	-5/+18
\| \| \| \| \| \| \| \|	When following a mount in rcu-walk mode we must check if the incoming dentry is telling us it may need to block, even if it isn't actually a mountpoint. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	Merge branch 'for-linus' of ↵	Linus Torvalds	2011-03-23	1	-2/+5
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: deal with races in /proc//{syscall,stack,personality} proc: enable writing to /proc/pid/mem proc: make check_mem_permission() return an mm_struct on success proc: hold cred_guard_mutex in check_mem_permission() proc: disable mem_write after exec mm: implement access_remote_vm mm: factor out main logic of access_process_vm mm: use mm_struct to resolve gate vma's in __get_user_pages mm: arch: rename in_gate_area_no_task to in_gate_area_no_mm mm: arch: make in_gate_area take an mm_struct instead of a task_struct mm: arch: make get_gate_vma take an mm_struct instead of a task_struct x86: mark associated mm when running a task in 32 bit compatibility mode x86: add context tag to mark mm when running a task in 32-bit compatibility mode auxv: require the target to be tracable (or yourself) close race in /proc//environ report errors in /proc//map* sanely pagemap: close races with suid execve make sessionid permissions in /proc//task/ match those in /proc/* fix leaks in path_lookupat() Fix up trivial conflicts in fs/proc/base.c
\| *	fix leaks in path_lookupat()	Al Viro	2011-03-23	1	-2/+5
\| \| \| \| \| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* \|	userns: rename is_owner_or_cap to inode_owner_or_capable	Serge E. Hallyn	2011-03-23	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	And give it a kernel-doc comment. [akpm@linux-foundation.org: btrfs changed in linux-next] Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Daniel Lezcano <daniel.lezcano@free.fr> Acked-by: David Howells <dhowells@redhat.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \|	userns: userns: check user namespace for task->file uid equivalence checks	Serge E. Hallyn	2011-03-23	1	-5/+16
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Cheat for now and say all files belong to init_user_ns. Next step will be to let superblocks belong to a user_ns, and derive inode_userns(inode) from inode->i_sb->s_user_ns. Finally we'll introduce more flexible arrangements. Changelog: Feb 15: make is_owner_or_cap take const struct inode Feb 23: make is_owner_or_cap bool [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> Acked-by: David Howells <dhowells@redhat.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	lose 'mounting_here' argument in ->d_manage()	Al Viro	2011-03-18	1	-4/+3
\| \| \| \| \| \|	it's always false... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	don't pass 'mounting_here' flag to follow_down()	Al Viro	2011-03-18	1	-2/+2
\| \| \| \| \| \|	it's always false now Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	fix follow_link() breakage	Al Viro	2011-03-16	1	-4/+3
\| \| \| \| \| \| \|	commit 574197e0de46a8a4db5c54ef7b65e43ffa8873a7 had a missing piece, breaking the loop detection ;-/ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	tidy the trailing symlinks traversal up	Al Viro	2011-03-15	1	-45/+26
\| \| \| \| \| \| \| \| \| \|	* pull the handling of current->total_link_count into __do_follow_link() * put the common "do ->put_link() if needed and path_put() the link" stuff into a helper (put_link(nd, link, cookie)) * rename __do_follow_link() to follow_link(), while we are at it Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	Turn resolution of trailing symlinks iterative everywhere	Al Viro	2011-03-15	1	-54/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The last remaining place (resolution of nested symlink) converted to the loop of the same kind we have in path_lookupat() and path_openat(). Note that we still do have a recursion in pathname resolution; can't avoid it, really. However, it's strictly for nested symlinks now - i.e. ones in the middle of a pathname. link_path_walk() has lost the tail now - it always walks everything except the last component. do_follow_link() renamed to nested_symlink() and moved down. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	simplify link_path_walk() tail	Al Viro	2011-03-15	1	-7/+1
\| \| \| \| \| \| \| \| \| \| \|	Now that link_path_walk() is called without LOOKUP_PARENT only from do_follow_link(), we can simplify the checks in last component handling. First of all, checking if we'd arrived to a directory is not needed - the caller will check it anyway. And LOOKUP_FOLLOW is guaranteed to be there, since we only get to that place with nd->depth > 0. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	Make trailing symlink resolution in path_lookupat() iterative	Al Viro	2011-03-15	1	-10/+53
\| \| \| \| \| \| \|	Now the only caller of link_path_walk() that does not pass LOOKUP_PARENT is do_follow_link() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	update nd->inode in __do_follow_link() instead of after do_follow_link()	Al Viro	2011-03-15	1	-3/+2
\| \| \| \| \| \|	... and note that we only need to do it for LAST_BIND symlinks Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	pull handling of one pathname component into a helper	Al Viro	2011-03-15	1	-68/+55
\| \| \| \| \| \| \| \| \| \|	new helper: walk_component(). Handles everything except symlinks; returns negative on error, 0 on success and 1 on symlinks we decided to follow. Drops out of RCU mode on such symlinks. link_path_walk() and do_last() switched to using that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	fs: allow AT_EMPTY_PATH in linkat(), limit that to CAP_DAC_READ_SEARCH	Aneesh Kumar K.V	2011-03-15	1	-4/+16
\| \| \| \| \| \| \| \|	We don't want to allow creation of private hardlinks by different application using the fd passed to them via SCM_RIGHTS. So limit the null relative name usage in linkat syscall to CAP_DAC_READ_SEARCH Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
*	Allow O_PATH for symlinks	Al Viro	2011-03-15	1	-6/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At that point we can't do almost nothing with them. They can be opened with O_PATH, we can manipulate such descriptors with dup(), etc. and we can see them in /proc//{fd,fdinfo}/. We can't (and won't be able to) follow /proc//fd/ symlinks for those; there's simply not enough information for pathname resolution to go on from such point - to resolve a symlink we need to know which directory does it live in. We will be able to do useful things with them after the next commit, though - readlinkat() and fchownat() will be possible to use with dfd being an O_PATH-opened symlink and empty relative pathname. Combined with open_by_handle() it'll give us a way to do realink-by-handle and lchown-by-handle without messing with more redundant syscalls. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	New kind of open files - "location only".	Al Viro	2011-03-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	New flag for open(2) - O_PATH. Semantics: * pathname is resolved, but the file itself is _NOT_ opened as far as filesystem is concerned. * almost all operations on the resulting descriptors shall fail with -EBADF. Exceptions are: 1) operations on descriptors themselves (i.e. close(), dup(), dup2(), dup3(), fcntl(fd, F_DUPFD), fcntl(fd, F_DUPFD_CLOEXEC, ...), fcntl(fd, F_GETFD), fcntl(fd, F_SETFD, ...)) 2) fcntl(fd, F_GETFL), for a common non-destructive way to check if descriptor is open 3) "dfd" arguments of ...at(2) syscalls, i.e. the starting points of pathname resolution * closing such descriptor does NOT affect dnotify or posix locks. * permissions are checked as usual along the way to file; no permission checks are applied to the file itself. Of course, giving such thing to syscall will result in permission checks (at the moment it means checking that starting point of ....at() is a directory and caller has exec permissions on it). fget() and fget_light() return NULL on such descriptors; use of fget_raw() and fget_raw_light() is needed to get them. That protects existing code from dealing with those things. There are two things still missing (they come in the next commits): one is handling of symlinks (right now we refuse to open them that way; see the next commit for semantics related to those) and another is descriptor passing via SCM_RIGHTS datagrams. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	fs: Don't allow to create hardlink for deleted file	Aneesh Kumar K.V	2011-03-15	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \|	Add inode->i_nlink == 0 check in VFS. Some of the file systems do this internally. A followup patch will remove those instance. This is needed to ensure that with link by handle we don't allow to create hardlink of an unlinked file. The check also prevent a race between unlink and link Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	New AT_... flag: AT_EMPTY_PATH	Al Viro	2011-03-14	1	-10/+19
\| \| \| \| \| \| \| \| \| \|	For name_to_handle_at(2) we'll want both ...at()-style syscall that would be usable for non-directory descriptors (with empty relative pathname). Introduce new flag (AT_EMPTY_PATH) to deal with that and corresponding LOOKUP_EMPTY; teach user_path_at() and path_init() to deal with the latter. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	open-style analog of vfs_path_lookup()	Al Viro	2011-03-14	1	-28/+52
\| \| \| \| \| \| \| \| \| \| \| \| \|	new function: file_open_root(dentry, mnt, name, flags) opens the file vfs_path_lookup would arrive to. Note that name can be empty; in that case the usual requirement that dentry should be a directory is lifted. open-coded equivalents switched to it, may_open() got down exactly one caller and became static. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	reduce vfs_path_lookup() to do_path_lookup()	Al Viro	2011-03-14	1	-52/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	New lookup flag: LOOKUP_ROOT. nd->root is set (and held) by caller, path_init() starts walking from that place and all pathname resolution machinery never drops nd->root if that flag is set. That turns vfs_path_lookup() into a special case of do_path_lookup() and gets us down to 3 callers of link_path_walk(), making it finally feasible to rip the handling of trailing symlink out of link_path_walk(). That will not only simply the living hell out of it, but make life much simpler for unionfs merge. Trailing symlink handling will become iterative, which is a good thing for stack footprint in a lot of situations as well. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	untangle do_lookup()	Al Viro	2011-03-14	1	-85/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	That thing has devolved into rats nest of gotos; sane use of unlikely() gets rid of that horror and gives much more readable structure: * make a fast attempt to find a dentry; false negatives are OK. In RCU mode if everything went fine, we are done, otherwise just drop out of RCU. If we'd done (RCU) ->d_revalidate() and it had not refused outright (i.e. didn't give us -ECHILD), remember its result. * now we are not in RCU mode and hopefully have a dentry. If we do not, lock parent, do full d_lookup() and if that has not found anything, allocate and call ->lookup(). If we'd done that ->lookup(), remember that dentry is good and we don't need to revalidate it. * now we have a dentry. If it has ->d_revalidate() and we can't skip it, call it. * hopefully dentry is good; if not, either fail (in case of error) or try to invalidate it. If d_invalidate() has succeeded, drop it and retry everything as if original attempt had not found a dentry. * now we can finish it up - deal with mountpoint crossing and automount. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	path_openat: clean ELOOP handling a bit	Al Viro	2011-03-14	1	-8/+6
\| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	do_last: kill a rudiment of old ->d_revalidate() workaround	Al Viro	2011-03-14	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \|	There used to be time when ->d_revalidate() couldn't return an error. So intents code had lookup_instantiate_filp() stash ERR_PTR(error) in nd->intent.open.filp and had it checked after lookup_hash(), to catch the otherwise silent failures. That had been introduced by commit 4af4c52f34606bdaab6930a845550c6fb02078a4. These days ->d_revalidate() can and does propagate errors back to callers explicitly, so this check isn't needed anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	fold __open_namei_create() and open_will_truncate() into do_last()	Al Viro	2011-03-14	1	-48/+26
\| \| \| \| \| \|	... and clean up a bit more Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	do_last: unify may_open() call and everyting after it	Al Viro	2011-03-14	1	-37/+22
\| \| \| \| \| \| \| \| \|	We have a bunch of diverging codepaths in do_last(); some of them converge, but the case of having to create a new file duplicates large part of common tail of the rest and exits separately. Massage them so that they could be merged. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	move may_open() from __open_name_create() to do_last()	Al Viro	2011-03-14	1	-5/+7
\| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	expand finish_open() in its only caller	Al Viro	2011-03-14	1	-52/+38
\| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	sanitize pathname component hash calculation	Al Viro	2011-03-14	1	-23/+19
\| \| \| \| \| \| \| \|	Lift it to lookup_one_len() and link_path_walk() resp. into the same place where we calculated default hash function of the same name. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	kill __lookup_one_len()	Al Viro	2011-03-14	1	-26/+15
\| \| \| \| \| \|	only one caller left Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	switch non-create side of open() to use of do_last()	Al Viro	2011-03-14	1	-33/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of path_lookupat() doing trailing symlink resolution, use the same scheme as on the O_CREAT side. Walk with LOOKUP_PARENT, then (in do_last()) look the final component up, then either open it or return error or, if it's a symlink, give the symlink back to path_openat() to be resolved there. The really messy complication here is RCU. We don't want to drop out of RCU mode before the final lookup, since we don't want to bounce parent directory ->d_count without a good reason. Result is _not_ pretty; later in the series we'll clean it up. For now we are roughly back where we'd been before the revert done by Nick's series - top-level logics of path_openat() is cleaned up, do_last() does actual opening, symlink resolution is done uniformly. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	get rid of nd->file	Al Viro	2011-03-14	1	-8/+7
\| \| \| \| \| \| \|	Don't stash the struct file * used as starting point of walk in nameidata; pass file ** to path_init() instead. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	get rid of the last LOOKUP_RCU dependencies in link_path_walk()	Al Viro	2011-03-14	1	-8/+13
\| \| \| \| \| \| \| \| \| \| \| \|	New helper: terminate_walk(). An error has happened during pathname resolution and we either drop nd->path or terminate RCU, depending the mode we had been in. After that, nd is essentially empty. Switch link_path_walk() to using that for cleanup. Now the top-level logics in link_path_walk() is back to sanity. RCU dependencies are in the lower-level functions. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	make nameidata_dentry_drop_rcu_maybe() always leave RCU mode	Al Viro	2011-03-14	1	-5/+11
\| \| \| \| \| \| \| \|	Now we have do_follow_link() guaranteed to leave without dangling RCU and the next step will get LOOKUP_RCU logics completely out of link_path_walk(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	make handle_dots() leave RCU mode on error	Al Viro	2011-03-14	1	-11/+12
\| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	clear RCU on all failure exits from link_path_walk()	Al Viro	2011-03-14	1	-14/+16
\| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	pull handling of . and .. into inlined helper	Al Viro	2011-03-14	1	-14/+16
\| \| \| \| \| \|	getting LOOKUP_RCU checks out of link_path_walk()... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	kill out_dput: in link_path_walk()	Al Viro	2011-03-14	1	-11/+4
\| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	separate -ESTALE/-ECHILD retries in do_filp_open() from real work	Al Viro	2011-03-14	1	-29/+20
\| \| \| \| \| \| \| \| \| \| \|	new helper: path_openat(). Does what do_filp_open() does, except that it tries only the walk mode (RCU/normal/force revalidation) it had been told to. Both create and non-create branches are using path_lookupat() now. Fixed the double audit_inode() in non-create branch. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	switch do_filp_open() to struct open_flags	Al Viro	2011-03-14	1	-79/+9
\| \| \| \| \| \| \| \| \|	take calculation of open_flags by open(2) arguments into new helper in fs/open.c, move filp_open() over there, have it and do_sys_open() use that helper, switch exec.c callers of do_filp_open() to explicit (and constant) struct open_flags. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	Collect "operation mode" arguments of do_last() into a structure	Al Viro	2011-03-14	1	-22/+35
\| \| \| \| \| \| \| \| \| \| \| \|	No point messing with passing shitloads of "operation mode" arguments to do_open() one by one, especially since they are not going to change during do_filp_open(). Collect them into a struct, fill it and pass to do_last() by reference. Make sure that lookup intent flags are correctly set and removed - we want them for do_last(), but they make no sense for __do_follow_link(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	clean up the failure exits after __do_follow_link() in do_filp_open()	Al Viro	2011-03-14	1	-8/+5
\| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	pull security_inode_follow_link() into __do_follow_link()	Al Viro	2011-03-14	1	-6/+7
\| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	pull dropping RCU on success of link_path_walk() into path_lookupat()	Al Viro	2011-03-14	1	-18/+12
\| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	untangle the "need_reval_dot" mess	Al Viro	2011-03-14	1	-63/+44
\| \| \| \| \| \| \| \| \| \| \| \| \|	instead of ad-hackery around need_reval_dot(), do the following: set a flag (LOOKUP_JUMPED) in the beginning of path, on absolute symlink traversal, on ".." and on procfs-style symlinks. Clear on normal components, leave unchanged on ".". Non-nested callers of link_path_walk() call handle_reval_path(), which checks that flag is set and that fs does want the final revalidate thing, then does ->d_revalidate(). In link_path_walk() all the return_reval stuff is gone. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	merge component type recognition	Al Viro	2011-03-14	1	-26/+22
\| \| \| \| \| \|	no need to do it in three places... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	merge path_init and path_init_rcu	Al Viro	2011-03-14	1	-83/+35
\| \| \| \| \| \| \| \| \| \|	Actual dependency on whether we want RCU or not is in 3 small areas (as it ought to be) and everything around those is the same in both versions. Since each function has only one caller and those callers are on two sides of if (flags & LOOKUP_RCU), it's easier and cleaner to merge them and pull the checks inside. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	sanitize path_walk() mess	Al Viro	2011-03-14	1	-92/+56
\| \| \| \| \| \| \| \| \|	New helper: path_lookupat(). Basically, what do_path_lookup() boils to modulo -ECHILD/-ESTALE handler. path_walk* family is gone; vfs_path_lookup() is using link_path_walk() directly, do_path_lookup() and do_filp_open() are using path_lookupat(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>