summaryrefslogtreecommitdiffstats
path: root/fs/proc/base.c
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'for-linus' of ↵Linus Torvalds2014-10-131-8/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: "The big thing in this pile is Eric's unmount-on-rmdir series; we finally have everything we need for that. The final piece of prereqs is delayed mntput() - now filesystem shutdown always happens on shallow stack. Other than that, we have several new primitives for iov_iter (Matt Wilcox, culled from his XIP-related series) pushing the conversion to ->read_iter()/ ->write_iter() a bit more, a bunch of fs/dcache.c cleanups and fixes (including the external name refcounting, which gives consistent behaviour of d_move() wrt procfs symlinks for long and short names alike) and assorted cleanups and fixes all over the place. This is just the first pile; there's a lot of stuff from various people that ought to go in this window. Starting with unionmount/overlayfs mess... ;-/" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (60 commits) fs/file_table.c: Update alloc_file() comment vfs: Deduplicate code shared by xattr system calls operating on paths reiserfs: remove pointless forward declaration of struct nameidata don't need that forward declaration of struct nameidata in dcache.h anymore take dname_external() into fs/dcache.c let path_init() failures treated the same way as subsequent link_path_walk() fix misuses of f_count() in ppp and netlink ncpfs: use list_for_each_entry() for d_subdirs walk vfs: move getname() from callers to do_mount() gfs2_atomic_open(): skip lookups on hashed dentry [infiniband] remove pointless assignments gadgetfs: saner API for gadgetfs_create_file() f_fs: saner API for ffs_sb_create_file() jfs: don't hash direct inode [s390] remove pointless assignment of ->f_op in vmlogrdr ->open() ecryptfs: ->f_op is never NULL android: ->f_op is never NULL nouveau: __iomem misannotations missing annotation in fs/file.c fs: namespace: suppress 'may be used uninitialized' warnings ...
| * proc: Update proc_flush_task_mnt to use d_invalidateEric W. Biederman2014-10-091-4/+2
| | | | | | | | | | | | | | | | | | | | | | Now that d_invalidate always succeeds and flushes mount points use it in stead of a combination of shrink_dcache_parent and d_drop in proc_flush_task_mnt. This removes the danger of a mount point under /proc/<pid>/... becoming unreachable after the d_drop. Reviewed-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * vfs: Remove d_drop calls from d_revalidate implementationsEric W. Biederman2014-10-091-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that d_invalidate always succeeds it is not longer necessary or desirable to hard code d_drop calls into filesystem specific d_revalidate implementations. Remove the unnecessary d_drop calls and rely on d_invalidate to drop the dentries. Using d_invalidate ensures that paths to mount points will not be dropped. Reviewed-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | Merge branch 'for-3.18' of ↵Linus Torvalds2014-10-101-35/+4
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: "Nothing too interesting. Just a handful of cleanup patches" * 'for-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: Revert "cgroup: remove redundant variable in cgroup_mount()" cgroup: remove redundant variable in cgroup_mount() cgroup: fix missing unlock in cgroup_release_agent() cgroup: remove CGRP_RELEASABLE flag perf/cgroup: Remove perf_put_cgroup() cgroup: remove redundant check in cgroup_ino() cpuset: simplify proc_cpuset_show() cgroup: simplify proc_cgroup_show() cgroup: use a per-cgroup work for release agent cgroup: remove bogus comments cgroup: remove redundant code in cgroup_rmdir() cgroup: remove some useless forward declarations cgroup: fix a typo in comment.
| * | cpuset: simplify proc_cpuset_show()Zefan Li2014-09-181-18/+2
| | | | | | | | | | | | | | | | | | | | | Use the ONE macro instead of REG, and we can simplify proc_cpuset_show(). Signed-off-by: Zefan Li <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | cgroup: simplify proc_cgroup_show()Zefan Li2014-09-181-17/+2
| |/ | | | | | | | | | | | | Use the ONE macro instead of REG, and we can simplify proc_cgroup_show(). Signed-off-by: Zefan Li <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
* / proc: introduce proc_mem_open()Oleg Nesterov2014-10-091-15/+21
|/ | | | | | | | | | | | Extract the mm_access() code from __mem_open() into the new helper, proc_mem_open(), the next patch will add another caller. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'for-linus' of ↵Linus Torvalds2014-08-091-5/+13
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull namespace updates from Eric Biederman: "This is a bunch of small changes built against 3.16-rc6. The most significant change for users is the first patch which makes setns drmatically faster by removing unneded rcu handling. The next chunk of changes are so that "mount -o remount,.." will not allow the user namespace root to drop flags on a mount set by the system wide root. Aks this forces read-only mounts to stay read-only, no-dev mounts to stay no-dev, no-suid mounts to stay no-suid, no-exec mounts to stay no exec and it prevents unprivileged users from messing with a mounts atime settings. I have included my test case as the last patch in this series so people performing backports can verify this change works correctly. The next change fixes a bug in NFS that was discovered while auditing nsproxy users for the first optimization. Today you can oops the kernel by reading /proc/fs/nfsfs/{servers,volumes} if you are clever with pid namespaces. I rebased and fixed the build of the !CONFIG_NFS_FS case yesterday when a build bot caught my typo. Given that no one to my knowledge bases anything on my tree fixing the typo in place seems more responsible that requiring a typo-fix to be backported as well. The last change is a small semantic cleanup introducing /proc/thread-self and pointing /proc/mounts and /proc/net at it. This prevents several kinds of problemantic corner cases. It is a user-visible change so it has a minute chance of causing regressions so the change to /proc/mounts and /proc/net are individual one line commits that can be trivially reverted. Unfortunately I lost and could not find the email of the original reporter so he is not credited. From at least one perspective this change to /proc/net is a refgression fix to allow pthread /proc/net uses that were broken by the introduction of the network namespace" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: proc: Point /proc/mounts at /proc/thread-self/mounts instead of /proc/self/mounts proc: Point /proc/net at /proc/thread-self/net instead of /proc/self/net proc: Implement /proc/thread-self to point at the directory of the current thread proc: Have net show up under /proc/<tgid>/task/<tid> NFS: Fix /proc/fs/nfsfs/servers and /proc/fs/nfsfs/volumes mnt: Add tests for unprivileged remount cases that have found to be faulty mnt: Change the default remount atime from relatime to the existing value mnt: Correct permission checks in do_remount mnt: Move the test for MNT_LOCK_READONLY from change_mount_flags into do_remount mnt: Only change user settable mount flags in remount namespaces: Use task_lock and not rcu to protect nsproxy
| * proc: Implement /proc/thread-self to point at the directory of the current ↵Eric W. Biederman2014-08-041-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | thread /proc/thread-self is derived from /proc/self. /proc/thread-self points to the directory in proc containing information about the current thread. This funtionality has been missing for a long time, and is tricky to implement in userspace as gettid() is not exported by glibc. More importantly this allows fixing defects in /proc/mounts and /proc/net where in a threaded application today they wind up being empty files when only the initial pthread has exited, causing problems for other threads. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
| * proc: Have net show up under /proc/<tgid>/task/<tid>Eric W. Biederman2014-08-041-0/+3
| | | | | | | | | | | | | | Network namespaces are per task so it make sense for them to show up in the task directory. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* | proc: remove INF macroAlexey Dobriyan2014-08-081-41/+0
| | | | | | | | | | | | | | | | | | If you're applying this patch, all /proc/$PID/* files were converted to seq_file interface and this code became unused. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: convert /proc/$PID/hardwall to seq_file interfaceAlexey Dobriyan2014-08-081-2/+2
| | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: convert /proc/$PID/io to seq_file interfaceAlexey Dobriyan2014-08-081-8/+10
| | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: convert /proc/$PID/oom_score to seq_file interfaceAlexey Dobriyan2014-08-081-4/+5
| | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: convert /proc/$PID/schedstat to seq_file interfaceAlexey Dobriyan2014-08-081-4/+5
| | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: convert /proc/$PID/wchan to seq_file interfaceAlexey Dobriyan2014-08-081-5/+6
| | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: convert /proc/$PID/cmdline to seq_file interfaceAlexey Dobriyan2014-08-081-4/+11
| | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: convert /proc/$PID/syscall to seq_file interfaceAlexey Dobriyan2014-08-081-6/+7
| | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: convert /proc/$PID/limits to seq_file interfaceAlexey Dobriyan2014-08-081-15/+12
| | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: convert /proc/$PID/auxv to seq_file interfaceAlexey Dobriyan2014-08-081-10/+8
| | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: more "const char *" pointersAlexey Dobriyan2014-08-081-4/+4
| | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: faster /proc/$PID lookupAlexey Dobriyan2014-08-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently lookup for /proc/$PID first goes through spinlock and whole list of misc /proc entries only to confirm that, yes, /proc/42 can not possibly match random proc entry. List is is several dozens entries long (52 entries on my setup). None of this is necessary. Try to convert dentry name to integer first. If it works, it must be /proc/$PID. If it doesn't, it must be random proc entry. Based on patch from Al Viro. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: add and remove /proc entry create checksAlexey Dobriyan2014-08-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * remove proc_create(NULL, ...) check, let it oops * warn about proc_create("", ...) and proc_create("very very long name", ...) proc code keeps length as u8, no 256+ name length possible * warn about proc_create("123", ...) /proc/$PID and /proc/misc namespaces are separate things, but dumb module might create funky a-la $PID entry. * remove post mortem strchr('/') check Triggering it implies either strchr() is buggy or memory corruption. It should be VFS check anyway. In reality, none of these checks will ever trigger, it is preparation for the next patch. Based on patch from Al Viro. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: constify seq_operationsFabian Frederick2014-08-081-1/+1
|/ | | | | | | | | | | | | | | | proc_uid_seq_operations, proc_gid_seq_operations and proc_projid_seq_operations are only called in proc_id_map_open with seq_open as const struct seq_operations so we can constify the 3 structures and update proc_id_map_open prototype. text data bss dec hex filename 6817 404 1984 9205 23f5 kernel/user_namespace.o-before 6913 308 1984 9205 23f5 kernel/user_namespace.o-after Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge git://git.infradead.org/users/eparis/auditLinus Torvalds2014-04-121-34/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull audit updates from Eric Paris. * git://git.infradead.org/users/eparis/audit: (28 commits) AUDIT: make audit_is_compat depend on CONFIG_AUDIT_COMPAT_GENERIC audit: renumber AUDIT_FEATURE_CHANGE into the 1300 range audit: do not cast audit_rule_data pointers pointlesly AUDIT: Allow login in non-init namespaces audit: define audit_is_compat in kernel internal header kernel: Use RCU_INIT_POINTER(x, NULL) in audit.c sched: declare pid_alive as inline audit: use uapi/linux/audit.h for AUDIT_ARCH declarations syscall_get_arch: remove useless function arguments audit: remove stray newline from audit_log_execve_info() audit_panic() call audit: remove stray newlines from audit_log_lost messages audit: include subject in login records audit: remove superfluous new- prefix in AUDIT_LOGIN messages audit: allow user processes to log from another PID namespace audit: anchor all pid references in the initial pid namespace audit: convert PPIDs to the inital PID namespace. pid: get pid_t ppid of task in init_pid_ns audit: rename the misleading audit_get_context() to audit_take_context() audit: Add generic compat syscall support audit: Add CONFIG_HAVE_ARCH_AUDITSYSCALL ...
| * proc: Update get proc_pid_cmdline() to use mm.h helpersWilliam Roberts2014-03-201-34/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Re-factor proc_pid_cmdline() to use get_cmdline() helper from mm.h. Acked-by: David Rientjes <rientjes@google.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: William Roberts <wroberts@tresys.com> Acked-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
* | fault-injection: set bounds on what /proc/self/make-it-fail accepts.Dave Jones2014-04-071-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | /proc/self/make-it-fail is a boolean, but accepts any number, including negative ones. Change variable to unsigned, and cap upper bound at 1. [akpm@linux-foundation.org: don't make make_it_fail unsigned] Signed-off-by: Dave Jones <davej@fedoraproject.org> Reviewed-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | procfs: make /proc/*/pagemap 0400Djalal Harouni2014-04-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The /proc/*/pagemap contain sensitive information and currently its mode is 0444. Change this to 0400, so the VFS will prevent unprivileged processes from getting file descriptors on arbitrary privileged /proc/*/pagemap files. This reduces the scope of address space leaking and bypasses by protecting already running processes. Signed-off-by: Djalal Harouni <tixxdz@opendz.org> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Andy Lutomirski <luto@amacapital.net> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | procfs: make /proc/*/{stack,syscall,personality} 0400Djalal Harouni2014-04-071-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These procfs files contain sensitive information and currently their mode is 0444. Change this to 0400, so the VFS will be able to block unprivileged processes from getting file descriptors on arbitrary privileged /proc/*/{stack,syscall,personality} files. This reduces the scope of ASLR leaking and bypasses by protecting already running processes. Signed-off-by: Djalal Harouni <tixxdz@opendz.org> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Andy Lutomirski <luto@amacapital.net> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | fs/proc/base.c: fix GPF in /proc/$PID/map_filesArtem Fetishev2014-03-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The expected logic of proc_map_files_get_link() is either to return 0 and initialize 'path' or return an error and leave 'path' uninitialized. By the time dname_to_vma_addr() returns 0 the corresponding vma may have already be gone. In this case the path is not initialized but the return value is still 0. This results in 'general protection fault' inside d_path(). Steps to reproduce: CONFIG_CHECKPOINT_RESTORE=y fd = open(...); while (1) { mmap(fd, ...); munmap(fd, ...); } ls -la /proc/$PID/map_files Addresses https://bugzilla.kernel.org/show_bug.cgi?id=68991 Signed-off-by: Artem Fetishev <artem_fetishev@epam.com> Signed-off-by: Aleksandr Terekhov <aleksandr_terekhov@epam.com> Reported-by: <wiebittewas@gmail.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: fix ->f_pos overflows in first_tid()Oleg Nesterov2014-01-231-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. proc_task_readdir()->first_tid() path truncates f_pos to int, this is wrong even on 64bit. We could check that f_pos < PID_MAX or even INT_MAX in proc_task_readdir(), but this patch simply checks the potential overflow in first_tid(), this check is nop on 64bit. We do not care if it was negative and the new unsigned value is huge, all we need to ensure is that we never wrongly return !NULL. 2. Remove the 2nd "nr != 0" check before get_nr_threads(), nr_threads == 0 is not distinguishable from !pid_task() above. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Sameer Nanda <snanda@chromium.org> Cc: Sergey Dyasly <dserrg@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: don't (ab)use ->group_leader in proc_task_readdir() pathsOleg Nesterov2014-01-231-28/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | proc_task_readdir() does not really need "leader", first_tid() has to revalidate it anyway. Just pass proc_pid(inode) to first_tid() instead, it can do pid_task(PIDTYPE_PID) itself and read ->group_leader only if necessary. The patch also extracts the "inode is dead" code from pid_delete_dentry(dentry) into the new trivial helper, proc_inode_is_dead(inode), proc_task_readdir() uses it to return -ENOENT if this dir was removed. This is a bit racy, but the race is very inlikely and the getdents() after openndir() can see the empty "." + ".." dir only once. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Sameer Nanda <snanda@chromium.org> Cc: Sergey Dyasly <dserrg@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: change first_tid() to use while_each_thread() rather than next_thread()Oleg Nesterov2014-01-231-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Rerwrite the main loop to use while_each_thread() instead of next_thread(). We are going to fix or replace while_each_thread(), next_thread() should be avoided whenever possible. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Sameer Nanda <snanda@chromium.org> Cc: Sergey Dyasly <dserrg@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | proc: fix the potential use-after-free in first_tid()Oleg Nesterov2014-01-231-0/+3
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | proc_task_readdir() verifies that the result of get_proc_task() is pid_alive() and thus its ->group_leader is fine too. However this is not necessarily true after rcu_read_unlock(), we need to recheck this again after first_tid() does rcu_read_lock(). Otherwise leader->thread_group.next (used by next_thread()) can be invalid if the rcu grace period expires in between. The race is subtle and unlikely, but still it is possible afaics. To simplify lets ignore the "likely" case when tid != 0, f_version can be cleared by proc_task_operations->llseek(). Suppose we have a main thread M and its subthread T. Suppose that f_pos == 3, iow first_tid() should return T. Now suppose that the following happens between rcu_read_unlock() and rcu_read_lock(): 1. T execs and becomes the new leader. This removes M from ->thread_group but next_thread(M) is still T. 2. T creates another thread X which does exec as well, T goes away. 3. X creates another subthread, this increments nr_threads. 4. first_tid() does next_thread(M) and returns the already dead T. Note also that we need 2. and 3. only because of get_nr_threads() check, and this check was supposed to be optimization only. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Sameer Nanda <snanda@chromium.org> Cc: Sergey Dyasly <dserrg@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* audit: allow unsetting the loginuid (with priv)Eric Paris2013-11-051-4/+10
| | | | | | | | | | | | | If a task has CAP_AUDIT_CONTROL allow that task to unset their loginuid. This would allow a child of that task to set their loginuid without CAP_AUDIT_CONTROL. Thus when launching a new login daemon, a priviledged helper would be able to unset the loginuid and then the daemon, which may be malicious user facing, do not need priv to function correctly. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
* proc_fill_cache(): clean up, get rid of pointless find_inode_number() useAl Viro2013-06-291-23/+13
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* proc_fill_cache(): just make instantiate_t return intAl Viro2013-06-291-31/+28
| | | | | | all instances always return ERR_PTR(-E...) or NULL, anyway Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* proc_pid_readdir(): stop wanking with proc_fill_cache() for /proc/selfAl Viro2013-06-291-3/+3
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* proc_fill_cache(): kill pointless checkAl Viro2013-06-291-4/+2
| | | | | | we'd just checked that child->d_inode is non-NULL, for fuck sake! Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* [readdir] convert procfsAl Viro2013-06-291-230/+133
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* posix-timers: Show clock ID in proc filePavel Tikhomirov2013-05-281-0/+1
| | | | | | | | | | | | | Expand information about posix-timers in /proc/<pid>/timers by adding info about clock, with which the timer was created. I.e. in the forth line of timer info after "notify:" line go "ClockID: <clock_id>". Signed-off-by: Pavel Tikhomirov <snorcht@gmail.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Matthew Helsley <matt.helsley@gmail.com> Cc: Pavel Emelyanov <xemul@parallels.com> Link: http://lkml.kernel.org/r/1368742323-46949-2-git-send-email-snorcht@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* Merge branch 'for-linus' of ↵Linus Torvalds2013-05-011-4/+52
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS updates from Al Viro, Misc cleanups all over the place, mainly wrt /proc interfaces (switch create_proc_entry to proc_create(), get rid of the deprecated create_proc_read_entry() in favor of using proc_create_data() and seq_file etc). 7kloc removed. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits) don't bother with deferred freeing of fdtables proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h proc: Make the PROC_I() and PDE() macros internal to procfs proc: Supply a function to remove a proc entry by PDE take cgroup_open() and cpuset_open() to fs/proc/base.c ppc: Clean up scanlog ppc: Clean up rtas_flash driver somewhat hostap: proc: Use remove_proc_subtree() drm: proc: Use remove_proc_subtree() drm: proc: Use minor->index to label things, not PDE->name drm: Constify drm_proc_list[] zoran: Don't print proc_dir_entry data in debug reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show() proc: Supply an accessor for getting the data from a PDE's parent airo: Use remove_proc_subtree() rtl8192u: Don't need to save device proc dir PDE rtl8187se: Use a dir under /proc/net/r8180/ proc: Add proc_mkdir_data() proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h} proc: Move PDE_NET() to fs/proc/proc_net.c ...
| * take cgroup_open() and cpuset_open() to fs/proc/base.cAl Viro2013-05-011-0/+31
| | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * proc: Uninline pid_delete_dentry()David Howells2013-05-011-0/+9
| | | | | | | | | | | | | | Uninline pid_delete_dentry() as it's only used by three function pointers. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * procfs: switch /proc/self away from proc_dir_entryAl Viro2013-04-091-4/+12
| | | | | | | | | | | | | | Just have it pinned in dcache all along and let procfs ->kill_sb() drop it before kill_anon_super(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | fs, proc: truncate /proc/pid/comm writes to first TASK_COMM_LEN bytesDavid Rientjes2013-04-301-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, a write to a procfs file will return the number of bytes successfully written. If the actual string is longer than this, the remainder of the string will not be be written and userspace will complete the operation by issuing additional write()s. Hence $ echo -n "abcdefghijklmnopqrs" > /proc/self/comm results in $ cat /proc/$$/comm pqrs since the final four bytes were written with a second write() since TASK_COMM_LEN == 16. This is obviously an undesired result and not equivalent to prctl(PR_SET_NAME). The implementation should not need to know the definition of TASK_COMM_LEN. This patch truncates the string to the first TASK_COMM_LEN bytes and returns the bytes written as the length of the string written so the second write() is suppressed. $ cat /proc/$$/comm abcdefghijklmno Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | posix-timers: Show sigevent info in proc filePavel Emelyanov2013-04-171-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previous patch added proc file to list posix timers created by task. Expand the information provided in this file by adding info about notification method, with which timers were created. I.e. after the "ID:" line there go 1. "signal:" line, that shows signal number and sigval bits; 2. "notify:" line, that shows the timer notification method. Thus the timer entry would looke like this: ID: 123 signal: 14/0000000000b005d0 notify: signal/pid.732 This information is enough to understand how timer_create() was called for each particular timer. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Matthew Helsley <matt.helsley@gmail.com> Link: http://lkml.kernel.org/r/513DA024.80404@parallels.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | posix-timers: Introduce /proc/PID/timers filePavel Emelyanov2013-04-171-0/+83
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently kernel doesn't provide any API for getting info about what posix timers are configured by processes. It's implied, that a process which configured some timers, knows what it did. However, for external tools it's impossible to get this information. In particular, this is critical for checkpoint-restore project to have this info. Introduce a per-pid proc file with information about posix timers. Since these timers are shared between threads, this file is present on tgid level only, no such thing in tid subdirs. The file format is expected to be the "/proc/<pid>/smaps"-like, i.e. each timer will occupy seveal lines to allow for future extending. Each new timer entry starts with the ID: <number> line which is added by this patch. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Matthew Helsley <matt.helsley@gmail.com> Link: http://lkml.kernel.org/r/513DA00D.6070009@parallels.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* fs/proc: clean up printksAndrew Morton2013-02-271-1/+2
| | | | | | | | | | | | - use pr_foo() throughout - remove a couple of duplicated KERN_WARNINGs, via WARN(KERN_WARNING "...") - nuke a few warnings which I've never seen happen, ever. Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fs: change return values from -EACCES to -EPERMZhao Hongjiang2013-02-261-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | According to SUSv3: [EACCES] Permission denied. An attempt was made to access a file in a way forbidden by its file access permissions. [EPERM] Operation not permitted. An attempt was made to perform an operation limited to processes with appropriate privileges or to the owner of a file or other resource. So -EPERM should be returned if capability checks fails. Strictly speaking this is an API change since the error code user sees is altered. Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com> Acked-by: Jan Kara <jack@suse.cz> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Acked-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>