summaryrefslogtreecommitdiffstats
path: root/init/do_mounts.c
Commit message (Collapse)AuthorAgeFilesLines
* init: don't panic if mount_nodev_root failedLeon Romanovsky2021-09-191-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Attempt to mount 9p file system as root gives the following kernel panic: 9pnet_virtio: no channels available for device root Kernel panic - not syncing: VFS: Unable to mount root "root" (9p), err=-2 CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.15.0-rc1+ #127 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack_lvl+0x45/0x59 panic+0x1e2/0x44b ? __warn_printk+0xf3/0xf3 ? free_unref_page+0x2d4/0x4a0 ? trace_hardirqs_on+0x32/0x120 ? free_unref_page+0x2d4/0x4a0 mount_root+0x189/0x1e0 prepare_namespace+0x136/0x165 kernel_init_freeable+0x3b8/0x3cb ? rest_init+0x2e0/0x2e0 kernel_init+0x19/0x130 ret_from_fork+0x1f/0x30 Kernel Offset: disabled ---[ end Kernel panic - not syncing: VFS: Unable to mount root "root" (9p), err=-2 ]--- QEMU command line: "qemu-system-x86_64 -append root=/dev/root rw rootfstype=9p rootflags=trans=virtio ..." This error is because root_device_name is truncated in prepare_namespace() from being "/dev/root" to be "root" prior to call to mount_nodev_root(). As a solution, don't treat errors in mount_nodev_root() as errors that require panics and allow failback to the mount flow that existed before patch citied in Fixes tag. Fixes: f9259be6a9e7 ("init: allow mounting arbitrary non-blockdevice filesystems as root") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* init/do_mounts.c: Harden split_fs_names() against buffer overflowVivek Goyal2021-09-191-11/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | split_fs_names() currently takes comma separate list of filesystems and converts it into individual filesystem strings. Pleaces these strings in the input buffer passed by caller and returns number of strings. If caller manages to pass input string bigger than buffer, then we can write beyond the buffer. Or if string just fits buffer, we will still write beyond the buffer as we append a '\0' byte at the end. Pass size of input buffer to split_fs_names() and put enough checks in place so such buffer overrun possibilities do not occur. This patch does few things. - Add a parameter "size" to split_fs_names(). This specifies size of input buffer. - Use strlcpy() (instead of strcpy()) so that we can't go beyond buffer size. If input string "names" is larger than passed in buffer, input string will be truncated to fit in buffer. - Stop appending extra '\0' character at the end and avoid one possibility of going beyond the input buffer size. - Do not use extra loop to count number of strings. - Previously if one passed "rootfstype=foo,,bar", split_fs_names() will return only 1 string "foo" (and "bar" will be truncated due to extra ,). After this patch, now split_fs_names() will return 3 strings ("foo", zero-sized-string, and "bar"). Callers of split_fs_names() have been modified to check for zero sized string and skip to next one. Reported-by: xu xin <xu.xin16@zte.com.cn> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Merge branch 'work.init' of ↵Linus Torvalds2021-09-091-25/+65
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull root filesystem type handling updates from Al Viro: "Teach init/do_mounts.c to handle non-block filesystems, hopefully preventing even more special-cased kludges (such as root=/dev/nfs, etc)" * 'work.init' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: simplify get_filesystem_list / get_all_fs_names init: allow mounting arbitrary non-blockdevice filesystems as root init: split get_fs_names
| * fs: simplify get_filesystem_list / get_all_fs_namesChristoph Hellwig2021-08-231-28/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just output the '\0' separate list of supported file systems for block devices directly rather than going through a pointless round of string manipulation. Based on an earlier patch from Al Viro <viro@zeniv.linux.org.uk>. Vivek: Modified list_bdev_fs_names() and split_fs_names() to return number of null terminted strings to caller. Callers now use that information to loop through all the strings instead of relying on one extra null char being present at the end. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * init: allow mounting arbitrary non-blockdevice filesystems as rootChristoph Hellwig2021-08-231-0/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the only non-blockdevice filesystems that can be used as the initial root filesystem are NFS and CIFS, which use the magic "root=/dev/nfs" and "root=/dev/cifs" syntax that requires the root device file system details to come from filesystem specific kernel command line options. Add a little bit of new code that allows to just pass arbitrary string mount options to any non-blockdevice filesystems so that it can be mounted as the root file system. For example a virtiofs root file system can be mounted using the following syntax: "root=myfs rootfstype=virtiofs rw" Based on an earlier patch from Vivek Goyal <vgoyal@redhat.com>. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * init: split get_fs_namesChristoph Hellwig2021-08-231-22/+26
| | | | | | | | | | | | | | | | Split get_fs_names into one function that splits up the command line argument, and one that gets the list of all registered file systems. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | block: remove CONFIG_DEBUG_BLOCK_EXT_DEVTChristoph Hellwig2021-08-241-4/+0
|/ | | | | | | | | This might have been a neat debug aid when the extended dev_t was added, but that time is long gone. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210824075216.1179406-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* block: factor out a part_devt helperChristoph Hellwig2021-06-011-8/+2
| | | | | | | | | Add a helper to find the dev_t for a disk + partno tuple. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210525061301.2242282-8-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* block: merge struct block_device and struct hd_structChristoph Hellwig2020-12-011-11/+10
| | | | | | | | | | | | | | Instead of having two structures that represent each block device with different life time rules, merge them into a single one. This also greatly simplifies the reference counting rules, as we can use the inode reference count as the main reference count for the new struct block_device, with the device model reference front ending it for device model interaction. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* block: remove the partno field from struct hd_structChristoph Hellwig2020-12-011-1/+1
| | | | | | | | | Just use the bd_partno field in struct block_device everywhere. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* block: move the partition_meta_info to struct block_deviceChristoph Hellwig2020-12-011-3/+4
| | | | | | | | | | Move the partition_meta_info to struct block_device in preparation for killing struct hd_struct. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* init: cleanup match_dev_by_uuid and match_dev_by_labelChristoph Hellwig2020-12-011-12/+6
| | | | | | | | | | | | | Avoid a totally pointless goto label, and use the same style of comparism for both helpers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* init: refactor devt_from_partuuidChristoph Hellwig2020-12-011-37/+31
| | | | | | | | | | | | The code in devt_from_partuuid is very convoluted. Refactor a bit by sanitizing the goto and variable name usage. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* init: refactor name_to_dev_tChristoph Hellwig2020-12-011-93/+90
| | | | | | | | | | | | | | Split each case into a self-contained helper, and move the block dependent code entirely under the pre-existing #ifdef CONFIG_BLOCK. This allows to remove the blk_lookup_devt stub in genhd.h. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* init: add an init_chroot helperChristoph Hellwig2020-07-311-1/+1
| | | | | | | Add a simple helper to chroot with a kernel space file name and switch the early init code over to it. Remove the now unused ksys_chroot. Signed-off-by: Christoph Hellwig <hch@lst.de>
* init: add an init_chdir helperChristoph Hellwig2020-07-311-1/+1
| | | | | | | Add a simple helper to chdir with a kernel space file name and switch the early init code over to it. Remove the now unused ksys_chdir. Signed-off-by: Christoph Hellwig <hch@lst.de>
* init: add an init_mount helperChristoph Hellwig2020-07-311-4/+4
| | | | | | | | | Like do_mount, but takes a kernel pointer for the destination path. Switch over the mounts in the init code and devtmpfs to it, which just happen to work due to the implicit set_fs(KERNEL_DS) during early init right now. Signed-off-by: Christoph Hellwig <hch@lst.de>
* initrd: remove support for multiple floppiesChristoph Hellwig2020-07-301-62/+7
| | | | | | | | | | | Remove the special handling for multiple floppies in the initrd code. No one should be using floppies for booting these days. (famous last words..) Includes a spelling fix from Colin Ian King <colin.king@canonical.com>. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
* md: move the early init autodetect code to drivers/md/Christoph Hellwig2020-07-161-0/+1
| | | | | | | | | Just like the NFS and CIFS root code this better lives with the driver it is tightly integrated with. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Song Liu <song@kernel.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
* block: remove __bdevnameChristoph Hellwig2020-03-241-10/+2
| | | | | | | | | | | | | There is no good reason for __bdevname to exist. Just open code printing the string in the callers. For three of them the format string can be trivially merged into existing printk statements, and in init/do_mounts.c we can at least do the scnprintf once at the start of the function, and unconditional of CONFIG_BLOCK to make the output for tiny configfs a little more helpful. Acked-by: Theodore Ts'o <tytso@mit.edu> # for ext4 Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* Fix root mounting with no mount optionsLinus Torvalds2019-12-161-10/+13
| | | | | | | | | | | | | | | | | | | The "trivial conversion" in commit cccaa5e33525 ("init: use do_mount() instead of ksys_mount()") was totally broken, since it didn't handle the case of a NULL mount data pointer. And while I had "tested" it (and presumably Dominik had too) that bug was hidden by me having options. Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Arnd Bergmann <arnd@arndb.de> Reported-by: Ondřej Jirman <megi@xff.cz> Reported-by: Guenter Roeck <linux@roeck-us.net> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Reported-and-tested-by: Borislav Petkov <bp@suse.de> Tested-by: Chris Clayton <chris2553@googlemail.com> Tested-by: Eric Biggers <ebiggers@kernel.org> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Guido Günther <agx@sigxcpu.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* init: use do_mount() instead of ksys_mount()Dominik Brodowski2019-12-121-6/+22
| | | | | | | | | | | | | | | | | | | | In prepare_namespace(), do_mount() can be used instead of ksys_mount() as the first and third argument are const strings in the kernel, the second and fourth argument are passed through anyway, and the fifth argument is NULL. In do_mount_root(), ksys_mount() is called with the first and third argument being already kernelspace strings, which do not need to be copied over from userspace to kernelspace (again). The second and fourth arguments are passed through to do_mount() anyway. The fifth argument, while already residing in kernelspace, needs to be put into a page of its own. Then, do_mount() can be used instead of ksys_mount(). Once this is done, there are no in-kernel users to ksys_mount() left, which can therefore be removed. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* devtmpfs: use do_mount() instead of ksys_mount()Dominik Brodowski2019-12-121-1/+1
| | | | | | | | | | | | In devtmpfs, do_mount() can be called directly instead of complex wrapping by ksys_mount(): - the first and third arguments are const strings in the kernel, and do not need to be copied over from userspace; - the fifth argument is NULL, and therefore no page needs to be copied over from userspace; - the second and fourth argument are passed through anyway. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* init: Support mounting root file systems over SMBPaulo Alcantara (SUSE)2019-10-021-0/+49
| | | | | | | | | | | | Add a new virtual device named /dev/cifs (0xfe) to tell the kernel to mount the root file system over the network by using SMB protocol. cifs_root_data() will be responsible to retrieve the parsed information of the new command-line option (cifsroot=) and then call do_mount_root() with the appropriate mount options for cifs.ko. Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
* vfs: Convert ramfs, shmem, tmpfs, devtmpfs, rootfs to use the new mount APIDavid Howells2019-09-121-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | Convert the ramfs, shmem, tmpfs, devtmpfs and rootfs filesystems to the new internal mount API as the old one will be obsoleted and removed. This allows greater flexibility in communication of mount parameters between userspace, the VFS and the filesystem. See Documentation/filesystems/mount_api.txt for more information. Note that tmpfs is slightly tricky as it can contain embedded commas, so it can't be trivially split up using strsep() to break on commas in generic_parse_monolithic(). Instead, tmpfs has to supply its own generic parser. However, if tmpfs changes, then devtmpfs and rootfs, which are wrappers around tmpfs or ramfs, must change too - and thus so must ramfs, so these had to be converted also. [AV: rewritten] Signed-off-by: David Howells <dhowells@redhat.com> cc: Hugh Dickins <hughd@google.com> cc: linux-mm@kvack.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* make shmem_fill_super() staticAl Viro2019-09-051-1/+1
| | | | | | ... have callers use shmem_mount() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* make ramfs_fill_super() staticAl Viro2019-09-051-4/+2
| | | | | | all users should just call ramfs_mount() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Merge branch 'work.mount0' of ↵Linus Torvalds2019-07-191-21/+3
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs mount updates from Al Viro: "The first part of mount updates. Convert filesystems to use the new mount API" * 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits) mnt_init(): call shmem_init() unconditionally constify ksys_mount() string arguments don't bother with registering rootfs init_rootfs(): don't bother with init_ramfs_fs() vfs: Convert smackfs to use the new mount API vfs: Convert selinuxfs to use the new mount API vfs: Convert securityfs to use the new mount API vfs: Convert apparmorfs to use the new mount API vfs: Convert openpromfs to use the new mount API vfs: Convert xenfs to use the new mount API vfs: Convert gadgetfs to use the new mount API vfs: Convert oprofilefs to use the new mount API vfs: Convert ibmasmfs to use the new mount API vfs: Convert qib_fs/ipathfs to use the new mount API vfs: Convert efivarfs to use the new mount API vfs: Convert configfs to use the new mount API vfs: Convert binfmt_misc to use the new mount API convenience helper: get_tree_single() convenience helper get_tree_nodev() vfs: Kill sget_userns() ...
| * mnt_init(): call shmem_init() unconditionallyAl Viro2019-07-041-7/+2
| | | | | | | | | | | | | | | | | | No point having two call sites (earlier in init_rootfs() from mnt_init() in case we are going to use shmem-style rootfs, later from do_basic_setup() unconditionally), along with the logics in shmem_init() itself to make the second call a no-op... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * don't bother with registering rootfsAl Viro2019-07-041-13/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | init_mount_tree() can get to rootfs_fs_type directly and that simplifies a lot of things. We don't need to register it, we don't need to look it up *and* we don't need to bother with preventing subsequent userland mounts. That's the way we should've done that from the very beginning. There is a user-visible change, namely the disappearance of "rootfs" from /proc/filesystems. Note that it's been unmountable all along and it didn't show up in /proc/mounts; however, it *is* a user-visible change and theoretically some script might've been using its presence in /proc/filesystems to tell 2.4.11+ from earlier kernels. *IF* any complaints about behaviour change do show up, we could fake it in /proc/filesystems. I very much doubt we'll have to, though. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * init_rootfs(): don't bother with init_ramfs_fs()Al Viro2019-07-041-2/+0
| | | | | | | | | | | | | | | | the only thing done by the latter is making ramfs visible to mount(2); we don't need it there - rootfs is separate and, in fact, made visible to mount(2) in the same init_rootfs(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | treewide: Add SPDX license identifier for missed filesThomas Gleixner2019-05-211-0/+1
|/ | | | | | | | | | | | | | | | | Add SPDX license identifiers to all files which: - Have no license information of any form - Have EXPORT_.*_SYMBOL_GPL inside which was used in the initial scan/conversion to ignore the file These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* vfs: Suppress MS_* flag defs within the kernel unless explicitly enabledDavid Howells2018-12-201-0/+1
| | | | | | | | | | Only the mount namespace code that implements mount(2) should be using the MS_* flags. Suppress them inside the kernel unless uapi/linux/mount.h is included. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: David Howells <dhowells@redhat.com>
* init/do_mounts.c: add root=PARTLABEL=<name> supportNikolaus Voss2018-10-311-0/+31
| | | | | | | | | | | | | | | | | | | | | | | Support referencing the root partition label from GPT as argument to the root= option on the kernel command line in analogy to referencing the partition uuid as root=PARTUUID=<uuid>. Specifying the partition label instead of the uuid is often much easier, e.g. in embedded environments when there is an A/B rootfs partition scheme for interruptible firmware updates (i.e. rootfsA/ rootfsB). The partition label can be queried with the blkid command. Link: http://lkml.kernel.org/r/20180822060904.828E510665E@pc-niv.weinmann.com Signed-off-by: Nikolaus Voss <nikolaus.voss@loewensteinmedical.de> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Sasha Levin <Alexander.Levin@microsoft.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* init/: remove ineffective sparse disablingLuc Van Oostenryck2018-08-221-10/+0
| | | | | | | | | | | | | | | | | | | | | Sparse checking used to be disabled on init/do_mounts.c and a few related files because "Many of the syscalls used in this file expect some of the arguments to be __user pointers not __kernel pointers". However since 28128c61e ("kconfig.h: Include compiler types to avoid missed struct attributes") the checks are, in fact, not disabled anymore because of the more early include of "linux/compiler_types.h" So remove the now ineffective #undefery that was done to disable these warnings, as well as the associated comment. Link: http://lkml.kernel.org/r/20180617115355.53799-1-luc.vanoostenryck@gmail.com Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fs: add ksys_read() helper; remove in-kernel calls to sys_read()Dominik Brodowski2018-04-021-1/+1
| | | | | | | | | | | | | | Using this helper allows us to avoid the in-kernel calls to the sys_read() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_read(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* fs: add ksys_ioctl() helper; remove in-kernel calls to sys_ioctl()Dominik Brodowski2018-04-021-4/+4
| | | | | | | | | | | | | | | | | Using this helper allows us to avoid the in-kernel calls to the sys_ioctl() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_ioctl(). After careful review, at least some of these calls could be converted to do_vfs_ioctl() in future. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* fs: add ksys_open() wrapper; remove in-kernel calls to sys_open()Dominik Brodowski2018-04-021-2/+2
| | | | | | | | | | | | | | | Using this wrapper allows us to avoid the in-kernel calls to the sys_open() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_open(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* fs: add ksys_close() wrapper; remove in-kernel calls to sys_close()Dominik Brodowski2018-04-021-2/+2
| | | | | | | | | | | | | | | | | | | | Using the ksys_close() wrapper allows us to get rid of in-kernel calls to the sys_close() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_close(), with one subtle difference: The few places which checked the return value did not care about the return value re-writing in sys_close(), so simply use a wrapper around __close_fd(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* fs: add ksys_chdir() helper; remove in-kernel calls to sys_chdir()Dominik Brodowski2018-04-021-1/+1
| | | | | | | | | | | | | | | Using this helper allows us to avoid the in-kernel calls to the sys_chdir() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_chdir(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* fs: add ksys_chroot() helper; remove-in kernel calls to sys_chroot()Dominik Brodowski2018-04-021-1/+1
| | | | | | | | | | | | | | | | | | Using this helper allows us to avoid the in-kernel calls to the sys_chroot() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_chroot(). In the near future, the fs-external callers of ksys_chroot() should be converted to use kern_path()/set_fs_root() directly. Then ksys_chroot() can be moved within sys_chroot() again. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* fs: add ksys_mount() helper; remove in-kernel calls to sys_mount()Dominik Brodowski2018-04-021-2/+2
| | | | | | | | | | | | | | | | | Using this helper allows us to avoid the in-kernel calls to the sys_mount() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_mount(). In the near future, all callers of ksys_mount() should be converted to call do_mount() directly. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* kmemcheck: stop using GFP_NOTRACK and SLAB_NOTRACKLevin, Alexander (Sasha Levin)2017-11-151-2/+1
| | | | | | | | | | | | | | | | Convert all allocations that used a NOTRACK flag to stop using it. Link: http://lkml.kernel.org/r/20171007030159.22241-3-alexander.levin@verizon.com Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Cc: Alexander Potapenko <glider@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tim Hansen <devtimhansen@gmail.com> Cc: Vegard Nossum <vegardno@ifi.uio.no> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* VFS: Differentiate mount flags (MS_*) from internal superblock flagsDavid Howells2017-07-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Differentiate the MS_* flags passed to mount(2) from the internal flags set in the super_block's s_flags. s_flags are now called SB_*, with the names and the values for the moment mirroring the MS_* flags that they're equivalent to. In this patch, just the headers are altered and some kernel code where blind automated conversion isn't necessarily correct. Note that this shows up some interesting issues: (1) Some MS_* flags get translated to MNT_* flags (such as MS_NODEV -> MNT_NODEV) without passing this on to the filesystem, but some filesystems set such flags anyway. (2) The ->remount_fs() methods of some filesystems adjust the *flags argument by setting MS_* flags in it, such as MS_NOATIME - but these flags are then scrubbed by do_remount_sb() (only the occupants of MS_RMT_MASK are permitted: MS_RDONLY, MS_SYNCHRONOUS, MS_MANDLOCK, MS_I_VERSION and MS_LAZYTIME) I'm not sure what's the best way to solve all these cases. Suggested-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com>
* VFS: Convert sb->s_flags & MS_RDONLY to sb_rdonly(sb)David Howells2017-07-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Firstly by applying the following with coccinelle's spatch: @@ expression SB; @@ -SB->s_flags & MS_RDONLY +sb_rdonly(SB) to effect the conversion to sb_rdonly(sb), then by applying: @@ expression A, SB; @@ ( -(!sb_rdonly(SB)) && A +!sb_rdonly(SB) && A | -A != (sb_rdonly(SB)) +A != sb_rdonly(SB) | -A == (sb_rdonly(SB)) +A == sb_rdonly(SB) | -!(sb_rdonly(SB)) +!sb_rdonly(SB) | -A && (sb_rdonly(SB)) +A && sb_rdonly(SB) | -A || (sb_rdonly(SB)) +A || sb_rdonly(SB) | -(sb_rdonly(SB)) != A +sb_rdonly(SB) != A | -(sb_rdonly(SB)) == A +sb_rdonly(SB) == A | -(sb_rdonly(SB)) && A +sb_rdonly(SB) && A | -(sb_rdonly(SB)) || A +sb_rdonly(SB) || A ) @@ expression A, B, SB; @@ ( -(sb_rdonly(SB)) ? 1 : 0 +sb_rdonly(SB) | -(sb_rdonly(SB)) ? A : B +sb_rdonly(SB) ? A : B ) to remove left over excess bracketage and finally by applying: @@ expression A, SB; @@ ( -(A & MS_RDONLY) != sb_rdonly(SB) +(bool)(A & MS_RDONLY) != sb_rdonly(SB) | -(A & MS_RDONLY) == sb_rdonly(SB) +(bool)(A & MS_RDONLY) == sb_rdonly(SB) ) to make comparisons against the result of sb_rdonly() (which is a bool) work correctly. Signed-off-by: David Howells <dhowells@redhat.com>
* init: reduce rootwait polling interval time to 5msJungseung Lee2016-12-121-1/+1
| | | | | | | | | | | | | | | For several devices, the rootwait time is sensitive because it directly affects booting time. The polling interval of rootwait is currently 100ms. To save unnessesary waiting time, reduce the polling interval to 5 ms. [akpm@linux-foundation.org: remove used-once #define] Link: http://lkml.kernel.org/r/20161207060743.1728-1-js07.lee@samsung.com Signed-off-by: Jungseung Lee <js07.lee@samsung.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* init/do_mounts.c: add create_dev() failure logVishnu Pratap Singh2015-06-251-2/+7
| | | | | | | | | | | | | | | | | If create_dev() function fails to create the root mount device (/dev/root), then it goes to panic as root device not found but there is no printk in this case. So I have added the log in case it fails to create the root device. It will help in debugging. [akpm@linux-foundation.org: simplify printk(), use pr_emerg(), display errno] Signed-off-by: Vishnu Pratap Singh <vishnu.ps@samsung.com> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Dan Ehrenberg <dehrenberg@chromium.org> Cc: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* init: fix regression by supporting devices with major:minor:offset formatChen Yu2015-05-051-2/+3
| | | | | | | | | | | | | | | | | | | | | | Commit 283e7ad02 ("init: stricter checking of major:minor root= values") was so strict that it exposed the fact that a previously unknown device format was being used. Distributions like Ubuntu uses klibc (rather than uswsusp) to resume system from hibernation. klibc expressed the swap partition/file in the form of major:minor:offset. For example, 8:3:0 represents a swap partition in klibc, and klibc's resume process in initrd will finally echo 8:3:0 to /sys/power/resume for manually resuming. However, due to commit 283e7ad02's stricter checking, 8:3:0 will be treated as an invalid device format, and manual resuming from hibernation will fail. Fix this by adding support for devices with major:minor:offset format when resuming from hibernation. Reported-by: Prigent, Christophe <christophe.prigent@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* init: stricter checking of major:minor root= valuesDan Ehrenberg2015-04-151-1/+2
| | | | | | | | | | | | | | | In the kernel command-line, previously, root=1:2jakshflaksjdhfa would be accepted and interpreted just like root=1:2. This patch adds stricter checking so that additional characters after major:minor are rejected by root=. The goal of this change is to help in unifying DM's interpretation of its block device argument by using existing kernel code (name_to_dev_t). But DM rejects malformed major:minor pairs, it seems reasonable for root= to reject them as well. Signed-off-by: Dan Ehrenberg <dehrenberg@chromium.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* init: export name_to_dev_t and mark name argument as constDan Ehrenberg2015-04-151-1/+2
| | | | | | | | DM will switch its device lookup code to using name_to_dev_t() so it must be exported. Also, the @name argument should be marked const. Signed-off-by: Dan Ehrenberg <dehrenberg@chromium.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com>