summaryrefslogtreecommitdiffstats
path: root/init/do_mounts.c
Commit message (Collapse)AuthorAgeFilesLines
* fs: add ksys_read() helper; remove in-kernel calls to sys_read()Dominik Brodowski2018-04-021-1/+1
| | | | | | | | | | | | | | Using this helper allows us to avoid the in-kernel calls to the sys_read() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_read(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* fs: add ksys_ioctl() helper; remove in-kernel calls to sys_ioctl()Dominik Brodowski2018-04-021-4/+4
| | | | | | | | | | | | | | | | | Using this helper allows us to avoid the in-kernel calls to the sys_ioctl() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_ioctl(). After careful review, at least some of these calls could be converted to do_vfs_ioctl() in future. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* fs: add ksys_open() wrapper; remove in-kernel calls to sys_open()Dominik Brodowski2018-04-021-2/+2
| | | | | | | | | | | | | | | Using this wrapper allows us to avoid the in-kernel calls to the sys_open() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_open(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* fs: add ksys_close() wrapper; remove in-kernel calls to sys_close()Dominik Brodowski2018-04-021-2/+2
| | | | | | | | | | | | | | | | | | | | Using the ksys_close() wrapper allows us to get rid of in-kernel calls to the sys_close() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_close(), with one subtle difference: The few places which checked the return value did not care about the return value re-writing in sys_close(), so simply use a wrapper around __close_fd(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* fs: add ksys_chdir() helper; remove in-kernel calls to sys_chdir()Dominik Brodowski2018-04-021-1/+1
| | | | | | | | | | | | | | | Using this helper allows us to avoid the in-kernel calls to the sys_chdir() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_chdir(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* fs: add ksys_chroot() helper; remove-in kernel calls to sys_chroot()Dominik Brodowski2018-04-021-1/+1
| | | | | | | | | | | | | | | | | | Using this helper allows us to avoid the in-kernel calls to the sys_chroot() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_chroot(). In the near future, the fs-external callers of ksys_chroot() should be converted to use kern_path()/set_fs_root() directly. Then ksys_chroot() can be moved within sys_chroot() again. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* fs: add ksys_mount() helper; remove in-kernel calls to sys_mount()Dominik Brodowski2018-04-021-2/+2
| | | | | | | | | | | | | | | | | Using this helper allows us to avoid the in-kernel calls to the sys_mount() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_mount(). In the near future, all callers of ksys_mount() should be converted to call do_mount() directly. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* kmemcheck: stop using GFP_NOTRACK and SLAB_NOTRACKLevin, Alexander (Sasha Levin)2017-11-151-2/+1
| | | | | | | | | | | | | | | | Convert all allocations that used a NOTRACK flag to stop using it. Link: http://lkml.kernel.org/r/20171007030159.22241-3-alexander.levin@verizon.com Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Cc: Alexander Potapenko <glider@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tim Hansen <devtimhansen@gmail.com> Cc: Vegard Nossum <vegardno@ifi.uio.no> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* VFS: Differentiate mount flags (MS_*) from internal superblock flagsDavid Howells2017-07-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Differentiate the MS_* flags passed to mount(2) from the internal flags set in the super_block's s_flags. s_flags are now called SB_*, with the names and the values for the moment mirroring the MS_* flags that they're equivalent to. In this patch, just the headers are altered and some kernel code where blind automated conversion isn't necessarily correct. Note that this shows up some interesting issues: (1) Some MS_* flags get translated to MNT_* flags (such as MS_NODEV -> MNT_NODEV) without passing this on to the filesystem, but some filesystems set such flags anyway. (2) The ->remount_fs() methods of some filesystems adjust the *flags argument by setting MS_* flags in it, such as MS_NOATIME - but these flags are then scrubbed by do_remount_sb() (only the occupants of MS_RMT_MASK are permitted: MS_RDONLY, MS_SYNCHRONOUS, MS_MANDLOCK, MS_I_VERSION and MS_LAZYTIME) I'm not sure what's the best way to solve all these cases. Suggested-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com>
* VFS: Convert sb->s_flags & MS_RDONLY to sb_rdonly(sb)David Howells2017-07-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Firstly by applying the following with coccinelle's spatch: @@ expression SB; @@ -SB->s_flags & MS_RDONLY +sb_rdonly(SB) to effect the conversion to sb_rdonly(sb), then by applying: @@ expression A, SB; @@ ( -(!sb_rdonly(SB)) && A +!sb_rdonly(SB) && A | -A != (sb_rdonly(SB)) +A != sb_rdonly(SB) | -A == (sb_rdonly(SB)) +A == sb_rdonly(SB) | -!(sb_rdonly(SB)) +!sb_rdonly(SB) | -A && (sb_rdonly(SB)) +A && sb_rdonly(SB) | -A || (sb_rdonly(SB)) +A || sb_rdonly(SB) | -(sb_rdonly(SB)) != A +sb_rdonly(SB) != A | -(sb_rdonly(SB)) == A +sb_rdonly(SB) == A | -(sb_rdonly(SB)) && A +sb_rdonly(SB) && A | -(sb_rdonly(SB)) || A +sb_rdonly(SB) || A ) @@ expression A, B, SB; @@ ( -(sb_rdonly(SB)) ? 1 : 0 +sb_rdonly(SB) | -(sb_rdonly(SB)) ? A : B +sb_rdonly(SB) ? A : B ) to remove left over excess bracketage and finally by applying: @@ expression A, SB; @@ ( -(A & MS_RDONLY) != sb_rdonly(SB) +(bool)(A & MS_RDONLY) != sb_rdonly(SB) | -(A & MS_RDONLY) == sb_rdonly(SB) +(bool)(A & MS_RDONLY) == sb_rdonly(SB) ) to make comparisons against the result of sb_rdonly() (which is a bool) work correctly. Signed-off-by: David Howells <dhowells@redhat.com>
* init: reduce rootwait polling interval time to 5msJungseung Lee2016-12-121-1/+1
| | | | | | | | | | | | | | | For several devices, the rootwait time is sensitive because it directly affects booting time. The polling interval of rootwait is currently 100ms. To save unnessesary waiting time, reduce the polling interval to 5 ms. [akpm@linux-foundation.org: remove used-once #define] Link: http://lkml.kernel.org/r/20161207060743.1728-1-js07.lee@samsung.com Signed-off-by: Jungseung Lee <js07.lee@samsung.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* init/do_mounts.c: add create_dev() failure logVishnu Pratap Singh2015-06-251-2/+7
| | | | | | | | | | | | | | | | | If create_dev() function fails to create the root mount device (/dev/root), then it goes to panic as root device not found but there is no printk in this case. So I have added the log in case it fails to create the root device. It will help in debugging. [akpm@linux-foundation.org: simplify printk(), use pr_emerg(), display errno] Signed-off-by: Vishnu Pratap Singh <vishnu.ps@samsung.com> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Dan Ehrenberg <dehrenberg@chromium.org> Cc: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* init: fix regression by supporting devices with major:minor:offset formatChen Yu2015-05-051-2/+3
| | | | | | | | | | | | | | | | | | | | | | Commit 283e7ad02 ("init: stricter checking of major:minor root= values") was so strict that it exposed the fact that a previously unknown device format was being used. Distributions like Ubuntu uses klibc (rather than uswsusp) to resume system from hibernation. klibc expressed the swap partition/file in the form of major:minor:offset. For example, 8:3:0 represents a swap partition in klibc, and klibc's resume process in initrd will finally echo 8:3:0 to /sys/power/resume for manually resuming. However, due to commit 283e7ad02's stricter checking, 8:3:0 will be treated as an invalid device format, and manual resuming from hibernation will fail. Fix this by adding support for devices with major:minor:offset format when resuming from hibernation. Reported-by: Prigent, Christophe <christophe.prigent@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* init: stricter checking of major:minor root= valuesDan Ehrenberg2015-04-151-1/+2
| | | | | | | | | | | | | | | In the kernel command-line, previously, root=1:2jakshflaksjdhfa would be accepted and interpreted just like root=1:2. This patch adds stricter checking so that additional characters after major:minor are rejected by root=. The goal of this change is to help in unifying DM's interpretation of its block device argument by using existing kernel code (name_to_dev_t). But DM rejects malformed major:minor pairs, it seems reasonable for root= to reject them as well. Signed-off-by: Dan Ehrenberg <dehrenberg@chromium.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* init: export name_to_dev_t and mark name argument as constDan Ehrenberg2015-04-151-1/+2
| | | | | | | | DM will switch its device lookup code to using name_to_dev_t() so it must be exported. Also, the @name argument should be marked const. Signed-off-by: Dan Ehrenberg <dehrenberg@chromium.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* init: fix read-write root mountMiklos Szeredi2014-12-171-2/+4
| | | | | | | | | | | | | | | | | | | If mount flags don't have MS_RDONLY, iso9660 returns EACCES without actually checking if it's an iso image. This tricks mount_block_root() into retrying with MS_RDONLY. This results in a read-only root despite the "rw" boot parameter if the actual filesystem was checked after iso9660. I believe the behavior of iso9660 is okay, while that of mount_block_root() is not. It should rather try all types without MS_RDONLY and only then retry with MS_RDONLY. This change also makes the code more robust against the case when EACCES is returned despite MS_RDONLY, which would've resulted in a lockup. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* init/do_mounts: better syntax descriptionPavel Machek2014-08-281-1/+2
| | | | | | | Specify hex device number unambiquously. Signed-off-by: Pavel Machek <pavel@denx.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* init/do_mounts.c: fix comment errorchishanmingshen2014-04-031-2/+2
| | | | | | Signed-off-by: chishanmingshen <chishanmingshen@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* init/do_mounts.c: add maj:min syntax commentSebastian Capella2013-11-131-0/+2
| | | | | | | | | | The name_to_dev_t function has a comment block which lists the supported syntaxes for the device name. Add a bullet for the <major>:<minor> syntax, which is already supported in the code Signed-off-by: Sebastian Capella <sebastian.capella@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* initmpfs: use initramfs if rootfstype= or root= specifiedRob Landley2013-09-111-4/+11
| | | | | | | | | | | | | | | | | | | | | Command line option rootfstype=ramfs to obtain old initramfs behavior, and use ramfs instead of tmpfs for stub when root= defined (for cosmetic reasons). [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Rob Landley <rob@landley.net> Cc: Jeff Layton <jlayton@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Stephen Warren <swarren@nvidia.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jim Cromie <jim.cromie@gmail.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* initmpfs: make rootfs use tmpfs when CONFIG_TMPFS enabledRob Landley2013-09-111-2/+8
| | | | | | | | | | | | | | | | | | | | | | | Conditionally call the appropriate fs_init function and fill_super functions. Add a use once guard to shmem_init() to simply succeed on a second call. (Note that IS_ENABLED() is a compile time constant so dead code elimination removes unused function calls when CONFIG_TMPFS is disabled.) Signed-off-by: Rob Landley <rob@landley.net> Cc: Jeff Layton <jlayton@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Stephen Warren <swarren@nvidia.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jim Cromie <jim.cromie@gmail.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* initmpfs: move rootfs code from fs/ramfs/ to init/Rob Landley2013-09-111-0/+32
| | | | | | | | | | | | | | | | | | | | | | | When the rootfs code was a wrapper around ramfs, having them in the same file made sense. Now that it can wrap another filesystem type, move it in with the init code instead. This also allows a subsequent patch to access rootfstype= command line arg. Signed-off-by: Rob Landley <rob@landley.net> Cc: Jeff Layton <jlayton@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Stephen Warren <swarren@nvidia.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jim Cromie <jim.cromie@gmail.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* insert missing space in printk line of root_delayToralf Förster2013-07-031-1/+1
| | | | | | | | Trivial, but it really looks better. Signed-off-by: Toralf Förster <toralf.foerster@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* driver-core: constify data for class_find_device()Michał Mirosław2013-02-061-2/+2
| | | | | | | | | | | | | | | | | | All in-kernel users of class_find_device() don't really need mutable data for match callback. In two places (kernel/power/suspend_test.c, drivers/scsi/osd/osd_uld.c) this patch changes match callbacks to use const search data. The const is propagated to rtc_class_open() and power_supply_get_by_name() parameters. Note that there's a dev reference leak in suspend_test.c that's not touched in this patch. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Acked-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* block: partition: msdos: provide UUIDs for partitionsStephen Warren2012-11-231-0/+4
| | | | | | | | | | | | | | | | | | The MSDOS/MBR partition table includes a 32-bit unique ID, often referred to as the NT disk signature. When combined with a partition number within the table, this can form a unique ID similar in concept to EFI/GPT's partition UUID. Constructing and recording this value in struct partition_meta_info allows MSDOS partitions to be referred to on the kernel command-line using the following syntax: root=PARTUUID=0002dd75-01 Signed-off-by: Stephen Warren <swarren@nvidia.com> Cc: Tejun Heo <tj@kernel.org> Cc: Will Drewry <wad@chromium.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* init: reduce PARTUUID min length to 1 from 36Stephen Warren2012-11-231-13/+22
| | | | | | | | | | | | | | | | | | | | | Reduce the minimum length for a root=PARTUUID= parameter to be considered valid from 36 to 1. EFI/GPT partition UUIDs are always exactly 36 characters long, hence the previous limit. However, the next patch will support DOS/MBR UUIDs too, which have a different, shorter, format. Instead of validating any particular length, just ensure that at least some non-empty value was given by the user. Also, consider a missing UUID value to be a parsing error, in the same vein as if /PARTNROFF exists and can't be parsed. As such, make both error cases print a message and disable rootwait. Convert to pr_err while we're at it. Signed-off-by: Stephen Warren <swarren@nvidia.com> Cc: Tejun Heo <tj@kernel.org> Cc: Will Drewry <wad@chromium.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* block: store partition_meta_info.uuid as a stringStephen Warren2012-11-231-11/+17
| | | | | | | | | | | | | | | | | | | This will allow other types of UUID to be stored here, aside from true UUIDs. This also simplifies code that uses this field, since it's usually constructed from a, used as a, or compared to other, strings. Note: A simplistic approach here would be to set uuid_str[36]=0 whenever a /PARTNROFF option was found to be present. However, this modifies the input string, and causes subsequent calls to devt_from_partuuid() not to see the /PARTNROFF option, which causes different results. In order to avoid misleading future maintainers, this parameter is marked const. Signed-off-by: Stephen Warren <swarren@nvidia.com> Cc: Tejun Heo <tj@kernel.org> Cc: Will Drewry <wad@chromium.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* vfs: allocate page instead of names_cache buffer in mount_block_rootJeff Layton2012-10-121-3/+4
| | | | | | | | | | | | | | | | | | | | | | First, it's incorrect to call putname() after __getname_gfp() since the bare __getname_gfp() call skips the auditing code, while putname() doesn't. mount_block_root allocates a PATH_MAX buffer via __getname_gfp, and then calls get_fs_names to fill the buffer. That function can call get_filesystem_list which assumes that that buffer is a full page in size. On arches where PAGE_SIZE != 4k, then this could potentially overrun. In practice, it's hard to imagine the list of filesystem names even approaching 4k, but it's best to be safe. Just allocate a page for this purpose instead. With this, we can also remove the __getname_gfp() definition since there are no more callers. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* init: disable sparse checking of the mount.o source filesH Hartley Sweeten2012-05-311-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | The init/mount.o source files produce a number of sparse warnings of the type: warning: incorrect type in argument 1 (different address spaces) expected char [noderef] <asn:1>*dev_name got char *name This is due to the syscalls expecting some of the arguments to be user pointers but they are being passed as kernel pointers. This is harmless but adds a lot of noise to a sparse build. To limit the noise just disable the sparse checking in the relevant source files, but still display a warning so that the user knows this has been done. Since the sparse checking has been disabled we can also remove the __user __force casts that are scattered thru the source. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* init: don't try mounting device as nfs root unless type fully matchesSasha Levin2012-05-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, we'll try mounting any device who's major device number is UNNAMED_MAJOR as NFS root. This would happen for non-NFS devices as well (such as 9p devices) but it wouldn't cause any issues since mounting the device as NFS would fail quickly and the code proceeded to doing the proper mount: [ 101.522716] VFS: Unable to mount root fs via NFS, trying floppy. [ 101.534499] VFS: Mounted root (9p filesystem) on device 0:18. Commit 6829a048102a ("NFS: Retry mounting NFSROOT") introduced retries when mounting NFS root, which means that now we don't immediately fail and instead it takes an additional 90+ seconds until we stop retrying, which has revealed the issue this patch fixes. This meant that it would take an additional 90 seconds to boot when we're not using a device type which gets detected in order before NFS. This patch modifies the NFS type check to require device type to be 'Root_NFS' instead of requiring the device to have an UNNAMED_MAJOR major. This makes boot process cleaner since we now won't go through the NFS mounting code at all when the device isn't an NFS root ("/dev/nfs"). Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* init/do_mounts.c: print error code on mount failureBernhard Walle2012-03-231-2/+2
| | | | | | | | | | | Printing the error code makes it easier to debug the cause of a mount failure. For example I had the problem that the root file system could not be mounted read-writeable because my SD card was write-protected. Without an error code it looks like the SD card was not detected at all. Signed-off-by: Bernhard Walle <bernhard@bwalle.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'nfs-for-3.3' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds2012-01-101-4/+31
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'nfs-for-3.3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4: Change the default setting of the nfs4_disable_idmapping parameter NFSv4: Save the owner/group name string when doing open NFS: Remove pNFS bloat from the generic write path pnfs-obj: Must return layout on IO error pnfs-obj: pNFS errors are communicated on iodata->pnfs_error NFS: Cache state owners after files are closed NFS: Clean up nfs4_find_state_owners_locked() NFSv4: include bitmap in nfsv4 get acl data nfs: fix a minor do_div portability issue NFSv4.1: cleanup comment and debug printk NFSv4.1: change nfs4_free_slot parameters for dynamic slots NFSv4.1: cleanup init and reset of session slot tables NFSv4.1: fix backchannel slotid off-by-one bug nfs: fix regression in handling of context= option in NFSv4 NFS - fix recent breakage to NFS error handling. NFS: Retry mounting NFSROOT SUNRPC: Clean up the RPCSEC_GSS service ticket requests
| * NFS: Retry mounting NFSROOTChuck Lever2012-01-051-4/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Lukas Razik <linux@razik.name> reports that on his SPARC system, booting with an NFS root file system stopped working after commit 56463e50 "NFS: Use super.c for NFSROOT mount option parsing." We found that the network switch to which Lukas' client was attached was delaying access to the LAN after the client's NIC driver reported that its link was up. The delay was longer than the timeouts used in the NFS client during mounting. NFSROOT worked for Lukas before commit 56463e50 because in those kernels, the client's first operation was an rpcbind request to determine which port the NFS server was listening on. When that request failed after a long timeout, the client simply selected the default NFS port (2049). By that time the switch was allowing access to the LAN, and the mount succeeded. Neither of these client behaviors is desirable, so reverting 56463e50 is really not a choice. Instead, introduce a mechanism that retries the NFSROOT mount request several times. This is the same tactic that normal user space NFS mounts employ to overcome server and network delays. Signed-off-by: Lukas Razik <linux@razik.name> [ cel: match kernel coding style, add proper patch description ] [ cel: add exponential back-off ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Lukas Razik <linux@razik.name> Cc: stable@kernel.org # > 2.6.38 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | vfs: prefer ->dentry->d_sb to ->mnt->mnt_sbAl Viro2012-01-061-4/+6
|/ | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* init: add root=PARTUUID=UUID/PARTNROFF=%d supportWill Drewry2011-11-021-5/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Expand root=PARTUUID=UUID syntax to support selecting a root partition by integer offset from a known, unique partition. This approach provides similar properties to specifying a device and partition number, but using the UUID as the unique path prior to evaluating the offset. For example, root=PARTUUID=99DE9194-FC15-4223-9192-FC243948F88B/PARTNROFF=1 selects the partition with UUID 99DE.. then select the next partition. This change is motivated by a particular usecase in Chromium OS where the bootloader can easily determine what partition it is on (by UUID) but doesn't perform general partition table walking. That said, support for this model provides a direct mechanism for the user to modify the root partition to boot without specifically needing to extract each UUID or update the bootloader explicitly when the root partition UUID is changed (if it is recreated to be larger, for instance). Pinning to a /boot-style partition UUID allows the arbitrary root partition reconfiguration/modifications with slightly less ambiguity than just [dev][partition] and less stringency than the specific root partition UUID. [sfr@canb.auug.org.au: fix init sections warning] Signed-off-by: Will Drewry <wad@chromium.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Fix common misspellingsLucas De Marchi2011-03-311-1/+1
| | | | | | Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
* fs: use appropriate printk priority levelsMandeep Singh Baines2011-03-221-1/+2
| | | | | | | | | | | | | printk()s without a priority level default to KERN_WARNING. To reduce noise at KERN_WARNING, this patch set the priority level appriopriately for unleveled printks()s. This should be useful to folks that look at dmesg warnings closely. Signed-off-by: Mandeep Singh Baines <msb@chromium.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* name_to_dev_t() must not call __init codeJan Beulich2011-01-031-1/+1
| | | | | | | | | The function can't be __init itself (being called from some sysfs handler), and hence none of the functions it calls can be either. Signed-off-by: Jan Beulich <jbeulich@novell.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* init: mark __user address space on string literalsNamhyung Kim2010-10-261-2/+2
| | | | | | | | | | | When calling syscall service routines in kernel, some of arguments should be user pointers but were missing __user markup on string literals. Add it. Removes some sparse warnings. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Cc: Phillip Lougher <phillip@lougher.demon.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'nfs-for-2.6.37' of ↵Linus Torvalds2010-10-251-6/+6
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'nfs-for-2.6.37' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (67 commits) SUNRPC: Cleanup duplicate assignment in rpcauth_refreshcred nfs: fix unchecked value Ask for time_delta during fsinfo probe Revalidate caches on lock SUNRPC: After calling xprt_release(), we must restart from call_reserve NFSv4: Fix up the 'dircount' hint in encode_readdir NFSv4: Clean up nfs4_decode_dirent NFSv4: nfs4_decode_dirent must clear entry->fattr->valid NFSv4: Fix a regression in decode_getfattr NFSv4: Fix up decode_attr_filehandle() to handle the case of empty fh pointer NFS: Ensure we check all allocation return values in new readdir code NFS: Readdir plus in v4 NFS: introduce generic decode_getattr function NFS: check xdr_decode for errors NFS: nfs_readdir_filler catch all errors NFS: readdir with vmapped pages NFS: remove page size checking code NFS: decode_dirent should use an xdr_stream SUNRPC: Add a helper function xdr_inline_peek NFS: remove readdir plus limit ...
| * NFS: Use super.c for NFSROOT mount option parsingChuck Lever2010-09-171-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace duplicate code in NFSROOT for mounting an NFS server on '/' with logic that uses the existing mainline text-based logic in the NFS client. Add documenting comments where appropriate. Note that this means NFSROOT mounts now use the same default settings as v2/v3 mounts done via mount(2) from user space. vers=3,tcp,rsize=<negotiated default>,wsize=<negotiated default> As before, however, no version/protocol negotiation with the server is done. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | do_mounts: only enable PARTUUID for CONFIG_BLOCKJens Axboe2010-09-171-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When CONFIG_BLOCK is not enabled: init/do_mounts.c:71: error: implicit declaration of function 'dev_to_part' init/do_mounts.c:71: warning: initialization makes pointer from integer without a cast init/do_mounts.c:73: error: dereferencing pointer to incomplete type init/do_mounts.c:76: error: dereferencing pointer to incomplete type init/do_mounts.c:76: error: dereferencing pointer to incomplete type init/do_mounts.c:102: error: implicit declaration of function 'part_pack_uuid' init/do_mounts.c:104: error: 'block_class' undeclared (first use in this function) Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
* | core: match_dev_by_uuid() should not be marked __initJens Axboe2010-09-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | It is also called outside the scope of init functions. Stephen reports: WARNING: init/mounts.o(.text+0x21a): Section mismatch in reference from the function name_to_dev_t() to the function .init.text:match_dev_by_uuid() The function name_to_dev_t() references the function __init match_dev_by_uuid(). This is often because name_to_dev_t lacks a __init annotation or the annotation of match_dev_by_uuid is wrong. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
* | init: add support for root devices specified by partition UUIDWill Drewry2010-09-151-0/+66
|/ | | | | | | | | | | | | | | | | | | This is the third patch in a series which adds support for storing partition metadata, optionally, off of the hd_struct. One major use for that data is being able to resolve partition by other identities than just the index on a block device. Device enumeration varies by platform and there's a benefit to being able to use something like EFI GPT's GUIDs to determine the correct block device and partition to mount as the root. This change adds that support to root= by adding support for the following syntax: root=PARTUUID=hex-uuid Signed-off-by: Will Drewry <wad@chromium.org> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
* include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo2010-03-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
* Driver Core: devtmpfs - kernel-maintained tmpfs-based /devKay Sievers2009-09-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Devtmpfs lets the kernel create a tmpfs instance called devtmpfs very early at kernel initialization, before any driver-core device is registered. Every device with a major/minor will provide a device node in devtmpfs. Devtmpfs can be changed and altered by userspace at any time, and in any way needed - just like today's udev-mounted tmpfs. Unmodified udev versions will run just fine on top of it, and will recognize an already existing kernel-created device node and use it. The default node permissions are root:root 0600. Proper permissions and user/group ownership, meaningful symlinks, all other policy still needs to be applied by userspace. If a node is created by devtmps, devtmpfs will remove the device node when the device goes away. If the device node was created by userspace, or the devtmpfs created node was replaced by userspace, it will no longer be removed by devtmpfs. If it is requested to auto-mount it, it makes init=/bin/sh work without any further userspace support. /dev will be fully populated and dynamic, and always reflect the current device state of the kernel. With the commonly used dynamic device numbers, it solves the problem where static devices nodes may point to the wrong devices. It is intended to make the initial bootup logic simpler and more robust, by de-coupling the creation of the inital environment, to reliably run userspace processes, from a complex userspace bootstrap logic to provide a working /dev. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Jan Blunck <jblunck@suse.de> Tested-By: Harald Hoyer <harald@redhat.com> Tested-By: Scott James Remnant <scott@ubuntu.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* fs: fix do_mount_root() false positive kmemcheck warningVegard Nossum2009-06-151-1/+2
| | | | | | | | | | | | This false positive is due to the fact that do_mount_root() fakes a mount option (which is normally read from userspace), and the kernel unconditionally reads a whole page for the mount option. Hide the false positive by using the new __getname_gfp() with the __GFP_NOTRACK_FALSE_POSITIVE flag. Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
* Get rid of indirect include of fs_struct.hAl Viro2009-03-311-0/+1
| | | | | | | | Don't pull it in sched.h; very few files actually need it and those can include directly. sched.h itself only needs forward declaration of struct fs_struct; Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Consolidate driver_probe_done() loops into one placeArjan van de Ven2009-02-211-4/+9
| | | | | | | | | | | | there's a few places that currently loop over driver_probe_done(), and I'm about to add another one. This patch abstracts it into a helper to reduce duplication. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Len Brown <lenb@kernel.org> Acked-by: Greg KH <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* async: Asynchronous function calls to speed up kernel bootArjan van de Ven2009-01-071-0/+2
| | | | | | | | | | | | | | | | | | | Right now, most of the kernel boot is strictly synchronous, such that various hardware delays are done sequentially. In order to make the kernel boot faster, this patch introduces infrastructure to allow doing some of the initialization steps asynchronously, which will hide significant portions of the hardware delays in practice. In order to not change device order and other similar observables, this patch does NOT do full parallel initialization. Rather, it operates more in the way an out of order CPU does; the work may be done out of order and asynchronous, but the observable effects (instruction retiring for the CPU) are still done in the original sequence. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>