summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'writeback-lock-fix' of ↵Linus Torvalds2012-06-121-0/+1
|\ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux Pull writeback locking fix from Wu Fengguang: "fix unbalanced wb->list_lock in 3.5-rc1" * tag 'writeback-lock-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux: writeback: Fix lock imbalance in writeback_sb_inodes()
| * writeback: Fix lock imbalance in writeback_sb_inodes()Jan Kara2012-06-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix bug introduced by 169ebd90. We have to have wb_list_lock locked when restarting writeback loop after having waited for inode writeback. Bug description by Ted Tso: I can reproduce this fairly easily by using ext4 w/o a journal, running under KVM with 1024megs memory, with fsstress (xfstests #13): [ 45.153294] ===================================== [ 45.154784] [ BUG: bad unlock balance detected! ] [ 45.155591] 3.5.0-rc1-00002-gb22b1f1 #124 Not tainted [ 45.155591] ------------------------------------- [ 45.155591] flush-254:16/2499 is trying to release lock (&(&wb->list_lock)->rlock) at: [ 45.155591] [<c022c3da>] writeback_sb_inodes+0x160/0x327 [ 45.155591] but there are no more locks to release! Reported-by: Theodore Ts'o <tytso@mit.edu> Tested-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
* | exofs: fix sparse non-ANSI function warningRandy Dunlap2012-06-121-1/+1
| | | | | | | | | | | | | | | | | | Fix sparse non-ANSI function warning: fs/exofs/sys.c:112:28: warning: non-ANSI function declaration of function 'exofs_sysfs_dbg_print' Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge tag 'ext4_for_linus' of ↵Linus Torvalds2012-06-082-5/+4
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bug fixes from Theodore Ts'o: "This update contains two bug fixes, both destined for the stable tree. Perhaps the most important is one which fixes ext4 when used with file systems originally formatted for use with ext3, but then later converted to take advantage of ext4." * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: don't set i_flags in EXT4_IOC_SETFLAGS ext4: fix the free blocks calculation for ext3 file systems w/ uninit_bg
| * | ext4: don't set i_flags in EXT4_IOC_SETFLAGSTao Ma2012-06-071-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 7990696 uses the ext4_{set,clear}_inode_flags() functions to change the i_flags automatically but fails to remove the error setting of i_flags. So we still have the problem of trashing state flags. Fix this by removing the assignment. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
| * | ext4: fix the free blocks calculation for ext3 file systems w/ uninit_bgTheodore Ts'o2012-06-071-4/+4
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ext3 filesystems that are converted to use as many ext4 file system features as possible will enable uninit_bg to speed up e2fsck times. These file systems will have a native ext3 layout of inode tables and block allocation bitmaps (as opposed to ext4's flex_bg layout). Unfortunately, in these cases, when first allocating a block in an uninitialized block group, ext4 would incorrectly calculate the number of free blocks in that block group, and then errorneously report that the file system was corrupt: EXT4-fs error (device vdd): ext4_mb_generate_buddy:741: group 30, 32254 clusters in bitmap, 32258 in gd This problem can be reproduced via: mke2fs -q -t ext4 -O ^flex_bg /dev/vdd 5g mount -t ext4 /dev/vdd /mnt fallocate -l 4600m /mnt/test The problem was caused by a bone headed mistake in the check to see if a particular metadata block was part of the block group. Many thanks to Kees Cook for finding and bisecting the buggy commit which introduced this bug (commit fd034a84e1, present since v3.2). Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Reported-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Tested-by: Kees Cook <keescook@chromium.org> Cc: stable@kernel.org
* | Merge tag 'upstream-3.5-rc2' of git://git.infradead.org/linux-ubifsLinus Torvalds2012-06-081-2/+10
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull UBI/UBIFS fixes from Artem Bityutskiy: "Fix UBI and UBIFS - they refuse to work without debugfs. This was broken by the 3.5-rc1 UBI/UBIFS changes when we removed the debugging Kconfig switches. Also, correct locking in 'ubi_wl_flush()' - it was extended to support flushing a specific LEB in 3.5-rc1, and the locking was sub-optimal." * tag 'upstream-3.5-rc2' of git://git.infradead.org/linux-ubifs: UBI: correct ubi_wl_flush locking UBIFS: fix debugfs-less systems support UBI: fix debugfs-less systems support
| * | UBIFS: fix debugfs-less systems supportArtem Bityutskiy2012-06-071-2/+10
| |/ | | | | | | | | | | | | | | | | | | | | Commit "f70b7e5 UBIFS: remove Kconfig debugging option" broke UBIFS and it refuses to initialize if debugfs (CONFIG_DEBUG_FS) is disabled. I incorrectly assumed that debugfs files creation function will return success if debugfs is disabled, but they actually return -ENODEV. This patch fixes the issue. Reported-by: Paul Parsons <lost.distance@yahoo.com> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Tested-by: Paul Parsons <lost.distance@yahoo.com>
* | Revert "vfs: stop d_splice_alias creating directory aliases"Linus Torvalds2012-06-081-6/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 7732a557b1342c6e6966efb5f07effcf99f56167 (and commit 3f50fff4dace23d3cfeb195d5cd4ee813cee68b7, which was a follow-up cleanup). We're chasing an elusive bug that Dave Jones can apparently reproduce using his system call fuzzer tool, and that looks like some kind of locking ordering problem on the directory i_mutex chain. Our i_mutex locking is rather complex, and depends on the topological ordering of the directories, which is why we have been very wary of splicing directory entries around. Of course, we really don't want to ever see aliased unconnected directories anyway, so none of this should ever happen, but this revert aims to basically get us back to a known older state. Bruce points to some of the previous discussion at http://marc.info/?i=<20110310105821.GE22723@ZenIV.linux.org.uk> and in particular a long post from Neil: http://marc.info/?i=<20110311150749.2fa2be66@notabene.brown> It should be noted that it's possible that Dave's problems come from other changes altohgether, including possibly just the fact that Dave constantly is teachning his fuzzer new tricks. So what appears to be a new bug could in fact be an old one that just gets newly triggered, but reverting these patches as "still under heavy discussion" is the right thing regardless. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Revert "mm: correctly synchronize rss-counters at exit/exec"Linus Torvalds2012-06-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 40af1bbdca47e5c8a2044039bb78ca8fd8b20f94. It's horribly and utterly broken for at least the following reasons: - calling sync_mm_rss() from mmput() is fundamentally wrong, because there's absolutely no reason to believe that the task that does the mmput() always does it on its own VM. Example: fork, ptrace, /proc - you name it. - calling it *after* having done mmdrop() on it is doubly insane, since the mm struct may well be gone now. - testing mm against NULL before you call it is insane too, since a NULL mm there would have caused oopses long before. .. and those are just the three bugs I found before I decided to give up looking for me and revert it asap. I should have caught it before I even took it, but I trusted Andrew too much. Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Hugh Dickins <hughd@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | mm: correctly synchronize rss-counters at exit/execKonstantin Khlebnikov2012-06-071-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mm->rss_stat counters have per-task delta: task->rss_stat. Before changing task->mm pointer the kernel must flush this delta with sync_mm_rss(). do_exit() already calls sync_mm_rss() to flush the rss-counters before committing the rss statistics into task->signal->maxrss, taskstats, audit and other stuff. Unfortunately the kernel does this before calling mm_release(), which can call put_user() for processing task->clear_child_tid. So at this point we can trigger page-faults and task->rss_stat becomes non-zero again. As a result mm->rss_stat becomes inconsistent and check_mm() will print something like this: | BUG: Bad rss-counter state mm:ffff88020813c380 idx:1 val:-1 | BUG: Bad rss-counter state mm:ffff88020813c380 idx:2 val:1 This patch moves sync_mm_rss() into mm_release(), and moves mm_release() out of do_exit() and calls it earlier. After mm_release() there should be no pagefaults. [akpm@linux-foundation.org: tweak comment] Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Hugh Dickins <hughd@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: <stable@vger.kernel.org> [3.4.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge branch 'for-linus' of ↵Linus Torvalds2012-06-055-10/+74
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse updates from Miklos Szeredi. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: fix blksize calculation fuse: fix stat call on 32 bit platforms fuse: optimize fallocate on permanent failure fuse: add FALLOCATE operation fuse: Convert to kstrtoul_from_user
| * | fuse: fix blksize calculationMiklos Szeredi2012-05-141-1/+9
| | | | | | | | | | | | | | | | | | | | | Don't use inode->i_blkbits which might be stale, instead calculate the blksize information from the freshly obtained attributes. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
| * | fuse: fix stat call on 32 bit platformsPavel Shilovsky2012-05-143-1/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now we store attr->ino at inode->i_ino, return attr->ino at the first time and then return inode->i_ino if the attribute timeout isn't expired. That's wrong on 32 bit platforms because attr->ino is 64 bit and inode->i_ino is 32 bit in this case. Fix this by saving 64 bit ino in fuse_inode structure and returning it every time we call getattr. Also squash attr->ino into inode->i_ino explicitly. Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
| * | fuse: optimize fallocate on permanent failureMiklos Szeredi2012-04-262-0/+10
| | | | | | | | | | | | | | | | | | | | | If userspace filesystem doesn't support fallocate, remember this and don't send request next time. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
| * | fuse: add FALLOCATE operationAnatol Pomozov2012-04-251-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | fallocate filesystem operation preallocates media space for the given file. If fallocate returns success then any subsequent write to the given range never fails with 'not enough space' error. Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
| * | fuse: Convert to kstrtoul_from_userPeter Huewe2012-04-251-8/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch replaces the code for getting an number from a userspace buffer by a simple call to kstroul_from_user. This makes it easier to read and less error prone. Signed-off-by: Peter Huewe <peterhuewe@gmx.de> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
* | | Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds2012-06-048-143/+167
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull cifs fixes from Steve French. * 'for-next' of git://git.samba.org/sfrench/cifs-2.6: CIFS: Move get_next_mid to ops struct CIFS: Make accessing is_valid_oplock/dump_detail ops struct field safe CIFS: Improve identation in cifs_unlock_range CIFS: Fix possible wrong memory allocation
| * | | CIFS: Move get_next_mid to ops structPavel Shilovsky2012-06-017-95/+103
| | | | | | | | | | | | | | | | | | | | | | | | Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | | CIFS: Make accessing is_valid_oplock/dump_detail ops struct field safePavel Shilovsky2012-06-011-2/+4
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | | CIFS: Improve identation in cifs_unlock_rangePavel Shilovsky2012-06-011-40/+35
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com>
| * | | CIFS: Fix possible wrong memory allocationPavel Shilovsky2012-06-011-6/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | when cifs_reconnect sets maxBuf to 0 and we try to calculate a size of memory we need to store locks. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com>
* | | | vfs: Fix /proc/<tid>/fdinfo/<fd> file handlingLinus Torvalds2012-06-041-7/+10
| |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cyrill Gorcunov reports that I broke the fdinfo files with commit 30a08bf2d31d ("proc: move fd symlink i_mode calculations into tid_fd_revalidate()"), and he's quite right. The tid_fd_revalidate() function is not just used for the <tid>/fd symlinks, it's also used for the <tid>/fdinfo/<fd> files, and the permission model for those are different. So do the dynamic symlink permission handling just for symlinks, making the fdinfo files once more appear as the proper regular files they are. Of course, Al Viro argued (probably correctly) that we shouldn't do the symlink permission games at all, and make the symlinks always just be the normal 'lrwxrwxrwx'. That would have avoided this issue too, but since somebody noticed that the permissions had changed (which was the reason for that original commit 30a08bf2d31d in the first place), people do apparently use this feature. [ Basically, you can use the symlink permission data as a cheap "fdinfo" replacement, since you see whether the file is open for reading and/or writing by just looking at st_mode of the symlink. So the feature does make sense, even if the pain it has caused means we probably shouldn't have done it to begin with. ] Reported-and-tested-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge branch 'akpm' (Fixups for Andrew's patchbomb)Linus Torvalds2012-06-0113-45/+33
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge fixups for the mac NLS tables from Andrew. * emailed from Andrew Morton, and one cleanup by me: nls: fix (and rename) mac NLS table files and config options fs/nls/Makefile: remove bogus CONFIG_ assignments
| * | | nls: fix (and rename) mac NLS table files and config optionsLinus Torvalds2012-06-0113-33/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The config options in the Kconfig file (with _CODEPAGE_ in the name) didn't match the config option name in the Makefile (no _CODEPAGE_). And both of them were of the hard-to-read MACXYZZY variety, which made them hard to parse for normal humans: MACROMAN easily reads as "macro man", not as "Mac Roman". So rename the options to be consistent, and be NLS_MAC_xyzzy. Rename the files to be mac-xyzzy.c too, and drop the "nls" part entirely (it's already in the directory name). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | fs/nls/Makefile: remove bogus CONFIG_ assignmentsAndrew Morton2012-06-011-12/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These were debug things which snuck through. Reported-by: Yinghai Lu <yinghai@kernel.org> Cc: Vladimir Serbinenko <phcoder@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge tag 'for-linus-3.5-20120601' of git://git.infradead.org/linux-mtdLinus Torvalds2012-06-016-13/+97
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull mtd update from David Woodhouse: - More robust parsing especially of xattr data in JFFS2 - Updates to mxc_nand and gpmi drivers to support new boards and device tree - Improve consistency of information about ECC strength in NAND devices - Clean up partition handling of plat_nand - Support NAND drivers without dedicated access to OOB area - BCH hardware ECC support for OMAP - Other fixes and cleanups, and a few new device IDs Fixed trivial conflict in drivers/mtd/nand/gpmi-nand/gpmi-nand.c due to added include files next to each other. * tag 'for-linus-3.5-20120601' of git://git.infradead.org/linux-mtd: (75 commits) mtd: mxc_nand: move ecc strengh setup before nand_scan_tail mtd: block2mtd: fix recursive call of mtd_writev mtd: gpmi-nand: define ecc.strength mtd: of_parts: fix breakage in Kconfig mtd: nand: fix scan_read_raw_oob mtd: docg3 fix in-middle of blocks reads mtd: cfi_cmdset_0002: Slight cleanup of fixup messages mtd: add fixup for S29NS512P NOR flash. jffs2: allow to complete xattr integrity check on first GC scan jffs2: allow to discriminate between recoverable and non-recoverable errors mtd: nand: omap: add support for hardware BCH ecc ARM: OMAP3: gpmc: add BCH ecc api and modes mtd: nand: check the return code of 'read_oob/read_oob_raw' mtd: nand: remove 'sndcmd' parameter of 'read_oob/read_oob_raw' mtd: m25p80: Add support for Winbond W25Q80BW jffs2: get rid of jffs2_sync_super jffs2: remove unnecessary GC pass on sync jffs2: remove unnecessary GC pass on umount jffs2: remove lock_super mtd: gpmi: add gpmi support for mx6q ...
| * | | | jffs2: allow to complete xattr integrity check on first GC scanJean-Christophe DUBOIS2012-05-133-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unlike file data integrity the xattr data integrity was not checked before some explicit access to the attribute was made. This could leave in the system a number of corrupted extended attributes which will be detected only at access time and possibly at a very late time compared to the time the corruption actually happened. This patch adds the ability to check for extended attribute integrity on first GC scan pass (similar to file data integrity check). This allows for all present attributes to be completly verified before any use of them. In order to work correctly this patch also needs the patch allowing JFFS2 to discriminate between recoverable and non recoverable errors on extended attributes. Signed-off-by: Jean-Christophe DUBOIS <jcd@tribudubois.net> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * | | | jffs2: allow to discriminate between recoverable and non-recoverable errorsJean-Christophe DUBOIS2012-05-131-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is basically a revert of commit f326966b3df47f4fa7e90425f60efdd30c31fe19. It allows JFFS2 to make the distinction between a potential transient error (reading or writing the media) and a non recoverable error like a bad CRC on the extended attribute data or some insconsitent parameters. In order to make clear that the error is indeed intended to report a corrupted attribute, a new local error code (JFFS2_XATTR_IS_CORRUPTED) is introduced rather than returning a confusing positive EIO, which is what led to the inappropriate "fix" last time. This error code is never reported to user space and only checked locally in this file. Signed-off-by: Jean-Christophe DUBOIS <jcd@tribudubois.net> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * | | | jffs2: get rid of jffs2_sync_superArtem Bityutskiy2012-05-134-21/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently JFFS2 file-system maps the VFS "superblock" abstraction to the write-buffer. Namely, it uses VFS services to synchronize the write-buffer periodically. The whole "superblock write-out" VFS infrastructure is served by the 'sync_supers()' kernel thread, which wakes up every 5 (by default) seconds and writes out all dirty superblock using the '->write_super()' call-back. But the problem with this thread is that it wastes power by waking up the system every 5 seconds no matter what. So we want to kill it completely and thus, we need to make file-systems to stop using the '->write_super' VFS service, and then remove it together with the kernel thread. This patch switches the JFFS2 write-buffer management from '->write_super()'/'->s_dirt' to a delayed work. Instead of setting the 's_dirt' flag we just schedule a delayed work for synchronizing the write-buffer. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * | | | jffs2: remove unnecessary GC pass on syncArtem Bityutskiy2012-05-131-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We do not need to call 'jffs2_write_super()' on sync. This function causes a GC pass to make sure the current contents is pushed out with the data which we already have on the media. But this is not needed on unmount and only slows sync down unnecessarily. It is enough to just sync the write-buffer. This call was added by one of the generic VFS rework patch-sets, see d579ed00aa96a7f7486978540a0d7cecaff742ae. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * | | | jffs2: remove unnecessary GC pass on umountArtem Bityutskiy2012-05-131-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We do not need to call 'jffs2_write_super()' on unmount. This function causes a GC pass to make sure the current contents is pushed out with the data which we already have on the media. But this is not needed on unmount and only slows unmount down unnecessarily. It is enough to just sync the write-buffer. This call was added by one of the generic VFS rework patch-sets, see 8c85e125124a473d6f3e9bb187b0b84207f81d91. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * | | | jffs2: remove lock_superArtem Bityutskiy2012-05-131-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We do not need 'lock_super()'/'unlock_super()' in JFFS2 - kill them. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * | | | jffs2: refactor csize usage in jffs2_do_read_inode_internal()Xi Wang2012-05-131-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace the verbose `je32_to_cpu(latest_node->csize)' with a shorter `csize'. Signed-off-by: Xi Wang <xi.wang@gmail.com> Cc: Artem Bityutskiy <dedekind1@gmail.com> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * | | | jffs2: validate symlink size in jffs2_do_read_inode_internal()Xi Wang2012-05-131-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | `csize' is read from disk and thus needs validation. Otherwise a bogus value 0xffffffff would turn the subsequent kmalloc(csize + 1, ...) into kmalloc(0, ...), leading to out-of-bounds write. This patch limits `csize' to JFFS2_MAX_NAME_LEN, which is also used in jffs2_symlink(). Artem: we actually validate csize by checking CRC, so this 0xFFs cannot come from empty flash region. But I guess an attacker could feed JFFS2 an image with random csize value, including 0xFFs. Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * | | | JFFS2: Add parameter to reserve disk space for rootDaniel Drake2012-05-133-0/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new rp_size= parameter which creates a "reserved pool" of disk space which can only be used by root. Other users are not permitted to write to disk when the available space is less than the pool size. Based on original code by Artem Bityutskiy in https://dev.laptop.org/ticket/5317 [dwmw2: use capable(CAP_SYS_RESOURCE) not uid/gid check, fix debug prints] Signed-off-by: Daniel Drake <dsd@laptop.org> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* | | | | Merge branch 'for-linus' of ↵Linus Torvalds2012-06-013-12/+0
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull third pile of signal handling patches from Al Viro: "This time it's mostly helpers and conversions to them; there's a lot of stuff remaining in the tree, but that'll either go in -rc2 (isolated bug fixes, ideally via arch maintainers' trees) or will sit there until the next cycle." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: x86: get rid of calling do_notify_resume() when returning to kernel mode blackfin: check __get_user() return value whack-a-mole with TIF_FREEZE FRV: Optimise the system call exit path in entry.S [ver #2] FRV: Shrink TIF_WORK_MASK [ver #2] FRV: Prevent syscall exit tracing and notify_resume at end of kernel exceptions new helper: signal_delivered() powerpc: get rid of restore_sigmask() most of set_current_blocked() callers want SIGKILL/SIGSTOP removed from set set_restore_sigmask() is never called without SIGPENDING (and never should be) TIF_RESTORE_SIGMASK can be set only when TIF_SIGPENDING is set don't call try_to_freeze() from do_signal() pull clearing RESTORE_SIGMASK into block_sigmask() sh64: failure to build sigframe != signal without handler openrisc: tracehook_signal_handler() is supposed to be called on success new helper: sigmask_to_save() new helper: restore_saved_sigmask() new helpers: {clear,test,test_and_clear}_restore_sigmask() HAVE_RESTORE_SIGMASK is defined on all architectures now
| * | | | | HAVE_RESTORE_SIGMASK is defined on all architectures nowAl Viro2012-06-013-12/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Everyone either defines it in arch thread_info.h or has TIF_RESTORE_SIGMASK and picks default set_restore_sigmask() in linux/thread_info.h. Kill the ifdefs, slap #error in linux/thread_info.h to catch breakage when new ones get merged. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | | | | | Merge branch 'for-linus' of ↵Linus Torvalds2012-06-0189-1081/+1175
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs changes from Al Viro. "A lot of misc stuff. The obvious groups: * Miklos' atomic_open series; kills the damn abuse of ->d_revalidate() by NFS, which was the major stumbling block for all work in that area. * ripping security_file_mmap() and dealing with deadlocks in the area; sanitizing the neighborhood of vm_mmap()/vm_munmap() in general. * ->encode_fh() switched to saner API; insane fake dentry in mm/cleancache.c gone. * assorted annotations in fs (endianness, __user) * parts of Artem's ->s_dirty work (jff2 and reiserfs parts) * ->update_time() work from Josef. * other bits and pieces all over the place. Normally it would've been in two or three pull requests, but signal.git stuff had eaten a lot of time during this cycle ;-/" Fix up trivial conflicts in Documentation/filesystems/vfs.txt (the 'truncate_range' inode method was removed by the VM changes, the VFS update adds an 'update_time()' method), and in fs/btrfs/ulist.[ch] (due to sparse fix added twice, with other changes nearby). * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (95 commits) nfs: don't open in ->d_revalidate vfs: retry last component if opening stale dentry vfs: nameidata_to_filp(): don't throw away file on error vfs: nameidata_to_filp(): inline __dentry_open() vfs: do_dentry_open(): don't put filp vfs: split __dentry_open() vfs: do_last() common post lookup vfs: do_last(): add audit_inode before open vfs: do_last(): only return EISDIR for O_CREAT vfs: do_last(): check LOOKUP_DIRECTORY vfs: do_last(): make ENOENT exit RCU safe vfs: make follow_link check RCU safe vfs: do_last(): use inode variable vfs: do_last(): inline walk_component() vfs: do_last(): make exit RCU safe vfs: split do_lookup() Btrfs: move over to use ->update_time fs: introduce inode operation ->update_time reiserfs: get rid of resierfs_sync_super reiserfs: mark the superblock as dirty a bit later ...
| * | | | | | nfs: don't open in ->d_revalidateMiklos Szeredi2012-06-012-55/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NFSv4 can't do reliable opens in d_revalidate, since it cannot know whether a mount needs to be followed or not. It does check d_mountpoint() on the dentry, which can result in a weird error if the VFS found that the mount does not in fact need to be followed, e.g.: # mount --bind /mnt/nfs /mnt/nfs-clone # echo something > /mnt/nfs/tmp/bar # echo x > /tmp/file # mount --bind /tmp/file /mnt/nfs-clone/tmp/bar # cat /mnt/nfs/tmp/bar cat: /mnt/nfs/tmp/bar: Not a directory Which should, by any sane filesystem, result in "something" being printed. So instead do the open in f_op->open() and in the unlikely case that the cached dentry turned out to be invalid, drop the dentry and return EOPENSTALE to let the VFS retry. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | | | | vfs: retry last component if opening stale dentryMiklos Szeredi2012-06-011-2/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NFS optimizes away d_revalidates for last component of open. This means that open itself can find the dentry stale. This patch allows the filesystem to return EOPENSTALE and the VFS will retry the lookup on just the last component if possible. If the lookup was done using RCU mode, including the last component, then this is not possible since the parent dentry is lost. In this case fall back to non-RCU lookup. Currently this is not used since NFS will always leave RCU mode. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | | | | vfs: nameidata_to_filp(): don't throw away file on errorMiklos Szeredi2012-06-011-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If open fails, don't put the file. This allows it to be reused if open needs to be retried. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | | | | vfs: nameidata_to_filp(): inline __dentry_open()Miklos Szeredi2012-06-011-2/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Copy __dentry_open() into nameidata_to_filp(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | | | | vfs: do_dentry_open(): don't put filpMiklos Szeredi2012-06-011-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move put_filp() out to __dentry_open(), the only caller now. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | | | | vfs: split __dentry_open()Miklos Szeredi2012-06-012-14/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split __dentry_open() into two functions: do_dentry_open() - does most of the actual work, doesn't put file on failure open_check_o_direct() - after a successful open, checks direct_IO method This will allow i_op->atomic_open to do just the file initialization and leave the direct_IO checking to the VFS. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | | | | vfs: do_last() common post lookupMiklos Szeredi2012-06-011-31/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now the post lookup code can be shared between O_CREAT and plain opens since they are essentially the same. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | | | | vfs: do_last(): add audit_inode before openMiklos Szeredi2012-06-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows this code to be shared between O_CREAT and plain opens. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | | | | vfs: do_last(): only return EISDIR for O_CREATMiklos Szeredi2012-06-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows this code to be shared between O_CREAT and plain opens. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | | | | vfs: do_last(): check LOOKUP_DIRECTORYMiklos Szeredi2012-06-011-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Check for ENOTDIR before finishing open. This allows this code to be shared between O_CREAT and plain opens. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | | | | vfs: do_last(): make ENOENT exit RCU safeMiklos Szeredi2012-06-011-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This will allow this code to be used in RCU mode. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>