summaryrefslogtreecommitdiffstats
path: root/fs/ubifs
Commit message (Collapse)AuthorAgeFilesLines
* fs: icache RCU free inodesNick Piggin2011-01-071-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RCU free the struct inode. This will allow: - Subsequent store-free path walking patch. The inode must be consulted for permissions when walking, so an RCU inode reference is a must. - sb_inode_list_lock to be moved inside i_lock because sb list walkers who want to take i_lock no longer need to take sb_inode_list_lock to walk the list in the first place. This will simplify and optimize locking. - Could remove some nested trylock loops in dcache code - Could potentially simplify things a bit in VM land. Do not need to take the page lock to follow page->mapping. The downsides of this is the performance cost of using RCU. In a simple creat/unlink microbenchmark, performance drops by about 10% due to inability to reuse cache-hot slab objects. As iterations increase and RCU freeing starts kicking over, this increases to about 20%. In cases where inode lifetimes are longer (ie. many inodes may be allocated during the average life span of a single inode), a lot of this cache reuse is not applicable, so the regression caused by this patch is smaller. The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU, however this adds some complexity to list walking and store-free path walking, so I prefer to implement this at a later date, if it is shown to be a win in real situations. I haven't found a regression in any non-micro benchmark so I doubt it will be a problem. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
* convert ubifsAl Viro2010-10-291-7/+6
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* new helper: ihold()Al Viro2010-10-251-1/+1
| | | | | | Clones an existing reference to inode; caller must already hold one. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Merge branch 'linux-next' of git://git.infradead.org/ubifs-2.6Linus Torvalds2010-10-2221-112/+362
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'linux-next' of git://git.infradead.org/ubifs-2.6: UBIFS: do not allocate unneeded scan buffer UBIFS: do not forget to cancel timers UBIFS: remove a bit of unneeded code UBIFS: add a commentary about log recovery UBIFS: avoid kernel error if ubifs superblock read fails UBIFS: introduce new flags for RO mounts UBIFS: introduce new flag for RO due to errors UBIFS: check return code of pnode_lookup UBIFS: check return code of ubifs_lpt_lookup UBIFS: improve error reporting when reading bad node UBIFS: introduce list sorting debugging checks UBIFS: fix assertion warnings in comparison function UBIFS: mark unused key objects as invalid UBIFS: do not write rubbish into truncation scanning node UBIFS: improve assertion in node comparison functions UBIFS: do not use key type in list_sort UBIFS: do not look up truncation nodes UBIFS: fix assertion warning UBIFS: do not treat ENOSPC specially UBIFS: switch to RO mode after synchronizing
| * UBIFS: do not allocate unneeded scan bufferArtem Bityutskiy2010-10-211-7/+1
| | | | | | | | | | | | | | | | | | In 'ubifs_replay_journal()' we allocate 'sbuf' for scanning the log. However, we already have 'c->sbuf' for these purposes, so do not allocate yet another one. This reduces UBIFS memory consumption while recovering. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| * UBIFS: do not forget to cancel timersArtem Bityutskiy2010-10-211-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a bug-fix: when we unmount, and we are currently in R/O mode because of an error - we do not sync write-buffers, which means we also do not cancel write-buffer timers we may possibly have armed. This patch fixes the issue. The issue can easily be reproduced by enabling UBIFS failure debug mode (echo 4 > /sys/module/ubifs/parameters/debug_tsts) and unmounting as soon as a failure happen. At some point the system oopses because we have an armed hrtimer but UBIFS is unmounted already. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| * UBIFS: remove a bit of unneeded codeArtem Bityutskiy2010-10-211-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a clean-up patch which: 1. Removes explicite 'hrtimer_cancel()' after 'ubifs_wbuf_sync()' in 'ubifs_remount_ro()', because the timers will be canceled by 'ubifs_wbuf_sync()', no need to cancel them for the second time. 2. Remove "if (c->jheads)" check from 'ubifs_put_super()', because at journal heads must always be allocated there, since we checked earlier that we were mounted R/W, and the olny situation when journal heads are not allocated is when mounter or re-mounted R/O. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| * UBIFS: add a commentary about log recoveryArtem Bityutskiy2010-10-172-5/+7
| | | | | | | | | | | | | | Add a commentary which elaborates that 'ubifs_recover_log_leb()' recovers only the last log LEB, not any. Also remove some unneeded newlines. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| * UBIFS: avoid kernel error if ubifs superblock read failsSteffen Sledz2010-09-281-2/+2
| | | | | | | | | | | | | | | | | | .get_sb is called on mounts with automatic fs detection too, so this function should print an error if it cannot read the superblock in debug mode only (new behaviour conforms the other fs types) Signed-off-by: Steffen Sledz <sledz@dresearch.de> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| * UBIFS: introduce new flags for RO mountsArtem Bityutskiy2010-09-1914-60/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 2fde99cb55fb9d9b88180512a5e8a5d939d27fec "UBIFS: mark VFS SB RO too" introduced regression. This commit made UBIFS set the 'MS_RDONLY' flag in the VFS superblock when it switches to R/O mode due to an error. This was done to make VFS show the R/O UBIFS flag in /proc/mounts. However, several places in UBIFS relied on the 'MS_RDONLY' flag and assume this flag can only change when we re-mount. For example, 'ubifs_put_super()'. This patch introduces new UBIFS flag - 'c->ro_mount' which changes only when we re-mount, and preserves the way UBIFS was originally mounted (R/W or R/O). This allows us to de-initialize UBIFS cleanly in 'ubifs_put_super()'. This patch also changes all 'ubifs_assert(!c->ro_media)' assertions to 'ubifs_assert(!c->ro_media && !c->ro_mount)', because we never should write anything if the FS was mounter R/O. All the places where we test for 'MS_RDONLY' flag in the VFS SB were changed and now we test the 'c->ro_mount' flag instead, because it preserves the original UBIFS mount type, unlike the 'MS_RDONLY' flag. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| * UBIFS: introduce new flag for RO due to errorsArtem Bityutskiy2010-09-1711-24/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The R/O state may have various reasons: 1. The UBI volume is R/O 2. The FS is mounted R/O 3. The FS switched to R/O mode because of an error However, in UBIFS we have only one variable which represents cases 1 and 3 - 'c->ro_media'. Indeed, we set this to 1 if we switch to R/O mode due to an error, and then we test it in many places to make sure that we stop writing as soon as the error happens. But this is very unclean. One consequence of this, for example, is that in 'ubifs_remount_fs()' we use 'c->ro_media' to check whether we are in R/O mode because on an error, and we print a message in this case. However, if we are in R/O mode because the media is R/O, our message is bogus. This patch introduces new flag - 'c->ro_error' which is set when we switch to R/O mode because of an error. It also changes all "if (c->ro_media)" checks to "if (c->ro_error)" checks, because this is what the checks actually mean. We do not need to check for 'c->ro_media' because if the UBI volume is in R/O mode, we do not allow R/W mounting, and now writes can happen. This is guaranteed by VFS. But it is good to double-check this, so this patch also adds many "ubifs_assert(!c->ro_media)" checks. In the 'ubifs_remount_fs()' function this patch makes a bit more changes - it fixes the error messages as well. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| * UBIFS: check return code of pnode_lookupVasiliy Kulikov2010-09-071-0/+3
| | | | | | | | | | | | | | Function pnode_lookup may return ERR_PTR(...). Check for it. Signed-off-by: Vasiliy Kulikov <segooon@gmail.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| * UBIFS: check return code of ubifs_lpt_lookupVasiliy Kulikov2010-09-071-1/+6
| | | | | | | | | | | | | | | | | | Function ubifs_lpt_lookup may return ERR_PTR(...). Check for it. [Tweaked by Artem Bityutskiy] Signed-off-by: Vasiliy Kulikov <segooon@gmail.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| * UBIFS: improve error reporting when reading bad nodeArtem Bityutskiy2010-08-301-1/+2
| | | | | | | | | | | | | | | | When an error happens during validation of read node, the typical situation is that the LEB we read is unmapped (due to some bug). It is handy to include the mapping status into the error message. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| * UBIFS: introduce list sorting debugging checksArtem Bityutskiy2010-08-303-2/+168
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The UBIFS bug in the GC list sorting comparison functions inspired me to write internal debugging check functions which verify that the list of nodes is sorted properly. So, this patch implements 2 new debugging functions: o 'dbg_check_data_nodes_order()' - check order of data nodes list o 'dbg_check_nondata_nodes_order()' - check order of non-data nodes list The debugging functions are executed only if general UBIFS debugging checks are enabled. And they are compiled out if UBIFS debugging is disabled. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| * UBIFS: fix assertion warnings in comparison functionArtem Bityutskiy2010-08-301-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When running the integrity test ('integck' from mtd-utils) on current UBIFS on 2.6.35, I see that assertions in UBIFS 'list_sort()' comparison functions trigger sometimes, e.g.: UBIFS assert failed in data_nodes_cmp at 132 (pid 28311) My investigation showed that this happens when 'list_sort()' calls the 'cmp()' function with equivalent arguments. In this case, the 'struct list_head' parameter, passed to 'cmp()' is bogus, and it does not belong to any element in the original list. And this issue seems to be introduced by commit: commit 835cc0c8477fdbc59e0217891d6f11061b1ac4e2 Author: Don Mullis <don.mullis@gmail.com> Date: Fri Mar 5 13:43:15 2010 -0800 It is easy to work around the issue by doing: if (a == b) return 0; in UBIFS. It works, but 'lib_sort()' should nevertheless be fixed. Although it is harmless to have this piece of code in UBIFS. This patch adds that code to both UBIFS 'cmp()' functions: 'data_nodes_cmp()' and 'nondata_nodes_cmp()'. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| * UBIFS: mark unused key objects as invalidArtem Bityutskiy2010-08-304-3/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When scanning the flash, UBIFS builds a list of flash nodes of type 'struct ubifs_scan_node'. Each scanned node has a 'snod->key' field. This field is valid for most of the nodes, but invalid for some node type, e.g., truncation nodes. It is safer to explicitly initialize such keys to something invalid, rather than leaving them initialized to all zeros, which has key type of UBIFS_INO_KEY. This patch introduces new "fake" key type UBIFS_INVALID_KEY and initializes unused 'snod->key' objects to this type. It also adds debugging assertions in the TNC code to make sure no one ever tries to look these nodes up in the TNC. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| * UBIFS: do not write rubbish into truncation scanning nodeArtem Bityutskiy2010-08-301-1/+0
| | | | | | | | | | | | | | | | | | | | | | In the scanning code, in 'ubifs_add_snod()', we write rubbish into 'snod->key', because we assume that on-flash truncation nodes have a key, but they do not. If the other parts of UBIFS then mistakenly try to look-up the truncation node key (they should not do this, but may do because of a bug), we can succeed and corrupt TNC. It looks like we did have such a situation in 'sort_nodes()' in gc.c. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| * UBIFS: improve assertion in node comparison functionsArtem Bityutskiy2010-08-301-0/+11
| | | | | | | | | | | | | | Improve assertions in gc.c in the comparison functions for 'list_sort()': check key types _and_ node types. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| * UBIFS: do not use key type in list_sortArtem Bityutskiy2010-08-301-9/+9
| | | | | | | | | | | | | | | | | | In comparison function for 'list_sort()' we use key type to distinguish between node types. However, we have a bit simper way to detect node type - 'snod->type'. This more logical to use, comparing to decoding key types. Also allows to get rid of 2 local variables. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| * UBIFS: do not look up truncation nodesArtem Bityutskiy2010-08-301-3/+20
| | | | | | | | | | | | | | | | | | | | When moving nodes in GC, do not try to look up truncation nodes in TNC, because they do not exist there. This would be harmless, because the TNC look-up would fail, if we did not have bug 'ubifs_add_snod()' which reads garbage into 'snod->key'. But in any case, it is less error prone to explicitly ignore everything but inode, data, dentry and xentry nodes. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| * UBIFS: fix assertion warningArtem Bityutskiy2010-08-301-1/+2
| | | | | | | | | | | | | | | | | | | | | | This patch fixes the following false assertion warning: UBIFS assert failed in data_nodes_cmp at 130 (pid 15107) The assertion was wrong because it did not take into account that the node can be an xentry. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| * UBIFS: do not treat ENOSPC speciallyArtem Bityutskiy2010-08-301-7/+5
| | | | | | | | | | | | | | | | 'ubifs_garbage_collect_leb()' should never return '-ENOSPC', and if it does, this is an error. Thus, do not treat this error code specially. '-EAGAIN' is a special error code, but not '-ENOSPC'. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| * UBIFS: switch to RO mode after synchronizingArtem Bityutskiy2010-08-301-1/+1
| | | | | | | | | | | | | | | | | | | | In 'ubifs_garbage_collect()' on error path, we first switch to R/O mode, and then synchronize write-buffers (to make sure no data are lost). But the GC write-buffer synchronization will fail, because we are already in R/O mode. This patch re-orders this and makes sure we first synchronize the write-buffer, and then switch to R/O mode. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
* | llseek: automatically add .llseek fopArnd Bergmann2010-10-151-0/+1
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All file_operations should get a .llseek operation so we can make nonseekable_open the default for future file operations without a .llseek pointer. The three cases that we can automatically detect are no_llseek, seq_lseek and default_llseek. For cases where we can we can automatically prove that the file offset is always ignored, we use noop_llseek, which maintains the current behavior of not returning an error from a seek. New drivers should normally not use noop_llseek but instead use no_llseek and call nonseekable_open at open time. Existing drivers can be converted to do the same when the maintainer knows for certain that no user code relies on calling seek on the device file. The generated code is often incorrectly indented and right now contains comments that clarify for each added line why a specific variant was chosen. In the version that gets submitted upstream, the comments will be gone and I will manually fix the indentation, because there does not seem to be a way to do that using coccinelle. Some amount of new code is currently sitting in linux-next that should get the same modifications, which I will do at the end of the merge window. Many thanks to Julia Lawall for helping me learn to write a semantic patch that does all this. ===== begin semantic patch ===== // This adds an llseek= method to all file operations, // as a preparation for making no_llseek the default. // // The rules are // - use no_llseek explicitly if we do nonseekable_open // - use seq_lseek for sequential files // - use default_llseek if we know we access f_pos // - use noop_llseek if we know we don't access f_pos, // but we still want to allow users to call lseek // @ open1 exists @ identifier nested_open; @@ nested_open(...) { <+... nonseekable_open(...) ...+> } @ open exists@ identifier open_f; identifier i, f; identifier open1.nested_open; @@ int open_f(struct inode *i, struct file *f) { <+... ( nonseekable_open(...) | nested_open(...) ) ...+> } @ read disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off) { <+... ( *off = E | *off += E | func(..., off, ...) | E = *off ) ...+> } @ read_no_fpos disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off) { ... when != off } @ write @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off) { <+... ( *off = E | *off += E | func(..., off, ...) | E = *off ) ...+> } @ write_no_fpos @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off) { ... when != off } @ fops0 @ identifier fops; @@ struct file_operations fops = { ... }; @ has_llseek depends on fops0 @ identifier fops0.fops; identifier llseek_f; @@ struct file_operations fops = { ... .llseek = llseek_f, ... }; @ has_read depends on fops0 @ identifier fops0.fops; identifier read_f; @@ struct file_operations fops = { ... .read = read_f, ... }; @ has_write depends on fops0 @ identifier fops0.fops; identifier write_f; @@ struct file_operations fops = { ... .write = write_f, ... }; @ has_open depends on fops0 @ identifier fops0.fops; identifier open_f; @@ struct file_operations fops = { ... .open = open_f, ... }; // use no_llseek if we call nonseekable_open //////////////////////////////////////////// @ nonseekable1 depends on !has_llseek && has_open @ identifier fops0.fops; identifier nso ~= "nonseekable_open"; @@ struct file_operations fops = { ... .open = nso, ... +.llseek = no_llseek, /* nonseekable */ }; @ nonseekable2 depends on !has_llseek @ identifier fops0.fops; identifier open.open_f; @@ struct file_operations fops = { ... .open = open_f, ... +.llseek = no_llseek, /* open uses nonseekable */ }; // use seq_lseek for sequential files ///////////////////////////////////// @ seq depends on !has_llseek @ identifier fops0.fops; identifier sr ~= "seq_read"; @@ struct file_operations fops = { ... .read = sr, ... +.llseek = seq_lseek, /* we have seq_read */ }; // use default_llseek if there is a readdir /////////////////////////////////////////// @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier readdir_e; @@ // any other fop is used that changes pos struct file_operations fops = { ... .readdir = readdir_e, ... +.llseek = default_llseek, /* readdir is present */ }; // use default_llseek if at least one of read/write touches f_pos ///////////////////////////////////////////////////////////////// @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read.read_f; @@ // read fops use offset struct file_operations fops = { ... .read = read_f, ... +.llseek = default_llseek, /* read accesses f_pos */ }; @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, ... + .llseek = default_llseek, /* write accesses f_pos */ }; // Use noop_llseek if neither read nor write accesses f_pos /////////////////////////////////////////////////////////// @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; identifier write_no_fpos.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, .read = read_f, ... +.llseek = noop_llseek, /* read and write both use no f_pos */ }; @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write_no_fpos.write_f; @@ struct file_operations fops = { ... .write = write_f, ... +.llseek = noop_llseek, /* write uses no f_pos */ }; @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; @@ struct file_operations fops = { ... .read = read_f, ... +.llseek = noop_llseek, /* read uses no f_pos */ }; @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; @@ struct file_operations fops = { ... +.llseek = noop_llseek, /* no read or write fn */ }; ===== End semantic patch ===== Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Julia Lawall <julia@diku.dk> Cc: Christoph Hellwig <hch@infradead.org>
* Merge branch 'for-linus' of ↵Linus Torvalds2010-08-103-20/+17
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (96 commits) no need for list_for_each_entry_safe()/resetting with superblock list Fix sget() race with failing mount vfs: don't hold s_umount over close_bdev_exclusive() call sysv: do not mark superblock dirty on remount sysv: do not mark superblock dirty on mount btrfs: remove junk sb_dirt change BFS: clean up the superblock usage AFFS: wait for sb synchronization when needed AFFS: clean up dirty flag usage cifs: truncate fallout mbcache: fix shrinker function return value mbcache: Remove unused features add f_flags to struct statfs(64) pass a struct path to vfs_statfs update VFS documentation for method changes. All filesystems that need invalidate_inode_buffers() are doing that explicitly convert remaining ->clear_inode() to ->evict_inode() Make ->drop_inode() just return whether inode needs to be dropped fs/inode.c:clear_inode() is gone fs/inode.c:evict() doesn't care about delete vs. non-delete paths now ... Fix up trivial conflicts in fs/nilfs2/super.c
| * switch ubifs to ->evict_inode()Al Viro2010-08-091-4/+8
| | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * check ATTR_SIZE contraints in inode_change_okChristoph Hellwig2010-08-092-16/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make sure we check the truncate constraints early on in ->setattr by adding those checks to inode_change_ok. Also clean up and document inode_change_ok to make this obvious. As a fallout we don't have to call inode_newsize_ok from simple_setsize and simplify it down to a truncate_setsize which doesn't return an error. This simplifies a lot of setattr implementations and means we use truncate_setsize almost everywhere. Get rid of fat_setsize now that it's trivial and mark ext2_setsize static to make the calling convention obvious. Keep the inode_newsize_ok in vmtruncate for now as all callers need an audit for its removal anyway. Note: setattr code in ecryptfs doesn't call inode_change_ok at all and needs a deeper audit, but that is left for later. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | Merge branch 'linux-next' of git://git.infradead.org/ubifs-2.6Linus Torvalds2010-08-034-14/+29
|\ \ | |/ |/| | | | | | | | | | | * 'linux-next' of git://git.infradead.org/ubifs-2.6: UBIFS: fix a memory leak on error path. UBIFS: fix GC LEB recovery UBIFS: use ERR_CAST UBIFS: check return code
| * UBIFS: fix a memory leak on error path.Matthieu CASTET2010-08-031-1/+1
| | | | | | | | | | | | | | | | In 'mount_ubifs()', in case of 'ubifs_leb_unmap()' falure, free allocated resources. Signed-off-by: Matthieu CASTET <matthieu.castet@parrot.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| * UBIFS: fix GC LEB recoveryArtem Bityutskiy2010-07-131-5/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | UBIFS tries to alway have an LEB reserved for GC, and stores it in c->gc_lnum. Besides, there is GC head which points to the current GC head LEB. In case of an unclean power cut, what may happen is that the GC head was switched to the reserved GC LEB (c->gc_lnum), but a new reserved GC LEB was not created yet. So, after an unclean reboot we may have no reserved GC LEB, and we need to find a new LEB for this. To do this, we find a dirty LEB which can fit the current GC head, move the data, unmap this dirty LEB, and it becomes our reserved GC LEB. However, if we cannot find a dirty enough LEB, we return failure, which is wrong, because we still can have free LEBs to use for the reserved GC LEB. This patch fixes the issue. This patch also fixes few typos in comments, which were spotted by aspell. Note, this patch fixes a real issue [ 14.328117] UBIFS: recovery needed [ 53.941378] UBIFS error (pid 462): ubifs_rcvry_gc_commit: could not find a dirty LEB [ 89.606399] UBIFS: recovery completed [ 89.609329] UBIFS assert failed in mount_ubifs at 1358 (pid 462) [ 89.616165] [<c0026144>] (unwind_backtrace+0x0/0xe4) from [<c0125ce4>] (ubifs_fill_super+0x11d0/0x1c4c) [ 89.625930] [<c0125ce4>] (ubifs_fill_super+0x11d0/0x1c4c) from [<c0126910>] (ubifs_get_sb+0x1b0/0x354) [ 89.635696] [<c0126910>] (ubifs_get_sb+0x1b0/0x354) from [<c008a50c>] (vfs_kern_mount+0x50/0xe0) [ 89.644485] [<c008a50c>] (vfs_kern_mount+0x50/0xe0) from [<c008a5e0>] (do_kern_mount+0x34/0xdc) [ 89.653274] [<c008a5e0>] (do_kern_mount+0x34/0xdc) from [<c00a29d8>] (do_mount+0x148/0x7cc) [ 89.662063] [<c00a29d8>] (do_mount+0x148/0x7cc) from [<c00a30f4>] (sys_mount+0x98/0xc8) [ 89.670852] [<c00a30f4>] (sys_mount+0x98/0xc8) from [<c0021f40>] (ret_fast_syscall+0x0/0x28) which was reported here: http://article.gmane.org/gmane.linux.drivers.mtd/29923 by Alexander Pazdnikov <pazdnikov@list.ru> Reported-by: Alexander Pazdnikov <pazdnikov@list.ru> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Reviewed-by: Adrian Hunter <adrian.hunter@nokia.com>
| * UBIFS: use ERR_CASTJulia Lawall2010-06-122-8/+8
| | | | | | | | | | | | | | | | | | Use ERR_CAST(x) rather than ERR_PTR(PTR_ERR(x)). The former makes more clear what is the purpose of the operation, which otherwise looks like a no-op. Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| * UBIFS: check return codeArtem Bityutskiy2010-06-121-0/+2
| | | | | | | | | | | | | | | | The error code from 'ubifs_rcvry_gc_commit()' was ignored, so UBIFS failed to recover and continued. Instead, we should refuse mounting the file-system. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
* | mm: add context argument to shrinker callbackDave Chinner2010-07-192-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | The current shrinker implementation requires the registered callback to have global state to work from. This makes it difficult to shrink caches that are not global (e.g. per-filesystem caches). Pass the shrinker structure to the callback so that users can embed the shrinker structure in the context the shrinker needs to operate on and get back to it in the callback via container_of(). Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* | writeback: enforce s_umount locking in writeback_inodes_sbChristoph Hellwig2010-06-111-0/+2
|/ | | | | | | | | | Make sure that not only sync_filesystem but all callers of writeback_inodes_sb have the superblock protected against remount. As-is this disables all functionality for these callers, but the next patch relies on this locking to fix writeback_inodes_sb for sync_filesystem. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
* kill spurious reference to vmtruncatenpiggin@suse.de2010-05-272-6/+9
| | | | | | | | | | Lots of filesystems calls vmtruncate despite not implementing the old ->truncate method. Switch them to use simple_setsize and add some comments about the truncate code where it seems fitting. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* drop unused dentry argument to ->fsyncChristoph Hellwig2010-05-272-3/+3
| | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* ubifs: replace inode uid,gid,mode initialization with helper functionDmitry Monakhov2010-05-211-8/+1
| | | | | | Acked-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* UBIFS: mark VFS SB RO tooZhangJieJing2010-04-291-0/+1
| | | | | | | | If some read/write error happens (eg.CRC error), UBIFS swotches to read-only mode, but the VFS infomation still not update. This patch add this also make /proc/mounts update. Signed-off-by: Zhang Jiejing <kzjeef@gmail.com>
* include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo2010-03-3012-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
* Revert "lib: build list_sort() only if needed"Linus Torvalds2010-03-071-1/+0
| | | | | | | | | | | | | | | | | | | | This reverts commit a069c266ae5fdfbf5b4aecf2c672413aa33b2504. It turns ou that not only was it missing a case (XFS) that needed it, but perhaps more importantly, people sometimes want to enable new modules that they hadn't had enabled before, and if such a module uses list_sort(), it can't easily be inserted any more. So rather than add a "select LIST_SORT" to the XFS case, just leave it compiled in. It's not all _that_ big, after all, and the inconvenience isn't worth it. Requested-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Don Mullis <don.mullis@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* lib: build list_sort() only if neededDon Mullis2010-03-061-0/+1
| | | | | | | | | | | | | Build list_sort() only for configs that need it -- those that don't save ~581 bytes (i386). Signed-off-by: Don Mullis <don.mullis@gmail.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Artem Bityutskiy <dedekind@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pass writeback_control to ->write_inodeChristoph Hellwig2010-03-053-6/+6
| | | | | | | | | | This gives the filesystem more information about the writeback that is happening. Trond requested this for the NFS unstable write handling, and other filesystems might benefit from this too by beeing able to distinguish between the different callers in more detail. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* lib: Introduce generic list_sort functionDave Chinner2010-01-121-95/+1
| | | | | | | | | | | | There are two copies of list_sort() in the tree already, one in the DRM code, another in ubifs. Now XFS needs this as well. Create a generic list_sort() function from the ubifs version and convert existing users to it so we don't end up with yet another copy in the tree. Signed-off-by: Dave Chinner <david@fromorbit.com> Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Artem Bityutskiy <dedekind@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* kill I_LOCKChristoph Hellwig2009-12-171-1/+1
| | | | | | | | After I_SYNC was split from I_LOCK the leftover is always used together with I_NEW and thus superflous. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* fs/ubifs: use %pUB to print UUIDsJoe Perches2009-12-152-13/+3
| | | | | | | | Signed-off-by: Joe Perches <joe@perches.com> Cc: Artem Bityutskiy <dedekind@infradead.org> Cc: Adrian Hunter <adrian.hunter@nokia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge git://git.infradead.org/ubifs-2.6Linus Torvalds2009-12-103-18/+17
|\ | | | | | | | | | | | | | | | | | | * git://git.infradead.org/ubifs-2.6: UBIFS: fix return code in check_leaf UBI: flush wl before clearing update marker MAINTAINERS: change e-mail of Artem Bityutskiy UBIFS: remove manual O_SYNC handling UBIFS: support mounting of UBI volume character devices UBI: Add ubi_open_volume_path
| * UBIFS: fix return code in check_leafRoel Kluin2009-12-081-1/+1
| | | | | | | | | | | | | | Return the PTR_ERR of the correct pointer. This fixes the debugging code. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| * UBIFS: remove manual O_SYNC handlingChristoph Hellwig2009-11-241-12/+1
| | | | | | | | | | | | | | | | | | generic_file_aio_write already calls into ->fsync to handle O_SYNC/O_DSYNC. Remove the duplicate call to ubifs_sync_wbufs_by_inode which is already covered by ubifs_fsync. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
| * UBIFS: support mounting of UBI volume character devicesCorentin Chary2009-11-241-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes it possible to mount UBI character device nodes, and use something like: $ mount -t ubifs /dev/ubi_volume_name /mnt/ubifs instead of the old restrictive 'nodev' semantics: $ mount -t ubifs ubi0_0 /mnt/ubifs [Comments and the patch were amended a bit by Artem] Signed-off-by: Corentin Chary <corentincj@iksaif.net> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>