summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
* sched/rt: Move rt specific bits into new header fileClark Williams2013-02-071-0/+1
| | | | | | | | | | | Move rt scheduler definitions out of include/linux/sched.h into new file include/linux/sched/rt.h Signed-off-by: Clark Williams <williams@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20130207094707.7b9f825f@riff.lan Signed-off-by: Ingo Molnar <mingo@kernel.org>
* Merge tag 'full-dynticks-cputime-for-mingo' of ↵Ingo Molnar2013-02-053-6/+13
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into sched/core Pull full-dynticks (user-space execution is undisturbed and receives no timer IRQs) preparation changes that convert the cputime accounting code to be full-dynticks ready, from Frederic Weisbecker: "This implements the cputime accounting on full dynticks CPUs. Typical cputime stats infrastructure relies on the timer tick and its periodic polling on the CPU to account the amount of time spent by the CPUs and the tasks per high level domains such as userspace, kernelspace, guest, ... Now we are preparing to implement full dynticks capability on Linux for Real Time and HPC users who want full CPU isolation. This feature requires a cputime accounting that doesn't depend on the timer tick. To implement it, this new cputime infrastructure plugs into kernel/user/guest boundaries to take snapshots of cputime and flush these to the stats when needed. This performs pretty much like CONFIG_VIRT_CPU_ACCOUNTING except that context location and cputime snaphots are synchronized between write and read side such that the latter can safely retrieve the pending tickless cputime of a task and add it to its latest cputime snapshot to return the correct result to the user." Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * cputime: Use accessors to read task cputime statsFrederic Weisbecker2013-01-273-6/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is in preparation for the full dynticks feature. While remotely reading the cputime of a task running in a full dynticks CPU, we'll need to do some extra-computation. This way we can account the time it spent tickless in userspace since its last cputime snapshot. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de>
* | Merge branch 'for-linus' of ↵Linus Torvalds2013-01-225-34/+38
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fixes from Miklos Szeredi: "This contain a bugfix for CUSE and miscellaneous small fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: remove unused variable in fuse_try_move_page() fuse: make fuse_file_fallocate() static fuse: Move CUSE Kconfig entry from fs/Kconfig into fs/fuse/Kconfig cuse: fix uninitialized variable warnings cuse: do not register multiple devices with identical names cuse: use mutex as registration lock instead of spinlocks
| * | fuse: remove unused variable in fuse_try_move_page()Wei Yongjun2013-01-171-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The variables mapping,index are initialized but never used otherwise, so remove the unused variables. dpatch engine is used to auto generate this patch. (https://github.com/weiyj/dpatch) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
| * | fuse: make fuse_file_fallocate() staticMiklos Szeredi2013-01-171-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | Fix the following sparse warning: fs/fuse/file.c:2249:6: warning: symbol 'fuse_file_fallocate' was not declared. Should it be static? Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
| * | fuse: Move CUSE Kconfig entry from fs/Kconfig into fs/fuse/KconfigRobert P. J. Day2013-01-172-12/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | Given that CUSE depends on FUSE, it only makes sense to move its Kconfig entry into the FUSE Kconfig file. Also, add a few grammatical and semantic touchups. Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
| * | cuse: fix uninitialized variable warningsMiklos Szeredi2013-01-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the following compiler warnings: fs/fuse/cuse.c: In function 'cuse_process_init_reply': fs/fuse/cuse.c:288:24: warning: 'val' may be used uninitialized in this function [-Wmaybe-uninitialized] fs/fuse/cuse.c:272:14: note: 'val' was declared here fs/fuse/cuse.c:284:10: warning: 'key' may be used uninitialized in this function [-Wmaybe-uninitialized] fs/fuse/cuse.c:272:8: note: 'key' was declared here Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
| * | cuse: do not register multiple devices with identical namesDavid Herrmann2013-01-171-6/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sysfs doesn't allow two devices with the same name, but we register a sysfs entry for each cuse device without checking for name collisions. This extends the registration to first check whether the name was already registered. To avoid race-conditions between the name-check and linking the device, we need to protect the whole registration with a mutex. Signed-off-by: David Herrmann <dh.herrmann@googlemail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
| * | cuse: use mutex as registration lock instead of spinlocksDavid Herrmann2013-01-171-8/+7
| |/ | | | | | | | | | | | | | | | | | | | | | | We need to check for name-collisions during cuse-device registration. To avoid race-conditions, this needs to be protected during the whole device registration. Therefore, replace the spinlocks by mutexes first so we can safely extend the locked regions to include more expensive or sleeping code paths. Signed-off-by: David Herrmann <dh.herrmann@googlemail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
* | Merge tag 'f2fs-for-3.8-rc5' of ↵Linus Torvalds2013-01-2214-131/+189
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs fixes from Jaegeuk Kim: o Support swap file and link generic_file_remap_pages o Enhance the bio streaming flow and free section control o Major bug fix on recovery routine o Minor bug/warning fixes and code cleanups * tag 'f2fs-for-3.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (22 commits) f2fs: use _safe() version of list_for_each f2fs: add comments of start_bidx_of_node f2fs: avoid issuing small bios due to several dirty node pages f2fs: support swapfile f2fs: add remap_pages as generic_file_remap_pages f2fs: add __init to functions in init_f2fs_fs f2fs: fix the debugfs entry creation path f2fs: add global mutex_lock to protect f2fs_stat_list f2fs: remove the blk_plug usage in f2fs_write_data_pages f2fs: avoid redundant time update for parent directory in f2fs_delete_entry f2fs: remove redundant call to set_blocksize in f2fs_fill_super f2fs: move f2fs_balance_fs to punch_hole f2fs: add f2fs_balance_fs in several interfaces f2fs: revisit the f2fs_gc flow f2fs: check return value during recovery f2fs: avoid null dereference in f2fs_acl_from_disk f2fs: initialize newly allocated dnode structure f2fs: update f2fs partition info about SIT/NAT layout f2fs: update f2fs document to reflect SIT/NAT layout correctly f2fs: remove unneeded INIT_LIST_HEAD at few places ...
| * f2fs: use _safe() version of list_for_eachDan Carpenter2013-01-221-4/+3
| | | | | | | | | | | | | | | | | | | | | | This is calling list_del() inside a loop which is a problem when we try move to the next item on the list. I've converted it to use the _safe version. And also, as a cleanup, I've converted it to use list_for_each_entry instead of list_for_each. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: add comments of start_bidx_of_nodeJaegeuk Kim2013-01-221-1/+5
| | | | | | | | | | | | | | | | | | The caller of start_bidx_of_node() should give proper node offsets which point only direct node blocks. Otherwise, it is a caller's bug. This patch adds comments to make it clear. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: avoid issuing small bios due to several dirty node pagesJaegeuk Kim2013-01-221-6/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If some small bios of dirty node pages are supposed to be issued during the sequential data writes, there-in well-produced consecutive data bios are able to be split by the small node bios, resulting in performance degradation. So, let's collect a number of dirty node pages until reaching a threshold. And, by default, I set the threshold as 2MB, a segment size. This improves sequential write performance on i5, 512GB SSD (830 w/ SATA2) as follows. Before: 231 MB/s -> After: 255 MB/s Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com> Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
| * f2fs: support swapfileJaegeuk Kim2013-01-221-0/+6
| | | | | | | | | | | | | | This patch adds f2fs_bmap operation to the data address space. This enables f2fs to support swapfile. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: add remap_pages as generic_file_remap_pagesJaegeuk Kim2013-01-221-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was added for all the file systems before. See the following commit. commit id: 0b173bc4daa8f8ec03a85abf5e47b23502ff80af [PATCH] mm: kill vma flag VM_CAN_NONLINEAR This patch moves actual ptes filling for non-linear file mappings into special vma operation: ->remap_pages(). File system must implement this method to get non-linear mappings support, if it uses filemap_fault() then generic_file_remap_pages() can be used. Now device drivers can implement this method and obtain nonlinear vma support." Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: add __init to functions in init_f2fs_fsNamjae Jeon2013-01-225-9/+9
| | | | | | | | | | | | | | | | Add __init to functions in init_f2fs_fs for code consistency. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: fix the debugfs entry creation pathNamjae Jeon2013-01-153-21/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | As the "status" debugfs entry will be maintained for entire F2FS filesystem irrespective of the number of partitions. So, we can move the initialization to the init part of the f2fs and destroy will be done from exit part. After making changes, for individual partition mount - entry creation code will not be executed. Signed-off-by: Jianpeng Ma <majianpeng@gmail.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: add global mutex_lock to protect f2fs_stat_listmajianpeng2013-01-151-12/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is an race condition between umounting f2fs and reading f2fs/status, which results in oops. Fox example: Thread A Thread B umount f2fs cat f2fs/status f2fs_destroy_stats() { stat_show() { list_for_each_entry_safe(&f2fs_stat_list) list_del(&si->stat_list); mutex_lock(&si->stat_lock); si->sbi = NULL; mutex_unlock(&si->stat_lock); kfree(sbi->stat_info); } mutex_lock(&si->stat_lock) <- si is gone. ... } Solution with a global lock: f2fs_stat_mutex: Thread A Thread B umount f2fs cat f2fs/status f2fs_destroy_stats() { stat_show() { mutex_lock(&f2fs_stat_mutex); list_del(&si->stat_list); mutex_unlock(&f2fs_stat_mutex); kfree(sbi->stat_info); mutex_lock(&f2fs_stat_mutex); } list_for_each_entry_safe(&f2fs_stat_list) ... mutex_unlock(&f2fs_stat_mutex); } Signed-off-by: Jianpeng Ma <majianpeng@gmail.com> [jaegeuk.kim@samsung.com: fix typos, description, and remove the existing lock] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: remove the blk_plug usage in f2fs_write_data_pagesNamjae Jeon2013-01-151-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let's consider the usage of blk_plug in f2fs_write_data_pages(). We can come up with the two issues: lock contention and task awareness. 1. Merging bios prior to grabing "queue lock" The f2fs merges consecutive IOs in the file system level before submitting any bios, which is similar with the back merge by the plugging mechanism in attempt_plug_merge(). Both of them need to acquire no queue lock. 2. Merging policy with respect to tasks The f2fs merges IOs as much as possible regardless of tasks, while blk-plugging is conducted on a basis of tasks. As we can understand there are trade-offs, f2fs tries to maximize the write performance with well-merged bios. As a result, if f2fs produces many consecutive but separated bios in writepages(), it would be good to use blk-plugging since f2fs would be able to avoid queue lock contention in the block layer by merging them. But, f2fs merges IOs and submit one bio, which means that there are not much chances to merge bios by attempt_plug_merge(). However, f2fs has already been used blk_plug by triggering generic_writepages() in f2fs_write_data_pages(). So to make the overall code consistency, I'd like to remove blk_plug there. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: avoid redundant time update for parent directory in f2fs_delete_entryNamjae Jeon2013-01-141-1/+1
| | | | | | | | | | | | | | | | | | | | In call to f2fs_delete_entry, 'dir' time modification code is put at two places. So, remove the redundant code for timing update. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: remove redundant call to set_blocksize in f2fs_fill_superNamjae Jeon2013-01-141-5/+1
| | | | | | | | | | | | | | | | | | | | Since, f2fs supports only 4KB blocksize, which is set at the beginning in f2fs_fill_super. So, we do not need to again check this blocksize setting in such case. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: move f2fs_balance_fs to punch_holeJaegeuk Kim2013-01-111-3/+2
| | | | | | | | | | | | | | | | | | | | | | The f2fs_fallocate() has two operations: punch_hole and expand_size. Only in the case of punch_hole, dirty node pages can be produced, so let's trigger f2fs_balance_fs() in this case only. Furthermore, let's trigger it at every data truncation routine. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: add f2fs_balance_fs in several interfacesJaegeuk Kim2013-01-114-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The f2fs_balance_fs() is to check the number of free sections and decide whether it needs to conduct cleaning or not. If there are not enough free sections, the cleaning job should be started. In order to control an amount of free sections even under high utilization, f2fs should call f2fs_balance_fs at all the VFS interfaces that are able to produce dirty pages. This patch adds the function calls in the missing interfaces as follows. 1. f2fs_setxattr() The f2fs_setxattr() produces dirty node pages so that we should call f2fs_balance_fs() either likewise doing in other VFS interfaces such as f2fs_lookup(), f2fs_mkdir(), and so on. 2. f2fs_sync_file() We should guarantee serving free sections for syncing metadata during fsync. Previously, there is no space check before triggering checkpoint and sync_node_pages. Therefore, if a bunch of fsync calls are triggered under 100% of FS utilization, f2fs is able to be faced with no free sections, resulting in BUG_ON(). 3. f2fs_sync_fs() Before calling write_checkpoint(), we should guarantee that there are minimum free sections. 4. f2fs_write_inode() f2fs_write_inode() is also able to produce dirty node pages. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: revisit the f2fs_gc flowJaegeuk Kim2013-01-103-41/+23
| | | | | | | | | | | | | | | | | | | | | | | | I'd like to revisit the f2fs_gc flow and rewrite as follows. 1. In practical, the nGC parameter of f2fs_gc is meaningless. So, let's remove it. 2. Background GC marks victim blocks as dirty one at a time. 3. Foreground GC should do cleaning job until acquiring enough free sections. Afterwards, it needs to do checkpoint. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: check return value during recoveryJaegeuk Kim2013-01-041-1/+1
| | | | | | | | | | | | | | | | This patch resolves Coverity #753102: >>> No check of the return value of "f2fs_add_link(&dent, inode)". Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: avoid null dereference in f2fs_acl_from_diskJaegeuk Kim2013-01-041-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch resolves Coverity #751303: >>> CID 753103: Explicit null dereferenced (FORWARD_NULL) Passing null >>> pointer "value" to function "f2fs_acl_from_disk(char const *, size_t)", which dereferences it. [Error path] - value = NULL; - retval = 0 by f2fs_getxattr(); - f2fs_acl_from_disk(value:NULL, ...); Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: initialize newly allocated dnode structureJaegeuk Kim2013-01-041-1/+1
| | | | | | | | | | | | | | | | | | This patch resolves Coverity #753112. In practical, the existing code flow does not fall into the reported errorneous path. But, anyway, let's avoid this for future. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: update f2fs partition info about SIT/NAT layoutHuajun Li2013-01-041-2/+2
| | | | | | | | | | | | | | Update partition info output under debug FS to reflect segment layout correctly. Signed-off-by: Huajun Li <huajun.li.lee@gmail.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: remove unneeded INIT_LIST_HEAD at few placesNamjae Jeon2013-01-042-2/+0
| | | | | | | | | | | | | | | | | | | | While creating a new entry for addition to the list(orphan inode list and fsync inode entry list), there is no need to call HEAD initialization for these entries. So, remove that init part. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: fix time update in case of f2fs fallocateNamjae Jeon2013-01-041-0/+5
| | | | | | | | | | | | | | | | | | | | After doing a punch hole or expanding inode doing fallocation. The change and modification time are not update for the file. So, update time after no issue is observed in fallocate. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: introduce f2fs_msg to ease adding information printsNamjae Jeon2013-01-042-17/+65
| | | | | | | | | | | | | | | | | | | | Introduced f2fs_msg function to differentiate f2fs specific messages in the log. And, added few informative prints in the mount path, to convey proper error in case of mount failure. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* | Merge tag 'for-linus-v3.8-rc4' of git://oss.sgi.com/xfs/xfsLinus Torvalds2013-01-167-42/+64
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull xfs bugfixes from Ben Myers: - fix(es) for compound buffers - fix for dquot soft timer asserts due to overflow of d_blk_softlimit - fix for regression in dir v2 code introduced in commit 20f7e9f3726a ("xfs: factor dir2 block read operations") * tag 'for-linus-v3.8-rc4' of git://oss.sgi.com/xfs/xfs: xfs: recalculate leaf entry pointer after compacting a dir2 block xfs: remove int casts from debug dquot soft limit timer asserts xfs: fix the multi-segment log buffer format xfs: fix segment in xfs_buf_item_format_segment xfs: rename bli_format to avoid confusion with bli_formats xfs: use b_maps[] for discontiguous buffers
| * | xfs: recalculate leaf entry pointer after compacting a dir2 blockEric Sandeen2013-01-161-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dave Jones hit this assert when doing a compile on recent git, with CONFIG_XFS_DEBUG enabled: XFS: Assertion failed: (char *)dup - (char *)hdr == be16_to_cpu(*xfs_dir2_data_unused_tag_p(dup)), file: fs/xfs/xfs_dir2_data.c, line: 828 Upon further digging, the tag found by xfs_dir2_data_unused_tag_p(dup) contained "2" and not the proper offset, and I found that this value was changed after the memmoves under "Use a stale leaf for our new entry." in xfs_dir2_block_addname(), i.e. memmove(&blp[mid + 1], &blp[mid], (highstale - mid) * sizeof(*blp)); overwrote it. What has happened is that the previous call to xfs_dir2_block_compact() has rearranged things; it changes btp->count as well as the blp array. So after we make that call, we must recalculate the proper pointer to the leaf entries by making another call to xfs_dir2_block_leaf_p(). Dave provided a metadump image which led to a simple reproducer (create a particular filename in the affected directory) and this resolves the testcase as well as the bug on his live system. Thanks also to dchinner for looking at this one with me. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Tested-by: Dave Jones <davej@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
| * | xfs: remove int casts from debug dquot soft limit timer assertsBrian Foster2013-01-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The int casts here make it easy to trigger an assert with a large soft limit. For example, set a >4TB soft limit on an empty volume to reproduce a (0 > -x) comparison due to an overflow of d_blk_softlimit. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
| * | xfs: fix the multi-segment log buffer formatMark Tinguely2013-01-162-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Per Dave Chinner suggestion, this patch: 1) Corrects the detection of whether a multi-segment buffer is still tracking data. 2) Clears all the buffer log formats for a multi-segment buffer. Signed-off-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
| * | xfs: fix segment in xfs_buf_item_format_segmentMark Tinguely2013-01-161-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Not every segment in a multi-segment buffer is dirty in a transaction and they will not be outputted. The assert in xfs_buf_item_format_segment() that checks for the at least one chunk of data in the segment to be used is not necessary true for multi-segmented buffers. Signed-off-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
| * | xfs: rename bli_format to avoid confusion with bli_formatsMark Tinguely2013-01-163-24/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename the bli_format structure to __bli_format to avoid accidently confusing them with the bli_formats pointer. Signed-off-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
| * | xfs: use b_maps[] for discontiguous buffersMark Tinguely2013-01-162-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commits starting at 77c1a08 introduced a multiple segment support to xfs_buf. xfs_trans_buf_item_match() could not find a multi-segment buffer in the transaction because it was looking at the single segment block number rather than the multi-segment b_maps[0].bm.bn. This results on a recursive buffer lock that can never be satisfied. This patch: 1) Changed the remaining b_map accesses to be b_maps[0] accesses. 2) Renames the single segment b_map structure to __b_map to avoid future confusion. Signed-off-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
* | | Merge branch 'for_linus' of ↵Linus Torvalds2013-01-162-2/+4
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull ext3 and udf fixes from Jan Kara: "One ext3 performance regression fix and one udf regression fix (oops on interrupted mount)." * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: UDF: Fix a null pointer dereference in udf_sb_free_partitions jbd: don't wake kjournald unnecessarily
| * | | UDF: Fix a null pointer dereference in udf_sb_free_partitionsNamjae Jeon2013-01-141-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a regression caused by commit bff943af6fe "udf: Fix memory leak when mounting" due to which it was triggering a kernel null point dereference in case of interrupted mount OR when allocating memory to sbi->s_partmaps failed in function udf_sb_alloc_partition_maps. Reported-and-tested-by: James Hogan <james@albanarts.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> Signed-off-by: Jan Kara <jack@suse.cz>
| * | | jbd: don't wake kjournald unnecessarilyEric Sandeen2013-01-141-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Don't send an extra wakeup to kjournald in the case where we already have the proper target in j_commit_request, i.e. that commit has already been requested for commit. commit d9b0193 "jbd: fix fsync() tid wraparound bug" changed the logic leading to a wakeup, but it caused some extra wakeups which were found to lead to a measurable performance regression. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz>
* | | | vfs: add missing virtual cache flush after editing partial pagesLinus Torvalds2013-01-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Andrew Morton pointed this out a month ago, and then I completely forgot about it. If we read a partial last page of a block device, we will zero out the end of the page, but since that page can then be mapped into user space, we should also make sure to flush the cache on architectures that have virtual caches. We have the flush_dcache_page() function for this, so use it. Now, in practice this really never matters, because nobody sane uses virtual caches to begin with, and they largely exist on old broken RISC arhitectures. And even if you did run on one of those obsolete CPU's, the whole "mmap and access the last partial page of a block device" behavior probably doesn't actually exist. The normal IO functions (read/write) will never see the zeroed-out part of the page that migth not be coherent in the cache, because they honor the size of the device. So I'm marking this for stable (3.7 only), but I'm not sure anybody will ever care. Pointed-out-by: Andrew Morton <akpm@linux-foundation.org> Cc: stable@vger.kernel.org # 3.7 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge tag 'driver-core-3.8-rc3' of ↵Linus Torvalds2013-01-141-1/+1
|\ \ \ \ | |/ / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg Kroah-Hartman: "Here are two patches for 3.8-rc3. One removes the __dev* defines from init.h now that all usages of it are gone from your tree. The other fix is for debugfs's paramater that was using the wrong base for the option. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" * tag 'driver-core-3.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: debugfs: convert gid= argument from decimal, not octal Remove __dev* markings from init.h
| * | | debugfs: convert gid= argument from decimal, not octalDave Reisner2013-01-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch technically breaks userspace, but I suspect that anyone who actually used this flag would have encountered this brokenness, declared it lunacy, and already sent a patch. Signed-off-by: Dave Reisner <dreisner@archlinux.org> Reviewed-by: Vasiliy Kulikov <segoon@openwall.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | | | fs/exec.c: work around icc miscompilationXi Wang2013-01-111-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tricky problem is this check: if (i++ >= max) icc (mis)optimizes this check as: if (++i > max) The check now becomes a no-op since max is MAX_ARG_STRINGS (0x7FFFFFFF). This is "allowed" by the C standard, assuming i++ never overflows, because signed integer overflow is undefined behavior. This optimization effectively reverts the previous commit 362e6663ef23 ("exec.c, compat.c: fix count(), compat_count() bounds checking") that tries to fix the check. This patch simply moves ++ after the check. Signed-off-by: Xi Wang <xi.wang@gmail.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | seq_file: fix new kernel-doc warningsRandy Dunlap2013-01-101-1/+1
|/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix kernel-doc warnings in fs/seq_file.c: Warning(fs/seq_file.c:304): No description found for parameter 'whence' Warning(fs/seq_file.c:304): Excess function parameter 'origin' description in 'seq_lseek' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2013-01-081-1/+3
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: 1) New sysctl ndisc_notify needs some documentation, from Hanns Frederic Sowa. 2) Netfilter REJECT target doesn't set transport header of SKB correctly, from Mukund Jampala. 3) Forcedeth driver needs to check for DMA mapping failures, from Larry Finger. 4) brcmsmac driver can't use usleep_range while holding locks, use udelay instead. From Niels Ole Salscheider. 5) Fix unregister of netlink bridge multicast database handlers, from Vlad Yasevich and Rami Rosen. 6) Fix checksum calculations in netfilter's ipv6 network prefix translation module. 7) Fix high order page allocation failures in netfilter xt_recent, from Eric Dumazet. 8) mac802154 needs to use netif_rx_ni() instead of netif_rx() because mac802154_process_data() can execute in process rather than interrupt context. From Alexander Aring. 9) Fix splice handling of MSG_SENDPAGE_NOTLAST, otherwise we elide one tcp_push() too many. From Eric Dumazet and Willy Tarreau. 10) Fix skb->truesize tracking in XEN netfront driver, from Ian Campbell. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (46 commits) xen/netfront: improve truesize tracking ipv4: fix NULL checking in devinet_ioctl() tcp: fix MSG_SENDPAGE_NOTLAST logic net/ipv4/ipconfig: really display the BOOTP/DHCP server's address. ip-sysctl: fix spelling errors mac802154: fix NOHZ local_softirq_pending 08 warning ipv6: document ndisc_notify in networking/ip-sysctl.txt ath9k: Fix Kconfig for ATH9K_HTC netfilter: xt_recent: avoid high order page allocations netfilter: fix missing dependencies for the NOTRACK target netfilter: ip6t_NPT: fix IPv6 NTP checksum calculation bridge: add empty br_mdb_init() and br_mdb_uninit() definitions. vxlan: allow live mac address change bridge: Correctly unregister MDB rtnetlink handlers brcmfmac: fix parsing rsn ie for ap mode. brcmsmac: add copyright information for Canonical rtlwifi: rtl8723ae: Fix warning for unchecked pci_map_single() call rtlwifi: rtl8192se: Fix warning for unchecked pci_map_single() call rtlwifi: rtl8192de: Fix warning for unchecked pci_map_single() call rtlwifi: rtl8192ce: Fix warning for unchecked pci_map_single() call ...
| * | | tcp: fix MSG_SENDPAGE_NOTLAST logicEric Dumazet2013-01-061-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 35f9c09fe9c72e (tcp: tcp_sendpages() should call tcp_push() once) added an internal flag : MSG_SENDPAGE_NOTLAST meant to be set on all frags but the last one for a splice() call. The condition used to set the flag in pipe_to_sendpage() relied on splice() user passing the exact number of bytes present in the pipe, or a smaller one. But some programs pass an arbitrary high value, and the test fails. The effect of this bug is a lack of tcp_push() at the end of a splice(pipe -> socket) call, and possibly very slow or erratic TCP sessions. We should both test sd->total_len and fact that another fragment is in the pipe (pipe->nrbufs > 1) Many thanks to Willy for providing very clear bug report, bisection and test programs. Reported-by: Willy Tarreau <w@1wt.eu> Bisected-by: Willy Tarreau <w@1wt.eu> Tested-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds2013-01-076-70/+90
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull CIFS fixes from Steve French: "Misc small cifs fixes" * 'for-next' of git://git.samba.org/sfrench/cifs-2.6: CIFS: Don't let read only caching for mandatory byte-range locked files CIFS: Fix write after setting a read lock for read oplock files Revert "CIFS: Fix write after setting a read lock for read oplock files" cifs: adjust sequence number downward after signing NT_CANCEL request cifs: move check for NULL socket into smb_send_rqst