summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'for_linus' of ↵Linus Torvalds2012-12-174-10/+14
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull ext3, udf, quota fixes from Jan Kara: "Some ext3 & quota cleanups and couple of udf fixes" * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: quota: Use the pre-processor to compile out quotactl_cmd_write when !CONFIG_BLOCK ext3: drop if around WARN_ON ext3: get rid of the duplicate code on ext3_fill_super udf: remove un-needed variable from inode_getblk udf: don't increment lenExtents while writing to a hole udf: fix memory leak while allocating blocks during write
| * quota: Use the pre-processor to compile out quotactl_cmd_write when ↵Lee Jones2012-12-131-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | !CONFIG_BLOCK quotactl_cmd_write() is only ever invoked when BLOCK is configured. When !CONFIG_BLOCK, the build warning below is displayed. Let's fix that. fs/quota/quota.c:311:12: warning: ‘quotactl_cmd_write’ defined but not used [-Wunused-function] Cc: Jan Kara <jack@suse.cz> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Jan Kara <jack@suse.cz>
| * ext3: drop if around WARN_ONJulia Lawall2012-12-131-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just use WARN_ON rather than an if containing only WARN_ON(1). A simplified version of the semantic patch that makes this transformation is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression e; @@ - if (e) WARN_ON(1); + WARN_ON(e); // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Jan Kara <jack@suse.cz>
| * ext3: get rid of the duplicate code on ext3_fill_superZhao Hongjiang2012-12-131-3/+0
| | | | | | | | | | | | | | | | | | Setting s_mount_opt to 0 is unnecessary because we use kzalloc() for sb allocation. s_resuid and s_resgid are set again few lines below based on values in on disk superblock. Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz>
| * udf: remove un-needed variable from inode_getblkNamjae Jeon2012-12-131-3/+0
| | | | | | | | | | | | | | | | The variable last_block is not needed. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> Signed-off-by: Jan Kara <jack@suse.cz>
| * udf: don't increment lenExtents while writing to a holeNamjae Jeon2012-12-131-2/+5
| | | | | | | | | | | | | | | | | | | | | | Incrementing lenExtents even while writing to a hole is bad for performance as calls to udf_discard_prealloc and udf_truncate_tail_extent would not return from start if isize != lenExtents Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> Signed-off-by: Jan Kara <jack@suse.cz>
| * udf: fix memory leak while allocating blocks during writeNamjae Jeon2012-12-131-0/+4
| | | | | | | | | | | | | | | | Need to brelse the buffer_head stored in cur_epos and next_epos. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> Signed-off-by: Jan Kara <jack@suse.cz>
* | Merge tag 'ext4_for_linus' of ↵Linus Torvalds2012-12-1629-1080/+4047
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 update from Ted Ts'o: "There are two major features for this merge window. The first is inline data, which allows small files or directories to be stored in the in-inode extended attribute area. (This requires that the file system use inodes which are at least 256 bytes or larger; 128 byte inodes do not have any room for in-inode xattrs.) The second new feature is SEEK_HOLE/SEEK_DATA support. This is enabled by the extent status tree patches, and this infrastructure will be used to further optimize ext4 in the future. Beyond that, we have the usual collection of code cleanups and bug fixes." * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (63 commits) ext4: zero out inline data using memset() instead of empty_zero_page ext4: ensure Inode flags consistency are checked at build time ext4: Remove CONFIG_EXT4_FS_XATTR ext4: remove unused variable from ext4_ext_in_cache() ext4: remove redundant initialization in ext4_fill_super() ext4: remove redundant code in ext4_alloc_inode() ext4: use sync_inode_metadata() when syncing inode metadata ext4: enable ext4 inline support ext4: let fallocate handle inline data correctly ext4: let ext4_truncate handle inline data correctly ext4: evict inline data out if we need to strore xattr in inode ext4: let fiemap work with inline data ext4: let ext4_rename handle inline dir ext4: let empty_dir handle inline dir ext4: let ext4_delete_entry() handle inline data ext4: make ext4_delete_entry generic ext4: let ext4_find_entry handle inline data ext4: create a new function search_dir ext4: let ext4_readdir handle inline data ext4: let add_dir_entry handle inline data properly ...
| * | ext4: zero out inline data using memset() instead of empty_zero_pageTheodore Ts'o2012-12-113-7/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Not all architectures (in particular, sparc64) have empty_zero_page. So instead of copying from empty_zero_page, use memset to clear the inline data by signalling to ext4_xattr_set_entry() via a magic pointer value, EXT4_ZERO_ATTR_VALUE, which is defined by casting -1 to a pointer. This fixes a build failure on sparc64, and the memset() should be more efficient than using memcpy() anyway. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: ensure Inode flags consistency are checked at build timeCarlos Maiolino2012-12-102-16/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Flags being used by atomic operations in inode flags (e.g. ext4_test_inode_flag(), should be consistent with that actually stored in inodes, i.e.: EXT4_XXX_FL. It ensures that this consistency is checked at build-time, not at run-time. Currently, the flags consistency are being checked at run-time, but, there is no real reason to not do a build-time check instead of a run-time check. The code is comparing macro defined values with enum type variables, where both are constants, so, there is no problem in comparing constants at build-time. enum variables are treated as constants by the C compiler, according to the C99 specs (see www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf sec. 6.2.5, item 16), so, there is no real problem in comparing an enumeration type at build time Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: Remove CONFIG_EXT4_FS_XATTRTao Ma2012-12-109-275/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ted has sent out a RFC about removing this feature. Eric and Jan confirmed that both RedHat and SUSE enable this feature in all their product. David also said that "As far as I know, it's enabled in all Android kernels that use ext4." So it seems OK for us. And what's more, as inline data depends its implementation on xattr, and to be frank, I don't run any test again inline data enabled while xattr disabled. So I think we should add inline data and remove this config option in the same release. [ The savings if you disable CONFIG_EXT4_FS_XATTR is only 27k, which isn't much in the grand scheme of things. Since no one seems to be testing this configuration except for some automated compile farms, on balance we are better removing this config option, and so that it is effectively always enabled. -- tytso ] Cc: David Brown <davidb@codeaurora.org> Cc: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: remove unused variable from ext4_ext_in_cache()Zhi Yong Wu2012-12-101-2/+0
| | | | | | | | | | | | | | | | | | Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Reviewed-by: Zheng Liu <gnehzuil.liu@gmail.com>
| * | ext4: remove redundant initialization in ext4_fill_super()Guo Chao2012-12-101-1/+0
| | | | | | | | | | | | | | | | | | | | | We use kzalloc() to allocate sbi, no need to zero its field. Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: remove redundant code in ext4_alloc_inode()Guo Chao2012-12-101-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | inode_init_always() will initialize inode->i_data.writeback_index anyway, no need to do this in ext4_alloc_inode(). Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Lukas Czerner <lczerner@redhat.com>
| * | ext4: use sync_inode_metadata() when syncing inode metadataGuo Chao2012-12-101-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | We have a dedicated interface to sync inode metadata. Use it to simplify ext4's code some. Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Lukas Czerner <lczerner@redhat.com>
| * | ext4: enable ext4 inline supportTao Ma2012-12-102-1/+6
| | | | | | | | | | | | | | | Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: let fallocate handle inline data correctlyTao Ma2012-12-103-0/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | If we are punching hole in a file, we will return ENOTSUPP. As for the fallocation of some extents, we will convert the inline data to a normal extent based file first. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: let ext4_truncate handle inline data correctlyTao Ma2012-12-103-0/+107
| | | | | | | | | | | | | | | | | | Signed-off-by: Robin Dong <sanbai@taobao.com> Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: evict inline data out if we need to strore xattr in inodeTao Ma2012-12-103-12/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now we that store data in the inode, in case we need to store some xattrs and inode doesn't have enough space, Andreas suggested that we should keep the xattr(metadata) in and data should be pushed out. So this patch does the work. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: let fiemap work with inline dataTao Ma2012-12-103-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | fiemap is used to find the disk layout of a file, as for inline data, let us just pretend like a file with just one extent. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: let ext4_rename handle inline dirTao Ma2012-12-103-35/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case we rename a directory, ext4_rename has to read the dir block and change its dotdot's information. The old ext4_rename encapsulated the dir_block read into itself. So this patch adds a new function ext4_get_first_dir_block() which gets the dir buffer information so the ext4_rename can handle it properly. As it will also change the parent inode number, we return the parent_de so that ext4_rename() can handle it more easily. ext4_find_entry is also changed so that the caller(rename) can tell whether the found entry is an inlined one or not and journaling the corresponding buffer head. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: let empty_dir handle inline dirTao Ma2012-12-103-0/+104
| | | | | | | | | | | | | | | | | | | | | | | | empty_dir is used when deleting a dir. So it should handle inline dir properly. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: let ext4_delete_entry() handle inline dataTao Ma2012-12-103-0/+76
| | | | | | | | | | | | | | | Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: make ext4_delete_entry genericTao Ma2012-12-102-26/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently ext4_delete_entry() is used only for dir entry removing from a dir block. So let us create a new function ext4_generic_delete_entry and this function takes a entry_buf and a buf_size so that it can be used for inline data. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: let ext4_find_entry handle inline dataTao Ma2012-12-103-1/+70
| | | | | | | | | | | | | | | | | | | | | | | | Create a new function ext4_find_inline_entry() to handle the case of inline data. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: create a new function search_dirTao Ma2012-12-102-7/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | search_dirblock is used to search a dir block, but the code is almost the same for searching an inline dir. So create a new fuction search_dir and let search_dirblock call it. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: let ext4_readdir handle inline dataTao Ma2012-12-104-13/+169
| | | | | | | | | | | | | | | | | | | | | | | | For "." and "..", we just call filldir by ourselves instead of iterating the real dir entry. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: let add_dir_entry handle inline data properlyTao Ma2012-12-104-10/+430
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch let add_dir_entry handle the inline data case. So the dir is initialized as inline dir first and then we can try to add some files to it, when the inline space can't hold all the entries, a dir block will be created and the dir entry will be moved to it. Also for an inlined dir, "." and ".." are removed and we only use 4 bytes to store the parent inode number. These 2 entries will be added when we convert an inline dir to a block-based one. [ Folded in patch from Dan Carpenter to remove an unused variable. ] Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: create __ext4_insert_dentry for dir entry insertionTao Ma2012-12-102-40/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The old add_dirent_to_buf handles all the work related to the work of adding dir entry to a dir block. Now we have inline data, so create 2 new function __ext4_find_dest_de and __ext4_insert_dentry that do the real work and let add_dirent_to_buf call them. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: refactor __ext4_check_dir_entry() to accept start and sizeTao Ma2012-12-103-15/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The __ext4_check_dir_entry() function() is used to check whether the de is over the block boundary. Now with inline data, it could be within the block boundary while exceeds the inode size. So check this function to check the overflow more precisely. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: make ext4_init_dot_dotdot for inline dir usageTao Ma2012-12-102-44/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the initialization of dot and dotdot are encapsulated in ext4_mkdir and also bond with dir_block. So create a new function named ext4_init_new_dir and the initialization is moved to ext4_init_dot_dotdot. Now it will called either in the normal non-inline case(rec_len of ".." will cover the whole block) or when we converting an inline dir to a block(rec len of ".." will be the real length). The start of the next entry is also returned for inline dir usage. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: add delalloc support for inline dataTao Ma2012-12-104-9/+262
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For delayed allocation mode, we write to inline data if the file is small enough. And in case of we write to some offset larger than the inline size, the 1st page is dirtied, so that ext4_da_writepages can handle the conversion. When the 1st page is initialized with blocks, the inline part is removed. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: add journalled write support for inline dataTao Ma2012-12-103-20/+85
| | | | | | | | | | | | | | | Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: add normal write support for inline dataTao Ma2012-12-105-42/+340
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For a normal write case (not journalled write, not delayed allocation), we write to the inline if the file is small and convert it to an extent based file when the write is larger than the max inline size. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: add read support for inline dataTao Ma2012-12-103-1/+98
| | | | | | | | | | | | | | | | | | | | | | | | Let readpage and readpages handle the case when we want to read an inlined file. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: add the basic function for inline data supportTao Ma2012-12-105-3/+534
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement inline data with xattr. Now we use "system.data" to store xattr, and the xattr will be extended if the i_size is increased while we don't release the space during truncate. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: export inline xattr functionsTao Ma2012-12-052-33/+64
| | | | | | | | | | | | | | | | | | | | | | | | The inline data feature will need some inline xattr functions, so export them from fs/ext4/xattr.c so that inline.c can use them. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: move extra inode read to a new functionTao Ma2012-12-021-5/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, in ext4_iget we do a simple check to see whether there does exist some information starting from the end of i_extra_size. With inline data added, this procedure is more complicated. So move it to a new function named ext4_iget_extra_inode. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: fix possible use after free with metadata csumTheodore Ts'o2012-11-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit fa77dcfafeaa introduces block bitmap checksum calculation into ext4_new_inode() in the case that block group was uninitialized. However we brelse() the bitmap buffer before we attempt to checksum it so we have no guarantee that the buffer is still there. Fix this by releasing the buffer after the possible checksum computation. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: Darrick J. Wong <darrick.wong@oracle.com> Cc: stable@vger.kernel.org
| * | ext4: restructure ext4_ext_direct_IO()Theodore Ts'o2012-11-291-108/+103
| | | | | | | | | | | | | | | | | | | | | | | | | | | Remove a level of indentation by moving the DIO read and extending write case to the beginning of the file. This results in no actual programmatic changes to the file, but makes it easier to read/understand. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: rationalize ext4_extents.h inclusionTheodore Ts'o2012-11-288-30/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, ext4_extents.h was being included at the end of ext4.h, which was bad for a number of reasons: (a) it was not being included in the expected place, and (b) it caused the header to be included multiple times. There were #ifdef's to prevent this from causing any problems, but it still was unnecessary. By moving the function declarations that were in ext4_extents.h to ext4.h, which is standard practice for where the function declarations for the rest of ext4.h can be found, we can remove ext4_extents.h from being included in ext4.h at all, and then we can only include ext4_extents.h where it is needed in ext4's source files. It should be possible to move a few more things into ext4.h, and further reduce the number of source files that need to #include ext4_extents.h, but that's a cleanup for another day. Reported-by: Sachin Kamat <sachin.kamat@linaro.org> Reported-by: Wei Yongjun <weiyj.lk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: fixed potential NULL dereference in ext4_calculate_overhead()Vahram Martirosyan2012-11-281-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The memset operation before check can cause a BUG if the memory allocation failed. Since we are using get_zeroed_age, there is no need to use memset anyway. Found by the Spruce system in cooperation with the KEDR Framework. Signed-off-by: Vahram Martirosyan <vmartirosyan@linuxtesting.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: simple cleanup in fiemap codepathLukas Czerner2012-11-281-16/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit is simple cleanup of fiemap codepath which has not been included in previous commit to make the changes clearer. In this commit we rename cbex variable to newex in ext4_fill_fiemap_extents() because callback is no longer present Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: prevent race while walking extent tree for fiemapLukas Czerner2012-11-282-74/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently ext4_ext_walk_space() only takes i_data_sem for read when searching for the extent at given block with ext4_ext_find_extent(). Then it drops the lock and the extent tree can be changed at will. However later on we're searching for the 'next' extent, but the extent tree might already have changed, so the information might not be accurate. In fact we can hit BUG_ON(end <= start) if the extent got inserted into the tree after the one we found and before the block we were searching for. This has been reproduced by running xfstests 225 in loop on s390x architecture, but theoretically we could hit this on any other architecture as well, but probably not as often. Moreover the extent currently in delayed allocation might be allocated after we search the extent tree and before we search extent status tree delayed buffers resulting in those delayed buffers being completely missed, even though completely written and allocated. We fix all those problems in several steps: 1. remove unnecessary callback indirection 2. rename functions ext4_ext_walk_space -> ext4_fill_fiemap_extents ext4_ext_fiemap_cb -> ext4_find_delayed_extent 3. move fiemap_fill_next_extent() into ext4_fill_fiemap_extents() 4. hold the i_data_sem for: ext4_ext_find_extent() ext4_ext_next_allocated_block() ext4_find_delayed_extent() 5. call fiemap_fill_next_extent after releasing the i_data_sem 6. move path reinitialization into the critical section. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: remove calls to ext4_jbd2_file_inode() from delalloc write pathTheodore Ts'o2012-11-151-19/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The calls to ext4_jbd2_file_inode() are needed to guarantee that we do not expose stale data in the data=ordered mode. However, they are not necessary because in all of the cases where we have newly allocated blocks in the delayed allocation write path, we immediately submit the dirty pages for I/O. Hence, we can avoid the overhead of adding the inode to the list of inodes whose data pages will be to be flushed out to disk completely during the next commit operation. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: init pagevec in ext4_da_block_invalidatepagesEric Sandeen2012-11-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ext4_da_block_invalidatepages is missing a pagevec_init(), which means that pvec->cold contains random garbage. This affects whether the page goes to the front or back of the LRU when ->cold makes it to free_hot_cold_page() Reviewed-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
| * | ext4: don't verify checksums of dx non-leaf nodes during fallback scanDarrick J. Wong2012-11-121-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During a directory entry lookup of a hashed directory, if the hash-based lookup functions fail and we fall back to a linear scan, don't try to verify the dirent checksum on the internal nodes of the hash tree because they don't store a checksum in a hidden dirent like the leaf nodes do. Reported-by: George Spelvin <linux@horizon.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: do not use ext4_error() when there is no space in dir leaf for csumTheodore Ts'o2012-11-101-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If there is no space for a checksum in a directory leaf node, previously we would use EXT4_ERROR_INODE() which would mark the file system as inconsistent. While it would be nice to use e2fsck -D, it certainly isn't required, so just print a warning using ext4_warning(). Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
| * | ext4: introduce lseek SEEK_DATA/SEEK_HOLE supportZheng Liu2012-11-081-2/+332
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes ext4 really support SEEK_DATA/SEEK_HOLE flags. Block-mapped and extent-mapped files are fully implemented together because ext4_map_blocks hides this differences. After applying this patch, it will cause a failure in xfstest #285 when the file is block-mapped due to block-mapped file isn't support fallocate(2). I had tried to use ext4_ext_walk_space() to retrieve the offset for a extent-mapped file. But finally I decide to keep using ext4_map_blocks() to support SEEK_DATA/SEEK_HOLE because ext4_map_blocks() can hide the difference between block-mapped file and extent-mapped file. Moreover, in next step, extent status tree will track all extent status, and we can get all mappings from this tree. So I think that using ext4_map_blocks() is a better choice. CC: Hugh Dickins <hughd@google.com> Signed-off-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | ext4: reimplement fiemap using extent status treeZheng Liu2012-11-081-163/+21
| | | | | | | | | | | | | | | | | | | | | Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: Allison Henderson <achender@linux.vnet.ibm.com> Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>