linux-stable.git - Linux kernel stable tree

	Commit message (Collapse)	Author	Age	Files	Lines
*	vfs: fix the stupidity with i_dentry in inode destructors	Al Viro	2012-01-03	1	-1/+0
\| \| \| \| \| \| \| \| \| \|	Seeing that just about every destructor got that INIT_LIST_HEAD() copied into it, there is no point whatsoever keeping this INIT_LIST_HEAD in inode_init_once(); the cost of taking it into inode_init_always() will be negligible for pipes and sockets and negative for everything else. Not to mention the removal of boilerplate code from ->destroy_inode() instances... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	ext4: display the correct mount option in /proc/mounts for [no]init_itable	Theodore Ts'o	2011-12-12	1	-9/+8
\| \| \| \| \| \| \| \| \|	/proc/mounts was showing the mount option [no]init_inode_table when the correct mount option that will be accepted by parse_options() is [no]init_itable. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
*	ext4: Remove kernel_lock annotations	Richard Weinberger	2011-11-07	1	-2/+0
\| \| \| \| \| \| \|	The BKL is gone, these annotations are useless. Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: ignore journalled data options on remount if fs has no journal	Theodore Ts'o	2011-11-07	1	-1/+3
\| \| \| \| \| \| \| \| \| \|	This avoids a confusing failure in the init scripts when the /etc/fstab has data=writeback or data=journal but the file system does not have a journal. So check for this case explicitly, and warn the user that we are ignoring the (pointless, since they have no journal) data=* mount option. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: optimize ext4_ext_convert_to_initialized()	Eric Gouriou	2011-10-27	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch introduces a fast path in ext4_ext_convert_to_initialized() for the case when the conversion can be performed by transferring the newly initialized blocks from the uninitialized extent into an adjacent initialized extent. Doing so removes the expensive invocations of memmove() which occur during extent insertion and the subsequent merge. In practice this should be the common case for clients performing append writes into files pre-allocated via fallocate(FALLOC_FL_KEEP_SIZE). In such a workload performed via direct IO and when using a suboptimal implementation of memmove() (x86_64 prior to the 2.6.39 rewrite), this patch reduces kernel CPU consumption by 32%. Two new trace points are added to ext4_ext_convert_to_initialized() to offer visibility into its operations. No exit trace point has been added due to the multiplicity of return points. This can be revisited once the upstream cleanup is backported. Signed-off-by: Eric Gouriou <egouriou@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: fix ext4 so it works without CONFIG_PROC_FS	Fabrice Jouhaud	2011-10-08	1	-10/+6
\| \| \| \| \| \| \| \| \|	This fixes a bug which was introduced in dd68314ccf3fb. The problem came from the test of the return value of proc_mkdir which is always false without procfs, and this would initialization of ext4. Signed-off-by: Fabrice Jouhaud <yargil@free.fr> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: remove deprecated oldalloc	Lukas Czerner	2011-10-08	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	For a long time now orlov is the default block allocator in the ext4. It performs better than the old one and no one seems to claim otherwise so we can safely drop it and make oldalloc and orlov mount option deprecated. This is a part of the effort to reduce number of ext4 options hence the test matrix. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: Free resources in some error path in ext4_fill_super	Tao Ma	2011-10-06	1	-8/+11
\| \| \| \| \| \| \| \| \|	Some of the error path in ext4_fill_super don't release the resouces properly. So this patch just try to release them in the right way. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: rename ext4_count_free_blocks() to ext4_count_free_clusters()	Theodore Ts'o	2011-09-09	1	-3/+4
\| \| \| \| \| \| \|	This function really counts the free clusters reported in the block group descriptors, so rename it to reduce confusion. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: Rename ext4_free_blks_{count,set}() to refer to clusters	Theodore Ts'o	2011-09-09	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	The field bg_free_blocks_count_{lo,high} in the block group descriptor has been repurposed to hold the number of free clusters for bigalloc functions. So rename the functions so it makes it easier to read and audit the block allocation and block freeing code. Note: at this point in bigalloc development we doesn't support online resize, so this also makes it really obvious all of the places we need to fix up to add support for online resize. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: Fix bigalloc quota accounting and i_blocks value	Aditya Kali	2011-09-09	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \|	With bigalloc changes, the i_blocks value was not correctly set (it was still set to number of blocks being used, but in case of bigalloc, we want i_blocks to represent the number of clusters being used). Since the quota subsystem sets the i_blocks value, this patch fixes the quota accounting and makes sure that the i_blocks value is set correctly. Signed-off-by: Aditya Kali <adityakali@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: teach ext4_statfs() to deal with clusters if bigalloc is enabled	Theodore Ts'o	2011-09-09	1	-13/+24
\| \| \| \|	Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: convert the free_blocks field in s_flex_groups to be free_clusters	Theodore Ts'o	2011-09-09	1	-1/+1
\| \| \| \| \| \| \| \| \|	Convert the free_blocks to be free_clusters to make the final revised bigalloc changes easier to read/understand. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: convert s_{dirty,free}blocks_counter to s_{dirty,free}clusters_counter	Theodore Ts'o	2011-09-09	1	-14/+15
\| \| \| \| \| \| \| \|	Convert the percpu counters s_dirtyblocks_counter and s_freeblocks_counter in struct ext4_super_info to be s_dirtyclusters_counter and s_freeclusters_counter. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: enforce bigalloc restrictions (e.g., no online resizing, etc.)	Theodore Ts'o	2011-09-09	1	-0/+7
\| \| \| \| \| \| \| \|	At least initially if the bigalloc feature is enabled, we will not support non-extent mapped inodes, online resizing, online defrag, or the FITRIM ioctl. This simplifies the initial implementation. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: read-only support for bigalloc file systems	Theodore Ts'o	2011-09-09	1	-8/+50
\| \| \| \| \| \| \| \| \|	This adds supports for bigalloc file systems. It teaches the mount code just enough about bigalloc superblock fields that it will mount the file system without freaking out that the number of blocks per group is too big. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: add ext4-specific kludge to avoid an oops after the disk disappears	Theodore Ts'o	2011-09-09	1	-1/+17
\| \| \| \| \| \| \| \| \| \| \| \| \|	The del_gendisk() function uninitializes the disk-specific data structures, including the bdi structure, without telling anyone else. Once this happens, any attempt to call mark_buffer_dirty() (for example, by ext4_commit_super), will cause a kernel OOPS. Fix this for now until we can fix things in an architecturally correct way. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: improve handling of conflicting mount options	Theodore Ts'o	2011-09-03	1	-21/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the user explicitly specifies conflicting mount options for delalloc or dioread_nolock and data=journal, fail the mount, instead of printing a warning and continuing (since many user's won't look at dmesg and notice the warning). Also, print a single warning that data=journal implies that delayed allocation is not on by default (since it's not supported), and furthermore that O_DIRECT is not supported. Improve the text in Documentation/filesystems/ext4.txt so this is clear there as well. Similarly, if the dioread_nolock mount option is specified when the file system block size != PAGE_SIZE, fail the mount instead of printing a warning message and ignoring the mount option. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: call ext4_ioend_wait and ext4_flush_completed_IO in ext4_evict_inode	Jiaying Zhang	2011-08-13	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Flush inode's i_completed_io_list before calling ext4_io_wait to prevent the following deadlock scenario: A page fault happens while some process is writing inode A. During page fault, shrink_icache_memory is called that in turn evicts another inode B. Inode B has some pending io_end work so it calls ext4_ioend_wait() that waits for inode B's i_ioend_count to become zero. However, inode B's ioend work was queued behind some of inode A's ioend work on the same cpu's ext4-dio-unwritten workqueue. As the ext4-dio-unwritten thread on that cpu is processing inode A's ioend work, it tries to grab inode A's i_mutex lock. Since the i_mutex lock of inode A is still hold before the page fault happened, we enter a deadlock. Also moves ext4_flush_completed_IO and ext4_ioend_wait from ext4_destroy_inode() to ext4_evict_inode(). During inode deleteion, ext4_evict_inode() is called before ext4_destroy_inode() and in ext4_evict_inode(), we may call ext4_truncate() without holding i_mutex lock. As a result, there is a race between flush_completed_IO that is called from ext4_ext_truncate() and ext4_end_io_work, which may cause corruption on an io_end structure. This change moves ext4_flush_completed_IO and ext4_ioend_wait from ext4_destroy_inode() to ext4_evict_inode() to resolve the race between ext4_truncate() and ext4_end_io_work during inode deletion. Signed-off-by: Jiaying Zhang <jiayingz@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
*	ext4: use kzalloc in ext4_kzalloc()	Mathias Krause	2011-08-03	1	-1/+1
\| \| \| \| \| \| \| \| \|	Commit 9933fc0i (ext4: introduce ext4_kvmalloc(), ext4_kzalloc(), and ext4_kvfree()) intruduced wrappers around k*alloc/vmalloc but introduced a typo for ext4_kzalloc() by not using kzalloc() but kmalloc(). Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: use ext4_kvzalloc()/ext4_kvmalloc() for s_group_desc and s_group_info	Theodore Ts'o	2011-08-01	1	-4/+5
\| \| \| \|	Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: introduce ext4_kvmalloc(), ext4_kzalloc(), and ext4_kvfree()	Theodore Ts'o	2011-08-01	1	-18/+36
\| \| \| \| \| \| \| \|	Introduce new helper functions which try kmalloc, and then fall back to vmalloc if necessary, and use them for allocating and deallocating s_flex_groups. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: prevent parallel resizers by atomic bit ops	Yongqiang Yang	2011-07-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before this patch, parallel resizers are allowed and protected by a mutex lock, actually, there is no need to support parallel resizer, so this patch prevents parallel resizers by atmoic bit ops, like lock_page() and unlock_page() do. To do this, the patch removed the mutex lock s_resize_lock from struct ext4_sb_info and added a unsigned long field named s_resize_flags which inidicates if there is a resizer. Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: ignore a stripe width of 1	Dan Ehrenberg	2011-07-17	1	-7/+15
\| \| \| \| \| \| \| \| \|	If the stripe width was set to 1, then this patch will ignore that stripe width and ext4 will act as if the stripe width were 0 with respect to optimizing allocations. Signed-off-by: Dan Ehrenberg <dehrenberg@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: add tracepoint for ext4_journal_start	Theodore Ts'o	2011-07-10	1	-0/+1
\| \| \| \| \| \|	This will help debug who is responsible for starting a jbd2 transaction. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	ext4: Fix max file size and logical block counting of extent format file	Lukas Czerner	2011-06-06	1	-3/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Kazuya Mio reported that he was able to hit BUG_ON(next == lblock) in ext4_ext_put_gap_in_cache() while creating a sparse file in extent format and fill the tail of file up to its end. We will hit the BUG_ON when we write the last block (2^32-1) into the sparse file. The root cause of the problem lies in the fact that we specifically set s_maxbytes so that block at s_maxbytes fit into on-disk extent format, which is 32 bit long. However, we are not storing start and end block number, but rather start block number and length in blocks. It means that in order to cover extent from 0 to EXT_MAX_BLOCK we need EXT_MAX_BLOCK+1 to fit into len (because we counting block 0 as well) - and it does not. The only way to fix it without changing the meaning of the struct ext4_extent members is, as Kazuya Mio suggested, to lower s_maxbytes by one fs block so we can cover the whole extent we can get by the on-disk extent format. Also in many places EXT_MAX_BLOCK is used as length instead of maximum logical block number as the name suggests, it is all a bit messy. So this commit renames it to EXT_MAX_BLOCKS and change its usage in some places to actually be maximum number of blocks in the extent. The bug which this commit fixes can be reproduced as follows: dd if=/dev/zero of=/mnt/mp1/file bs=<blocksize> count=1 seek=$((232-2)) sync dd if=/dev/zero of=/mnt/mp1/file bs=<blocksize> count=1 seek=$((232-1)) Reported-by: Kazuya Mio <k-mio@sx.jp.nec.com> Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	Merge branch 'for-linus' of ↵	Linus Torvalds	2011-05-26	1	-0/+2
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/djm/tmem * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/djm/tmem: xen: cleancache shim to Xen Transcendent Memory ocfs2: add cleancache support ext4: add cleancache support btrfs: add cleancache support ext3: add cleancache support mm/fs: add hooks to support cleancache mm: cleancache core ops functions and config fs: add field to superblock to support cleancache mm/fs: cleancache documentation Fix up trivial conflict in fs/btrfs/extent_io.c due to includes
\| *	ext4: add cleancache support	Dan Magenheimer	2011-05-26	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This seventh patch of eight in this cleancache series "opts-in" cleancache for ext4. Filesystems must explicitly enable cleancache by calling cleancache_init_fs anytime an instance of the filesystem is mounted. For ext4, all other cleancache hooks are in the VFS layer including the matching cleancache_flush_fs hook which must be called on unmount. Details and a FAQ can be found in Documentation/vm/cleancache.txt [v6-v8: no changes] [v5: jeremy@goop.org: simplify init hook and any future fs init changes] Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com> Reviewed-by: Jeremy Fitzhardinge <jeremy@goop.org> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Andreas Dilger <adilger@sun.com> Cc: Ted Ts'o <tytso@mit.edu> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Rik Van Riel <riel@redhat.com> Cc: Jan Beulich <JBeulich@novell.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <joel.becker@oracle.com> Cc: Nitin Gupta <ngupta@vflare.org>
* \|	ext4: add support for multiple mount protection	Johann Lombardi	2011-05-24	1	-1/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Prevent an ext4 filesystem from being mounted multiple times. A sequence number is stored on disk and is periodically updated (every 5 seconds by default) by a mounted filesystem. At mount time, we now wait for s_mmp_update_interval seconds to make sure that the MMP sequence does not change. In case of failure, the nodename, bdevname and the time at which the MMP block was last updated is displayed. Signed-off-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Johann Lombardi <johann@whamcloud.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* \|	ext4: ensure f_bfree returned by ext4_statfs() is non-negative	Kazuya Mio	2011-05-24	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I found the issue that the number of free blocks went negative. # stat -f /mnt/mp1/ File: "/mnt/mp1/" ID: e175ccb83a872efe Namelen: 255 Type: ext2/ext3 Block size: 4096 Fundamental block size: 4096 Blocks: Total: 258022 Free: -15 Available: -13122 Inodes: Total: 65536 Free: 63029 f_bfree in struct statfs will go negative when the filesystem has few free blocks. Because the number of dirty blocks is bigger than the number of free blocks in the following two cases. CASE 1: ext4_da_writepages mpage_da_map_and_submit ext4_map_blocks ext4_ext_map_blocks ext4_mb_new_blocks ext4_mb_diskspace_used percpu_counter_sub(&sbi->s_freeblocks_counter, ac->ac_b_ex.fe_len); <--- interrupt statfs systemcall ---> ext4_da_update_reserve_space percpu_counter_sub(&sbi->s_dirtyblocks_counter, used + ei->i_allocated_meta_blocks); CASE 2: ext4_write_begin __block_write_begin ext4_map_blocks ext4_ext_map_blocks ext4_mb_new_blocks ext4_mb_diskspace_used percpu_counter_sub(&sbi->s_freeblocks_counter, ac->ac_b_ex.fe_len); <--- interrupt statfs systemcall ---> percpu_counter_sub(&sbi->s_dirtyblocks_counter, reserv_blks); To avoid the issue, this patch ensures that f_bfree is non-negative. Signed-off-by: Kazuya Mio <k-mio@sx.jp.nec.com>
* \|	ext4: count hits/misses of extent cache and expose in sysfs	Vivek Haldar	2011-05-22	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The number of hits and misses for each filesystem is exposed in /sys/fs/ext4/<dev>/extent_cache_{hits, misses}. Tested: fsstress, manual checks. Signed-off-by: Vivek Haldar <haldar@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* \|	ext4: don't show mount options in /proc/mounts if there is no journal	Theodore Ts'o	2011-05-22	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After creating an ext4 file system without a journal: # mke2fs -t ext4 -O ^has_journal /dev/sda # mount -t ext4 /dev/sda /test the /proc/mounts will show: "/dev/sda /test ext4 rw,relatime,user_xattr,acl,barrier=1,data=writeback 0 0" which can fool users into thinking that the fs is using writeback mode. So don't set the writeback option when the journal has not been enabled; we don't depend on the writeback option being set, since ext4_should_writeback_data() in ext4_jbd2.h tests to see if the journal is not present before returning true. Reported-by: Robin Dong <sanbai@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* \|	ext4: fix possible use-after-free in ext4_remove_li_request()	Lukas Czerner	2011-05-20	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We need to take reference to the s_li_request after we take a mutex, because it might be freed since then, hence result in accessing old already freed memory. Also we should protect the whole ext4_remove_li_request() because ext4_li_info might be in the process of being freed in ext4_lazyinit_thread(). Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
* \|	ext4: fix the mount option "init_itable=n" to work as expected for n=0	Lukas Czerner	2011-05-20	1	-7/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For some reason, when we set the mount option "init_itable=0" it behaves as we would set init_itable=20 which is not right at all. Basically when we set it to zero we are saying to lazyinit thread not to wait between zeroing the inode table (except of cond_resched()) so this commit fixes that and removes the unnecessary condition. The 'n' should be also properly used on remount. When the n is not set at all, it means that the default miltiplier EXT4_DEF_LI_WAIT_MULT is set instead. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reported-by: Eric Sandeen <sandeen@redhat.com>
* \|	ext4: Remove unnecessary wait_event ext4_run_lazyinit_thread()	Lukas Czerner	2011-05-20	1	-10/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For some reason we have been waiting for lazyinit thread to start in the ext4_run_lazyinit_thread() but it is not needed since it was jus unnecessary complexity, so get rid of it. We can also remove li_task and li_wait_task since it is not used anymore. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
* \|	ext4: Use schedule_timeout_interruptible() for waiting in lazyinit thread	Lukas Czerner	2011-05-20	1	-25/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to make lazyinit eat approx. 10% of io bandwidth at max, we are sleeping between zeroing each single inode table. For that purpose we are using timer which wakes up thread when it expires. It is set via add_timer() and this may cause troubles in the case that thread has been woken up earlier and in next iteration we call add_timer() on still running timer hence hitting BUG_ON in add_timer(). We could fix that by using mod_timer() instead however we can use schedule_timeout_interruptible() for waiting and hence simplifying things a lot. This commit exchange the old "waiting mechanism" with simple schedule_timeout_interruptible(), setting the time to sleep. Hence we do not longer need li_wait_daemon waiting queue and others, so get rid of it. Addresses-Red-Hat-Bugzilla: #699708 Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
* \|	ext4: don't warn about mnt_count if it has been disabled	Tao Ma	2011-05-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, if we mkfs a new ext4 volume with s_max_mnt_count set to zero, and mount it for the first time, we will get the warning: maximal mount count reached, running e2fsck is recommended It is really misleading. So change the check so that it won't warn in that case. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* \|	ext4: fix oops in ext4_quota_off()	Amir Goldstein	2011-05-16	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If quota is not enabled when ext4_quota_off() is called, we must not dereference quota file inode since it is NULL. Check properly for this. This fixes a bug in commit 21f976975cbe (ext4: remove unnecessary [cm]time update of quota file), which was merged for 2.6.39-rc3. Reported-by: Amir Goldstein <amir73il@users.sf.net> Signed-off-by: Amir Goldstein <amir73il@users.sf.net> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* \|	ext4: remove redundant #ifdef in super.c	Amerigo Wang	2011-05-09	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is already an #ifdef CONFIG_QUOTA some lines above, so this one is totally useless. Signed-off-by: WANG Cong <amwang@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* \|	ext4: remove redundant check for first_not_zeroed in ext4_register_li_request	Tao Ma	2011-05-09	1	-8/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have checked first_not_zeroed == ngroups already above, so remove this redundant check. sbi->s_li_request = NULL above is also removed since it is NULL already. Cc: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* \|	ext4: check for ext[23] file system features when mounting as ext[23]	Theodore Ts'o	2011-04-18	1	-9/+65
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Provide better emulation for ext[23] mode by enforcing that the file system does not have any unsupported file system features as defined by ext[23] when emulating the ext[23] file system driver when CONFIG_EXT4_USE_FOR_EXT23 is defined. This causes the file system type information in /proc/mounts to be correct for the automatically mounted root file system. This also means that "mount -t ext2 /dev/sda /mnt" will fail if /dev/sda contains an ext3 or ext4 file system, just as one would expect if the original ext2 file system driver were in use. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
*	Merge branch 'for_linus' of ↵	Linus Torvalds	2011-04-11	1	-16/+58
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix data corruption regression by reverting commit 6de9843dab3f ext4: Allow indirect-block file to grow the file size to max file size ext4: allow an active handle to be started when freezing ext4: sync the directory inode in ext4_sync_parent() ext4: init timer earlier to avoid a kernel panic in __save_error_info jbd2: fix potential memory leak on transaction commit ext4: fix a double free in ext4_register_li_request ext4: fix credits computing for indirect mapped files ext4: remove unnecessary [cm]time update of quota file jbd2: move bdget out of critical section
\| *	ext4: allow an active handle to be started when freezing	Yongqiang Yang	2011-04-10	1	-11/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ext4_journal_start_sb() should not prevent an active handle from being started due to s_frozen. Otherwise, deadlock is easy to happen, below is a situation. ================================================ freeze \| truncate ================================================ \| ext4_ext_truncate() freeze_super() \| starts a handle sets s_frozen \| \| ext4_ext_truncate() \| holds i_data_sem ext4_freeze() \| waits for updates \| \| ext4_free_blocks() \| calls dquot_free_block() \| \| dquot_free_blocks() \| calls ext4_dirty_inode() \| \| ext4_dirty_inode() \| trys to start an active \| handle \| \| block due to s_frozen ================================================ Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reported-by: Amir Goldstein <amir73il@users.sf.net> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Andreas Dilger <adilger@dilger.ca>
\| *	ext4: init timer earlier to avoid a kernel panic in __save_error_info	Tao Ma	2011-04-05	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During mount, when we fail to open journal inode or root inode, the __save_error_info will mod_timer. But actually s_err_report isn't initialized yet and the kernel oops. The detailed information can be found https://bugzilla.kernel.org/show_bug.cgi?id=32082. The best way is to check whether the timer s_err_report is initialized or not. But it seems that in include/linux/timer.h, we can't find a good function to check the status of this timer, so this patch just move the initializtion of s_err_report earlier so that we can avoid the kernel panic. The corresponding del_timer is also added in the error path. Reported-by: Sami Liedes <sliedes@cc.hut.fi> Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
\| *	ext4: fix a double free in ext4_register_li_request	Tao Ma	2011-04-04	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In ext4_register_li_request, we malloc a ext4_li_request and inserts it into ext4_li_info->li_request_list. In case of any error later, we free it in the end. But if we have some error in ext4_run_lazyinit_thread, the whole li_request_list will be dropped and freed in it. So we will double free this ext4_li_request. This patch just sets elr to NULL after it is inserted to the list so that the latter kfree won't double free it. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
\| *	ext4: remove unnecessary [cm]time update of quota file	Jan Kara	2011-04-04	1	-2/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is not necessary to update [cm]time of quota file on each quota file write and it wastes journal space and IO throughput with inode writes. So just remove the updating from ext4_quota_write() and only update times when quotas are being turned off. Userspace cannot get anything reliable from quota files while they are used by the kernel anyway. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* \|	Fix common misspellings	Lucas De Marchi	2011-03-31	1	-2/+2
\|/ \| \| \| \| \|	Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
*	Merge branch 'for_linus' of ↵	Linus Torvalds	2011-03-25	1	-21/+27
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (43 commits) ext4: fix a BUG in mb_mark_used during trim. ext4: unused variables cleanup in fs/ext4/extents.c ext4: remove redundant set_buffer_mapped() in ext4_da_get_block_prep() ext4: add more tracepoints and use dev_t in the trace buffer ext4: don't kfree uninitialized s_group_info members ext4: add missing space in printk's in __ext4_grp_locked_error() ext4: add FITRIM to compat_ioctl. ext4: handle errors in ext4_clear_blocks() ext4: unify the ext4_handle_release_buffer() api ext4: handle errors in ext4_rename jbd2: add COW fields to struct jbd2_journal_handle jbd2: add the b_cow_tid field to journal_head struct ext4: Initialize fsync transaction ids in ext4_new_inode() ext4: Use single thread to perform DIO unwritten convertion ext4: optimize ext4_bio_write_page() when no extent conversion is needed ext4: skip orphan cleanup if fs has unknown ROCOMPAT features ext4: use the nblocks arg to ext4_truncate_restart_trans() ext4: fix missing iput of root inode for some mount error paths ext4: make FIEMAP and delayed allocation play well together ext4: suppress verbose debugging information if malloc-debug is off ... Fi up conflicts in fs/ext4/super.c due to workqueue changes
\| *	ext4: add missing space in printk's in __ext4_grp_locked_error()	Robin Dong	2011-03-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we do performence-testing on ext4 filesystem, we observed a warning like this: EXT4-fs error (device sda7): ext4_mb_generate_buddy:718: group 259825901 blocks in bitmap, 26057 in gd instead, it should be "group 2598, 25901 blocks in bitmap, 26057 in gd" Reviewed-by: Coly Li <bosong.ly@taobao.com> Cc: Tao Ma <boyu.mt@taobao.com> Signed-off-by: Robin Dong <sanbai@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
\| *	ext4: Use single thread to perform DIO unwritten convertion	Mingming Cao	2011-03-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While running ext4 testing on multiple core, we found there are per cpu ext4-dio-unwritten threads processing conversion from unwritten extents to written for IOs completed from async direct IO patch. Per filesystem is enough, we don't need per cpu threads to work on conversion. Signed-off-by: Mingming Cao <cmm@us.ibm.com>