summaryrefslogtreecommitdiffstats
path: root/fs/ext4/super.c
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds2022-02-061-2/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Various bug fixes for ext4 fast commit and inline data handling. Also fix regression introduced as part of moving to the new mount API" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: fs/ext4: fix comments mentioning i_mutex ext4: fix incorrect type issue during replay_del_range jbd2: fix kernel-doc descriptions for jbd2_journal_shrink_{scan,count}() ext4: fix potential NULL pointer dereference in ext4_fill_super() jbd2: refactor wait logic for transaction updates into a common function jbd2: cleanup unused functions declarations from jbd2.h ext4: fix error handling in ext4_fc_record_modified_inode() ext4: remove redundant max inline_size check in ext4_da_write_inline_data_begin() ext4: fix error handling in ext4_restore_inline_data() ext4: fast commit may miss file actions ext4: fast commit may not fallback for ineligible commit ext4: modify the logic of ext4_mb_new_blocks_simple ext4: prevent used blocks from being allocated during fast commit replay
| * ext4: fix potential NULL pointer dereference in ext4_fill_super()Lukas Czerner2022-02-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By mistake we fail to return an error from ext4_fill_super() in case that ext4_alloc_sbi() fails to allocate a new sbi. Instead we just set the ret variable and allow the function to continue which will later lead to a NULL pointer dereference. Fix it by returning -ENOMEM in the case ext4_alloc_sbi() fails. Fixes: cebe85d570cf ("ext4: switch to the new mount api") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Lukas Czerner <lczerner@redhat.com> Link: https://lore.kernel.org/r/20220119130209.40112-1-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
| * ext4: fast commit may miss file actionsXin Yin2022-02-031-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | in the follow scenario: 1. jbd start transaction n 2. task A get new handle for transaction n+1 3. task A do some actions and add inode to FC_Q_MAIN fc_q 4. jbd complete transaction n and clear FC_Q_MAIN fc_q 5. task A call fsync Fast commit will lost the file actions during a full commit. we should also add updates to staging queue during a full commit. and in ext4_fc_cleanup(), when reset a inode's fc track range, check it's i_sync_tid, if it bigger than current transaction tid, do not rest it, or we will lost the track range. And EXT4_MF_FC_COMMITTING is not needed anymore, so drop it. Signed-off-by: Xin Yin <yinxin.x@bytedance.com> Link: https://lore.kernel.org/r/20220117093655.35160-3-yinxin.x@bytedance.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
| * ext4: fast commit may not fallback for ineligible commitXin Yin2022-02-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For the follow scenario: 1. jbd start commit transaction n 2. task A get new handle for transaction n+1 3. task A do some ineligible actions and mark FC_INELIGIBLE 4. jbd complete transaction n and clean FC_INELIGIBLE 5. task A call fsync In this case fast commit will not fallback to full commit and transaction n+1 also not handled by jbd. Make ext4_fc_mark_ineligible() also record transaction tid for latest ineligible case, when call ext4_fc_cleanup() check current transaction tid, if small than latest ineligible tid do not clear the EXT4_MF_FC_INELIGIBLE. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reported-by: Ritesh Harjani <riteshh@linux.ibm.com> Suggested-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Signed-off-by: Xin Yin <yinxin.x@bytedance.com> Link: https://lore.kernel.org/r/20220117093655.35160-2-yinxin.x@bytedance.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
* | Merge tag 'unicode-for-next-5.17-rc3' of ↵Linus Torvalds2022-02-011-5/+5
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/krisman/unicode Pull unicode cleanup from Gabriel Krisman Bertazi: "A fix from Christoph Hellwig merging the CONFIG_UNICODE_UTF8_DATA into the previous CONFIG_UNICODE. It is -rc material since we don't want to expose the former symbol on 5.17. This has been living on linux-next for the past week" * tag 'unicode-for-next-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/krisman/unicode: unicode: clean up the Kconfig symbol confusion
| * | unicode: clean up the Kconfig symbol confusionChristoph Hellwig2022-01-201-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Turn the CONFIG_UNICODE symbol into a tristate that generates some always built in code and remove the confusing CONFIG_UNICODE_UTF8_DATA symbol. Note that a lot of the IS_ENABLED() checks could be turned from cpp statements into normal ifs, but this change is intended to be fairly mechanic, so that should be cleaned up later. Fixes: 2b3d04787012 ("unicode: Add utf8-data module") Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
* | | mm: remove cleancacheChristoph Hellwig2022-01-221-3/+0
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series "remove Xen tmem leftovers". Since the removal of the Xen tmem driver in 2019, the cleancache hooks are entirely unused, as are large parts of frontswap. This series against linux-next (with the folio changes included) removes cleancaches, and cuts down frontswap to the bits actually used by zswap. This patch (of 13): The cleancache subsystem is unused since the removal of Xen tmem driver in commit 814bbf49dcd0 ("xen: remove tmem driver"). [akpm@linux-foundation.org: remove now-unreachable code] Link: https://lkml.kernel.org/r/20211224062246.1258487-1-hch@lst.de Link: https://lkml.kernel.org/r/20211224062246.1258487-2-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Juergen Gross <jgross@suse.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: Seth Jennings <sjenning@redhat.com> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Vitaly Wool <vitaly.wool@konsulko.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge tag 'unicode-for-next-5.17' of ↵Linus Torvalds2022-01-171-20/+19
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/krisman/unicode Pull unicode updates from Gabriel Krisman Bertazi: "This includes patches from Christoph Hellwig to split the large data tables of the unicode subsystem into a loadable module, which allow users to not have them around if case-insensitive filesystems are not to be used. It also includes minor code fixes to unicode and its users, from the same author. All the patches here have been on linux-next releases for the past months" * tag 'unicode-for-next-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/krisman/unicode: unicode: only export internal symbols for the selftests unicode: Add utf8-data module unicode: cache the normalization tables in struct unicode_map unicode: move utf8cursor to utf8-selftest.c unicode: simplify utf8len unicode: remove the unused utf8{,n}age{min,max} functions unicode: pass a UNICODE_AGE() tripple to utf8_load unicode: mark the version field in struct unicode_map unsigned unicode: remove the charset field from struct unicode_map f2fs: simplify f2fs_sb_read_encoding ext4: simplify ext4_sb_read_encoding
| * | unicode: pass a UNICODE_AGE() tripple to utf8_loadChristoph Hellwig2021-10-111-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | Don't bother with pointless string parsing when the caller can just pass the version in the format that the core expects. Also remove the fallback to the latest version that none of the callers actually uses. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
| * | ext4: simplify ext4_sb_read_encodingChristoph Hellwig2021-10-111-14/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Return the encoding table as the return value instead of as an argument, and don't bother with the encoding flags as the caller can handle that trivially. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.com> Acked-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
* | | Merge tag 'libnvdimm-for-5.17' of ↵Linus Torvalds2022-01-121-4/+7
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull dax and libnvdimm updates from Dan Williams: "The bulk of this is a rework of the dax_operations API after discovering the obstacles it posed to the work-in-progress DAX+reflink support for XFS and other copy-on-write filesystem mechanics. Primarily the need to plumb a block_device through the API to handle partition offsets was a sticking point and Christoph untangled that dependency in addition to other cleanups to make landing the DAX+reflink support easier. The DAX_PMEM_COMPAT option has been around for 4 years and not only are distributions shipping userspace that understand the current configuration API, but some are not even bothering to turn this option on anymore, so it seems a good time to remove it per the deprecation schedule. Recall that this was added after the device-dax subsystem moved from /sys/class/dax to /sys/bus/dax for its sysfs organization. All recent functionality depends on /sys/bus/dax. Some other miscellaneous cleanups and reflink prep patches are included as well. Summary: - Simplify the dax_operations API: - Eliminate bdev_dax_pgoff() in favor of the filesystem maintaining and applying a partition offset to all its DAX iomap operations. - Remove wrappers and device-mapper stacked callbacks for ->copy_from_iter() and ->copy_to_iter() in favor of moving block_device relative offset responsibility to the dax_direct_access() caller. - Remove the need for an @bdev in filesystem-DAX infrastructure - Remove unused uio helpers copy_from_iter_flushcache() and copy_mc_to_iter() as only the non-check_copy_size() versions are used for DAX. - Prepare XFS for the pending (next merge window) DAX+reflink support - Remove deprecated DEV_DAX_PMEM_COMPAT support - Cleanup a straggling misuse of the GUID api" * tag 'libnvdimm-for-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (38 commits) iomap: Fix error handling in iomap_zero_iter() ACPI: NFIT: Import GUID before use dax: remove the copy_from_iter and copy_to_iter methods dax: remove the DAXDEV_F_SYNC flag dax: simplify dax_synchronous and set_dax_synchronous uio: remove copy_from_iter_flushcache() and copy_mc_to_iter() iomap: turn the byte variable in iomap_zero_iter into a ssize_t memremap: remove support for external pgmap refcounts fsdax: don't require CONFIG_BLOCK iomap: build the block based code conditionally dax: fix up some of the block device related ifdefs fsdax: shift partition offset handling into the file systems dax: return the partition offset from fs_dax_get_by_bdev iomap: add a IOMAP_DAX flag xfs: pass the mapping flags to xfs_bmbt_to_iomap xfs: use xfs_direct_write_iomap_ops for DAX zeroing xfs: move dax device handling into xfs_{alloc,free}_buftarg ext4: cleanup the dax handling in ext4_fill_super ext2: cleanup the dax handling in ext2_fill_super fsdax: decouple zeroing from the iomap buffered I/O code ...
| * | dax: return the partition offset from fs_dax_get_by_bdevChristoph Hellwig2021-12-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prepare for the removal of the block_device from the DAX I/O path by returning the partition offset from fs_dax_get_by_bdev so that the file systems have it at hand for use during I/O. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20211129102203.2243509-26-hch@lst.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| * | ext4: cleanup the dax handling in ext4_fill_superChristoph Hellwig2021-12-041-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Only call fs_dax_get_by_bdev once the sbi has been allocated and remove the need for the dax_dev local variable. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20211129102203.2243509-21-hch@lst.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| * | dax: remove dax_capableChristoph Hellwig2021-12-041-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just open code the block size and dax_dev == NULL checks in the callers. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> [erofs] Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20211129102203.2243509-9-hch@lst.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | | ext4: fix a copy and paste typoDan Carpenter2022-01-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was obviously supposed to be an ext4 struct, not xfs. GCC doesn't care either way so it doesn't affect the build or runtime. Fixes: cebe85d570cf ("ext4: switch to the new mount api") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Link: https://lore.kernel.org/r/20211215114309.GB14552@kili Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | | ext4: implement support for get/set fs labelLukas Czerner2022-01-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement support for FS_IOC_GETFSLABEL and FS_IOC_SETFSLABEL ioctls for online reading and setting of file system label. ext4_ioctl_getlabel() is simple, just get the label from the primary superblock. This might not be the first sb on the file system if 'sb=' mount option is used. In ext4_ioctl_setlabel() we update what ext4 currently views as a primary superblock and then proceed to update backup superblocks. There are two caveats: - the primary superblock might not be the first superblock and so it might not be the one used by userspace tools if read directly off the disk. - because the primary superblock might not be the first superblock we potentialy have to update it as part of backup superblock update. However the first sb location is a bit more complicated than the rest so we have to account for that. The superblock modification is created generic enough so the infrastructure can be used for other potential superblock modification operations, such as chaning UUID. Tested with generic/492 with various configurations. I also checked the behavior with 'sb=' mount options, including very large file systems with and without sparse_super/sparse_super2. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Link: https://lore.kernel.org/r/20211213135618.43303-1-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | | ext4: only set EXT4_MOUNT_QUOTA when journalled quota file is specifiedLukas Czerner2022-01-101-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Only set EXT4_MOUNT_QUOTA when journalled quota file is specified, otherwise simply disabling specific quota type (usrjquota=) will also set the EXT4_MOUNT_QUOTA super block option. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Fixes: e6e268cb6822 ("ext4: move quota configuration out of handle_mount_opt()") Link: https://lore.kernel.org/r/20220104143518.134465-2-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | | ext4: don't use kfree() on rcu protected pointer sbi->s_qf_namesLukas Czerner2022-01-101-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During ext4 mount api rework the commit e6e268cb6822 ("ext4: move quota configuration out of handle_mount_opt()") introduced a bug where we would kfree(sbi->s_qf_names[i]) before assigning the new quota name in ext4_apply_quota_options(). This is wrong because we're using kfree() on rcu prointer that could be simultaneously accessed from ext4_show_quota_options() during remount. Fix it by using rcu_replace_pointer() to replace the old qname with the new one and then kfree_rcu() the old quota name. Also use get_qf_name() instead of sbi->s_qf_names in strcmp() to silence the sparse warning. Fixes: e6e268cb6822 ("ext4: move quota configuration out of handle_mount_opt()") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Lukas Czerner <lczerner@redhat.com> Link: https://lore.kernel.org/r/20220104143518.134465-1-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | | ext4: make sure to reset inode lockdep class when quota enabling failsJan Kara2022-01-101-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we succeed in enabling some quota type but fail to enable another one with quota feature, we correctly disable all enabled quota types. However we forget to reset i_data_sem lockdep class. When the inode gets freed and reused, it will inherit this lockdep class (i_data_sem is initialized only when a slab is created) and thus eventually lockdep barfs about possible deadlocks. Reported-and-tested-by: syzbot+3b6f9218b1301ddda3e2@syzkaller.appspotmail.com Signed-off-by: Jan Kara <jack@suse.cz> Cc: stable@kernel.org Link: https://lore.kernel.org/r/20211007155336.12493-3-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | | ext4: make sure quota gets properly shutdown on errorJan Kara2022-01-101-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we hit an error when enabling quotas and setting inode flags, we do not properly shutdown quota subsystem despite returning error from Q_QUOTAON quotactl. This can lead to some odd situations like kernel using quota file while it is still writeable for userspace. Make sure we properly cleanup the quota subsystem in case of error. Signed-off-by: Jan Kara <jack@suse.cz> Cc: stable@kernel.org Link: https://lore.kernel.org/r/20211007155336.12493-2-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | | ext4: Fix BUG_ON in ext4_bread when write quota dataYe Bin2022-01-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We got issue as follows when run syzkaller: [ 167.936972] EXT4-fs error (device loop0): __ext4_remount:6314: comm rep: Abort forced by user [ 167.938306] EXT4-fs (loop0): Remounting filesystem read-only [ 167.981637] Assertion failure in ext4_getblk() at fs/ext4/inode.c:847: '(EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) || handle != NULL || create == 0' [ 167.983601] ------------[ cut here ]------------ [ 167.984245] kernel BUG at fs/ext4/inode.c:847! [ 167.984882] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI [ 167.985624] CPU: 7 PID: 2290 Comm: rep Tainted: G B 5.16.0-rc5-next-20211217+ #123 [ 167.986823] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014 [ 167.988590] RIP: 0010:ext4_getblk+0x17e/0x504 [ 167.989189] Code: c6 01 74 28 49 c7 c0 a0 a3 5c 9b b9 4f 03 00 00 48 c7 c2 80 9c 5c 9b 48 c7 c6 40 b6 5c 9b 48 c7 c7 20 a4 5c 9b e8 77 e3 fd ff <0f> 0b 8b 04 244 [ 167.991679] RSP: 0018:ffff8881736f7398 EFLAGS: 00010282 [ 167.992385] RAX: 0000000000000094 RBX: 1ffff1102e6dee75 RCX: 0000000000000000 [ 167.993337] RDX: 0000000000000001 RSI: ffffffff9b6e29e0 RDI: ffffed102e6dee66 [ 167.994292] RBP: ffff88816a076210 R08: 0000000000000094 R09: ffffed107363fa09 [ 167.995252] R10: ffff88839b1fd047 R11: ffffed107363fa08 R12: ffff88816a0761e8 [ 167.996205] R13: 0000000000000000 R14: 0000000000000021 R15: 0000000000000001 [ 167.997158] FS: 00007f6a1428c740(0000) GS:ffff88839b000000(0000) knlGS:0000000000000000 [ 167.998238] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 167.999025] CR2: 00007f6a140716c8 CR3: 0000000133216000 CR4: 00000000000006e0 [ 167.999987] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 168.000944] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 168.001899] Call Trace: [ 168.002235] <TASK> [ 168.007167] ext4_bread+0xd/0x53 [ 168.007612] ext4_quota_write+0x20c/0x5c0 [ 168.010457] write_blk+0x100/0x220 [ 168.010944] remove_free_dqentry+0x1c6/0x440 [ 168.011525] free_dqentry.isra.0+0x565/0x830 [ 168.012133] remove_tree+0x318/0x6d0 [ 168.014744] remove_tree+0x1eb/0x6d0 [ 168.017346] remove_tree+0x1eb/0x6d0 [ 168.019969] remove_tree+0x1eb/0x6d0 [ 168.022128] qtree_release_dquot+0x291/0x340 [ 168.023297] v2_release_dquot+0xce/0x120 [ 168.023847] dquot_release+0x197/0x3e0 [ 168.024358] ext4_release_dquot+0x22a/0x2d0 [ 168.024932] dqput.part.0+0x1c9/0x900 [ 168.025430] __dquot_drop+0x120/0x190 [ 168.025942] ext4_clear_inode+0x86/0x220 [ 168.026472] ext4_evict_inode+0x9e8/0xa22 [ 168.028200] evict+0x29e/0x4f0 [ 168.028625] dispose_list+0x102/0x1f0 [ 168.029148] evict_inodes+0x2c1/0x3e0 [ 168.030188] generic_shutdown_super+0xa4/0x3b0 [ 168.030817] kill_block_super+0x95/0xd0 [ 168.031360] deactivate_locked_super+0x85/0xd0 [ 168.031977] cleanup_mnt+0x2bc/0x480 [ 168.033062] task_work_run+0xd1/0x170 [ 168.033565] do_exit+0xa4f/0x2b50 [ 168.037155] do_group_exit+0xef/0x2d0 [ 168.037666] __x64_sys_exit_group+0x3a/0x50 [ 168.038237] do_syscall_64+0x3b/0x90 [ 168.038751] entry_SYSCALL_64_after_hwframe+0x44/0xae In order to reproduce this problem, the following conditions need to be met: 1. Ext4 filesystem with no journal; 2. Filesystem image with incorrect quota data; 3. Abort filesystem forced by user; 4. umount filesystem; As in ext4_quota_write: ... if (EXT4_SB(sb)->s_journal && !handle) { ext4_msg(sb, KERN_WARNING, "Quota write (off=%llu, len=%llu)" " cancelled because transaction is not started", (unsigned long long)off, (unsigned long long)len); return -EIO; } ... We only check handle if NULL when filesystem has journal. There is need check handle if NULL even when filesystem has no journal. Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20211223015506.297766-1-yebin10@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
* | | ext4: destroy ext4_fc_dentry_cachep kmemcache on module removalSebastian Andrzej Siewior2022-01-101-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kmemcache for ext4_fc_dentry_cachep remains registered after module removal. Destroy ext4_fc_dentry_cachep kmemcache on module removal. Fixes: aa75f4d3daaeb ("ext4: main fast-commit commit path") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20211110134640.lyku5vklvdndw6uk@linutronix.de Link: https://lore.kernel.org/r/YbiK3JetFFl08bd7@linutronix.de Link: https://lore.kernel.org/r/20211223164436.2628390-1-bigeasy@linutronix.de Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
* | | ext4: drop ineligible txn start stop APIsHarshad Shirwadkar2021-12-231-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch drops ext4_fc_start_ineligible() and ext4_fc_stop_ineligible() APIs. Fast commit ineligible transactions should simply call ext4_fc_mark_ineligible() after starting the trasaction. Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20211223202140.2061101-3-harshads@google.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | | ext4: fix i_version handling on remountLukas Czerner2021-12-231-25/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | i_version mount option is getting lost on remount. This is because the 'i_version' mount option differs from the util-linux mount option 'iversion', but it has exactly the same functionality. We have to specifically notify the vfs that this is what we want by setting appropriate flag in fc->sb_flags. Fix it and as a result we can remove *flags argument from __ext4_remount(); do the same for __ext4_fill_super(). In addition set out to deprecate ext4 specific 'i_version' mount option in favor or 'iversion' by kernel version 5.20. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Fixes: cebe85d570cf ("ext4: switch to the new mount api") Link: https://lore.kernel.org/r/20211222104517.11187-2-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | | ext4: remove lazytime/nolazytime mount options handled by MS_LAZYTIMELukas Czerner2021-12-231-11/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The lazytime and nolazytime mount options were added temporarily back in 2015 with commit a26f49926da9 ("ext4: add optimization for the lazytime mount option"). It think it has been enough time for the util-linux with lazytime support to get widely used. Remove the mount options. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Link: https://lore.kernel.org/r/20211222104517.11187-1-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | | ext4: don't fail remount if journalling mode didn't changeLukas Czerner2021-12-231-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Switching to the new mount api introduced inconsistency in how the journalling mode mount option (data=) is handled during a remount. Ext4 always prevented changing the journalling mode during the remount, however the new code always fails the remount when the journalling mode is specified, even if it remains unchanged. Fix it. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reported-by: Heiner Kallweit <hkallweit1@gmail.com> Fixes: cebe85d570cf ("ext4: switch to the new mount api") Link: https://lore.kernel.org/r/20211220152657.101599-1-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | | ext4: Remove unused match_table_t tokensLukas Czerner2021-12-091-243/+131
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove unused match_table_t, slim down mount_opts structure by removing unnecessary definitions, remove redundant MOPT_ flags and clean up ext4_parse_param() by converting the most of the if/else branching to switch except for the MOPT_SET/MOPT_CEAR handling. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Link: https://lore.kernel.org/r/20211027141857.33657-14-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | | ext4: switch to the new mount apiLukas Czerner2021-12-091-109/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the necessary functions for the fs_context_operations. Convert and rename ext4_remount() and ext4_fill_super() to ext4_get_tree() and ext4_reconfigure() respectively and switch the ext4 to use the new api. One user facing change is the fact that we no longer have access to the entire string of mount options provided by mount(2) since the mount api does not store it anywhere. As a result we can't print the options to the log as we did in the past after the successful mount. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Link: https://lore.kernel.org/r/20211027141857.33657-13-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | | ext4: change token2str() to use ext4_param_specsLukas Czerner2021-12-091-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change token2str() to use ext4_param_specs instead of tokens so that we can get rid of tokens entirely. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Link: https://lore.kernel.org/r/20211027141857.33657-12-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | | ext4: clean up return values in handle_mount_opt()Lukas Czerner2021-12-091-11/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clean up return values in handle_mount_opt() and rename the function to ext4_parse_param() Now we can use it in fs_context_operations as .parse_param. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Link: https://lore.kernel.org/r/20211027141857.33657-11-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | | ext4: Completely separate options parsing and sb setupLukas Czerner2021-12-091-135/+264
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new mount api separates option parsing and super block setup into two distinct steps and so we need to separate the options parsing out of the ext4_fill_super() and ext4_remount(). In order to achieve this we have to create new ext4_fill_super() and ext4_remount() functions which will serve its purpose only until we actually do convert to the new api (as such they are only temporary for this patch series) and move the option parsing out of the old function which will now be renamed to __ext4_fill_super() and __ext4_remount(). There is a small complication in the fact that while the mount option parsing is going to happen before we get to __ext4_fill_super(), the mount options stored in the super block itself needs to be applied first, before the user specified mount options. So with this patch we're going through the following sequence: - parse user provided options (including sb block) - initialize sbi and store s_sb_block if provided - in __ext4_fill_super() - read the super block - parse and apply options specified in s_mount_opts - check and apply user provided options stored in ctx - continue with the regular ext4_fill_super operation It's not exactly the most elegant solution, but if we still want to support s_mount_opts we have to do it in this order. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Link: https://lore.kernel.org/r/20211027141857.33657-10-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | | ext4: get rid of super block and sbi from handle_mount_ops()Lukas Czerner2021-12-091-173/+368
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At the parsing phase of mount in the new mount api sb will not be available. We've already removed some uses of sb and sbi, but now we need to get rid of the rest of it. Use ext4_fs_context to store all of the configuration specification so that it can be later applied to the super block and sbi. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Link: https://lore.kernel.org/r/20211027141857.33657-9-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | | ext4: check ext2/3 compatibility outside handle_mount_opt()Lukas Czerner2021-12-091-16/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At the parsing phase of mount in the new mount api sb will not be available so move ext2/3 compatibility check outside handle_mount_opt(). Unfortunately we will lose the ability to show exactly which option is not compatible. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Link: https://lore.kernel.org/r/20211027141857.33657-8-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | | ext4: move quota configuration out of handle_mount_opt()Lukas Czerner2021-12-091-93/+165
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At the parsing phase of mount in the new mount api sb will not be available so move quota confiquration out of handle_mount_opt() by noting the quota file names in the ext4_fs_context structure to be able to apply it later. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Link: https://lore.kernel.org/r/20211027141857.33657-7-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | | ext4: Allow sb to be NULL in ext4_msg()Lukas Czerner2021-12-091-66/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At the parsing phase of mount in the new mount api sb will not be available so allow sb to be NULL in ext4_msg and use that in handle_mount_opt(). Also change return value to appropriate -EINVAL where needed. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Link: https://lore.kernel.org/r/20211027141857.33657-6-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | | ext4: Change handle_mount_opt() to use fs_parameterLukas Czerner2021-12-091-107/+143
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the new mount option specifications to parse the options in handle_mount_opt(). However we're still using the old API to get the options string. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Link: https://lore.kernel.org/r/20211027141857.33657-5-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | | ext4: move option validation to a separate functionLukas Czerner2021-12-091-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move option validation out of parse_options() into a separate function ext4_validate_options(). Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Link: https://lore.kernel.org/r/20211027141857.33657-4-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | | ext4: Add fs parameter specifications for mount optionsLukas Czerner2021-12-091-0/+151
|/ / | | | | | | | | | | | | | | | | | | | | Create an array of fs_parameter_spec called ext4_param_specs to hold the mount option specifications we're going to be using with the new mount api. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Link: https://lore.kernel.org/r/20211027141857.33657-3-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | Merge tag 'ext4_for_linus' of ↵Linus Torvalds2021-11-101-7/+8
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "Only bug fixes and cleanups for ext4 this merge window. Of note are fixes for the combination of the inline_data and fast_commit fixes, and more accurately calculating when to schedule additional lazy inode table init, especially when CONFIG_HZ is 100HZ" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix error code saved on super block during file system abort ext4: inline data inode fast commit replay fixes ext4: commit inline data during fast commit ext4: scope ret locally in ext4_try_to_trim_range() ext4: remove an unused variable warning with CONFIG_QUOTA=n ext4: fix boolreturn.cocci warnings in fs/ext4/name.c ext4: prevent getting empty inode buffer ext4: move ext4_fill_raw_inode() related functions ext4: factor out ext4_fill_raw_inode() ext4: prevent partial update of the extent blocks ext4: check for inconsistent extents between index and leaf block ext4: check for out-of-order index extents in ext4_valid_extent_entries() ext4: convert from atomic_t to refcount_t on ext4_io_end->count ext4: refresh the ext4_ext_path struct after dropping i_data_sem. ext4: ensure enough credits in ext4_ext_shift_path_extents ext4: correct the left/middle/right debug message for binsearch ext4: fix lazy initialization next schedule time computation in more granular unit Revert "ext4: enforce buffer head state assertion in ext4_da_map_blocks"
| * | ext4: fix error code saved on super block during file system abortGabriel Krisman Bertazi2021-11-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ext4_abort will eventually call ext4_errno_to_code, which translates the errno to an EXT4_ERR specific error. This means that ext4_abort expects an errno. By using EXT4_ERR_ here, it gets misinterpreted (as an errno), and ends up saving EXT4_ERR_EBUSY on the superblock during an abort, which makes no sense. ESHUTDOWN will get properly translated to EXT4_ERR_SHUTDOWN, so use that instead. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Link: https://lore.kernel.org/r/20211026173302.84000-1-krisman@collabora.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
| * | ext4: remove an unused variable warning with CONFIG_QUOTA=nAustin Kim2021-11-041-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 'enable_quota' variable is only used in an CONFIG_QUOTA. With CONFIG_QUOTA=n, compiler causes a harmless warning: fs/ext4/super.c: In function ‘ext4_remount’: fs/ext4/super.c:5840:6: warning: variable ‘enable_quota’ set but not used [-Wunused-but-set-variable] int enable_quota = 0; ^~~~~ Move 'enable_quota' into the same #ifdef CONFIG_QUOTA block to remove an unused variable warning. Signed-off-by: Austin Kim <austindh.kim@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20210824034929.GA13415@raspberrypi Signed-off-by: Theodore Ts'o <tytso@mit.edu>
| * | ext4: fix lazy initialization next schedule time computation in more ↵Shaoying Xu2021-11-041-5/+4
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | granular unit Ext4 file system has default lazy inode table initialization setup once it is mounted. However, it has issue on computing the next schedule time that makes the timeout same amount in jiffies but different real time in secs if with various HZ values. Therefore, fix by measuring the current time in a more granular unit nanoseconds and make the next schedule time independent of the HZ value. Fixes: bfff68738f1c ("ext4: add support for lazy inode table initialization") Signed-off-by: Shaoying Xu <shaoyi@amazon.com> Cc: stable@vger.kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20210902164412.9994-2-shaoyi@amazon.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | Merge tag 'fsnotify_for_v5.16-rc1' of ↵Linus Torvalds2021-11-061-0/+8
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull fsnotify updates from Jan Kara: "Support for reporting filesystem errors through fanotify so that system health monitoring daemons can watch for these and act instead of scraping system logs" * tag 'fsnotify_for_v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (34 commits) samples: remove duplicate include in fs-monitor.c samples: Fix warning in fsnotify sample docs: Fix formatting of literal sections in fanotify docs samples: Make fs-monitor depend on libc and headers docs: Document the FAN_FS_ERROR event samples: Add fs error monitoring example ext4: Send notifications on error fanotify: Allow users to request FAN_FS_ERROR events fanotify: Emit generic error info for error event fanotify: Report fid info for file related file system errors fanotify: WARN_ON against too large file handles fanotify: Add helpers to decide whether to report FID/DFID fanotify: Wrap object_fh inline space in a creator macro fanotify: Support merging of error events fanotify: Support enqueueing of error events fanotify: Pre-allocate pool of error events fanotify: Reserve UAPI bits for FAN_FS_ERROR fsnotify: Support FS_ERROR event type fanotify: Require fid_mode for any non-fd event fanotify: Encode empty file handle when no inode is provided ...
| * | ext4: Send notifications on errorGabriel Krisman Bertazi2021-10-271-0/+8
| |/ | | | | | | | | | | | | | | | | | | | | | | | | Send a FS_ERROR message via fsnotify to a userspace monitoring tool whenever a ext4 error condition is triggered. This follows the existing error conditions in ext4, so it is hooked to the ext4_error* functions. Link: https://lore.kernel.org/r/20211025192746.66445-30-krisman@collabora.com Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Acked-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jan Kara <jack@suse.cz>
* | Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscryptLinus Torvalds2021-11-011-1/+0
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull fscrypt updates from Eric Biggers: "Some cleanups for fs/crypto/: - Allow 256-bit master keys with AES-256-XTS - Improve documentation and comments - Remove unneeded field fscrypt_operations::max_namelen" * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: fscrypt: improve a few comments fscrypt: allow 256-bit master keys with AES-256-XTS fscrypt: improve documentation for inline encryption fscrypt: clean up comments in bio.c fscrypt: remove fscrypt_operations::max_namelen
| * | fscrypt: remove fscrypt_operations::max_namelenEric Biggers2021-09-201-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | The max_namelen field is unnecessary, as it is set to 255 (NAME_MAX) on all filesystems that support fscrypt (or plan to support fscrypt). For simplicity, just use NAME_MAX directly instead. Link: https://lore.kernel.org/r/20210909184513.139281-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
* | | ext4: use sb_bdev_nr_blocksChristoph Hellwig2021-10-181-1/+1
| |/ |/| | | | | | | | | | | | | | | | | | | | | Use the sb_bdev_nr_blocks helper instead of open coding it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Acked-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20211018101130.1838532-27-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds2021-10-031-6/+15
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Fix a number of ext4 bugs in fast_commit, inline data, and delayed allocation. Also fix error handling code paths in ext4_dx_readdir() and ext4_fill_super(). Finally, avoid a grabbing a journal head in the delayed allocation write in the common cases where we are overwriting a pre-existing block or appending to an inode" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: recheck buffer uptodate bit under buffer lock ext4: fix potential infinite loop in ext4_dx_readdir() ext4: flush s_error_work before journal destroy in ext4_fill_super ext4: fix loff_t overflow in ext4_max_bitmap_size() ext4: fix reserved space counter leakage ext4: limit the number of blocks in one ADD_RANGE TLV ext4: enforce buffer head state assertion in ext4_da_map_blocks ext4: remove extent cache entries when truncating inline data ext4: drop unnecessary journal handle in delalloc write ext4: factor out write end code of inline file ext4: correct the error path of ext4_write_inline_data_end() ext4: check and update i_disksize properly ext4: add error checking to ext4_ext_replay_set_iblocks()
| * ext4: flush s_error_work before journal destroy in ext4_fill_superyangerkun2021-10-011-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The error path in ext4_fill_super forget to flush s_error_work before journal destroy, and it may trigger the follow bug since flush_stashed_error_work can run concurrently with journal destroy without any protection for sbi->s_journal. [32031.740193] EXT4-fs (loop66): get root inode failed [32031.740484] EXT4-fs (loop66): mount failed [32031.759805] ------------[ cut here ]------------ [32031.759807] kernel BUG at fs/jbd2/transaction.c:373! [32031.760075] invalid opcode: 0000 [#1] SMP PTI [32031.760336] CPU: 5 PID: 1029268 Comm: kworker/5:1 Kdump: loaded 4.18.0 [32031.765112] Call Trace: [32031.765375] ? __switch_to_asm+0x35/0x70 [32031.765635] ? __switch_to_asm+0x41/0x70 [32031.765893] ? __switch_to_asm+0x35/0x70 [32031.766148] ? __switch_to_asm+0x41/0x70 [32031.766405] ? _cond_resched+0x15/0x40 [32031.766665] jbd2__journal_start+0xf1/0x1f0 [jbd2] [32031.766934] jbd2_journal_start+0x19/0x20 [jbd2] [32031.767218] flush_stashed_error_work+0x30/0x90 [ext4] [32031.767487] process_one_work+0x195/0x390 [32031.767747] worker_thread+0x30/0x390 [32031.768007] ? process_one_work+0x390/0x390 [32031.768265] kthread+0x10d/0x130 [32031.768521] ? kthread_flush_work_fn+0x10/0x10 [32031.768778] ret_from_fork+0x35/0x40 static int start_this_handle(...) BUG_ON(journal->j_flags & JBD2_UNMOUNT); <---- Trigger this Besides, after we enable fast commit, ext4_fc_replay can add work to s_error_work but return success, so the latter journal destroy in ext4_load_journal can trigger this problem too. Fix this problem with two steps: 1. Call ext4_commit_super directly in ext4_handle_error for the case that called from ext4_fc_replay 2. Since it's hard to pair the init and flush for s_error_work, we'd better add a extras flush_work before journal destroy in ext4_fill_super Besides, this patch will call ext4_commit_super in ext4_handle_error for any nojournal case too. But it seems safe since the reason we call schedule_work was that we should save error info to sb through journal if available. Conversely, for the nojournal case, it seems useless delay commit superblock to s_error_work. Fixes: c92dc856848f ("ext4: defer saving error info from atomic context") Fixes: 2d01ddc86606 ("ext4: save error info to sb through journal if available") Cc: stable@kernel.org Signed-off-by: yangerkun <yangerkun@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20210924093917.1953239-1-yangerkun@huawei.com
| * ext4: fix loff_t overflow in ext4_max_bitmap_size()Ritesh Harjani2021-10-011-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We should use unsigned long long rather than loff_t to avoid overflow in ext4_max_bitmap_size() for comparison before returning. w/o this patch sbi->s_bitmap_maxbytes was becoming a negative value due to overflow of upper_limit (with has_huge_files as true) Below is a quick test to trigger it on a 64KB pagesize system. sudo mkfs.ext4 -b 65536 -O ^has_extents,^64bit /dev/loop2 sudo mount /dev/loop2 /mnt sudo echo "hello" > /mnt/hello -> This will error out with "echo: write error: File too large" Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Link: https://lore.kernel.org/r/594f409e2c543e90fd836b78188dfa5c575065ba.1622867594.git.riteshh@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>