| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 095680f24f2673d860fd1d3d2f54f40f330b4c63 ]
This patch fixes losing lazytime when remounting f2fs.
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 61f7725aa148ee870436a29d3a24d5c00ab7e9af ]
This fixes overriding error number in f2fs_gc.
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 4a1728cad6340bfbe17bd17fd158b2165cd99508 ]
Mark inode dirty explicitly in the end of recover_inode() to make sure
that all recoverable fields can be persisted later.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit f4474aa6e5e901ee4af21f39f1b9115aaaaec503 ]
Testcase to reproduce this bug:
1. mkfs.f2fs -O extra_attr -O project_quota /dev/sdd
2. mount -t f2fs /dev/sdd /mnt/f2fs
3. touch /mnt/f2fs/file
4. sync
5. chattr -p 1 /mnt/f2fs/file
6. xfs_io -f /mnt/f2fs/file -c "fsync"
7. godown /mnt/f2fs
8. umount /mnt/f2fs
9. mount -t f2fs /dev/sdd /mnt/f2fs
10. lsattr -p /mnt/f2fs/file
0 -----------------N- /mnt/f2fs/file
But actually, we expect the correct result is:
1 -----------------N- /mnt/f2fs/file
The reason is we didn't recover inode.i_projid field during mount,
fix it.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 0a4daae5ffea39f5015334e4d18a6a80b447cae4 ]
This is related to
ee70daaba82d ("xfs: update i_size after unwritten conversion in dio completion")
If we update i_size during dio_write, dio_read can read out stale data, which
breaks xfstests/465.
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit c6b1867b1da3b1203b4c49988afeebdcbdf65499 ]
Currently we show mount option "io_bits=%u" as "io_size=%uKB",
it will cause option parsing problem(unrecognized mount option)
in remount.
Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit dc4cd1257c86451cec3e8e352cc376348e4f4af4 ]
Step to reproduce this bug:
1. logon as root
2. mount -t f2fs /dev/sdd /mnt;
3. touch /mnt/file;
4. chown system /mnt/file; chgrp system /mnt/file;
5. xfs_io -f /mnt/file -c "fsync";
6. godown /mnt;
7. umount /mnt;
8. mount -t f2fs /dev/sdd /mnt;
After step 8) we will expect file's uid/gid are all system, but during
recovery, these two fields were not been recovered, fix it.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit f84262b0862d43b71b3e80a036cdd9d82e620367 ]
If we have an error in f2fs_build_free_nids, we're able to fall into a loop
to find free nids.
Suggested-by: Chao Yu <chao@kernel.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit c8e927579e00a182eda07e4c45df9c8c699c8ded ]
Currently, project quota could be changed by fssetxattr
ioctl, and existed permission check inode_owner_or_capable()
is obviously not enough, just think that common users could
change project id of file, that could make users to
break project quota easily.
This patch try to follow same regular of xfs project
quota:
"Project Quota ID state is only allowed to change from
within the init namespace. Enforce that restriction only
if we are trying to change the quota ID state.
Everything else is allowed in user namespaces."
Besides that, check and set project id'state should
be an atomic operation, protect whole operation with
inode lock.
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 4a70e255449c9a13eed7a6eeecc85a1ea63cef76 ]
In fill_super -> init_percpu_info, we should destroy percpu counter
in error path, otherwise memory allcoated for percpu counter will
leak.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 0b2103e886e6de9802e1170e57c573443286a483 ]
It needs to release memory allocated for sbi->write_io in error path,
otherwise, it will cause memory leak.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 5ce805869cbed93267ed26552ff76e30f05c91f7 ]
Sometimes, some merged IOs could get a chance to be submitted, resulting in
system hang in shutdown test. This issues IOs all the time after shutdown.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 0ded69f632bb717be9aeea3ae74e29050fcb060c ]
1. Create a file in an encrypted directory
2. Do GC & drop caches
3. Read stale data before its bio for metapage was not issued yet
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 0c093b590efb5c1ccdc835868dc2ae94bd2e14dc ]
Testcase to reproduce this bug:
1. mkfs.f2fs /dev/sdd
2. mount -t f2fs /dev/sdd /mnt/f2fs
3. touch /mnt/f2fs/file
4. sync
5. chattr +a /mnt/f2fs/file
6. xfs_io -a /mnt/f2fs/file -c "fsync"
7. godown /mnt/f2fs
8. umount /mnt/f2fs
9. mount -t f2fs /dev/sdd /mnt/f2fs
10. xfs_io /mnt/f2fs/file
There is no error when opening this file w/o O_APPEND, but actually,
we expect the correct result should be:
/mnt/f2fs/file: Operation not permitted
The root cause is, in recover_inode(), we recover inode->i_flags more
than F2FS_I(inode)->i_flags, so fix it.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 7de36cf3e4087207f42a88992f8cb615a1bd902e ]
inode.i_gc_failures is used to indicate that skip count of migrating
on blocks of inode, we should guarantee it can be recovered in sudden
power-off case.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 0e0667b625cf64243df83171bff61f9d350b9ca5 ]
After quota_off, we'll get some dirty blocks. If put_super don't have a chance
to flush them by checkpoint, it causes NULL pointer exception in end_io after
iput(node_inode). (e.g., by checkpoint=disable)
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 38fb6d0ea34299d97b031ed64fe994158b6f8eb3 ]
The kernel mount_block_root() function expects -EACESS or -EINVAL for a
unmountable filesystem when trying to mount the root with different
filesystem types.
However, in 5.3-rc1 the behavior when F2FS code cannot find valid block
changed to return -EFSCORRUPTED(-EUCLEAN), and this error code makes
mount_block_root() fail when trying to probe F2FS.
When the magic number of the superblock mismatches, it has a high
probability that it's just not a F2FS. In this case return -EINVAL seems
to be a better result, and this return value can make mount_block_root()
probing work again.
Return -EINVAL when the superblock has magic mismatch, -EFSCORRUPTED in
other cases (the magic matches but the superblock cannot be recognized).
Fixes: 10f966bbf521 ("f2fs: use generic EFSBADCRC/EFSCORRUPTED")
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 10f966bbf521bb9b2e497bbca496a5141f4071d0 ]
f2fs uses EFAULT as error number to indicate filesystem is corrupted
all the time, but generic filesystems use EUCLEAN for such condition,
we need to change to follow others.
This patch adds two new macros as below to wrap more generic error
code macros, and spread them in code.
EFSBADCRC EBADMSG /* Bad CRC detected */
EFSCORRUPTED EUCLEAN /* Filesystem is corrupted */
Reported-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit c854f4d681365498f53ba07843a16423625aa7e9 ]
As Jungyeon Reported in bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=203233
- Reproduces
gcc poc_13.c
./run.sh f2fs
- Kernel messages
F2FS-fs (sdb): Bitmap was wrongly set, blk:4608
kernel BUG at fs/f2fs/segment.c:2133!
RIP: 0010:update_sit_entry+0x35d/0x3e0
Call Trace:
f2fs_allocate_data_block+0x16c/0x5a0
do_write_page+0x57/0x100
f2fs_do_write_node_page+0x33/0xa0
__write_node_page+0x270/0x4e0
f2fs_sync_node_pages+0x5df/0x670
f2fs_write_checkpoint+0x364/0x13a0
f2fs_sync_fs+0xa3/0x130
f2fs_do_sync_file+0x1a6/0x810
do_fsync+0x33/0x60
__x64_sys_fsync+0xb/0x10
do_syscall_64+0x43/0x110
entry_SYSCALL_64_after_hwframe+0x44/0xa9
The testcase fails because that, in fuzzed image, current segment was
allocated with LFS type, its .next_blkoff should point to an unused
block address, but actually, its bitmap shows it's not. So during
allocation, f2fs crash when setting bitmap.
Introducing sanity_check_curseg() to check such inconsistence of
current in-used segment.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit a37d0862d17411edb67677a580a6f505ec2225f6 ]
As Pavel Machek reported:
"We normally use -EUCLEAN to signal filesystem corruption. Plus, it is
good idea to report it to the syslog and mark filesystem as "needing
fsck" if filesystem can do that."
Still we need improve the original patch with:
- use unlikely keyword
- add message print
- return EUCLEAN
However, after rethink this patch, I don't think we should add such
condition check here as below reasons:
- We have already checked the field in f2fs_sanity_check_ckpt(),
- If there is fs corrupt or security vulnerability, there is nothing
to guarantee the field is integrated after the check, unless we do
the check before each of its use, however no filesystem does that.
- We only have similar check for bitmap, which was added due to there
is bitmap corruption happened on f2fs' runtime in product.
- There are so many key fields in SB/CP/NAT did have such check
after f2fs_sanity_check_{sb,cp,..}.
So I propose to revert this unneeded check.
This reverts commit 56f3ce675103e3fb9e631cfb4131fc768bc23e9a.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 1166c1f2f69117ad254189ca781287afa6e550b6 ]
As a part of the sanity checking while mounting, distinct segment number
assignment to data and node segments is verified. Fixing a small bug in
this verification between node and data segments. We need to check all
the data segments with all the node segments.
Fixes: 042be0f849e5f ("f2fs: fix to do sanity check with current segment number")
Signed-off-by: Surbhi Palande <csurbhi@gmail.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 56f3ce675103e3fb9e631cfb4131fc768bc23e9a ]
blkoff_off might over 512 due to fs corrupt or security
vulnerability. That should be checked before being using.
Use ENTRIES_IN_SUM to protect invalid value in cur_data_blkoff.
Signed-off-by: Ocean Chen <oceanchen@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 7c77bf7de1574ac7a31a2b76f4927404307d13e7 ]
This fixes wrong access of address spaces of node and meta inodes after iput.
Fixes: 60aa4d5536ab ("f2fs: fix use-after-free issue when accessing sbi->stat_info")
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 2777e654371dd4207a3a7f4fb5fa39550053a080 ]
When we traverse xattr entries via __find_xattr(),
if the raw filesystem content is faked or any hardware failure occurs,
out-of-bound error can be detected by KASAN.
Fix the issue by introducing boundary check.
[ 38.402878] c7 1827 BUG: KASAN: slab-out-of-bounds in f2fs_getxattr+0x518/0x68c
[ 38.402891] c7 1827 Read of size 4 at addr ffffffc0b6fb35dc by task
[ 38.402935] c7 1827 Call trace:
[ 38.402952] c7 1827 [<ffffff900809003c>] dump_backtrace+0x0/0x6bc
[ 38.402966] c7 1827 [<ffffff9008090030>] show_stack+0x20/0x2c
[ 38.402981] c7 1827 [<ffffff900871ab10>] dump_stack+0xfc/0x140
[ 38.402995] c7 1827 [<ffffff9008325c40>] print_address_description+0x80/0x2d8
[ 38.403009] c7 1827 [<ffffff900832629c>] kasan_report_error+0x198/0x1fc
[ 38.403022] c7 1827 [<ffffff9008326104>] kasan_report_error+0x0/0x1fc
[ 38.403037] c7 1827 [<ffffff9008325000>] __asan_load4+0x1b0/0x1b8
[ 38.403051] c7 1827 [<ffffff90085fcc44>] f2fs_getxattr+0x518/0x68c
[ 38.403066] c7 1827 [<ffffff90085fc508>] f2fs_xattr_generic_get+0xb0/0xd0
[ 38.403080] c7 1827 [<ffffff9008395708>] __vfs_getxattr+0x1f4/0x1fc
[ 38.403096] c7 1827 [<ffffff9008621bd0>] inode_doinit_with_dentry+0x360/0x938
[ 38.403109] c7 1827 [<ffffff900862d6cc>] selinux_d_instantiate+0x2c/0x38
[ 38.403123] c7 1827 [<ffffff900861b018>] security_d_instantiate+0x68/0x98
[ 38.403136] c7 1827 [<ffffff9008377db8>] d_splice_alias+0x58/0x348
[ 38.403149] c7 1827 [<ffffff900858d16c>] f2fs_lookup+0x608/0x774
[ 38.403163] c7 1827 [<ffffff900835eacc>] lookup_slow+0x1e0/0x2cc
[ 38.403177] c7 1827 [<ffffff9008367fe0>] walk_component+0x160/0x520
[ 38.403190] c7 1827 [<ffffff9008369ef4>] path_lookupat+0x110/0x2b4
[ 38.403203] c7 1827 [<ffffff900835dd38>] filename_lookup+0x1d8/0x3a8
[ 38.403216] c7 1827 [<ffffff900835eeb0>] user_path_at_empty+0x54/0x68
[ 38.403229] c7 1827 [<ffffff9008395f44>] SyS_getxattr+0xb4/0x18c
[ 38.403241] c7 1827 [<ffffff9008084200>] el0_svc_naked+0x34/0x38
Signed-off-by: Randall Huang <huangrandall@google.com>
[Jaegeuk Kim: Fix wrong ending boundary]
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit b42b179bda9ff11075a6fc2bac4d9e400513679a ]
As Jungyeon reported in bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=203221
- Overview
When mounting the attached crafted image and running program, this error is reported.
The image is intentionally fuzzed from a normal f2fs image for testing and I enabled option CONFIG_F2FS_CHECK_FS on.
- Reproduces
cc poc_07.c
mkdir test
mount -t f2fs tmp.img test
cp a.out test
cd test
sudo ./a.out
- Messages
kernel BUG at fs/f2fs/node.c:1279!
RIP: 0010:read_node_page+0xcf/0xf0
Call Trace:
__get_node_page+0x6b/0x2f0
f2fs_iget+0x8f/0xdf0
f2fs_lookup+0x136/0x320
__lookup_slow+0x92/0x140
lookup_slow+0x30/0x50
walk_component+0x1c1/0x350
path_lookupat+0x62/0x200
filename_lookup+0xb3/0x1a0
do_fchmodat+0x3e/0xa0
__x64_sys_chmod+0x12/0x20
do_syscall_64+0x43/0xf0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
On below paths, we can have opportunity to readahead inode page
- gc_node_segment -> f2fs_ra_node_page
- gc_data_segment -> f2fs_ra_node_page
- f2fs_fill_dentries -> f2fs_ra_node_page
Unlike synchronized read, on readahead path, we can set page uptodate
before verifying page's checksum, then read_node_page() will trigger
kernel panic once it encounters a uptodated page w/ incorrect checksum.
So considering readahead scenario, we have to do checksum each time
when loading inode page even if it is uptodated.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit e95bcdb2fefa129f37bd9035af1d234ca92ee4ef ]
As Jungyeon reported in bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=203233
- Overview
When mounting the attached crafted image and running program, following errors are reported.
Additionally, it hangs on sync after running program.
The image is intentionally fuzzed from a normal f2fs image for testing.
Compile options for F2FS are as follows.
CONFIG_F2FS_FS=y
CONFIG_F2FS_STAT_FS=y
CONFIG_F2FS_FS_XATTR=y
CONFIG_F2FS_FS_POSIX_ACL=y
CONFIG_F2FS_CHECK_FS=y
- Reproduces
cc poc_13.c
mkdir test
mount -t f2fs tmp.img test
cp a.out test
cd test
sudo ./a.out
sync
- Kernel messages
F2FS-fs (sdb): Bitmap was wrongly set, blk:4608
kernel BUG at fs/f2fs/segment.c:2102!
RIP: 0010:update_sit_entry+0x394/0x410
Call Trace:
f2fs_allocate_data_block+0x16f/0x660
do_write_page+0x62/0x170
f2fs_do_write_node_page+0x33/0xa0
__write_node_page+0x270/0x4e0
f2fs_sync_node_pages+0x5df/0x670
f2fs_write_checkpoint+0x372/0x1400
f2fs_sync_fs+0xa3/0x130
f2fs_do_sync_file+0x1a6/0x810
do_fsync+0x33/0x60
__x64_sys_fsync+0xb/0x10
do_syscall_64+0x43/0xf0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
sit.vblocks and sum valid block count in sit.valid_map may be
inconsistent, segment w/ zero vblocks will be treated as free
segment, while allocating in free segment, we may allocate a
free block, if its bitmap is valid previously, it can cause
kernel crash due to bitmap verification failure.
Anyway, to avoid further serious metadata inconsistence and
corruption, it is necessary and worth to detect SIT
inconsistence. So let's enable check_block_count() to verify
vblocks and valid_map all the time rather than do it only
CONFIG_F2FS_CHECK_FS is enabled.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 622927f3b8809206f6da54a6a7ed4df1a7770fce ]
With below mkfs and mount option:
MKFS_OPTIONS -- -O extra_attr -O project_quota -O inode_checksum -O flexible_inline_xattr -O inode_crtime -f
MOUNT_OPTIONS -- -o noinline_xattr
We may miss xattr data with below testcase:
- mkdir dir
- setfattr -n "user.name" -v 0 dir
- for ((i = 0; i < 190; i++)) do touch dir/$i; done
- umount
- mount
- getfattr -n "user.name" dir
user.name: No such attribute
The root cause is that we persist xattr data into reserved inline xattr
space, even if inline_xattr is not enable in inline directory inode, after
inline dentry conversion, reserved space no longer exists, so that xattr
data missed.
Let's use inline xattr space only if inline_xattr flag is set on inode
to fix this iusse.
Fixes: 6afc662e68b5 ("f2fs: support flexible inline xattr size")
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 5e159cd349bf3a31fb7e35c23a93308eb30f4f71 ]
As Jungyeon reported in bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=203209
- Overview
When mounting the attached crafted image and running program, I got this error.
Additionally, it hangs on sync after the this script.
The image is intentionally fuzzed from a normal f2fs image for testing and I enabled option CONFIG_F2FS_CHECK_FS on.
- Reproduces
cc poc_01.c
./run.sh f2fs
sync
kernel BUG at fs/f2fs/f2fs.h:1788!
RIP: 0010:f2fs_truncate_data_blocks_range+0x342/0x350
Call Trace:
f2fs_truncate_blocks+0x36d/0x3c0
f2fs_truncate+0x88/0x110
f2fs_setattr+0x3e1/0x460
notify_change+0x2da/0x400
do_truncate+0x6d/0xb0
do_sys_ftruncate+0xf1/0x160
do_syscall_64+0x43/0xf0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
The reason is dec_valid_block_count() will trigger kernel panic due to
inconsistent count in between inode.i_blocks and actual block.
To avoid panic, let's just print debug message and set SBI_NEED_FSCK to
give a hint to fsck for latter repairing.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: fix build warning and add unlikely]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 546d22f070d64a7b96f57c93333772085d3a5e6d ]
As Jungyeon reported in bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=203217
- Overview
When mounting the attached crafted image and running program, I got this error.
Additionally, it hangs on sync after running the program.
The image is intentionally fuzzed from a normal f2fs image for testing and I enabled option CONFIG_F2FS_CHECK_FS on.
- Reproduces
cc poc_test_05.c
mkdir test
mount -t f2fs tmp.img test
sudo ./a.out
sync
- Messages
kernel BUG at fs/f2fs/inode.c:707!
RIP: 0010:f2fs_evict_inode+0x33f/0x3a0
Call Trace:
evict+0xba/0x180
f2fs_iget+0x598/0xdf0
f2fs_lookup+0x136/0x320
__lookup_slow+0x92/0x140
lookup_slow+0x30/0x50
walk_component+0x1c1/0x350
path_lookupat+0x62/0x200
filename_lookup+0xb3/0x1a0
do_readlinkat+0x56/0x110
__x64_sys_readlink+0x16/0x20
do_syscall_64+0x43/0xf0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
During inode loading, __recover_inline_status() can recovery inode status
and set inode dirty, once we failed in following process, it will fail
the check in f2fs_evict_inode, result in trigger BUG_ON().
Let's clear dirty inode in error path of f2fs_iget() to avoid panic.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 626bcf2b7ce87211dba565f2bfa7842ba5be5c1b ]
As Jungyeon reported in bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=203225
- Overview
When mounting the attached crafted image and unmounting it, following errors are reported.
Additionally, it hangs on sync after unmounting.
The image is intentionally fuzzed from a normal f2fs image for testing.
Compile options for F2FS are as follows.
CONFIG_F2FS_FS=y
CONFIG_F2FS_STAT_FS=y
CONFIG_F2FS_FS_XATTR=y
CONFIG_F2FS_FS_POSIX_ACL=y
CONFIG_F2FS_CHECK_FS=y
- Reproduces
mkdir test
mount -t f2fs tmp.img test
touch test/t
umount test
sync
- Messages
kernel BUG at fs/f2fs/node.c:3073!
RIP: 0010:f2fs_destroy_node_manager+0x2f0/0x300
Call Trace:
f2fs_put_super+0xf4/0x270
generic_shutdown_super+0x62/0x110
kill_block_super+0x1c/0x50
kill_f2fs_super+0xad/0xd0
deactivate_locked_super+0x35/0x60
cleanup_mnt+0x36/0x70
task_work_run+0x75/0x90
exit_to_usermode_loop+0x93/0xa0
do_syscall_64+0xba/0xf0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0010:f2fs_destroy_node_manager+0x2f0/0x300
NAT table is corrupted, so reserved meta/node inode ids were added into
free list incorrectly, during file creation, since reserved id has cached
in inode hash, so it fails the creation and preallocated nid can not be
released later, result in kernel panic.
To fix this issue, let's do nid boundary check during free nid loading.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 8b6810f8acfe429fde7c7dad4714692cc5f75651 ]
As Jungyeon reported in bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=203219
- Overview
When mounting the attached crafted image and running program, I got this error.
Additionally, it hangs on sync after running the program.
The image is intentionally fuzzed from a normal f2fs image for testing and I enabled option CONFIG_F2FS_CHECK_FS on.
- Reproduces
cc poc_06.c
mkdir test
mount -t f2fs tmp.img test
cp a.out test
cd test
sudo ./a.out
sync
- Messages
kernel BUG at fs/f2fs/node.c:1183!
RIP: 0010:f2fs_remove_inode_page+0x294/0x2d0
Call Trace:
f2fs_evict_inode+0x2a3/0x3a0
evict+0xba/0x180
__dentry_kill+0xbe/0x160
dentry_kill+0x46/0x180
dput+0xbb/0x100
do_renameat2+0x3c9/0x550
__x64_sys_rename+0x17/0x20
do_syscall_64+0x43/0xf0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
The reason is f2fs_remove_inode_page() will trigger kernel panic due to
inconsistent i_blocks value of inode.
To avoid panic, let's just print debug message and set SBI_NEED_FSCK to
give a hint to fsck for latter repairing of potential image corruption.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: fix build warning and add unlikely]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 05573d6ccf702df549a7bdeabef31e4753df1a90 ]
As Jungyeon reported in bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=203239
- Overview
When mounting the attached crafted image and running program, following errors are reported.
Additionally, it hangs on sync after running program.
The image is intentionally fuzzed from a normal f2fs image for testing.
Compile options for F2FS are as follows.
CONFIG_F2FS_FS=y
CONFIG_F2FS_STAT_FS=y
CONFIG_F2FS_FS_XATTR=y
CONFIG_F2FS_FS_POSIX_ACL=y
CONFIG_F2FS_CHECK_FS=y
- Reproduces
cc poc_15.c
./run.sh f2fs
sync
- Kernel messages
------------[ cut here ]------------
kernel BUG at fs/f2fs/segment.c:3162!
RIP: 0010:f2fs_inplace_write_data+0x12d/0x160
Call Trace:
f2fs_do_write_data_page+0x3c1/0x820
__write_data_page+0x156/0x720
f2fs_write_cache_pages+0x20d/0x460
f2fs_write_data_pages+0x1b4/0x300
do_writepages+0x15/0x60
__filemap_fdatawrite_range+0x7c/0xb0
file_write_and_wait_range+0x2c/0x80
f2fs_do_sync_file+0x102/0x810
do_fsync+0x33/0x60
__x64_sys_fsync+0xb/0x10
do_syscall_64+0x43/0xf0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
The reason is f2fs_inplace_write_data() will trigger kernel panic due
to data block locates in node type segment.
To avoid panic, let's just return error code and set SBI_NEED_FSCK to
give a hint to fsck for latter repairing.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 22d61e286e2d9097dae36f75ed48801056b77cac ]
As Jungyeon reported in bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=203227
- Overview
When mounting the attached crafted image, following errors are reported.
Additionally, it hangs on sync after trying to mount it.
The image is intentionally fuzzed from a normal f2fs image for testing.
Compile options for F2FS are as follows.
CONFIG_F2FS_FS=y
CONFIG_F2FS_STAT_FS=y
CONFIG_F2FS_FS_XATTR=y
CONFIG_F2FS_FS_POSIX_ACL=y
CONFIG_F2FS_CHECK_FS=y
- Reproduces
mkdir test
mount -t f2fs tmp.img test
sync
- Messages
kernel BUG at fs/f2fs/recovery.c:549!
RIP: 0010:recover_data+0x167a/0x1780
Call Trace:
f2fs_recover_fsync_data+0x613/0x710
f2fs_fill_super+0x1043/0x1aa0
mount_bdev+0x16d/0x1a0
mount_fs+0x4a/0x170
vfs_kern_mount+0x5d/0x100
do_mount+0x200/0xcf0
ksys_mount+0x79/0xc0
__x64_sys_mount+0x1c/0x20
do_syscall_64+0x43/0xf0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
During recovery, if ofs_of_node is inconsistent in between recovered
node page and original checkpointed node page, let's just fail recovery
instead of making kernel panic.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 0916878da355650d7e77104a7ac0fa1784eca852 upstream.
For a single device mount using a zoned block device, the zone
information for the device is stored in the sbi->devs single entry
array and sbi->s_ndevs is set to 1. This differs from a single device
mount using a regular block device which does not allocate sbi->devs
and sets sbi->s_ndevs to 0.
However, sbi->s_devs == 0 condition is used throughout the code to
differentiate a single device mount from a multi-device mount where
sbi->s_ndevs is always larger than 1. This results in problems with
single zoned block device volumes as these are treated as multi-device
mounts but do not have the start_blk and end_blk information set. One
of the problem observed is skipping of zone discard issuing resulting in
write commands being issued to full zones or unaligned to a zone write
pointer.
Fix this problem by simply treating the cases sbi->s_ndevs == 0 (single
regular block device mount) and sbi->s_ndevs == 1 (single zoned block
device mount) in the same manner. This is done by introducing the
helper function f2fs_is_multi_device() and using this helper in place
of direct tests of sbi->s_ndevs value, improving code readability.
Fixes: 7bb3a371d199 ("f2fs: Fix zoned block device support")
Cc: <stable@vger.kernel.org>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit ca597bddedd94906cd761d8be6a3ad21292725de ]
As Seulbae Kim reported in bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=202637
We didn't recover permission field correctly after sudden power-cut,
the reason is in setattr we didn't add inode into global dirty list
once i_mode is changed, so latter checkpoint triggered by fsync will
not flush last i_mode into disk, result in this problem, fix it.
Reported-by: Seulbae Kim <seulbae@gatech.edu>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 26b5a079197c8cb6725565968b7fd3299bd1877b ]
During recover, we will try to create new dentries for inodes with
dentry_mark. But if the parent is missing (e.g. killed by fsck),
recover will break. But those recovered dirty pages are not cleanup.
This will hit f2fs_bug_on:
[ 53.519566] F2FS-fs (loop0): Found nat_bits in checkpoint
[ 53.539354] F2FS-fs (loop0): recover_inode: ino = 5, name = file, inline = 3
[ 53.539402] F2FS-fs (loop0): recover_dentry: ino = 5, name = file, dir = 0, err = -2
[ 53.545760] F2FS-fs (loop0): Cannot recover all fsync data errno=-2
[ 53.546105] F2FS-fs (loop0): access invalid blkaddr:4294967295
[ 53.546171] WARNING: CPU: 1 PID: 1798 at fs/f2fs/checkpoint.c:163 f2fs_is_valid_blkaddr+0x26c/0x320
[ 53.546174] Modules linked in:
[ 53.546183] CPU: 1 PID: 1798 Comm: mount Not tainted 4.19.0-rc2+ #1
[ 53.546186] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 53.546191] RIP: 0010:f2fs_is_valid_blkaddr+0x26c/0x320
[ 53.546195] Code: 85 bb 00 00 00 48 89 df 88 44 24 07 e8 ad a8 db ff 48 8b 3b 44 89 e1 48 c7 c2 40 03 72 a9 48 c7 c6 e0 01 72 a9 e8 84 3c ff ff <0f> 0b 0f b6 44 24 07 e9 8a 00 00 00 48 8d bf 38 01 00 00 e8 7c a8
[ 53.546201] RSP: 0018:ffff88006c067768 EFLAGS: 00010282
[ 53.546208] RAX: 0000000000000000 RBX: ffff880068844200 RCX: ffffffffa83e1a33
[ 53.546211] RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff88006d51e590
[ 53.546215] RBP: 0000000000000005 R08: ffffed000daa3cb3 R09: ffffed000daa3cb3
[ 53.546218] R10: 0000000000000001 R11: ffffed000daa3cb2 R12: 00000000ffffffff
[ 53.546221] R13: ffff88006a1f8000 R14: 0000000000000200 R15: 0000000000000009
[ 53.546226] FS: 00007fb2f3646840(0000) GS:ffff88006d500000(0000) knlGS:0000000000000000
[ 53.546229] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 53.546234] CR2: 00007f0fd77f0008 CR3: 00000000687e6002 CR4: 00000000000206e0
[ 53.546237] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 53.546240] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 53.546242] Call Trace:
[ 53.546248] f2fs_submit_page_bio+0x95/0x740
[ 53.546253] read_node_page+0x161/0x1e0
[ 53.546271] ? truncate_node+0x650/0x650
[ 53.546283] ? add_to_page_cache_lru+0x12c/0x170
[ 53.546288] ? pagecache_get_page+0x262/0x2d0
[ 53.546292] __get_node_page+0x200/0x660
[ 53.546302] f2fs_update_inode_page+0x4a/0x160
[ 53.546306] f2fs_write_inode+0x86/0xb0
[ 53.546317] __writeback_single_inode+0x49c/0x620
[ 53.546322] writeback_single_inode+0xe4/0x1e0
[ 53.546326] sync_inode_metadata+0x93/0xd0
[ 53.546330] ? sync_inode+0x10/0x10
[ 53.546342] ? do_raw_spin_unlock+0xed/0x100
[ 53.546347] f2fs_sync_inode_meta+0xe0/0x130
[ 53.546351] f2fs_fill_super+0x287d/0x2d10
[ 53.546367] ? vsnprintf+0x742/0x7a0
[ 53.546372] ? f2fs_commit_super+0x180/0x180
[ 53.546379] ? up_write+0x20/0x40
[ 53.546385] ? set_blocksize+0x5f/0x140
[ 53.546391] ? f2fs_commit_super+0x180/0x180
[ 53.546402] mount_bdev+0x181/0x200
[ 53.546406] mount_fs+0x94/0x180
[ 53.546411] vfs_kern_mount+0x6c/0x1e0
[ 53.546415] do_mount+0xe5e/0x1510
[ 53.546420] ? fs_reclaim_release+0x9/0x30
[ 53.546424] ? copy_mount_string+0x20/0x20
[ 53.546428] ? fs_reclaim_acquire+0xd/0x30
[ 53.546435] ? __might_sleep+0x2c/0xc0
[ 53.546440] ? ___might_sleep+0x53/0x170
[ 53.546453] ? __might_fault+0x4c/0x60
[ 53.546468] ? _copy_from_user+0x95/0xa0
[ 53.546474] ? memdup_user+0x39/0x60
[ 53.546478] ksys_mount+0x88/0xb0
[ 53.546482] __x64_sys_mount+0x5d/0x70
[ 53.546495] do_syscall_64+0x65/0x130
[ 53.546503] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 53.547639] ---[ end trace b804d1ea2fec893e ]---
So if recover fails, we need to drop all recovered data.
Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 042be0f849e5fc24116d0afecfaf926eed5cac63 ]
https://bugzilla.kernel.org/show_bug.cgi?id=200219
Reproduction way:
- mount image
- run poc code
- umount image
F2FS-fs (loop1): Bitmap was wrongly set, blk:15364
------------[ cut here ]------------
kernel BUG at /home/yuchao/git/devf2fs/segment.c:2061!
invalid opcode: 0000 [#1] PREEMPT SMP
CPU: 2 PID: 17686 Comm: umount Tainted: G W O 4.18.0-rc2+ #39
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
EIP: update_sit_entry+0x459/0x4e0 [f2fs]
Code: e8 1c b5 fd ff 0f 0b 0f 0b 8b 45 e4 c7 44 24 08 9c 7a 6c f8 c7 44 24 04 bc 4a 6c f8 89 44 24 0c 8b 06 89 04 24 e8 f7 b4 fd ff <0f> 0b 8b 45 e4 0f b6 d2 89 54 24 10 c7 44 24 08 60 7a 6c f8 c7 44
EAX: 00000032 EBX: 000000f8 ECX: 00000002 EDX: 00000001
ESI: d7177000 EDI: f520fe68 EBP: d6477c6c ESP: d6477c34
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010282
CR0: 80050033 CR2: b7fbe000 CR3: 2a99b3c0 CR4: 000406f0
Call Trace:
f2fs_allocate_data_block+0x124/0x580 [f2fs]
do_write_page+0x78/0x150 [f2fs]
f2fs_do_write_node_page+0x25/0xa0 [f2fs]
__write_node_page+0x2bf/0x550 [f2fs]
f2fs_sync_node_pages+0x60e/0x6d0 [f2fs]
? sync_inode_metadata+0x2f/0x40
? f2fs_write_checkpoint+0x28f/0x7d0 [f2fs]
? up_write+0x1e/0x80
f2fs_write_checkpoint+0x2a9/0x7d0 [f2fs]
? mark_held_locks+0x5d/0x80
? _raw_spin_unlock_irq+0x27/0x50
kill_f2fs_super+0x68/0x90 [f2fs]
deactivate_locked_super+0x3d/0x70
deactivate_super+0x40/0x60
cleanup_mnt+0x39/0x70
__cleanup_mnt+0x10/0x20
task_work_run+0x81/0xa0
exit_to_usermode_loop+0x59/0xa7
do_fast_syscall_32+0x1f5/0x22c
entry_SYSENTER_32+0x53/0x86
EIP: 0xb7f95c51
Code: c1 1e f7 ff ff 89 e5 8b 55 08 85 d2 8b 81 64 cd ff ff 74 02 89 02 5d c3 8b 0c 24 c3 8b 1c 24 c3 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
EAX: 00000000 EBX: 0871ab90 ECX: bfb2cd00 EDX: 00000000
ESI: 00000000 EDI: 0871ab90 EBP: 0871ab90 ESP: bfb2cd7c
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000246
Modules linked in: f2fs(O) crc32_generic bnep rfcomm bluetooth ecdh_generic snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq pcbc joydev aesni_intel snd_seq_device aes_i586 snd_timer crypto_simd snd cryptd soundcore mac_hid serio_raw video i2c_piix4 parport_pc ppdev lp parport hid_generic psmouse usbhid hid e1000 [last unloaded: f2fs]
---[ end trace d423f83982cfcdc5 ]---
The reason is, different log headers using the same segment, once
one log's next block address is used by another log, it will cause
panic as above.
Main area: 24 segs, 24 secs 24 zones
- COLD data: 0, 0, 0
- WARM data: 1, 1, 1
- HOT data: 20, 20, 20
- Dir dnode: 22, 22, 22
- File dnode: 22, 22, 22
- Indir nodes: 21, 21, 21
So this patch adds sanity check to detect such condition to avoid
this issue.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 7d20c8abb2edcf962ca857d51f4d0f9cd4b19053 ]
https://bugzilla.kernel.org/show_bug.cgi?id=200951
These is a NULL pointer dereference issue reported in bugzilla:
Hi,
in the setup there is a SATA SSD connected to a SATA-to-USB bridge.
The disc is "Samsung SSD 850 PRO 256G" which supports TRIM.
There are four partitions:
sda1: FAT /boot
sda2: F2FS /
sda3: F2FS /home
sda4: F2FS
The bridge is ASMT1153e which uses the "uas" driver.
There is no TRIM pass-through, so, when mounting it reports:
mounting with "discard" option, but the device does not support discard
The USB host is USB3.0 and UASP capable. It is the one on RK3399.
Given this everything works fine, except there is no TRIM support.
In order to enable TRIM a new UDEV rule is added [1]:
/etc/udev/rules.d/10-sata-bridge-trim.rules:
ACTION=="add|change", ATTRS{idVendor}=="174c", ATTRS{idProduct}=="55aa", SUBSYSTEM=="scsi_disk", ATTR{provisioning_mode}="unmap"
After reboot any F2FS write hangs forever and dmesg reports:
Unable to handle kernel NULL pointer dereference
Also tested on a x86_64 system: works fine even with TRIM enabled.
same disc
same bridge
different usb host controller
different cpu architecture
not root filesystem
Regards,
Vicenç.
[1] Post #5 in https://bbs.archlinux.org/viewtopic.php?id=236280
Unable to handle kernel NULL pointer dereference at virtual address 000000000000003e
Mem abort info:
ESR = 0x96000004
Exception class = DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000004
CM = 0, WnR = 0
user pgtable: 4k pages, 48-bit VAs, pgdp = 00000000626e3122
[000000000000003e] pgd=0000000000000000
Internal error: Oops: 96000004 [#1] SMP
Modules linked in: overlay snd_soc_hdmi_codec rc_cec dw_hdmi_i2s_audio dw_hdmi_cec snd_soc_simple_card snd_soc_simple_card_utils snd_soc_rockchip_i2s rockchip_rga snd_soc_rockchip_pcm rockchipdrm videobuf2_dma_sg v4l2_mem2mem rtc_rk808 videobuf2_memops analogix_dp videobuf2_v4l2 videobuf2_common dw_hdmi dw_wdt cec rc_core videodev drm_kms_helper media drm rockchip_thermal rockchip_saradc realtek drm_panel_orientation_quirks syscopyarea sysfillrect sysimgblt fb_sys_fops dwmac_rk stmmac_platform stmmac pwm_bl squashfs loop crypto_user gpio_keys hid_kensington
CPU: 5 PID: 957 Comm: nvim Not tainted 4.19.0-rc1-1-ARCH #1
Hardware name: Sapphire-RK3399 Board (DT)
pstate: 00000005 (nzcv daif -PAN -UAO)
pc : update_sit_entry+0x304/0x4b0
lr : update_sit_entry+0x108/0x4b0
sp : ffff00000ca13bd0
x29: ffff00000ca13bd0 x28: 000000000000003e
x27: 0000000000000020 x26: 0000000000080000
x25: 0000000000000048 x24: ffff8000ebb85cf8
x23: 0000000000000253 x22: 00000000ffffffff
x21: 00000000000535f2 x20: 00000000ffffffdf
x19: ffff8000eb9e6800 x18: ffff8000eb9e6be8
x17: 0000000007ce6926 x16: 000000001c83ffa8
x15: 0000000000000000 x14: ffff8000f602df90
x13: 0000000000000006 x12: 0000000000000040
x11: 0000000000000228 x10: 0000000000000000
x9 : 0000000000000000 x8 : 0000000000000000
x7 : 00000000000535f2 x6 : ffff8000ebff3440
x5 : ffff8000ebff3440 x4 : ffff8000ebe3a6c8
x3 : 00000000ffffffff x2 : 0000000000000020
x1 : 0000000000000000 x0 : ffff8000eb9e5800
Process nvim (pid: 957, stack limit = 0x0000000063a78320)
Call trace:
update_sit_entry+0x304/0x4b0
f2fs_invalidate_blocks+0x98/0x140
truncate_node+0x90/0x400
f2fs_remove_inode_page+0xe8/0x340
f2fs_evict_inode+0x2b0/0x408
evict+0xe0/0x1e0
iput+0x160/0x260
do_unlinkat+0x214/0x298
__arm64_sys_unlinkat+0x3c/0x68
el0_svc_handler+0x94/0x118
el0_svc+0x8/0xc
Code: f9400800 b9488400 36080140 f9400f01 (387c4820)
---[ end trace a0f21a307118c477 ]---
The reason is it is possible to enable discard flag on block queue via
UDEV, but during mount, f2fs will initialize se->discard_map only if
this flag is set, once the flag is set after mount, f2fs may dereference
NULL pointer on se->discard_map.
So this patch does below changes to fix this issue:
- initialize and update se->discard_map all the time.
- don't clear DISCARD option if device has no QUEUE_FLAG_DISCARD flag
during mount.
- don't issue small discard on zoned block device.
- introduce some functions to enhance the readability.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Tested-by: Vicente Bergas <vicencb@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit ac92985864e187a1735502f6a02f54eaa655b2aa ]
When setting /sys/fs/f2fs/<DEV>/iostat_enable with non-bool value, UBSAN
reports the following warning.
[ 7562.295484] ================================================================================
[ 7562.296531] UBSAN: Undefined behaviour in fs/f2fs/f2fs.h:2776:10
[ 7562.297651] load of value 64 is not a valid value for type '_Bool'
[ 7562.298642] CPU: 1 PID: 7487 Comm: dd Not tainted 4.20.0-rc4+ #79
[ 7562.298653] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 7562.298662] Call Trace:
[ 7562.298760] dump_stack+0x46/0x5b
[ 7562.298811] ubsan_epilogue+0x9/0x40
[ 7562.298830] __ubsan_handle_load_invalid_value+0x72/0x90
[ 7562.298863] f2fs_file_write_iter+0x29f/0x3f0
[ 7562.298905] __vfs_write+0x115/0x160
[ 7562.298922] vfs_write+0xa7/0x190
[ 7562.298934] ksys_write+0x50/0xc0
[ 7562.298973] do_syscall_64+0x4a/0xe0
[ 7562.298992] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 7562.299001] RIP: 0033:0x7fa45ec19c00
[ 7562.299004] Code: 73 01 c3 48 8b 0d 88 92 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d dd eb 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ce 8f 01 00 48 89 04 24
[ 7562.299044] RSP: 002b:00007ffca52b49e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 7562.299052] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fa45ec19c00
[ 7562.299059] RDX: 0000000000000400 RSI: 000000000093f000 RDI: 0000000000000001
[ 7562.299065] RBP: 000000000093f000 R08: 0000000000000004 R09: 0000000000000000
[ 7562.299071] R10: 00007ffca52b47b0 R11: 0000000000000246 R12: 0000000000000400
[ 7562.299077] R13: 000000000093f000 R14: 000000000093f400 R15: 0000000000000000
[ 7562.299091] ================================================================================
So, if iostat_enable is enabled, set its value as true.
Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 500e0b28ecd3c5aade98f3c3a339d18dcb166bb6 ]
We use below condition to check inline_xattr_size boundary:
if (!F2FS_OPTION(sbi).inline_xattr_size ||
F2FS_OPTION(sbi).inline_xattr_size >=
DEF_ADDRS_PER_INODE -
F2FS_TOTAL_EXTRA_ATTR_SIZE -
DEF_INLINE_RESERVED_SIZE -
DEF_MIN_INLINE_SIZE)
There is there problems in that check:
- we should allow inline_xattr_size equaling to min size of inline
{data,dentry} area.
- F2FS_TOTAL_EXTRA_ATTR_SIZE and inline_xattr_size are based on
different size unit, previous one is 4 bytes, latter one is 1 bytes.
- DEF_MIN_INLINE_SIZE only indicate min size of inline data area,
however, we need to consider min size of inline dentry area as well,
minimal inline dentry should at least contain two entries: '.' and
'..', so that min inline_dentry size is 40 bytes.
.bitmap 1 * 1 = 1
.reserved 1 * 1 = 1
.dentry 11 * 2 = 22
.filename 8 * 2 = 16
total 40
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 9083977dabf3833298ddcd40dee28687f1e6b483 ]
Fix below warning coming because of using mutex lock in atomic context.
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:98
in_atomic(): 1, irqs_disabled(): 0, pid: 585, name: sh
Preemption disabled at: __radix_tree_preload+0x28/0x130
Call trace:
dump_backtrace+0x0/0x2b4
show_stack+0x20/0x28
dump_stack+0xa8/0xe0
___might_sleep+0x144/0x194
__might_sleep+0x58/0x8c
mutex_lock+0x2c/0x48
f2fs_trace_pid+0x88/0x14c
f2fs_set_node_page_dirty+0xd0/0x184
Do not use f2fs_radix_tree_insert() to avoid doing cond_resched() with
spin_lock() acquired.
Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit aadcef64b22f668c1a107b86d3521d9cac915c24 ]
As Jiqun Li reported in bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=202883
sometimes, dead lock when make system call SYS_getdents64 with fsync() is
called by another process.
monkey running on android9.0
1. task 9785 held sbi->cp_rwsem and waiting lock_page()
2. task 10349 held mm_sem and waiting sbi->cp_rwsem
3. task 9709 held lock_page() and waiting mm_sem
so this is a dead lock scenario.
task stack is show by crash tools as following
crash_arm64> bt ffffffc03c354080
PID: 9785 TASK: ffffffc03c354080 CPU: 1 COMMAND: "RxIoScheduler-3"
>> #7 [ffffffc01b50fac0] __lock_page at ffffff80081b11e8
crash-arm64> bt 10349
PID: 10349 TASK: ffffffc018b83080 CPU: 1 COMMAND: "BUGLY_ASYNC_UPL"
>> #3 [ffffffc01f8cfa40] rwsem_down_read_failed at ffffff8008a93afc
PC: 00000033 LR: 00000000 SP: 00000000 PSTATE: ffffffffffffffff
crash-arm64> bt 9709
PID: 9709 TASK: ffffffc03e7f3080 CPU: 1 COMMAND: "IntentService[A"
>> #3 [ffffffc001e67850] rwsem_down_read_failed at ffffff8008a93afc
>> #8 [ffffffc001e67b80] el1_ia at ffffff8008084fc4
PC: ffffff8008274114 [compat_filldir64+120]
LR: ffffff80083584d4 [f2fs_fill_dentries+448]
SP: ffffffc001e67b80 PSTATE: 80400145
X29: ffffffc001e67b80 X28: 0000000000000000 X27: 000000000000001a
X26: 00000000000093d7 X25: ffffffc070d52480 X24: 0000000000000008
X23: 0000000000000028 X22: 00000000d43dfd60 X21: ffffffc001e67e90
X20: 0000000000000011 X19: ffffff80093a4000 X18: 0000000000000000
X17: 0000000000000000 X16: 0000000000000000 X15: 0000000000000000
X14: ffffffffffffffff X13: 0000000000000008 X12: 0101010101010101
X11: 7f7f7f7f7f7f7f7f X10: 6a6a6a6a6a6a6a6a X9: 7f7f7f7f7f7f7f7f
X8: 0000000080808000 X7: ffffff800827409c X6: 0000000080808000
X5: 0000000000000008 X4: 00000000000093d7 X3: 000000000000001a
X2: 0000000000000011 X1: ffffffc070d52480 X0: 0000000000800238
>> #9 [ffffffc001e67be0] f2fs_fill_dentries at ffffff80083584d0
PC: 0000003c LR: 00000000 SP: 00000000 PSTATE: 000000d9
X12: f48a02ff X11: d4678960 X10: d43dfc00 X9: d4678ae4
X8: 00000058 X7: d4678994 X6: d43de800 X5: 000000d9
X4: d43dfc0c X3: d43dfc10 X2: d46799c8 X1: 00000000
X0: 00001068
Below potential deadlock will happen between three threads:
Thread A Thread B Thread C
- f2fs_do_sync_file
- f2fs_write_checkpoint
- down_write(&sbi->node_change) -- 1)
- do_page_fault
- down_write(&mm->mmap_sem) -- 2)
- do_wp_page
- f2fs_vm_page_mkwrite
- getdents64
- f2fs_read_inline_dir
- lock_page -- 3)
- f2fs_sync_node_pages
- lock_page -- 3)
- __do_map_lock
- down_read(&sbi->node_change) -- 1)
- f2fs_fill_dentries
- dir_emit
- compat_filldir64
- do_page_fault
- down_read(&mm->mmap_sem) -- 2)
Since f2fs_readdir is protected by inode.i_rwsem, there should not be
any updates in inode page, we're safe to lookup dents in inode page
without its lock held, so taking off the lock to improve concurrency
of readdir and avoid potential deadlock.
Reported-by: Jiqun Li <jiqun.li@unisoc.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 2c28aba8b2e2a51749fa66e01b68e1cd5b53e022 ]
With below testcase, we will fail to find existed xattr entry:
1. mkfs.f2fs -O extra_attr -O flexible_inline_xattr /dev/zram0
2. mount -t f2fs -o inline_xattr_size=1 /dev/zram0 /mnt/f2fs/
3. touch /mnt/f2fs/file
4. setfattr -n "user.name" -v 0 /mnt/f2fs/file
5. getfattr -n "user.name" /mnt/f2fs/file
/mnt/f2fs/file: user.name: No such attribute
The reason is for inode which has very small inline xattr size,
__find_inline_xattr() will fail to traverse any entry due to first
entry may not be loaded from xattr node yet, later, we may skip to
check entire xattr datas in __find_xattr(), result in such wrong
condition.
This patch adds condition to check such case to avoid this issue.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 48432984d718c95cf13e26d487c2d1b697c3c01f upstream.
Thread A Thread B
- __fput
- f2fs_release_file
- drop_inmem_pages
- mutex_lock(&fi->inmem_lock)
- __revoke_inmem_pages
- lock_page(page)
- open
- f2fs_setattr
- truncate_setsize
- truncate_inode_pages_range
- lock_page(page)
- truncate_cleanup_page
- f2fs_invalidate_page
- drop_inmem_page
- mutex_lock(&fi->inmem_lock);
We may encounter above ABBA deadlock as reported by Kyungtae Kim:
I'm reporting a bug in linux-4.17.19: "INFO: task hung in
drop_inmem_page" (no reproducer)
I think this might be somehow related to the following:
https://groups.google.com/forum/#!searchin/syzkaller-bugs/INFO$3A$20task$20hung$20in$20%7Csort:date/syzkaller-bugs/c6soBTrdaIo/AjAzPeIzCgAJ
=========================================
INFO: task syz-executor7:10822 blocked for more than 120 seconds.
Not tainted 4.17.19 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor7 D27024 10822 6346 0x00000004
Call Trace:
context_switch kernel/sched/core.c:2867 [inline]
__schedule+0x721/0x1e60 kernel/sched/core.c:3515
schedule+0x88/0x1c0 kernel/sched/core.c:3559
schedule_preempt_disabled+0x18/0x30 kernel/sched/core.c:3617
__mutex_lock_common kernel/locking/mutex.c:833 [inline]
__mutex_lock+0x5bd/0x1410 kernel/locking/mutex.c:893
mutex_lock_nested+0x1b/0x20 kernel/locking/mutex.c:908
drop_inmem_page+0xcb/0x810 fs/f2fs/segment.c:327
f2fs_invalidate_page+0x337/0x5e0 fs/f2fs/data.c:2401
do_invalidatepage mm/truncate.c:165 [inline]
truncate_cleanup_page+0x261/0x330 mm/truncate.c:187
truncate_inode_pages_range+0x552/0x1610 mm/truncate.c:367
truncate_inode_pages mm/truncate.c:478 [inline]
truncate_pagecache+0x6d/0x90 mm/truncate.c:801
truncate_setsize+0x81/0xa0 mm/truncate.c:826
f2fs_setattr+0x44f/0x1270 fs/f2fs/file.c:781
notify_change+0xa62/0xe80 fs/attr.c:313
do_truncate+0x12e/0x1e0 fs/open.c:63
do_last fs/namei.c:2955 [inline]
path_openat+0x2042/0x29f0 fs/namei.c:3505
do_filp_open+0x1bd/0x2c0 fs/namei.c:3540
do_sys_open+0x35e/0x4e0 fs/open.c:1101
__do_sys_open fs/open.c:1119 [inline]
__se_sys_open fs/open.c:1114 [inline]
__x64_sys_open+0x89/0xc0 fs/open.c:1114
do_syscall_64+0xc4/0x4e0 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4497b9
RSP: 002b:00007f734e459c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
RAX: ffffffffffffffda RBX: 00007f734e45a6cc RCX: 00000000004497b9
RDX: 0000000000000104 RSI: 00000000000a8280 RDI: 0000000020000080
RBP: 000000000071bea0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 0000000000007230 R14: 00000000006f02d0 R15: 00007f734e45a700
INFO: task syz-executor7:10858 blocked for more than 120 seconds.
Not tainted 4.17.19 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor7 D28880 10858 6346 0x00000004
Call Trace:
context_switch kernel/sched/core.c:2867 [inline]
__schedule+0x721/0x1e60 kernel/sched/core.c:3515
schedule+0x88/0x1c0 kernel/sched/core.c:3559
__rwsem_down_write_failed_common kernel/locking/rwsem-xadd.c:565 [inline]
rwsem_down_write_failed+0x5e6/0xc90 kernel/locking/rwsem-xadd.c:594
call_rwsem_down_write_failed+0x17/0x30 arch/x86/lib/rwsem.S:117
__down_write arch/x86/include/asm/rwsem.h:142 [inline]
down_write+0x58/0xa0 kernel/locking/rwsem.c:72
inode_lock include/linux/fs.h:713 [inline]
do_truncate+0x120/0x1e0 fs/open.c:61
do_last fs/namei.c:2955 [inline]
path_openat+0x2042/0x29f0 fs/namei.c:3505
do_filp_open+0x1bd/0x2c0 fs/namei.c:3540
do_sys_open+0x35e/0x4e0 fs/open.c:1101
__do_sys_open fs/open.c:1119 [inline]
__se_sys_open fs/open.c:1114 [inline]
__x64_sys_open+0x89/0xc0 fs/open.c:1114
do_syscall_64+0xc4/0x4e0 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4497b9
RSP: 002b:00007f734e3b4c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
RAX: ffffffffffffffda RBX: 00007f734e3b56cc RCX: 00000000004497b9
RDX: 0000000000000104 RSI: 00000000000a8280 RDI: 0000000020000080
RBP: 000000000071c238 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 0000000000007230 R14: 00000000006f02d0 R15: 00007f734e3b5700
INFO: task syz-executor5:10829 blocked for more than 120 seconds.
Not tainted 4.17.19 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor5 D28760 10829 6308 0x80000002
Call Trace:
context_switch kernel/sched/core.c:2867 [inline]
__schedule+0x721/0x1e60 kernel/sched/core.c:3515
schedule+0x88/0x1c0 kernel/sched/core.c:3559
io_schedule+0x21/0x80 kernel/sched/core.c:5179
wait_on_page_bit_common mm/filemap.c:1100 [inline]
__lock_page+0x2b5/0x390 mm/filemap.c:1273
lock_page include/linux/pagemap.h:483 [inline]
__revoke_inmem_pages+0xb35/0x11c0 fs/f2fs/segment.c:231
drop_inmem_pages+0xa3/0x3e0 fs/f2fs/segment.c:306
f2fs_release_file+0x2c7/0x330 fs/f2fs/file.c:1556
__fput+0x2c7/0x780 fs/file_table.c:209
____fput+0x1a/0x20 fs/file_table.c:243
task_work_run+0x151/0x1d0 kernel/task_work.c:113
exit_task_work include/linux/task_work.h:22 [inline]
do_exit+0x8ba/0x30a0 kernel/exit.c:865
do_group_exit+0x13b/0x3a0 kernel/exit.c:968
get_signal+0x6bb/0x1650 kernel/signal.c:2482
do_signal+0x84/0x1b70 arch/x86/kernel/signal.c:810
exit_to_usermode_loop+0x155/0x190 arch/x86/entry/common.c:162
prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
do_syscall_64+0x445/0x4e0 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4497b9
RSP: 002b:00007f1c68e74ce8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 000000000071bf80 RCX: 00000000004497b9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000071bf80
RBP: 000000000071bf80 R08: 0000000000000000 R09: 000000000071bf58
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f1c68e759c0 R15: 00007f1c68e75700
This patch tries to use trylock_page to mitigate such deadlock condition
for fix.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 31867b23d7d1ee3535136c6a410a6cf56f666bfc upstream.
Otherwise, we can get wrong counts incurring checkpoint hang.
IO_W (CP: -24, Data: 24, Flush: ( 0 0 1), Discard: ( 0 0))
Thread A Thread B
- f2fs_write_data_pages
- __write_data_page
- f2fs_submit_page_write
- inc_page_count(F2FS_WB_DATA)
type is F2FS_WB_DATA due to file is non-atomic one
- f2fs_ioc_start_atomic_write
- set_inode_flag(FI_ATOMIC_FILE)
- f2fs_write_end_io
- dec_page_count(F2FS_WB_CP_DATA)
type is F2FS_WB_DATA due to file becomes
atomic one
Cc: <stable@vger.kernel.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit e4589fa545e0020dbbc3c9bde35f35f949901392 ]
When there is a failure in f2fs_fill_super() after/during
the recovery of fsync'd nodes, it frees the current sbi and
retries again. This time the mount is successful, but the files
that got recovered before retry, still holds the extent tree,
whose extent nodes list is corrupted since sbi and sbi->extent_list
is freed up. The list_del corruption issue is observed when the
file system is getting unmounted and when those recoverd files extent
node is being freed up in the below context.
list_del corruption. prev->next should be fffffff1e1ef5480, but was (null)
<...>
kernel BUG at kernel/msm-4.14/lib/list_debug.c:53!
lr : __list_del_entry_valid+0x94/0xb4
pc : __list_del_entry_valid+0x94/0xb4
<...>
Call trace:
__list_del_entry_valid+0x94/0xb4
__release_extent_node+0xb0/0x114
__free_extent_tree+0x58/0x7c
f2fs_shrink_extent_tree+0xdc/0x3b0
f2fs_leave_shrinker+0x28/0x7c
f2fs_put_super+0xfc/0x1e0
generic_shutdown_super+0x70/0xf4
kill_block_super+0x2c/0x5c
kill_f2fs_super+0x44/0x50
deactivate_locked_super+0x60/0x8c
deactivate_super+0x68/0x74
cleanup_mnt+0x40/0x78
__cleanup_mnt+0x1c/0x28
task_work_run+0x48/0xd0
do_notify_resume+0x678/0xe98
work_pending+0x8/0x14
Fix this by not creating extents for those recovered files if shrinker is
not registered yet. Once mount is successful and shrinker is registered,
those files can have extents again.
Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 60aa4d5536ab7fe32433ca1173bd9d6633851f27 ]
iput() on sbi->node_inode can update sbi->stat_info
in the below context, if the f2fs_write_checkpoint()
has failed with error.
f2fs_balance_fs_bg+0x1ac/0x1ec
f2fs_write_node_pages+0x4c/0x260
do_writepages+0x80/0xbc
__writeback_single_inode+0xdc/0x4ac
writeback_single_inode+0x9c/0x144
write_inode_now+0xc4/0xec
iput+0x194/0x22c
f2fs_put_super+0x11c/0x1e8
generic_shutdown_super+0x70/0xf4
kill_block_super+0x2c/0x5c
kill_f2fs_super+0x44/0x50
deactivate_locked_super+0x60/0x8c
deactivate_super+0x68/0x74
cleanup_mnt+0x40/0x78
Fix this by moving f2fs_destroy_stats() further below iput() in
both f2fs_put_super() and f2fs_fill_super() paths.
Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit f6176473a0c7472380eef72ebeb330cf9485bf0a ]
When call f2fs_acl_create_masq() failed, the caller f2fs_acl_create()
should return -EIO instead of -ENOMEM, this patch makes it consistent
with posix_acl_create() which has been fixed in commit beaf226b863a
("posix_acl: don't ignore return value of posix_acl_create_masq()").
Fixes: 83dfe53c185e ("f2fs: fix reference leaks in f2fs_acl_create")
Signed-off-by: Tiezhu Yang <kernelpatch@126.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 2866fb16d67992195b0526d19e65acb6640fb87f ]
The following race could lead to inconsistent SIT bitmap:
Task A Task B
====== ======
f2fs_write_checkpoint
block_operations
f2fs_lock_all
down_write(node_change)
down_write(node_write)
... sync ...
up_write(node_change)
f2fs_file_write_iter
set_inode_flag(FI_NO_PREALLOC)
......
f2fs_write_begin(index=0, has inline data)
prepare_write_begin
__do_map_lock(AIO) => down_read(node_change)
f2fs_convert_inline_page => update SIT
__do_map_lock(AIO) => up_read(node_change)
f2fs_flush_sit_entries <= inconsistent SIT
finish write checkpoint
sudden-power-off
If SPO occurs after checkpoint is finished, SIT bitmap will be set
incorrectly.
Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit b61ac5b720146c619c7cdf17eff2551b934399e5 ]
This patch move dir data flush to write checkpoint process, by
doing this, it may reduce some time for dir fsync.
pre:
-f2fs_do_sync_file enter
-file_write_and_wait_range <- flush & wait
-write_checkpoint
-do_checkpoint <- wait all
-f2fs_do_sync_file exit
now:
-f2fs_do_sync_file enter
-write_checkpoint
-block_operations <- flush dir & no wait
-do_checkpoint <- wait all
-f2fs_do_sync_file exit
Signed-off-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|