| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, the hidedotfiles mount option is behaving in the reverse
way of what would be expected: enabling it disables setting the
hidden attribute on files or directories with names starting with a
dot and disabling it enables the setting.
Reverse the behaviour of the hidedotfiles mount option so it matches
what is expected.
Signed-off-by: Daniel Pinto <danielpinto52@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
|
|
| |
When enabled, the windows_names mount option prevents the creation
of files or directories with names not allowed by Windows. Use
the same option name as NTFS-3G for compatibility.
Signed-off-by: Daniel Pinto <danielpinto52@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
|
| |
'a == b ? 0 : 1' is logically equivalent to 'a != b'.
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The data_size and valid_size fields of non resident attributes should be
less than the its alloc_size field, but this is not checked in
ntfs_read_mft function.
Syzbot reports a allocation order warning due to a large unchecked value
of data_size getting assigned to inode->i_size which is then passed to
kcalloc.
Add sanity check for ensuring that the data_size and valid_size fields
are not larger than alloc_size field.
Link: https://syzkaller.appspot.com/bug?extid=fa4648a5446460b7b963
Reported-and-tested-by: syzbot+fa4648a5446460b7b963@syzkaller.appspotmail.com
Fixes: (82cae269cfa95) fs/ntfs3: Add initialization of super block
Signed-off-by: Abdun Nihaal <abdun.nihaal@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
syzbot is reporting too large allocation at ntfs_fill_super() [1], for a
crafted filesystem can contain bogus inode->i_size. Add __GFP_NOWARN in
order to avoid too large allocation warning, than exhausting memory by
using kvmalloc().
Link: https://syzkaller.appspot.com/bug?extid=33f3faaa0c08744f7d40 [1]
Reported-by: syzot <syzbot+33f3faaa0c08744f7d40@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
syzbot is reporting too large allocation at wnd_init() [1], for a crafted
filesystem can become wnd->nwnd close to UINT_MAX. Add __GFP_NOWARN in
order to avoid too large allocation warning, than exhausting memory by
using kvcalloc().
Link: https://syzkaller.appspot.com/bug?extid=fa4648a5446460b7b963 [1]
Reported-by: syzot <syzbot+fa4648a5446460b7b963@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Syzbot reports an out of bound access in ntfs_trim_fs.
The cause of this is using a loop termination condition that compares
window index (iw) with wnd->nbits instead of wnd->nwnd, due to which the
index used for wnd->free_bits exceeds the size of the array allocated.
Fix the loop condition.
Fixes: 3f3b442b5ad2 ("fs/ntfs3: Add bitmap")
Link: https://syzkaller.appspot.com/bug?extid=b892240eac461e488d51
Reported-by: syzbot+b892240eac461e488d51@syzkaller.appspotmail.com
Signed-off-by: Abdun Nihaal <abdun.nihaal@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This enhances the sanity check for $SDH and $SII while initializing NTFS
security, guarantees these index root are legit.
[ 162.459513] BUG: KASAN: use-after-free in hdr_find_e.isra.0+0x10c/0x320
[ 162.460176] Read of size 2 at addr ffff8880037bca99 by task mount/243
[ 162.460851]
[ 162.461252] CPU: 0 PID: 243 Comm: mount Not tainted 6.0.0-rc7 #42
[ 162.461744] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 162.462609] Call Trace:
[ 162.462954] <TASK>
[ 162.463276] dump_stack_lvl+0x49/0x63
[ 162.463822] print_report.cold+0xf5/0x689
[ 162.464608] ? unwind_get_return_address+0x3a/0x60
[ 162.465766] ? hdr_find_e.isra.0+0x10c/0x320
[ 162.466975] kasan_report+0xa7/0x130
[ 162.467506] ? _raw_spin_lock_irq+0xc0/0xf0
[ 162.467998] ? hdr_find_e.isra.0+0x10c/0x320
[ 162.468536] __asan_load2+0x68/0x90
[ 162.468923] hdr_find_e.isra.0+0x10c/0x320
[ 162.469282] ? cmp_uints+0xe0/0xe0
[ 162.469557] ? cmp_sdh+0x90/0x90
[ 162.469864] ? ni_find_attr+0x214/0x300
[ 162.470217] ? ni_load_mi+0x80/0x80
[ 162.470479] ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 162.470931] ? ntfs_bread_run+0x190/0x190
[ 162.471307] ? indx_get_root+0xe4/0x190
[ 162.471556] ? indx_get_root+0x140/0x190
[ 162.471833] ? indx_init+0x1e0/0x1e0
[ 162.472069] ? fnd_clear+0x115/0x140
[ 162.472363] ? _raw_spin_lock_irqsave+0x100/0x100
[ 162.472731] indx_find+0x184/0x470
[ 162.473461] ? sysvec_apic_timer_interrupt+0x57/0xc0
[ 162.474429] ? indx_find_buffer+0x2d0/0x2d0
[ 162.474704] ? do_syscall_64+0x3b/0x90
[ 162.474962] dir_search_u+0x196/0x2f0
[ 162.475381] ? ntfs_nls_to_utf16+0x450/0x450
[ 162.475661] ? ntfs_security_init+0x3d6/0x440
[ 162.475906] ? is_sd_valid+0x180/0x180
[ 162.476191] ntfs_extend_init+0x13f/0x2c0
[ 162.476496] ? ntfs_fix_post_read+0x130/0x130
[ 162.476861] ? iput.part.0+0x286/0x320
[ 162.477325] ntfs_fill_super+0x11e0/0x1b50
[ 162.477709] ? put_ntfs+0x1d0/0x1d0
[ 162.477970] ? vsprintf+0x20/0x20
[ 162.478258] ? set_blocksize+0x95/0x150
[ 162.478538] get_tree_bdev+0x232/0x370
[ 162.478789] ? put_ntfs+0x1d0/0x1d0
[ 162.479038] ntfs_fs_get_tree+0x15/0x20
[ 162.479374] vfs_get_tree+0x4c/0x130
[ 162.479729] path_mount+0x654/0xfe0
[ 162.480124] ? putname+0x80/0xa0
[ 162.480484] ? finish_automount+0x2e0/0x2e0
[ 162.480894] ? putname+0x80/0xa0
[ 162.481467] ? kmem_cache_free+0x1c4/0x440
[ 162.482280] ? putname+0x80/0xa0
[ 162.482714] do_mount+0xd6/0xf0
[ 162.483264] ? path_mount+0xfe0/0xfe0
[ 162.484782] ? __kasan_check_write+0x14/0x20
[ 162.485593] __x64_sys_mount+0xca/0x110
[ 162.486024] do_syscall_64+0x3b/0x90
[ 162.486543] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 162.487141] RIP: 0033:0x7f9d374e948a
[ 162.488324] Code: 48 8b 0d 11 fa 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 008
[ 162.489728] RSP: 002b:00007ffe30e73d18 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
[ 162.490971] RAX: ffffffffffffffda RBX: 0000561cdb43a060 RCX: 00007f9d374e948a
[ 162.491669] RDX: 0000561cdb43a260 RSI: 0000561cdb43a2e0 RDI: 0000561cdb442af0
[ 162.492050] RBP: 0000000000000000 R08: 0000561cdb43a280 R09: 0000000000000020
[ 162.492459] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 0000561cdb442af0
[ 162.493183] R13: 0000561cdb43a260 R14: 0000000000000000 R15: 00000000ffffffff
[ 162.493644] </TASK>
[ 162.493908]
[ 162.494214] The buggy address belongs to the physical page:
[ 162.494761] page:000000003e38a3d5 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x37bc
[ 162.496064] flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff)
[ 162.497278] raw: 000fffffc0000000 ffffea00000df1c8 ffffea00000df008 0000000000000000
[ 162.498928] raw: 0000000000000000 0000000000240000 00000000ffffffff 0000000000000000
[ 162.500542] page dumped because: kasan: bad access detected
[ 162.501057]
[ 162.501242] Memory state around the buggy address:
[ 162.502230] ffff8880037bc980: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 162.502977] ffff8880037bca00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 162.503522] >ffff8880037bca80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 162.503963] ^
[ 162.504370] ffff8880037bcb00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 162.504766] ffff8880037bcb80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Signed-off-by: Edward Lo <edward.lo@ambergroup.io>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clang warns:
fs/ntfs3/namei.c:445:7: error: variable 'uni1' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized]
if (toupper(c1) != toupper(c2)) {
^~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/ctype.h:64:20: note: expanded from macro 'toupper'
#define toupper(c) __toupper(c)
^
fs/ntfs3/namei.c:487:12: note: uninitialized use occurs here
__putname(uni1);
^~~~
./include/linux/fs.h:2789:65: note: expanded from macro '__putname'
#define __putname(name) kmem_cache_free(names_cachep, (void *)(name))
^~~~
fs/ntfs3/namei.c:445:3: note: remove the 'if' if its condition is always false
if (toupper(c1) != toupper(c2)) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fs/ntfs3/namei.c:434:7: error: variable 'uni1' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized]
if (!lm--) {
^~~~~
fs/ntfs3/namei.c:487:12: note: uninitialized use occurs here
__putname(uni1);
^~~~
./include/linux/fs.h:2789:65: note: expanded from macro '__putname'
#define __putname(name) kmem_cache_free(names_cachep, (void *)(name))
^~~~
fs/ntfs3/namei.c:434:3: note: remove the 'if' if its condition is always false
if (!lm--) {
^~~~~~~~~~~~
fs/ntfs3/namei.c:430:22: note: initialize the variable 'uni1' to silence this warning
struct cpu_str *uni1, *uni2;
^
= NULL
2 errors generated.
There is no point in calling __putname() in these particular error
paths, as there has been no corresponding __getname() call yet. Just
return directly in these blocks to clear up the warning.
Fixes: a3a956c78efa ("fs/ntfs3: Add option "nocase"")
Link: https://github.com/ClangBuiltLinux/linux/issues/1729
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
|
|
| |
The way of determin attribute type is just matching
name with the predefined string, do this with strcmp
to simplify the code.
Signed-off-by: Yuan Can <yuancan@huawei.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Syzkaller reports slab-out-of-bounds bug as follows:
==================================================================
BUG: KASAN: slab-out-of-bounds in run_unpack+0x8b7/0x970 fs/ntfs3/run.c:944
Read of size 1 at addr ffff88801bbdff02 by task syz-executor131/3611
[...]
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
print_address_description mm/kasan/report.c:317 [inline]
print_report.cold+0x2ba/0x719 mm/kasan/report.c:433
kasan_report+0xb1/0x1e0 mm/kasan/report.c:495
run_unpack+0x8b7/0x970 fs/ntfs3/run.c:944
run_unpack_ex+0xb0/0x7c0 fs/ntfs3/run.c:1057
ntfs_read_mft fs/ntfs3/inode.c:368 [inline]
ntfs_iget5+0xc20/0x3280 fs/ntfs3/inode.c:501
ntfs_loadlog_and_replay+0x124/0x5d0 fs/ntfs3/fsntfs.c:272
ntfs_fill_super+0x1eff/0x37f0 fs/ntfs3/super.c:1018
get_tree_bdev+0x440/0x760 fs/super.c:1323
vfs_get_tree+0x89/0x2f0 fs/super.c:1530
do_new_mount fs/namespace.c:3040 [inline]
path_mount+0x1326/0x1e20 fs/namespace.c:3370
do_mount fs/namespace.c:3383 [inline]
__do_sys_mount fs/namespace.c:3591 [inline]
__se_sys_mount fs/namespace.c:3568 [inline]
__x64_sys_mount+0x27f/0x300 fs/namespace.c:3568
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
[...]
</TASK>
The buggy address belongs to the physical page:
page:ffffea00006ef600 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1bbd8
head:ffffea00006ef600 order:3 compound_mapcount:0 compound_pincount:0
flags: 0xfff00000010200(slab|head|node=0|zone=1|lastcpupid=0x7ff)
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff88801bbdfe00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff88801bbdfe80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88801bbdff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
^
ffff88801bbdff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff88801bbe0000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
Kernel will tries to read record and parse MFT from disk in
ntfs_read_mft().
Yet the problem is that during enumerating attributes in record,
kernel doesn't check whether run_off field loading from the disk
is a valid value.
To be more specific, if attr->nres.run_off is larger than attr->size,
kernel will passes an invalid argument run_buf_size in
run_unpack_ex(), which having an integer overflow. Then this invalid
argument will triggers the slab-out-of-bounds Read bug as above.
This patch solves it by adding the sanity check between
the offset to packed runs and attribute size.
link: https://lore.kernel.org/all/0000000000009145fc05e94bd5c3@google.com/#t
Reported-and-tested-by: syzbot+8d6fbb27a6aded64b25b@syzkaller.appspotmail.com
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Though we already have some sanity checks while enumerating attributes,
resident attribute names aren't included. This patch checks the resident
attribute names are in the valid ranges.
[ 259.209031] BUG: KASAN: slab-out-of-bounds in ni_create_attr_list+0x1e1/0x850
[ 259.210770] Write of size 426 at addr ffff88800632f2b2 by task exp/255
[ 259.211551]
[ 259.212035] CPU: 0 PID: 255 Comm: exp Not tainted 6.0.0-rc6 #37
[ 259.212955] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 259.214387] Call Trace:
[ 259.214640] <TASK>
[ 259.214895] dump_stack_lvl+0x49/0x63
[ 259.215284] print_report.cold+0xf5/0x689
[ 259.215565] ? kasan_poison+0x3c/0x50
[ 259.215778] ? kasan_unpoison+0x28/0x60
[ 259.215991] ? ni_create_attr_list+0x1e1/0x850
[ 259.216270] kasan_report+0xa7/0x130
[ 259.216481] ? ni_create_attr_list+0x1e1/0x850
[ 259.216719] kasan_check_range+0x15a/0x1d0
[ 259.216939] memcpy+0x3c/0x70
[ 259.217136] ni_create_attr_list+0x1e1/0x850
[ 259.217945] ? __rcu_read_unlock+0x5b/0x280
[ 259.218384] ? ni_remove_attr+0x2e0/0x2e0
[ 259.218712] ? kernel_text_address+0xcf/0xe0
[ 259.219064] ? __kernel_text_address+0x12/0x40
[ 259.219434] ? arch_stack_walk+0x9e/0xf0
[ 259.219668] ? __this_cpu_preempt_check+0x13/0x20
[ 259.219904] ? sysvec_apic_timer_interrupt+0x57/0xc0
[ 259.220140] ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
[ 259.220561] ni_ins_attr_ext+0x52c/0x5c0
[ 259.220984] ? ni_create_attr_list+0x850/0x850
[ 259.221532] ? run_deallocate+0x120/0x120
[ 259.221972] ? vfs_setxattr+0x128/0x300
[ 259.222688] ? setxattr+0x126/0x140
[ 259.222921] ? path_setxattr+0x164/0x180
[ 259.223431] ? __x64_sys_setxattr+0x6d/0x80
[ 259.223828] ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 259.224417] ? mi_find_attr+0x3c/0xf0
[ 259.224772] ni_insert_attr+0x1ba/0x420
[ 259.225216] ? ni_ins_attr_ext+0x5c0/0x5c0
[ 259.225504] ? ntfs_read_ea+0x119/0x450
[ 259.225775] ni_insert_resident+0xc0/0x1c0
[ 259.226316] ? ni_insert_nonresident+0x400/0x400
[ 259.227001] ? __kasan_kmalloc+0x88/0xb0
[ 259.227468] ? __kmalloc+0x192/0x320
[ 259.227773] ntfs_set_ea+0x6bf/0xb30
[ 259.228216] ? ftrace_graph_ret_addr+0x2a/0xb0
[ 259.228494] ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 259.228838] ? ntfs_read_ea+0x450/0x450
[ 259.229098] ? is_bpf_text_address+0x24/0x40
[ 259.229418] ? kernel_text_address+0xcf/0xe0
[ 259.229681] ? __kernel_text_address+0x12/0x40
[ 259.229948] ? unwind_get_return_address+0x3a/0x60
[ 259.230271] ? write_profile+0x270/0x270
[ 259.230537] ? arch_stack_walk+0x9e/0xf0
[ 259.230836] ntfs_setxattr+0x114/0x5c0
[ 259.231099] ? ntfs_set_acl_ex+0x2e0/0x2e0
[ 259.231529] ? evm_protected_xattr_common+0x6d/0x100
[ 259.231817] ? posix_xattr_acl+0x13/0x80
[ 259.232073] ? evm_protect_xattr+0x1f7/0x440
[ 259.232351] __vfs_setxattr+0xda/0x120
[ 259.232635] ? xattr_resolve_name+0x180/0x180
[ 259.232912] __vfs_setxattr_noperm+0x93/0x300
[ 259.233219] __vfs_setxattr_locked+0x141/0x160
[ 259.233492] ? kasan_poison+0x3c/0x50
[ 259.233744] vfs_setxattr+0x128/0x300
[ 259.234002] ? __vfs_setxattr_locked+0x160/0x160
[ 259.234837] do_setxattr+0xb8/0x170
[ 259.235567] ? vmemdup_user+0x53/0x90
[ 259.236212] setxattr+0x126/0x140
[ 259.236491] ? do_setxattr+0x170/0x170
[ 259.236791] ? debug_smp_processor_id+0x17/0x20
[ 259.237232] ? kasan_quarantine_put+0x57/0x180
[ 259.237605] ? putname+0x80/0xa0
[ 259.237870] ? __kasan_slab_free+0x11c/0x1b0
[ 259.238234] ? putname+0x80/0xa0
[ 259.238500] ? preempt_count_sub+0x18/0xc0
[ 259.238775] ? __mnt_want_write+0xaa/0x100
[ 259.238990] ? mnt_want_write+0x8b/0x150
[ 259.239290] path_setxattr+0x164/0x180
[ 259.239605] ? setxattr+0x140/0x140
[ 259.239849] ? debug_smp_processor_id+0x17/0x20
[ 259.240174] ? fpregs_assert_state_consistent+0x67/0x80
[ 259.240411] __x64_sys_setxattr+0x6d/0x80
[ 259.240715] do_syscall_64+0x3b/0x90
[ 259.240934] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 259.241697] RIP: 0033:0x7fc6b26e4469
[ 259.242647] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 088
[ 259.244512] RSP: 002b:00007ffc3c7841f8 EFLAGS: 00000217 ORIG_RAX: 00000000000000bc
[ 259.245086] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc6b26e4469
[ 259.246025] RDX: 00007ffc3c784380 RSI: 00007ffc3c7842e0 RDI: 00007ffc3c784238
[ 259.246961] RBP: 00007ffc3c788410 R08: 0000000000000001 R09: 00007ffc3c7884f8
[ 259.247775] R10: 000000000000007f R11: 0000000000000217 R12: 00000000004004e0
[ 259.248534] R13: 00007ffc3c7884f0 R14: 0000000000000000 R15: 0000000000000000
[ 259.249368] </TASK>
[ 259.249644]
[ 259.249888] Allocated by task 255:
[ 259.250283] kasan_save_stack+0x26/0x50
[ 259.250957] __kasan_kmalloc+0x88/0xb0
[ 259.251826] __kmalloc+0x192/0x320
[ 259.252745] ni_create_attr_list+0x11e/0x850
[ 259.253298] ni_ins_attr_ext+0x52c/0x5c0
[ 259.253685] ni_insert_attr+0x1ba/0x420
[ 259.253974] ni_insert_resident+0xc0/0x1c0
[ 259.254311] ntfs_set_ea+0x6bf/0xb30
[ 259.254629] ntfs_setxattr+0x114/0x5c0
[ 259.254859] __vfs_setxattr+0xda/0x120
[ 259.255155] __vfs_setxattr_noperm+0x93/0x300
[ 259.255445] __vfs_setxattr_locked+0x141/0x160
[ 259.255862] vfs_setxattr+0x128/0x300
[ 259.256251] do_setxattr+0xb8/0x170
[ 259.256522] setxattr+0x126/0x140
[ 259.256911] path_setxattr+0x164/0x180
[ 259.257308] __x64_sys_setxattr+0x6d/0x80
[ 259.257637] do_syscall_64+0x3b/0x90
[ 259.257970] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 259.258550]
[ 259.258772] The buggy address belongs to the object at ffff88800632f000
[ 259.258772] which belongs to the cache kmalloc-1k of size 1024
[ 259.260190] The buggy address is located 690 bytes inside of
[ 259.260190] 1024-byte region [ffff88800632f000, ffff88800632f400)
[ 259.261412]
[ 259.261743] The buggy address belongs to the physical page:
[ 259.262354] page:0000000081e8cac9 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x632c
[ 259.263722] head:0000000081e8cac9 order:2 compound_mapcount:0 compound_pincount:0
[ 259.264284] flags: 0xfffffc0010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
[ 259.265312] raw: 000fffffc0010200 ffffea0000060d00 dead000000000004 ffff888001041dc0
[ 259.265772] raw: 0000000000000000 0000000080080008 00000001ffffffff 0000000000000000
[ 259.266305] page dumped because: kasan: bad access detected
[ 259.266588]
[ 259.266728] Memory state around the buggy address:
[ 259.267225] ffff88800632f300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 259.267841] ffff88800632f380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 259.269111] >ffff88800632f400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 259.269626] ^
[ 259.270162] ffff88800632f480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 259.270810] ffff88800632f500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
Signed-off-by: Edward Lo <edward.lo@ambergroup.io>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
indx_read is called when we have some NTFS directory operations that
need more information from the index buffers. This adds a sanity check
to make sure the returned index buffer length is legit, or we may have
some out-of-bound memory accesses.
[ 560.897595] BUG: KASAN: slab-out-of-bounds in hdr_find_e.isra.0+0x10c/0x320
[ 560.898321] Read of size 2 at addr ffff888009497238 by task exp/245
[ 560.898760]
[ 560.899129] CPU: 0 PID: 245 Comm: exp Not tainted 6.0.0-rc6 #37
[ 560.899505] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 560.900170] Call Trace:
[ 560.900407] <TASK>
[ 560.900732] dump_stack_lvl+0x49/0x63
[ 560.901108] print_report.cold+0xf5/0x689
[ 560.901395] ? hdr_find_e.isra.0+0x10c/0x320
[ 560.901716] kasan_report+0xa7/0x130
[ 560.901950] ? hdr_find_e.isra.0+0x10c/0x320
[ 560.902208] __asan_load2+0x68/0x90
[ 560.902427] hdr_find_e.isra.0+0x10c/0x320
[ 560.902846] ? cmp_uints+0xe0/0xe0
[ 560.903363] ? cmp_sdh+0x90/0x90
[ 560.903883] ? ntfs_bread_run+0x190/0x190
[ 560.904196] ? rwsem_down_read_slowpath+0x750/0x750
[ 560.904969] ? ntfs_fix_post_read+0xe0/0x130
[ 560.905259] ? __kasan_check_write+0x14/0x20
[ 560.905599] ? up_read+0x1a/0x90
[ 560.905853] ? indx_read+0x22c/0x380
[ 560.906096] indx_find+0x2ef/0x470
[ 560.906352] ? indx_find_buffer+0x2d0/0x2d0
[ 560.906692] ? __kasan_kmalloc+0x88/0xb0
[ 560.906977] dir_search_u+0x196/0x2f0
[ 560.907220] ? ntfs_nls_to_utf16+0x450/0x450
[ 560.907464] ? __kasan_check_write+0x14/0x20
[ 560.907747] ? mutex_lock+0x8f/0xe0
[ 560.907970] ? __mutex_lock_slowpath+0x20/0x20
[ 560.908214] ? kmem_cache_alloc+0x143/0x4b0
[ 560.908459] ntfs_lookup+0xe0/0x100
[ 560.908788] __lookup_slow+0x116/0x220
[ 560.909050] ? lookup_fast+0x1b0/0x1b0
[ 560.909309] ? lookup_fast+0x13f/0x1b0
[ 560.909601] walk_component+0x187/0x230
[ 560.909944] link_path_walk.part.0+0x3f0/0x660
[ 560.910285] ? handle_lookup_down+0x90/0x90
[ 560.910618] ? path_init+0x642/0x6e0
[ 560.911084] ? percpu_counter_add_batch+0x6e/0xf0
[ 560.912559] ? __alloc_file+0x114/0x170
[ 560.913008] path_openat+0x19c/0x1d10
[ 560.913419] ? getname_flags+0x73/0x2b0
[ 560.913815] ? kasan_save_stack+0x3a/0x50
[ 560.914125] ? kasan_save_stack+0x26/0x50
[ 560.914542] ? __kasan_slab_alloc+0x6d/0x90
[ 560.914924] ? kmem_cache_alloc+0x143/0x4b0
[ 560.915339] ? getname_flags+0x73/0x2b0
[ 560.915647] ? getname+0x12/0x20
[ 560.916114] ? __x64_sys_open+0x4c/0x60
[ 560.916460] ? path_lookupat.isra.0+0x230/0x230
[ 560.916867] ? __isolate_free_page+0x2e0/0x2e0
[ 560.917194] do_filp_open+0x15c/0x1f0
[ 560.917448] ? may_open_dev+0x60/0x60
[ 560.917696] ? expand_files+0xa4/0x3a0
[ 560.917923] ? __kasan_check_write+0x14/0x20
[ 560.918185] ? _raw_spin_lock+0x88/0xdb
[ 560.918409] ? _raw_spin_lock_irqsave+0x100/0x100
[ 560.918783] ? _find_next_bit+0x4a/0x130
[ 560.919026] ? _raw_spin_unlock+0x19/0x40
[ 560.919276] ? alloc_fd+0x14b/0x2d0
[ 560.919635] do_sys_openat2+0x32a/0x4b0
[ 560.920035] ? file_open_root+0x230/0x230
[ 560.920336] ? __rcu_read_unlock+0x5b/0x280
[ 560.920813] do_sys_open+0x99/0xf0
[ 560.921208] ? filp_open+0x60/0x60
[ 560.921482] ? exit_to_user_mode_prepare+0x49/0x180
[ 560.921867] __x64_sys_open+0x4c/0x60
[ 560.922128] do_syscall_64+0x3b/0x90
[ 560.922369] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 560.923030] RIP: 0033:0x7f7dff2e4469
[ 560.923681] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 088
[ 560.924451] RSP: 002b:00007ffd41a210b8 EFLAGS: 00000206 ORIG_RAX: 0000000000000002
[ 560.925168] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7dff2e4469
[ 560.925655] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 00007ffd41a211f0
[ 560.926085] RBP: 00007ffd41a252a0 R08: 00007f7dff60fba0 R09: 00007ffd41a25388
[ 560.926405] R10: 0000000000400b80 R11: 0000000000000206 R12: 00000000004004e0
[ 560.926867] R13: 00007ffd41a25380 R14: 0000000000000000 R15: 0000000000000000
[ 560.927241] </TASK>
[ 560.927491]
[ 560.927755] Allocated by task 245:
[ 560.928409] kasan_save_stack+0x26/0x50
[ 560.929271] __kasan_kmalloc+0x88/0xb0
[ 560.929778] __kmalloc+0x192/0x320
[ 560.930023] indx_read+0x249/0x380
[ 560.930224] indx_find+0x2a2/0x470
[ 560.930695] dir_search_u+0x196/0x2f0
[ 560.930892] ntfs_lookup+0xe0/0x100
[ 560.931115] __lookup_slow+0x116/0x220
[ 560.931323] walk_component+0x187/0x230
[ 560.931570] link_path_walk.part.0+0x3f0/0x660
[ 560.931791] path_openat+0x19c/0x1d10
[ 560.932008] do_filp_open+0x15c/0x1f0
[ 560.932226] do_sys_openat2+0x32a/0x4b0
[ 560.932413] do_sys_open+0x99/0xf0
[ 560.932709] __x64_sys_open+0x4c/0x60
[ 560.933417] do_syscall_64+0x3b/0x90
[ 560.933776] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 560.934235]
[ 560.934486] The buggy address belongs to the object at ffff888009497000
[ 560.934486] which belongs to the cache kmalloc-512 of size 512
[ 560.935239] The buggy address is located 56 bytes to the right of
[ 560.935239] 512-byte region [ffff888009497000, ffff888009497200)
[ 560.936153]
[ 560.937326] The buggy address belongs to the physical page:
[ 560.938228] page:0000000062a3dfae refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x9496
[ 560.939616] head:0000000062a3dfae order:1 compound_mapcount:0 compound_pincount:0
[ 560.940219] flags: 0xfffffc0010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
[ 560.942702] raw: 000fffffc0010200 ffffea0000164f80 dead000000000005 ffff888001041c80
[ 560.943932] raw: 0000000000000000 0000000080080008 00000001ffffffff 0000000000000000
[ 560.944568] page dumped because: kasan: bad access detected
[ 560.945735]
[ 560.946112] Memory state around the buggy address:
[ 560.946870] ffff888009497100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 560.947242] ffff888009497180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 560.947611] >ffff888009497200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 560.947915] ^
[ 560.948249] ffff888009497280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 560.948687] ffff888009497300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
Signed-off-by: Edward Lo <edward.lo@ambergroup.io>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
|
|
| |
Smatch complains that the "add_bytes" is not to be trusted. Use
size_add() to prevent an integer overflow.
Fixes: be71b5cba2e6 ("fs/ntfs3: Add attrib operations")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Although the attribute name length is checked before comparing it to
some common names (e.g., $I30), the offset isn't. This adds a sanity
check for the attribute name offset, guarantee the validity and prevent
possible out-of-bound memory accesses.
[ 191.720056] BUG: unable to handle page fault for address: ffffebde00000008
[ 191.721060] #PF: supervisor read access in kernel mode
[ 191.721586] #PF: error_code(0x0000) - not-present page
[ 191.722079] PGD 0 P4D 0
[ 191.722571] Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
[ 191.723179] CPU: 0 PID: 244 Comm: mount Not tainted 6.0.0-rc4 #28
[ 191.723749] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 191.724832] RIP: 0010:kfree+0x56/0x3b0
[ 191.725870] Code: 80 48 01 d8 0f 82 65 03 00 00 48 c7 c2 00 00 00 80 48 2b 15 2c 06 dd 01 48 01 d0 48 c1 e8 0c 48 c1 e0 06 48 03 05 0a 069
[ 191.727375] RSP: 0018:ffff8880076f7878 EFLAGS: 00000286
[ 191.727897] RAX: ffffebde00000000 RBX: 0000000000000040 RCX: ffffffff8528d5b9
[ 191.728531] RDX: 0000777f80000000 RSI: ffffffff8522d49c RDI: 0000000000000040
[ 191.729183] RBP: ffff8880076f78a0 R08: 0000000000000000 R09: 0000000000000000
[ 191.729628] R10: ffff888008949fd8 R11: ffffed10011293fd R12: 0000000000000040
[ 191.730158] R13: ffff888008949f98 R14: ffff888008949ec0 R15: ffff888008949fb0
[ 191.730645] FS: 00007f3520cd7e40(0000) GS:ffff88805ba00000(0000) knlGS:0000000000000000
[ 191.731328] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 191.731667] CR2: ffffebde00000008 CR3: 0000000009704000 CR4: 00000000000006f0
[ 191.732568] Call Trace:
[ 191.733231] <TASK>
[ 191.733860] kvfree+0x2c/0x40
[ 191.734632] ni_clear+0x180/0x290
[ 191.735085] ntfs_evict_inode+0x45/0x70
[ 191.735495] evict+0x199/0x280
[ 191.735996] iput.part.0+0x286/0x320
[ 191.736438] iput+0x32/0x50
[ 191.736811] iget_failed+0x23/0x30
[ 191.737270] ntfs_iget5+0x337/0x1890
[ 191.737629] ? ntfs_clear_mft_tail+0x20/0x260
[ 191.738201] ? ntfs_get_block_bmap+0x70/0x70
[ 191.738482] ? ntfs_objid_init+0xf6/0x140
[ 191.738779] ? ntfs_reparse_init+0x140/0x140
[ 191.739266] ntfs_fill_super+0x121b/0x1b50
[ 191.739623] ? put_ntfs+0x1d0/0x1d0
[ 191.739984] ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
[ 191.740466] ? put_ntfs+0x1d0/0x1d0
[ 191.740787] ? sb_set_blocksize+0x6a/0x80
[ 191.741272] get_tree_bdev+0x232/0x370
[ 191.741829] ? put_ntfs+0x1d0/0x1d0
[ 191.742669] ntfs_fs_get_tree+0x15/0x20
[ 191.743132] vfs_get_tree+0x4c/0x130
[ 191.743457] path_mount+0x654/0xfe0
[ 191.743938] ? putname+0x80/0xa0
[ 191.744271] ? finish_automount+0x2e0/0x2e0
[ 191.744582] ? putname+0x80/0xa0
[ 191.745053] ? kmem_cache_free+0x1c4/0x440
[ 191.745403] ? putname+0x80/0xa0
[ 191.745616] do_mount+0xd6/0xf0
[ 191.745887] ? path_mount+0xfe0/0xfe0
[ 191.746287] ? __kasan_check_write+0x14/0x20
[ 191.746582] __x64_sys_mount+0xca/0x110
[ 191.746850] do_syscall_64+0x3b/0x90
[ 191.747122] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 191.747517] RIP: 0033:0x7f351fee948a
[ 191.748332] Code: 48 8b 0d 11 fa 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 008
[ 191.749341] RSP: 002b:00007ffd51cf3af8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
[ 191.749960] RAX: ffffffffffffffda RBX: 000055b903733060 RCX: 00007f351fee948a
[ 191.750589] RDX: 000055b903733260 RSI: 000055b9037332e0 RDI: 000055b90373bce0
[ 191.751115] RBP: 0000000000000000 R08: 000055b903733280 R09: 0000000000000020
[ 191.751537] R10: 00000000c0ed0000 R11: 0000000000000202 R12: 000055b90373bce0
[ 191.751946] R13: 000055b903733260 R14: 0000000000000000 R15: 00000000ffffffff
[ 191.752519] </TASK>
[ 191.752782] Modules linked in:
[ 191.753785] CR2: ffffebde00000008
[ 191.754937] ---[ end trace 0000000000000000 ]---
[ 191.755429] RIP: 0010:kfree+0x56/0x3b0
[ 191.755725] Code: 80 48 01 d8 0f 82 65 03 00 00 48 c7 c2 00 00 00 80 48 2b 15 2c 06 dd 01 48 01 d0 48 c1 e8 0c 48 c1 e0 06 48 03 05 0a 069
[ 191.756744] RSP: 0018:ffff8880076f7878 EFLAGS: 00000286
[ 191.757218] RAX: ffffebde00000000 RBX: 0000000000000040 RCX: ffffffff8528d5b9
[ 191.757580] RDX: 0000777f80000000 RSI: ffffffff8522d49c RDI: 0000000000000040
[ 191.758016] RBP: ffff8880076f78a0 R08: 0000000000000000 R09: 0000000000000000
[ 191.758570] R10: ffff888008949fd8 R11: ffffed10011293fd R12: 0000000000000040
[ 191.758957] R13: ffff888008949f98 R14: ffff888008949ec0 R15: ffff888008949fb0
[ 191.759317] FS: 00007f3520cd7e40(0000) GS:ffff88805ba00000(0000) knlGS:0000000000000000
[ 191.759711] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 191.760118] CR2: ffffebde00000008 CR3: 0000000009704000 CR4: 00000000000006f0
Signed-off-by: Edward Lo <edward.lo@ambergroup.io>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a sanity check for the i_op pointer of the inode which is
returned after reading Root directory MFT record. We should check the
i_op is valid before trying to create the root dentry, otherwise we may
encounter a NPD while mounting a image with a funny Root directory MFT
record.
[ 114.484325] BUG: kernel NULL pointer dereference, address: 0000000000000008
[ 114.484811] #PF: supervisor read access in kernel mode
[ 114.485084] #PF: error_code(0x0000) - not-present page
[ 114.485606] PGD 0 P4D 0
[ 114.485975] Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
[ 114.486570] CPU: 0 PID: 237 Comm: mount Tainted: G B 6.0.0-rc4 #28
[ 114.486977] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 114.488169] RIP: 0010:d_flags_for_inode+0xe0/0x110
[ 114.488816] Code: 24 f7 ff 49 83 3e 00 74 41 41 83 cd 02 66 44 89 6b 02 eb 92 48 8d 7b 20 e8 6d 24 f7 ff 4c 8b 73 20 49 8d 7e 08 e8 60 241
[ 114.490326] RSP: 0018:ffff8880065e7aa8 EFLAGS: 00000296
[ 114.490695] RAX: 0000000000000001 RBX: ffff888008ccd750 RCX: ffffffff84af2aea
[ 114.490986] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffff87abd020
[ 114.491364] RBP: ffff8880065e7ac8 R08: 0000000000000001 R09: fffffbfff0f57a05
[ 114.491675] R10: ffffffff87abd027 R11: fffffbfff0f57a04 R12: 0000000000000000
[ 114.491954] R13: 0000000000000008 R14: 0000000000000000 R15: ffff888008ccd750
[ 114.492397] FS: 00007fdc8a627e40(0000) GS:ffff888058200000(0000) knlGS:0000000000000000
[ 114.492797] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 114.493150] CR2: 0000000000000008 CR3: 00000000013ba000 CR4: 00000000000006f0
[ 114.493671] Call Trace:
[ 114.493890] <TASK>
[ 114.494075] __d_instantiate+0x24/0x1c0
[ 114.494505] d_instantiate.part.0+0x35/0x50
[ 114.494754] d_make_root+0x53/0x80
[ 114.494998] ntfs_fill_super+0x1232/0x1b50
[ 114.495260] ? put_ntfs+0x1d0/0x1d0
[ 114.495499] ? vsprintf+0x20/0x20
[ 114.495723] ? set_blocksize+0x95/0x150
[ 114.495964] get_tree_bdev+0x232/0x370
[ 114.496272] ? put_ntfs+0x1d0/0x1d0
[ 114.496502] ntfs_fs_get_tree+0x15/0x20
[ 114.496859] vfs_get_tree+0x4c/0x130
[ 114.497099] path_mount+0x654/0xfe0
[ 114.497507] ? putname+0x80/0xa0
[ 114.497933] ? finish_automount+0x2e0/0x2e0
[ 114.498362] ? putname+0x80/0xa0
[ 114.498571] ? kmem_cache_free+0x1c4/0x440
[ 114.498819] ? putname+0x80/0xa0
[ 114.499069] do_mount+0xd6/0xf0
[ 114.499343] ? path_mount+0xfe0/0xfe0
[ 114.499683] ? __kasan_check_write+0x14/0x20
[ 114.500133] __x64_sys_mount+0xca/0x110
[ 114.500592] do_syscall_64+0x3b/0x90
[ 114.500930] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 114.501294] RIP: 0033:0x7fdc898e948a
[ 114.501542] Code: 48 8b 0d 11 fa 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 008
[ 114.502716] RSP: 002b:00007ffd793e58f8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
[ 114.503175] RAX: ffffffffffffffda RBX: 0000564b2228f060 RCX: 00007fdc898e948a
[ 114.503588] RDX: 0000564b2228f260 RSI: 0000564b2228f2e0 RDI: 0000564b22297ce0
[ 114.504925] RBP: 0000000000000000 R08: 0000564b2228f280 R09: 0000000000000020
[ 114.505484] R10: 00000000c0ed0000 R11: 0000000000000202 R12: 0000564b22297ce0
[ 114.505823] R13: 0000564b2228f260 R14: 0000000000000000 R15: 00000000ffffffff
[ 114.506562] </TASK>
[ 114.506887] Modules linked in:
[ 114.507648] CR2: 0000000000000008
[ 114.508884] ---[ end trace 0000000000000000 ]---
[ 114.509675] RIP: 0010:d_flags_for_inode+0xe0/0x110
[ 114.510140] Code: 24 f7 ff 49 83 3e 00 74 41 41 83 cd 02 66 44 89 6b 02 eb 92 48 8d 7b 20 e8 6d 24 f7 ff 4c 8b 73 20 49 8d 7e 08 e8 60 241
[ 114.511762] RSP: 0018:ffff8880065e7aa8 EFLAGS: 00000296
[ 114.512401] RAX: 0000000000000001 RBX: ffff888008ccd750 RCX: ffffffff84af2aea
[ 114.513103] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffff87abd020
[ 114.513512] RBP: ffff8880065e7ac8 R08: 0000000000000001 R09: fffffbfff0f57a05
[ 114.513831] R10: ffffffff87abd027 R11: fffffbfff0f57a04 R12: 0000000000000000
[ 114.514757] R13: 0000000000000008 R14: 0000000000000000 R15: ffff888008ccd750
[ 114.515411] FS: 00007fdc8a627e40(0000) GS:ffff888058200000(0000) knlGS:0000000000000000
[ 114.515794] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 114.516208] CR2: 0000000000000008 CR3: 00000000013ba000 CR4: 00000000000006f0
Signed-off-by: Edward Lo <edward.lo@ambergroup.io>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ntfs3 file system driver does not convert the target path of
junction points to a proper Linux path. As junction points targets
are always absolute paths (they start with a drive letter), all
junctions will result in broken links.
Translate the targets of junction points to relative paths so they
point to directories inside the mounted volume. Note that Windows
allows junction points to reference directories in another drive.
However, as there is no way to know which drive the junctions refer
to, we assume they always target the same file system they are in.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=214833
Signed-off-by: Daniel Pinto <danielpinto52@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
syzbot reported UBSAN error as below:
[ 76.901829][ T6677] ================================================================================
[ 76.903908][ T6677] UBSAN: shift-out-of-bounds in fs/ntfs3/super.c:675:13
[ 76.905363][ T6677] shift exponent -247 is negative
This patch avoid this error.
Link: https://syzkaller.appspot.com/bug?id=b0299c09a14aababf0f1c862dd4ebc8ab9eb0179
Fixes: a3b774342fa7 (fs/ntfs3: validate BOOT sectors_per_clusters)
Cc: Author: Randy Dunlap <rdunlap@infradead.org>
Reported-by: syzbot+35b87c668935bb55e666@syzkaller.appspotmail.com
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
syzbot reported kmemleak as below:
BUG: memory leak
unreferenced object 0xffff8880122f1540 (size 32):
comm "a.out", pid 6664, jiffies 4294939771 (age 25.500s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 ed ff ed ff 00 00 00 00 ................
backtrace:
[<ffffffff81b16052>] ntfs_init_fs_context+0x22/0x1c0
[<ffffffff8164aaa7>] alloc_fs_context+0x217/0x430
[<ffffffff81626dd4>] path_mount+0x704/0x1080
[<ffffffff81627e7c>] __x64_sys_mount+0x18c/0x1d0
[<ffffffff84593e14>] do_syscall_64+0x34/0xb0
[<ffffffff84600087>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
This patch fixes this issue by freeing mount options on error path of
ntfs_fill_super().
Reported-by: syzbot+9d67170b20e8f94351c8@syzkaller.appspotmail.com
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
|
|
| |
Prefer using kmalloc_array(a, b) over kmalloc(a * b) as this
improves semantics since kmalloc is intended for allocating an
array of memory.
Signed-off-by: Kenneth Lee <klee33@uw.edu>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The bug occours due to a misuse of `attr` variable instead of `attr_b`.
`attr` is being initialized as NULL, then being derenfernced
as `attr->res.data_size`.
This bug causes a crash of the ntfs3 driver itself,
If compiled directly to the kernel, it crashes the whole system.
Signed-off-by: Alon Zahavi <zahavi.alon@gmail.com>
Co-developed-by: Tal Lossos <tallossos@gmail.com>
Signed-off-by: Tal Lossos <tallossos@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
|
|
| |
ntfs3's dmask and fmask mount options are 16-bit quantities but are displayed
as 1-extended 32-bit values in /proc/mounts. Fix this by circumventing
integer promotion.
Signed-off-by: Marc Aurèle La France <tsi@tuyoix.net>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some metadata files are handled before MFT. This adds a null pointer
check for some corner cases that could lead to NPD while reading these
metadata files for a malformed NTFS image.
[ 240.190827] BUG: kernel NULL pointer dereference, address: 0000000000000158
[ 240.191583] #PF: supervisor read access in kernel mode
[ 240.191956] #PF: error_code(0x0000) - not-present page
[ 240.192391] PGD 0 P4D 0
[ 240.192897] Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
[ 240.193805] CPU: 0 PID: 242 Comm: mount Tainted: G B 5.19.0+ #17
[ 240.194477] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 240.195152] RIP: 0010:ni_find_attr+0xae/0x300
[ 240.195679] Code: c8 48 c7 45 88 c0 4e 5e 86 c7 00 f1 f1 f1 f1 c7 40 04 00 f3 f3 f3 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 e8 e2 d9f
[ 240.196642] RSP: 0018:ffff88800812f690 EFLAGS: 00000286
[ 240.197019] RAX: 0000000000000001 RBX: 0000000000000000 RCX: ffffffff85ef037a
[ 240.197523] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffff88e95f60
[ 240.197877] RBP: ffff88800812f738 R08: 0000000000000001 R09: fffffbfff11d2bed
[ 240.198292] R10: ffffffff88e95f67 R11: fffffbfff11d2bec R12: 0000000000000000
[ 240.198647] R13: 0000000000000080 R14: 0000000000000000 R15: 0000000000000000
[ 240.199410] FS: 00007f233c33be40(0000) GS:ffff888058200000(0000) knlGS:0000000000000000
[ 240.199895] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 240.200314] CR2: 0000000000000158 CR3: 0000000004d32000 CR4: 00000000000006f0
[ 240.200839] Call Trace:
[ 240.201104] <TASK>
[ 240.201502] ? ni_load_mi+0x80/0x80
[ 240.202297] ? ___slab_alloc+0x465/0x830
[ 240.202614] attr_load_runs_vcn+0x8c/0x1a0
[ 240.202886] ? __kasan_slab_alloc+0x32/0x90
[ 240.203157] ? attr_data_write_resident+0x250/0x250
[ 240.203543] mi_read+0x133/0x2c0
[ 240.203785] mi_get+0x70/0x140
[ 240.204012] ni_load_mi_ex+0xfa/0x190
[ 240.204346] ? ni_std5+0x90/0x90
[ 240.204588] ? __kasan_kmalloc+0x88/0xb0
[ 240.204859] ni_enum_attr_ex+0xf1/0x1c0
[ 240.205107] ? ni_fname_type.part.0+0xd0/0xd0
[ 240.205600] ? ntfs_load_attr_list+0xbe/0x300
[ 240.205864] ? ntfs_cmp_names_cpu+0x125/0x180
[ 240.206157] ntfs_iget5+0x56c/0x1870
[ 240.206510] ? ntfs_get_block_bmap+0x70/0x70
[ 240.206776] ? __kasan_kmalloc+0x88/0xb0
[ 240.207030] ? set_blocksize+0x95/0x150
[ 240.207545] ntfs_fill_super+0xb8f/0x1e20
[ 240.207839] ? put_ntfs+0x1d0/0x1d0
[ 240.208069] ? vsprintf+0x20/0x20
[ 240.208467] ? mutex_unlock+0x81/0xd0
[ 240.208846] ? set_blocksize+0x95/0x150
[ 240.209221] get_tree_bdev+0x232/0x370
[ 240.209804] ? put_ntfs+0x1d0/0x1d0
[ 240.210519] ntfs_fs_get_tree+0x15/0x20
[ 240.210991] vfs_get_tree+0x4c/0x130
[ 240.211455] path_mount+0x645/0xfd0
[ 240.211806] ? putname+0x80/0xa0
[ 240.212112] ? finish_automount+0x2e0/0x2e0
[ 240.212559] ? kmem_cache_free+0x110/0x390
[ 240.212906] ? putname+0x80/0xa0
[ 240.213329] do_mount+0xd6/0xf0
[ 240.213829] ? path_mount+0xfd0/0xfd0
[ 240.214246] ? __kasan_check_write+0x14/0x20
[ 240.214774] __x64_sys_mount+0xca/0x110
[ 240.215080] do_syscall_64+0x3b/0x90
[ 240.215442] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 240.215811] RIP: 0033:0x7f233b4e948a
[ 240.216104] Code: 48 8b 0d 11 fa 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 008
[ 240.217615] RSP: 002b:00007fff02211ec8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
[ 240.218718] RAX: ffffffffffffffda RBX: 0000561cdc35b060 RCX: 00007f233b4e948a
[ 240.219556] RDX: 0000561cdc35b260 RSI: 0000561cdc35b2e0 RDI: 0000561cdc363af0
[ 240.219975] RBP: 0000000000000000 R08: 0000561cdc35b280 R09: 0000000000000020
[ 240.220403] R10: 00000000c0ed0000 R11: 0000000000000202 R12: 0000561cdc363af0
[ 240.220803] R13: 0000561cdc35b260 R14: 0000000000000000 R15: 00000000ffffffff
[ 240.221256] </TASK>
[ 240.221567] Modules linked in:
[ 240.222028] CR2: 0000000000000158
[ 240.223291] ---[ end trace 0000000000000000 ]---
[ 240.223669] RIP: 0010:ni_find_attr+0xae/0x300
[ 240.224058] Code: c8 48 c7 45 88 c0 4e 5e 86 c7 00 f1 f1 f1 f1 c7 40 04 00 f3 f3 f3 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 e8 e2 d9f
[ 240.225033] RSP: 0018:ffff88800812f690 EFLAGS: 00000286
[ 240.225968] RAX: 0000000000000001 RBX: 0000000000000000 RCX: ffffffff85ef037a
[ 240.226624] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffff88e95f60
[ 240.227307] RBP: ffff88800812f738 R08: 0000000000000001 R09: fffffbfff11d2bed
[ 240.227816] R10: ffffffff88e95f67 R11: fffffbfff11d2bec R12: 0000000000000000
[ 240.228330] R13: 0000000000000080 R14: 0000000000000000 R15: 0000000000000000
[ 240.228729] FS: 00007f233c33be40(0000) GS:ffff888058200000(0000) knlGS:0000000000000000
[ 240.229281] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 240.230298] CR2: 0000000000000158 CR3: 0000000004d32000 CR4: 00000000000006f0
Signed-off-by: Edward Lo <edward.lo@ambergroup.io>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds sanity checks for data run offset. We should make sure data
run offset is legit before trying to unpack them, otherwise we may
encounter use-after-free or some unexpected memory access behaviors.
[ 82.940342] BUG: KASAN: use-after-free in run_unpack+0x2e3/0x570
[ 82.941180] Read of size 1 at addr ffff888008a8487f by task mount/240
[ 82.941670]
[ 82.942069] CPU: 0 PID: 240 Comm: mount Not tainted 5.19.0+ #15
[ 82.942482] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 82.943720] Call Trace:
[ 82.944204] <TASK>
[ 82.944471] dump_stack_lvl+0x49/0x63
[ 82.944908] print_report.cold+0xf5/0x67b
[ 82.945141] ? __wait_on_bit+0x106/0x120
[ 82.945750] ? run_unpack+0x2e3/0x570
[ 82.946626] kasan_report+0xa7/0x120
[ 82.947046] ? run_unpack+0x2e3/0x570
[ 82.947280] __asan_load1+0x51/0x60
[ 82.947483] run_unpack+0x2e3/0x570
[ 82.947709] ? memcpy+0x4e/0x70
[ 82.947927] ? run_pack+0x7a0/0x7a0
[ 82.948158] run_unpack_ex+0xad/0x3f0
[ 82.948399] ? mi_enum_attr+0x14a/0x200
[ 82.948717] ? run_unpack+0x570/0x570
[ 82.949072] ? ni_enum_attr_ex+0x1b2/0x1c0
[ 82.949332] ? ni_fname_type.part.0+0xd0/0xd0
[ 82.949611] ? mi_read+0x262/0x2c0
[ 82.949970] ? ntfs_cmp_names_cpu+0x125/0x180
[ 82.950249] ntfs_iget5+0x632/0x1870
[ 82.950621] ? ntfs_get_block_bmap+0x70/0x70
[ 82.951192] ? evict+0x223/0x280
[ 82.951525] ? iput.part.0+0x286/0x320
[ 82.951969] ntfs_fill_super+0x1321/0x1e20
[ 82.952436] ? put_ntfs+0x1d0/0x1d0
[ 82.952822] ? vsprintf+0x20/0x20
[ 82.953188] ? mutex_unlock+0x81/0xd0
[ 82.953379] ? set_blocksize+0x95/0x150
[ 82.954001] get_tree_bdev+0x232/0x370
[ 82.954438] ? put_ntfs+0x1d0/0x1d0
[ 82.954700] ntfs_fs_get_tree+0x15/0x20
[ 82.955049] vfs_get_tree+0x4c/0x130
[ 82.955292] path_mount+0x645/0xfd0
[ 82.955615] ? putname+0x80/0xa0
[ 82.955955] ? finish_automount+0x2e0/0x2e0
[ 82.956310] ? kmem_cache_free+0x110/0x390
[ 82.956723] ? putname+0x80/0xa0
[ 82.957023] do_mount+0xd6/0xf0
[ 82.957411] ? path_mount+0xfd0/0xfd0
[ 82.957638] ? __kasan_check_write+0x14/0x20
[ 82.957948] __x64_sys_mount+0xca/0x110
[ 82.958310] do_syscall_64+0x3b/0x90
[ 82.958719] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 82.959341] RIP: 0033:0x7fd0d1ce948a
[ 82.960193] Code: 48 8b 0d 11 fa 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 008
[ 82.961532] RSP: 002b:00007ffe59ff69a8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
[ 82.962527] RAX: ffffffffffffffda RBX: 0000564dcc107060 RCX: 00007fd0d1ce948a
[ 82.963266] RDX: 0000564dcc107260 RSI: 0000564dcc1072e0 RDI: 0000564dcc10fce0
[ 82.963686] RBP: 0000000000000000 R08: 0000564dcc107280 R09: 0000000000000020
[ 82.964272] R10: 00000000c0ed0000 R11: 0000000000000202 R12: 0000564dcc10fce0
[ 82.964785] R13: 0000564dcc107260 R14: 0000000000000000 R15: 00000000ffffffff
Signed-off-by: Edward Lo <edward.lo@ambergroup.io>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The offset addition could overflow and pass the used size check given an
attribute with very large size (e.g., 0xffffff7f) while parsing MFT
attributes. This could lead to out-of-bound memory R/W if we try to
access the next attribute derived by Add2Ptr(attr, asize)
[ 32.963847] BUG: unable to handle page fault for address: ffff956a83c76067
[ 32.964301] #PF: supervisor read access in kernel mode
[ 32.964526] #PF: error_code(0x0000) - not-present page
[ 32.964893] PGD 4dc01067 P4D 4dc01067 PUD 0
[ 32.965316] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 32.965727] CPU: 0 PID: 243 Comm: mount Not tainted 5.19.0+ #6
[ 32.966050] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 32.966628] RIP: 0010:mi_enum_attr+0x44/0x110
[ 32.967239] Code: 89 f0 48 29 c8 48 89 c1 39 c7 0f 86 94 00 00 00 8b 56 04 83 fa 17 0f 86 88 00 00 00 89 d0 01 ca 48 01 f0 8d 4a 08 39 f9a
[ 32.968101] RSP: 0018:ffffba15c06a7c38 EFLAGS: 00000283
[ 32.968364] RAX: ffff956a83c76067 RBX: ffff956983c76050 RCX: 000000000000006f
[ 32.968651] RDX: 0000000000000067 RSI: ffff956983c760e8 RDI: 00000000000001c8
[ 32.968963] RBP: ffffba15c06a7c38 R08: 0000000000000064 R09: 00000000ffffff7f
[ 32.969249] R10: 0000000000000007 R11: ffff956983c760e8 R12: ffff95698225e000
[ 32.969870] R13: 0000000000000000 R14: ffffba15c06a7cd8 R15: ffff95698225e170
[ 32.970655] FS: 00007fdab8189e40(0000) GS:ffff9569fdc00000(0000) knlGS:0000000000000000
[ 32.971098] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 32.971378] CR2: ffff956a83c76067 CR3: 0000000002c58000 CR4: 00000000000006f0
[ 32.972098] Call Trace:
[ 32.972842] <TASK>
[ 32.973341] ni_enum_attr_ex+0xda/0xf0
[ 32.974087] ntfs_iget5+0x1db/0xde0
[ 32.974386] ? slab_post_alloc_hook+0x53/0x270
[ 32.974778] ? ntfs_fill_super+0x4c7/0x12a0
[ 32.975115] ntfs_fill_super+0x5d6/0x12a0
[ 32.975336] get_tree_bdev+0x175/0x270
[ 32.975709] ? put_ntfs+0x150/0x150
[ 32.975956] ntfs_fs_get_tree+0x15/0x20
[ 32.976191] vfs_get_tree+0x2a/0xc0
[ 32.976374] ? capable+0x19/0x20
[ 32.976572] path_mount+0x484/0xaa0
[ 32.977025] ? putname+0x57/0x70
[ 32.977380] do_mount+0x80/0xa0
[ 32.977555] __x64_sys_mount+0x8b/0xe0
[ 32.978105] do_syscall_64+0x3b/0x90
[ 32.978830] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 32.979311] RIP: 0033:0x7fdab72e948a
[ 32.980015] Code: 48 8b 0d 11 fa 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 008
[ 32.981251] RSP: 002b:00007ffd15b87588 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
[ 32.981832] RAX: ffffffffffffffda RBX: 0000557de0aaf060 RCX: 00007fdab72e948a
[ 32.982234] RDX: 0000557de0aaf260 RSI: 0000557de0aaf2e0 RDI: 0000557de0ab7ce0
[ 32.982714] RBP: 0000000000000000 R08: 0000557de0aaf280 R09: 0000000000000020
[ 32.983046] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 0000557de0ab7ce0
[ 32.983494] R13: 0000557de0aaf260 R14: 0000000000000000 R15: 00000000ffffffff
[ 32.984094] </TASK>
[ 32.984352] Modules linked in:
[ 32.984753] CR2: ffff956a83c76067
[ 32.985911] ---[ end trace 0000000000000000 ]---
[ 32.986555] RIP: 0010:mi_enum_attr+0x44/0x110
[ 32.987217] Code: 89 f0 48 29 c8 48 89 c1 39 c7 0f 86 94 00 00 00 8b 56 04 83 fa 17 0f 86 88 00 00 00 89 d0 01 ca 48 01 f0 8d 4a 08 39 f9a
[ 32.988232] RSP: 0018:ffffba15c06a7c38 EFLAGS: 00000283
[ 32.988532] RAX: ffff956a83c76067 RBX: ffff956983c76050 RCX: 000000000000006f
[ 32.988916] RDX: 0000000000000067 RSI: ffff956983c760e8 RDI: 00000000000001c8
[ 32.989356] RBP: ffffba15c06a7c38 R08: 0000000000000064 R09: 00000000ffffff7f
[ 32.989994] R10: 0000000000000007 R11: ffff956983c760e8 R12: ffff95698225e000
[ 32.990415] R13: 0000000000000000 R14: ffffba15c06a7cd8 R15: ffff95698225e170
[ 32.991011] FS: 00007fdab8189e40(0000) GS:ffff9569fdc00000(0000) knlGS:0000000000000000
[ 32.991524] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 32.991936] CR2: ffff956a83c76067 CR3: 0000000002c58000 CR4: 00000000000006f0
This patch adds an overflow check
Signed-off-by: edward lo <edward.lo@ambergroup.io>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the NTFS BOOT record_size field < 0, it represents a
shift value. However, there is no sanity check on the shift result
and the sbi->record_bits calculation through blksize_bits() assumes
the size always > 256, which could lead to NPD while mounting a
malformed NTFS image.
[ 318.675159] BUG: kernel NULL pointer dereference, address: 0000000000000158
[ 318.675682] #PF: supervisor read access in kernel mode
[ 318.675869] #PF: error_code(0x0000) - not-present page
[ 318.676246] PGD 0 P4D 0
[ 318.676502] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 318.676934] CPU: 0 PID: 259 Comm: mount Not tainted 5.19.0 #5
[ 318.677289] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 318.678136] RIP: 0010:ni_find_attr+0x2d/0x1c0
[ 318.678656] Code: 89 ca 4d 89 c7 41 56 41 55 41 54 41 89 cc 55 48 89 fd 53 48 89 d3 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 180
[ 318.679848] RSP: 0018:ffffa6c8c0297bd8 EFLAGS: 00000246
[ 318.680104] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000080
[ 318.680790] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 318.681679] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 318.682577] R10: 0000000000000000 R11: 0000000000000005 R12: 0000000000000080
[ 318.683015] R13: ffff8d5582e68400 R14: 0000000000000100 R15: 0000000000000000
[ 318.683618] FS: 00007fd9e1c81e40(0000) GS:ffff8d55fdc00000(0000) knlGS:0000000000000000
[ 318.684280] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 318.684651] CR2: 0000000000000158 CR3: 0000000002e1a000 CR4: 00000000000006f0
[ 318.685623] Call Trace:
[ 318.686607] <TASK>
[ 318.686872] ? ntfs_alloc_inode+0x1a/0x60
[ 318.687235] attr_load_runs_vcn+0x2b/0xa0
[ 318.687468] mi_read+0xbb/0x250
[ 318.687576] ntfs_iget5+0x114/0xd90
[ 318.687750] ntfs_fill_super+0x588/0x11b0
[ 318.687953] ? put_ntfs+0x130/0x130
[ 318.688065] ? snprintf+0x49/0x70
[ 318.688164] ? put_ntfs+0x130/0x130
[ 318.688256] get_tree_bdev+0x16a/0x260
[ 318.688407] vfs_get_tree+0x20/0xb0
[ 318.688519] path_mount+0x2dc/0x9b0
[ 318.688877] do_mount+0x74/0x90
[ 318.689142] __x64_sys_mount+0x89/0xd0
[ 318.689636] do_syscall_64+0x3b/0x90
[ 318.689998] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 318.690318] RIP: 0033:0x7fd9e133c48a
[ 318.690687] Code: 48 8b 0d 11 fa 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 008
[ 318.691357] RSP: 002b:00007ffd374406c8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
[ 318.691632] RAX: ffffffffffffffda RBX: 0000564d0b051080 RCX: 00007fd9e133c48a
[ 318.691920] RDX: 0000564d0b051280 RSI: 0000564d0b051300 RDI: 0000564d0b0596a0
[ 318.692123] RBP: 0000000000000000 R08: 0000564d0b0512a0 R09: 0000000000000020
[ 318.692349] R10: 00000000c0ed0000 R11: 0000000000000202 R12: 0000564d0b0596a0
[ 318.692673] R13: 0000564d0b051280 R14: 0000000000000000 R15: 00000000ffffffff
[ 318.693007] </TASK>
[ 318.693271] Modules linked in:
[ 318.693614] CR2: 0000000000000158
[ 318.694446] ---[ end trace 0000000000000000 ]---
[ 318.694779] RIP: 0010:ni_find_attr+0x2d/0x1c0
[ 318.694952] Code: 89 ca 4d 89 c7 41 56 41 55 41 54 41 89 cc 55 48 89 fd 53 48 89 d3 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 180
[ 318.696042] RSP: 0018:ffffa6c8c0297bd8 EFLAGS: 00000246
[ 318.696531] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000080
[ 318.698114] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 318.699286] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 318.699795] R10: 0000000000000000 R11: 0000000000000005 R12: 0000000000000080
[ 318.700236] R13: ffff8d5582e68400 R14: 0000000000000100 R15: 0000000000000000
[ 318.700973] FS: 00007fd9e1c81e40(0000) GS:ffff8d55fdc00000(0000) knlGS:0000000000000000
[ 318.701688] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 318.702190] CR2: 0000000000000158 CR3: 0000000002e1a000 CR4: 00000000000006f0
[ 318.726510] mount (259) used greatest stack depth: 13320 bytes left
This patch adds a sanity check.
Signed-off-by: edward lo <edward.lo@ambergroup.io>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
| |
After renaming we don't need to split code in two lines.
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
| |
This commit adds mount option and additional functions.
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
| |
Many filesystems already use free_inode callback,
so we will use it too from now on.
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
|
| |
With this option all files with filename[0] == '.'
will have FILE_ATTRIBUTE_HIDDEN attribute.
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
|
|
|
|
| |
This commit adds additional info about CONFIG_NTFS3_64BIT_CLUSTER
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:
"Regression and bug fixes:
- Performance regression fix from 5.18 on a Rasberry Pi
- Fix extent parsing bug which triggers a BUG_ON when a (corrupted)
extent tree has has a non-root node when zero entries.
- Fix a livelock where in the right (wrong) circumstances a large
number of nfsd threads can try to write to a nearly full file
system, and retry for hours(!)"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: limit the number of retries after discarding preallocations blocks
ext4: fix bug in extents parsing when eh_entries == 0 and eh_depth > 0
ext4: use buckets for cr 1 block scan instead of rbtree
ext4: use locality group preallocation for small closed files
ext4: make directory inode spreading reflect flexbg size
ext4: avoid unnecessary spreading of allocations among groups
ext4: make mballoc try target group first even with mb_optimize_scan
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This patch avoids threads live-locking for hours when a large number
threads are competing over the last few free extents as they blocks
getting added and removed from preallocation pools. From our bug
reporter:
A reliable way for triggering this has multiple writers
continuously write() to files when the filesystem is full, while
small amounts of space are freed (e.g. by truncating a large file
-1MiB at a time). In the local filesystem, this can be done by
simply not checking the return code of write (0) and/or the error
(ENOSPACE) that is set. Over NFS with an async mount, even clients
with proper error checking will behave this way since the linux NFS
client implementation will not propagate the server errors [the
write syscalls immediately return success] until the file handle is
closed. This leads to a situation where NFS clients send a
continuous stream of WRITE rpcs which result in ERRNOSPACE -- but
since the client isn't seeing this, the stream of writes continues
at maximum network speed.
When some space does appear, multiple writers will all attempt to
claim it for their current write. For NFS, we may see dozens to
hundreds of threads that do this.
The real-world scenario of this is database backup tooling (in
particular, github.com/mdkent/percona-xtrabackup) which may write
large files (>1TiB) to NFS for safe keeping. Some temporary files
are written, rewound, and read back -- all before closing the file
handle (the temp file is actually unlinked, to trigger automatic
deletion on close/crash.) An application like this operating on an
async NFS mount will not see an error code until TiB have been
written/read.
The lockup was observed when running this database backup on large
filesystems (64 TiB in this case) with a high number of block
groups and no free space. Fragmentation is generally not a factor
in this filesystem (~thousands of large files, mostly contiguous
except for the parts written while the filesystem is at capacity.)
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When walking through an inode extents, the ext4_ext_binsearch_idx() function
assumes that the extent header has been previously validated. However, there
are no checks that verify that the number of entries (eh->eh_entries) is
non-zero when depth is > 0. And this will lead to problems because the
EXT_FIRST_INDEX() and EXT_LAST_INDEX() will return garbage and result in this:
[ 135.245946] ------------[ cut here ]------------
[ 135.247579] kernel BUG at fs/ext4/extents.c:2258!
[ 135.249045] invalid opcode: 0000 [#1] PREEMPT SMP
[ 135.250320] CPU: 2 PID: 238 Comm: tmp118 Not tainted 5.19.0-rc8+ #4
[ 135.252067] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b-rebuilt.opensuse.org 04/01/2014
[ 135.255065] RIP: 0010:ext4_ext_map_blocks+0xc20/0xcb0
[ 135.256475] Code:
[ 135.261433] RSP: 0018:ffffc900005939f8 EFLAGS: 00010246
[ 135.262847] RAX: 0000000000000024 RBX: ffffc90000593b70 RCX: 0000000000000023
[ 135.264765] RDX: ffff8880038e5f10 RSI: 0000000000000003 RDI: ffff8880046e922c
[ 135.266670] RBP: ffff8880046e9348 R08: 0000000000000001 R09: ffff888002ca580c
[ 135.268576] R10: 0000000000002602 R11: 0000000000000000 R12: 0000000000000024
[ 135.270477] R13: 0000000000000000 R14: 0000000000000024 R15: 0000000000000000
[ 135.272394] FS: 00007fdabdc56740(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000
[ 135.274510] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 135.276075] CR2: 00007ffc26bd4f00 CR3: 0000000006261004 CR4: 0000000000170ea0
[ 135.277952] Call Trace:
[ 135.278635] <TASK>
[ 135.279247] ? preempt_count_add+0x6d/0xa0
[ 135.280358] ? percpu_counter_add_batch+0x55/0xb0
[ 135.281612] ? _raw_read_unlock+0x18/0x30
[ 135.282704] ext4_map_blocks+0x294/0x5a0
[ 135.283745] ? xa_load+0x6f/0xa0
[ 135.284562] ext4_mpage_readpages+0x3d6/0x770
[ 135.285646] read_pages+0x67/0x1d0
[ 135.286492] ? folio_add_lru+0x51/0x80
[ 135.287441] page_cache_ra_unbounded+0x124/0x170
[ 135.288510] filemap_get_pages+0x23d/0x5a0
[ 135.289457] ? path_openat+0xa72/0xdd0
[ 135.290332] filemap_read+0xbf/0x300
[ 135.291158] ? _raw_spin_lock_irqsave+0x17/0x40
[ 135.292192] new_sync_read+0x103/0x170
[ 135.293014] vfs_read+0x15d/0x180
[ 135.293745] ksys_read+0xa1/0xe0
[ 135.294461] do_syscall_64+0x3c/0x80
[ 135.295284] entry_SYSCALL_64_after_hwframe+0x46/0xb0
This patch simply adds an extra check in __ext4_ext_check(), verifying that
eh_entries is not 0 when eh_depth is > 0.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215941
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216283
Cc: Baokun Li <libaokun1@huawei.com>
Cc: stable@kernel.org
Signed-off-by: Luís Henriques <lhenriques@suse.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
Link: https://lore.kernel.org/r/20220822094235.2690-1-lhenriques@suse.de
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Using rbtree for sorting groups by average fragment size is relatively
expensive (needs rbtree update on every block freeing or allocation) and
leads to wide spreading of allocations because selection of block group
is very sentitive both to changes in free space and amount of blocks
allocated. Furthermore selecting group with the best matching average
fragment size is not necessary anyway, even more so because the
variability of fragment sizes within a group is likely large so average
is not telling much. We just need a group with large enough average
fragment size so that we have high probability of finding large enough
free extent and we don't want average fragment size to be too big so
that we are likely to find free extent only somewhat larger than what we
need.
So instead of maintaing rbtree of groups sorted by fragment size keep
bins (lists) or groups where average fragment size is in the interval
[2^i, 2^(i+1)). This structure requires less updates on block allocation
/ freeing, generally avoids chaotic spreading of allocations into block
groups, and still is able to quickly (even faster that the rbtree)
provide a block group which is likely to have a suitably sized free
space extent.
This patch reduces number of block groups used when untarring archive
with medium sized files (size somewhat above 64k which is default
mballoc limit for avoiding locality group preallocation) to about half
and thus improves write speeds for eMMC flash significantly.
Fixes: 196e402adf2e ("ext4: improve cr 0 / cr 1 group scanning")
CC: stable@kernel.org
Reported-and-tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Tested-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://lore.kernel.org/all/0d81a7c2-46b7-6010-62a4-3e6cfc1628d6@i2se.com/
Link: https://lore.kernel.org/r/20220908092136.11770-5-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Curently we don't use any preallocation when a file is already closed
when allocating blocks (from writeback code when converting delayed
allocation). However for small files, using locality group preallocation
is actually desirable as that is not specific to a particular file.
Rather it is a method to pack small files together to reduce
fragmentation and for that the fact the file is closed is actually even
stronger hint the file would benefit from packing. So change the logic
to allow locality group preallocation in this case.
Fixes: 196e402adf2e ("ext4: improve cr 0 / cr 1 group scanning")
CC: stable@kernel.org
Reported-and-tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Tested-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/all/0d81a7c2-46b7-6010-62a4-3e6cfc1628d6@i2se.com/
Link: https://lore.kernel.org/r/20220908092136.11770-4-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Currently the Orlov inode allocator searches for free inodes for a
directory only in flex block groups with at most inodes_per_group/16
more directory inodes than average per flex block group. However with
growing size of flex block group this becomes unnecessarily strict.
Scale allowed difference from average directory count per flex block
group with flex block group size as we do with other metrics.
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Tested-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Cc: stable@kernel.org
Link: https://lore.kernel.org/all/0d81a7c2-46b7-6010-62a4-3e6cfc1628d6@i2se.com/
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220908092136.11770-3-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
mb_set_largest_free_order() updates lists containing groups with largest
chunk of free space of given order. The way it updates it leads to
always moving the group to the tail of the list. Thus allocations
looking for free space of given order effectively end up cycling through
all groups (and due to initialization in last to first order). This
spreads allocations among block groups which reduces performance for
rotating disks or low-end flash media. Change
mb_set_largest_free_order() to only update lists if the order of the
largest free chunk in the group changed.
Fixes: 196e402adf2e ("ext4: improve cr 0 / cr 1 group scanning")
CC: stable@kernel.org
Reported-and-tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Tested-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/all/0d81a7c2-46b7-6010-62a4-3e6cfc1628d6@i2se.com/
Link: https://lore.kernel.org/r/20220908092136.11770-2-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
One of the side-effects of mb_optimize_scan was that the optimized
functions to select next group to try were called even before we tried
the goal group. As a result we no longer allocate files close to
corresponding inodes as well as we don't try to expand currently
allocated extent in the same group. This results in reaim regression
with workfile.disk workload of upto 8% with many clients on my test
machine:
baseline mb_optimize_scan
Hmean disk-1 2114.16 ( 0.00%) 2099.37 ( -0.70%)
Hmean disk-41 87794.43 ( 0.00%) 83787.47 * -4.56%*
Hmean disk-81 148170.73 ( 0.00%) 135527.05 * -8.53%*
Hmean disk-121 177506.11 ( 0.00%) 166284.93 * -6.32%*
Hmean disk-161 220951.51 ( 0.00%) 207563.39 * -6.06%*
Hmean disk-201 208722.74 ( 0.00%) 203235.59 ( -2.63%)
Hmean disk-241 222051.60 ( 0.00%) 217705.51 ( -1.96%)
Hmean disk-281 252244.17 ( 0.00%) 241132.72 * -4.41%*
Hmean disk-321 255844.84 ( 0.00%) 245412.84 * -4.08%*
Also this is causing huge regression (time increased by a factor of 5 or
so) when untarring archive with lots of small files on some eMMC storage
cards.
Fix the problem by making sure we try goal group first.
Fixes: 196e402adf2e ("ext4: improve cr 0 / cr 1 group scanning")
CC: stable@kernel.org
Reported-and-tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Tested-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://lore.kernel.org/all/20220727105123.ckwrhbilzrxqpt24@quack3/
Link: https://lore.kernel.org/all/0d81a7c2-46b7-6010-62a4-3e6cfc1628d6@i2se.com/
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220908092136.11770-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull NVDIMM and DAX fixes from Dan Williams:
"A recently discovered one-line fix for devdax that further addresses a
v5.5 regression, and (a bit embarrassing) a small batch of fixes that
have been sitting in my fixes tree for weeks.
The older fixes have soaked in linux-next during that time and address
an fsdax infinite loop and some other minor fixups.
- Fix a infinite loop bug in fsdax
- Fix memory-type detection for devdax (EINJ regression)
- Small cleanups"
* tag 'dax-and-nvdimm-fixes-v6.0-final' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
devdax: Fix soft-reservation memory description
fsdax: Fix infinite loop in dax_iomap_rw()
nvdimm/namespace: drop nested variable in create_namespace_pmem()
ndtest: Cleanup all of blk namespace specific code
pmem: fix a name collision
|
| |\ \
| | | |
| | | |
| | | |
| | | | |
Pick up another "Soft Reservation" fix for v6.0-final on top of some
straggling nvdimm fixes that missed v5.19.
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
I got an infinite loop and a WARNING report when executing a tail command
in virtiofs.
WARNING: CPU: 10 PID: 964 at fs/iomap/iter.c:34 iomap_iter+0x3a2/0x3d0
Modules linked in:
CPU: 10 PID: 964 Comm: tail Not tainted 5.19.0-rc7
Call Trace:
<TASK>
dax_iomap_rw+0xea/0x620
? __this_cpu_preempt_check+0x13/0x20
fuse_dax_read_iter+0x47/0x80
fuse_file_read_iter+0xae/0xd0
new_sync_read+0xfe/0x180
? 0xffffffff81000000
vfs_read+0x14d/0x1a0
ksys_read+0x6d/0xf0
__x64_sys_read+0x1a/0x20
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
The tail command will call read() with a count of 0. In this case,
iomap_iter() will report this WARNING, and always return 1 which casuing
the infinite loop in dax_iomap_rw().
Fixing by checking count whether is 0 in dax_iomap_rw().
Fixes: ca289e0b95af ("fsdax: switch dax_iomap_rw to use iomap_iter")
Signed-off-by: Li Jinlin <lijinlin3@huawei.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Link: https://lore.kernel.org/r/20220725032050.3873372-1-lijinlin3@huawei.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat
Pull exfat fix from Namjae Jeon:
- fix integer overflow on large partitions
* tag 'exfat-for-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
exfat: fix overflow for large capacity partition
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Using int type for sector index, there will be overflow in a large
capacity partition.
For example, if storage with sector size of 512 bytes and partition
capacity is larger than 2TB, there will be overflow.
Fixes: 1b6138385499 ("exfat: reduce block requests when zeroing a cluster")
Cc: stable@vger.kernel.org # v5.19+
Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Reviewed-by: Andy Wu <Andy.Wu@sony.com>
Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com>
Acked-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|\ \ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- two fixes for hangs in the umount sequence where threads depend on
each other and the work must be finished in the right order
- in zoned mode, wait for flushing all block group metadata IO before
finishing the zone
* tag 'for-6.0-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: zoned: wait for extent buffer IOs before finishing a zone
btrfs: fix hang during unmount when stopping a space reclaim worker
btrfs: fix hang during unmount when stopping block group reclaim worker
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Before sending REQ_OP_ZONE_FINISH to a zone, we need to ensure that
ongoing IOs already finished. Or, we will see a "Zone Is Full" error for
the IOs, as the ZONE_FINISH command makes the zone full.
We ensure that with btrfs_wait_block_group_reservations() and
btrfs_wait_ordered_roots() for a data block group. And, for a metadata
block group, the comparison of alloc_offset vs meta_write_pointer mostly
ensures IOs for the allocated region already sent. However, there still
can be a little time frame where the IOs are sent but not yet completed.
Introduce wait_eb_writebacks() to ensure such IOs are completed for a
metadata block group. It walks the buffer_radix to find extent buffers in
the block group and calls wait_on_extent_buffer_writeback() on them.
Fixes: afba2bc036b0 ("btrfs: zoned: implement active zone tracking")
CC: stable@vger.kernel.org # 5.19+
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Often when running generic/562 from fstests we can hang during unmount,
resulting in a trace like this:
Sep 07 11:52:00 debian9 unknown: run fstests generic/562 at 2022-09-07 11:52:00
Sep 07 11:55:32 debian9 kernel: INFO: task umount:49438 blocked for more than 120 seconds.
Sep 07 11:55:32 debian9 kernel: Not tainted 6.0.0-rc2-btrfs-next-122 #1
Sep 07 11:55:32 debian9 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 07 11:55:32 debian9 kernel: task:umount state:D stack: 0 pid:49438 ppid: 25683 flags:0x00004000
Sep 07 11:55:32 debian9 kernel: Call Trace:
Sep 07 11:55:32 debian9 kernel: <TASK>
Sep 07 11:55:32 debian9 kernel: __schedule+0x3c8/0xec0
Sep 07 11:55:32 debian9 kernel: ? rcu_read_lock_sched_held+0x12/0x70
Sep 07 11:55:32 debian9 kernel: schedule+0x5d/0xf0
Sep 07 11:55:32 debian9 kernel: schedule_timeout+0xf1/0x130
Sep 07 11:55:32 debian9 kernel: ? lock_release+0x224/0x4a0
Sep 07 11:55:32 debian9 kernel: ? lock_acquired+0x1a0/0x420
Sep 07 11:55:32 debian9 kernel: ? trace_hardirqs_on+0x2c/0xd0
Sep 07 11:55:32 debian9 kernel: __wait_for_common+0xac/0x200
Sep 07 11:55:32 debian9 kernel: ? usleep_range_state+0xb0/0xb0
Sep 07 11:55:32 debian9 kernel: __flush_work+0x26d/0x530
Sep 07 11:55:32 debian9 kernel: ? flush_workqueue_prep_pwqs+0x140/0x140
Sep 07 11:55:32 debian9 kernel: ? trace_clock_local+0xc/0x30
Sep 07 11:55:32 debian9 kernel: __cancel_work_timer+0x11f/0x1b0
Sep 07 11:55:32 debian9 kernel: ? close_ctree+0x12b/0x5b3 [btrfs]
Sep 07 11:55:32 debian9 kernel: ? __trace_bputs+0x10b/0x170
Sep 07 11:55:32 debian9 kernel: close_ctree+0x152/0x5b3 [btrfs]
Sep 07 11:55:32 debian9 kernel: ? evict_inodes+0x166/0x1c0
Sep 07 11:55:32 debian9 kernel: generic_shutdown_super+0x71/0x120
Sep 07 11:55:32 debian9 kernel: kill_anon_super+0x14/0x30
Sep 07 11:55:32 debian9 kernel: btrfs_kill_super+0x12/0x20 [btrfs]
Sep 07 11:55:32 debian9 kernel: deactivate_locked_super+0x2e/0xa0
Sep 07 11:55:32 debian9 kernel: cleanup_mnt+0x100/0x160
Sep 07 11:55:32 debian9 kernel: task_work_run+0x59/0xa0
Sep 07 11:55:32 debian9 kernel: exit_to_user_mode_prepare+0x1a6/0x1b0
Sep 07 11:55:32 debian9 kernel: syscall_exit_to_user_mode+0x16/0x40
Sep 07 11:55:32 debian9 kernel: do_syscall_64+0x48/0x90
Sep 07 11:55:32 debian9 kernel: entry_SYSCALL_64_after_hwframe+0x63/0xcd
Sep 07 11:55:32 debian9 kernel: RIP: 0033:0x7fcde59a57a7
Sep 07 11:55:32 debian9 kernel: RSP: 002b:00007ffe914217c8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
Sep 07 11:55:32 debian9 kernel: RAX: 0000000000000000 RBX: 00007fcde5ae8264 RCX: 00007fcde59a57a7
Sep 07 11:55:32 debian9 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000055b57556cdd0
Sep 07 11:55:32 debian9 kernel: RBP: 000055b57556cba0 R08: 0000000000000000 R09: 00007ffe91420570
Sep 07 11:55:32 debian9 kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
Sep 07 11:55:32 debian9 kernel: R13: 000055b57556cdd0 R14: 000055b57556ccb8 R15: 0000000000000000
Sep 07 11:55:32 debian9 kernel: </TASK>
What happens is the following:
1) The cleaner kthread tries to start a transaction to delete an unused
block group, but the metadata reservation can not be satisfied right
away, so a reservation ticket is created and it starts the async
metadata reclaim task (fs_info->async_reclaim_work);
2) Writeback for all the filler inodes with an i_size of 2K starts
(generic/562 creates a lot of 2K files with the goal of filling
metadata space). We try to create an inline extent for them, but we
fail when trying to insert the inline extent with -ENOSPC (at
cow_file_range_inline()) - since this is not critical, we fallback
to non-inline mode (back to cow_file_range()), reserve extents, create
extent maps and create the ordered extents;
3) An unmount starts, enters close_ctree();
4) The async reclaim task is flushing stuff, entering the flush states one
by one, until it reaches RUN_DELAYED_IPUTS. There it runs all current
delayed iputs.
After running the delayed iputs and before calling
btrfs_wait_on_delayed_iputs(), one or more ordered extents complete,
and btrfs_add_delayed_iput() is called for each one through
btrfs_finish_ordered_io() -> btrfs_put_ordered_extent(). This results
in bumping fs_info->nr_delayed_iputs from 0 to some positive value.
So the async reclaim task blocks at btrfs_wait_on_delayed_iputs() waiting
for fs_info->nr_delayed_iputs to become 0;
5) The current transaction is committed by the transaction kthread, we then
start unpinning extents and end up calling btrfs_try_granting_tickets()
through unpin_extent_range(), since we released some space.
This results in satisfying the ticket created by the cleaner kthread at
step 1, waking up the cleaner kthread;
6) At close_ctree() we ask the cleaner kthread to park;
7) The cleaner kthread starts the transaction, deletes the unused block
group, and then calls kthread_should_park(), which returns true, so it
parks. And at this point we have the delayed iputs added by the
completion of the ordered extents still pending;
8) Then later at close_ctree(), when we call:
cancel_work_sync(&fs_info->async_reclaim_work);
We hang forever, since the cleaner was parked and no one else can run
delayed iputs after that, while the reclaim task is waiting for the
remaining delayed iputs to be completed.
Fix this by waiting for all ordered extents to complete and running the
delayed iputs before attempting to stop the async reclaim tasks. Note that
we can not wait for ordered extents with btrfs_wait_ordered_roots() (or
other similar functions) because that waits for the BTRFS_ORDERED_COMPLETE
flag to be set on an ordered extent, but the delayed iput is added after
that, when doing the final btrfs_put_ordered_extent(). So instead wait for
the work queues used for executing ordered extent completion to be empty,
which works because we do the final put on an ordered extent at
btrfs_finish_ordered_io() (while we are in the unmount context).
Fixes: d6fd0ae25c6495 ("Btrfs: fix missing delayed iputs on unmount")
CC: stable@vger.kernel.org # 5.15+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
During early unmount, at close_ctree(), we try to stop the block group
reclaim task with cancel_work_sync(), but that may hang if the block group
reclaim task is currently at btrfs_relocate_block_group() waiting for the
flag BTRFS_FS_UNFINISHED_DROPS to be cleared from fs_info->flags. During
unmount we only clear that flag later, after trying to stop the block
group reclaim task.
Fix that by clearing BTRFS_FS_UNFINISHED_DROPS before trying to stop the
block group reclaim task and after setting BTRFS_FS_CLOSING_START, so that
if the reclaim task is waiting on that bit, it will stop immediately after
being woken, because it sees the filesystem is closing (with a call to
btrfs_fs_closing()), and then returns immediately with -EINTR.
Fixes: 31e70e527806c5 ("btrfs: fix hang during unmount when block group reclaim task is running")
CC: stable@vger.kernel.org # 5.15+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|\ \ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping
Pull vfs fix from Christian Brauner:
"Beginning of the merge window we introduced the vfs{g,u}id_t types in
b27c82e12965 ("attr: port attribute changes to new types") and changed
various codepaths over including chown_common().
When userspace passes -1 for an ownership change the ownership fields
in struct iattr stay uninitialized. Usually this is fine because any
code making use of any fields in struct iattr must check the
->ia_valid field whether the value of interest has been initialized.
That's true for all struct iattr passing code.
However, over the course of the last year with more heavy use of KMSAN
we found quite a few places that got this wrong. A recent one I fixed
was 3cb6ee991496 ("9p: only copy valid iattrs in 9P2000.L setattr
implementation").
But we also have LSM hooks. Actually we have two. The first one is
security_inode_setattr() in notify_change() which does the right thing
and passes the full struct iattr down to LSMs and thus LSMs can check
whether it is initialized.
But then we also have security_path_chown() which passes down a path
argument and the target ownership as the filesystem would see it. For
the latter we now generate the target values based on struct iattr and
pass it down. However, when userspace passes -1 then struct iattr
isn't initialized.
This patch simply initializes ->ia_vfs{g,u}id with INVALID_VFS{G,U}ID
so the hook continue to see invalid ownership when -1 is passed from
userspace. The only LSM that cares about the actual values is Tomoyo.
The vfs codepaths don't look at these fields without ->ia_valid being
set so there's no harm in initializing ->ia_vfs{g,u}id. Arguably this
is also safer since we can't end up copying valid ownership values
when invalid ownership values should be passed.
This only affects mainline. No kernel has been released with this and
thus no backport is needed. The commit is thus marked with a Fixes:
tag but annotated with "# mainline only" (I didn't quite remember what
Greg said about how to tell stable autoselect to not bother with fixes
for mainline only)"
* tag 'fs.fixes.v6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
open: always initialize ownership fields
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Beginning of the merge window we introduced the vfs{g,u}id_t types in
b27c82e12965 ("attr: port attribute changes to new types") and changed
various codepaths over including chown_common().
During that change we forgot to account for the case were the passed
ownership value is -1. In this case the ownership fields in struct iattr
aren't initialized but we rely on them being initialized by the time we
generate the ownership to pass down to the LSMs. All the major LSMs
don't care about the ownership values at all. Only Tomoyo uses them and
so it took a while for syzbot to unearth this issue.
Fix this by initializing the ownership fields and do it within the
retry_deleg block. While notify_change() doesn't alter the ownership
fields currently we shouldn't rely on it.
Since no kernel has been released with these changes this does not
needed to be backported to any stable kernels.
[Christian Brauner (Microsoft) <brauner@kernel.org>]
* rewrote commit message
* use INVALID_VFS{G,U}ID macros
Fixes: b27c82e12965 ("attr: port attribute changes to new types") # mainline only
Reported-and-tested-by: syzbot+541e21dcc32c4046cba9@syzkaller.appspotmail.com
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Seth Forshee (DigitalOcean) <sforshee@kernel.org>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
|