linux-stable.git - Linux kernel stable tree

	Commit message (Collapse)	Author	Age	Files	Lines
*	Linux 5.15.62v5.15.62	Greg Kroah-Hartman	2022-08-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	Link: https://lore.kernel.org/r/20220819153711.658766010@linuxfoundation.org Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk> Tested-by: Ron Economos <re@w6rz.net> Link: https://lore.kernel.org/r/20220820182309.607584465@linuxfoundation.org Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	btrfs: raid56: don't trust any cached sector in __raid56_parity_recover()	Qu Wenruo	2022-08-21	1	-13/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit f6065f8edeb25f4a9dfe0b446030ad995a84a088 upstream. [BUG] There is a small workload which will always fail with recent kernel: (A simplified version from btrfs/125 test case) mkfs.btrfs -f -m raid5 -d raid5 -b 1G $dev1 $dev2 $dev3 mount $dev1 $mnt xfs_io -f -c "pwrite -S 0xee 0 1M" $mnt/file1 sync umount $mnt btrfs dev scan -u $dev3 mount -o degraded $dev1 $mnt xfs_io -f -c "pwrite -S 0xff 0 128M" $mnt/file2 umount $mnt btrfs dev scan mount $dev1 $mnt btrfs balance start --full-balance $mnt umount $mnt The failure is always failed to read some tree blocks: BTRFS info (device dm-4): relocating block group 217710592 flags data\|raid5 BTRFS error (device dm-4): parent transid verify failed on 38993920 wanted 9 found 7 BTRFS error (device dm-4): parent transid verify failed on 38993920 wanted 9 found 7 ... [CAUSE] With the recently added debug output, we can see all RAID56 operations related to full stripe 38928384: 56.1183: raid56_read_partial: full_stripe=38928384 devid=2 type=DATA1 offset=0 opf=0x0 physical=9502720 len=65536 56.1185: raid56_read_partial: full_stripe=38928384 devid=3 type=DATA2 offset=16384 opf=0x0 physical=9519104 len=16384 56.1185: raid56_read_partial: full_stripe=38928384 devid=3 type=DATA2 offset=49152 opf=0x0 physical=9551872 len=16384 56.1187: raid56_write_stripe: full_stripe=38928384 devid=3 type=DATA2 offset=0 opf=0x1 physical=9502720 len=16384 56.1188: raid56_write_stripe: full_stripe=38928384 devid=3 type=DATA2 offset=32768 opf=0x1 physical=9535488 len=16384 56.1188: raid56_write_stripe: full_stripe=38928384 devid=1 type=PQ1 offset=0 opf=0x1 physical=30474240 len=16384 56.1189: raid56_write_stripe: full_stripe=38928384 devid=1 type=PQ1 offset=32768 opf=0x1 physical=30507008 len=16384 56.1218: raid56_write_stripe: full_stripe=38928384 devid=3 type=DATA2 offset=49152 opf=0x1 physical=9551872 len=16384 56.1219: raid56_write_stripe: full_stripe=38928384 devid=1 type=PQ1 offset=49152 opf=0x1 physical=30523392 len=16384 56.2721: raid56_parity_recover: full stripe=38928384 eb=39010304 mirror=2 56.2723: raid56_parity_recover: full stripe=38928384 eb=39010304 mirror=2 56.2724: raid56_parity_recover: full stripe=38928384 eb=39010304 mirror=2 Before we enter raid56_parity_recover(), we have triggered some metadata write for the full stripe 38928384, this leads to us to read all the sectors from disk. Furthermore, btrfs raid56 write will cache its calculated P/Q sectors to avoid unnecessary read. This means, for that full stripe, after any partial write, we will have stale data, along with P/Q calculated using that stale data. Thankfully due to patch "btrfs: only write the sectors in the vertical stripe which has data stripes" we haven't submitted all the corrupted P/Q to disk. When we really need to recover certain range, aka in raid56_parity_recover(), we will use the cached rbio, along with its cached sectors (the full stripe is all cached). This explains why we have no event raid56_scrub_read_recover() triggered. Since we have the cached P/Q which is calculated using the stale data, the recovered one will just be stale. In our particular test case, it will always return the same incorrect metadata, thus causing the same error message "parent transid verify failed on 39010304 wanted 9 found 7" again and again. [BTRFS DESTRUCTIVE RMW PROBLEM] Test case btrfs/125 (and above workload) always has its trouble with the destructive read-modify-write (RMW) cycle: 0 32K 64K Data1: \| Good \| Good \| Data2: \| Bad \| Bad \| Parity: \| Good \| Good \| In above case, if we trigger any write into Data1, we will use the bad data in Data2 to re-generate parity, killing the only chance to recovery Data2, thus Data2 is lost forever. This destructive RMW cycle is not specific to btrfs RAID56, but there are some btrfs specific behaviors making the case even worse: - Btrfs will cache sectors for unrelated vertical stripes. In above example, if we're only writing into 0~32K range, btrfs will still read data range (32K ~ 64K) of Data1, and (64K~128K) of Data2. This behavior is to cache sectors for later update. Incidentally commit d4e28d9b5f04 ("btrfs: raid56: make steal_rbio() subpage compatible") has a bug which makes RAID56 to never trust the cached sectors, thus slightly improve the situation for recovery. Unfortunately, follow up fix "btrfs: update stripe_sectors::uptodate in steal_rbio" will revert the behavior back to the old one. - Btrfs raid56 partial write will update all P/Q sectors and cache them This means, even if data at (64K ~ 96K) of Data2 is free space, and only (96K ~ 128K) of Data2 is really stale data. And we write into that (96K ~ 128K), we will update all the parity sectors for the full stripe. This unnecessary behavior will completely kill the chance of recovery. Thankfully, an unrelated optimization "btrfs: only write the sectors in the vertical stripe which has data stripes" will prevent submitting the write bio for untouched vertical sectors. That optimization will keep the on-disk P/Q untouched for a chance for later recovery. [FIX] Although we have no good way to completely fix the destructive RMW (unless we go full scrub for each partial write), we can still limit the damage. With patch "btrfs: only write the sectors in the vertical stripe which has data stripes" now we won't really submit the P/Q of unrelated vertical stripes, so the on-disk P/Q should still be fine. Now we really need to do is just drop all the cached sectors when doing recovery. By this, we have a chance to read the original P/Q from disk, and have a chance to recover the stale data, while still keep the cache to speed up regular write path. In fact, just dropping all the cache for recovery path is good enough to allow the test case btrfs/125 along with the small script to pass reliably. The lack of metadata write after the degraded mount, and forced metadata COW is saving us this time. So this patch will fix the behavior by not trust any cache in __raid56_parity_recover(), to solve the problem while still keep the cache useful. But please note that this test pass DOES NOT mean we have solved the destructive RMW problem, we just do better damage control a little better. Related patches: - btrfs: only write the sectors in the vertical stripe - d4e28d9b5f04 ("btrfs: raid56: make steal_rbio() subpage compatible") - btrfs: update stripe_sectors::uptodate in steal_rbio Acked-by: David Sterba <dsterba@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	btrfs: only write the sectors in the vertical stripe which has data stripes	Qu Wenruo	2022-08-21	1	-4/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit bd8f7e627703ca5707833d623efcd43f104c7b3f upstream. If we have only 8K partial write at the beginning of a full RAID56 stripe, we will write the following contents: 0 8K 32K 64K Disk 1 (data): \|XX\| \| \| Disk 2 (data): \| \| \| Disk 3 (parity): \|XXXXXXXXXXXXXXX\|XXXXXXXXXXXXXXX\| \|X\| means the sector will be written back to disk. Note that, although we won't write any sectors from disk 2, but we will write the full 64KiB of parity to disk. This behavior is fine for now, but not for the future (especially for RAID56J, as we waste quite some space to journal the unused parity stripes). So here we will also utilize the btrfs_raid_bio::dbitmap, anytime we queue a higher level bio into an rbio, we will update rbio::dbitmap to indicate which vertical stripes we need to writeback. And at finish_rmw(), we also check dbitmap to see if we need to write any sector in the vertical stripe. So after the patch, above example will only lead to the following writeback pattern: 0 8K 32K 64K Disk 1 (data): \|XX\| \| \| Disk 2 (data): \| \| \| Disk 3 (parity): \|XX\| \| \| Acked-by: David Sterba <dsterba@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	x86/ftrace: Use alternative RET encoding	Peter Zijlstra	2022-08-21	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 1f001e9da6bbf482311e45e48f53c2bd2179e59c upstream. Use the return thunk in ftrace trampolines, if needed. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> [cascardo: use memcpy(text_gen_insn) as there is no __text_gen_insn] Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	x86/ibt,ftrace: Make function-graph play nice	Peter Zijlstra	2022-08-21	2	-11/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit e52fc2cf3f662828cc0d51c4b73bed73ad275fce upstream. Return trampoline must not use indirect branch to return; while this preserves the RSB, it is fundamentally incompatible with IBT. Instead use a retpoline like ROP gadget that defeats IBT while not unbalancing the RSB. And since ftrace_stub is no longer a plain RET, don't use it to copy from. Since RET is a trivial instruction, poke it directly. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20220308154318.347296408@infradead.org [cascardo: remove ENDBR] Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	Revert "x86/ftrace: Use alternative RET encoding"	Thadeu Lima de Souza Cascardo	2022-08-21	1	-5/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit e54fcb0812faebd147de72bd37ad87cc4951c68c. This temporarily reverts the backport of upstream commit 1f001e9da6bbf482311e45e48f53c2bd2179e59c. It was not correct to copy the ftrace stub as it would contain a relative jump to the return thunk which would not apply to the context where it was being copied to, leading to ftrace support to be broken. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	ksmbd: fix heap-based overflow in set_ntacl_dacl()	Namjae Jeon	2022-08-21	4	-57/+119
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 8f0541186e9ad1b62accc9519cc2b7a7240272a7 upstream. The testcase use SMB2_SET_INFO_HE command to set a malformed file attribute under the label `security.NTACL`. SMB2_QUERY_INFO_HE command in testcase trigger the following overflow. [ 4712.003781] ================================================================== [ 4712.003790] BUG: KASAN: slab-out-of-bounds in build_sec_desc+0x842/0x1dd0 [ksmbd] [ 4712.003807] Write of size 1060 at addr ffff88801e34c068 by task kworker/0:0/4190 [ 4712.003813] CPU: 0 PID: 4190 Comm: kworker/0:0 Not tainted 5.19.0-rc5 #1 [ 4712.003850] Workqueue: ksmbd-io handle_ksmbd_work [ksmbd] [ 4712.003867] Call Trace: [ 4712.003870] <TASK> [ 4712.003873] dump_stack_lvl+0x49/0x5f [ 4712.003935] print_report.cold+0x5e/0x5cf [ 4712.003972] ? ksmbd_vfs_get_sd_xattr+0x16d/0x500 [ksmbd] [ 4712.003984] ? cmp_map_id+0x200/0x200 [ 4712.003988] ? build_sec_desc+0x842/0x1dd0 [ksmbd] [ 4712.004000] kasan_report+0xaa/0x120 [ 4712.004045] ? build_sec_desc+0x842/0x1dd0 [ksmbd] [ 4712.004056] kasan_check_range+0x100/0x1e0 [ 4712.004060] memcpy+0x3c/0x60 [ 4712.004064] build_sec_desc+0x842/0x1dd0 [ksmbd] [ 4712.004076] ? parse_sec_desc+0x580/0x580 [ksmbd] [ 4712.004088] ? ksmbd_acls_fattr+0x281/0x410 [ksmbd] [ 4712.004099] smb2_query_info+0xa8f/0x6110 [ksmbd] [ 4712.004111] ? psi_group_change+0x856/0xd70 [ 4712.004148] ? update_load_avg+0x1c3/0x1af0 [ 4712.004152] ? asym_cpu_capacity_scan+0x5d0/0x5d0 [ 4712.004157] ? xas_load+0x23/0x300 [ 4712.004162] ? smb2_query_dir+0x1530/0x1530 [ksmbd] [ 4712.004173] ? _raw_spin_lock_bh+0xe0/0xe0 [ 4712.004179] handle_ksmbd_work+0x30e/0x1020 [ksmbd] [ 4712.004192] process_one_work+0x778/0x11c0 [ 4712.004227] ? _raw_spin_lock_irq+0x8e/0xe0 [ 4712.004231] worker_thread+0x544/0x1180 [ 4712.004234] ? __cpuidle_text_end+0x4/0x4 [ 4712.004239] kthread+0x282/0x320 [ 4712.004243] ? process_one_work+0x11c0/0x11c0 [ 4712.004246] ? kthread_complete_and_exit+0x30/0x30 [ 4712.004282] ret_from_fork+0x1f/0x30 This patch add the buffer validation for security descriptor that is stored by malformed SMB2_SET_INFO_HE command. and allocate large response buffer about SMB2_O_INFO_SECURITY file info class. Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Cc: stable@vger.kernel.org Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-17771 Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	ksmbd: prevent out of bound read for SMB2_WRITE	Hyunchul Lee	2022-08-21	2	-6/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit ac60778b87e45576d7bfdbd6f53df902654e6f09 upstream. OOB read memory can be written to a file, if DataOffset is 0 and Length is too large in SMB2_WRITE request of compound request. To prevent this, when checking the length of the data area of SMB2_WRITE in smb2_get_data_area_len(), let the minimum of DataOffset be the size of SMB2 header + the size of SMB2_WRITE header. This bug can lead an oops looking something like: [ 798.008715] BUG: KASAN: slab-out-of-bounds in copy_page_from_iter_atomic+0xd3d/0x14b0 [ 798.008724] Read of size 252 at addr ffff88800f863e90 by task kworker/0:2/2859 ... [ 798.008754] Call Trace: [ 798.008756] <TASK> [ 798.008759] dump_stack_lvl+0x49/0x5f [ 798.008764] print_report.cold+0x5e/0x5cf [ 798.008768] ? __filemap_get_folio+0x285/0x6d0 [ 798.008774] ? copy_page_from_iter_atomic+0xd3d/0x14b0 [ 798.008777] kasan_report+0xaa/0x120 [ 798.008781] ? copy_page_from_iter_atomic+0xd3d/0x14b0 [ 798.008784] kasan_check_range+0x100/0x1e0 [ 798.008788] memcpy+0x24/0x60 [ 798.008792] copy_page_from_iter_atomic+0xd3d/0x14b0 [ 798.008795] ? pagecache_get_page+0x53/0x160 [ 798.008799] ? iov_iter_get_pages_alloc+0x1590/0x1590 [ 798.008803] ? ext4_write_begin+0xfc0/0xfc0 [ 798.008807] ? current_time+0x72/0x210 [ 798.008811] generic_perform_write+0x2c8/0x530 [ 798.008816] ? filemap_fdatawrite_wbc+0x180/0x180 [ 798.008820] ? down_write+0xb4/0x120 [ 798.008824] ? down_write_killable+0x130/0x130 [ 798.008829] ext4_buffered_write_iter+0x137/0x2c0 [ 798.008833] ext4_file_write_iter+0x40b/0x1490 [ 798.008837] ? __fsnotify_parent+0x275/0xb20 [ 798.008842] ? __fsnotify_update_child_dentry_flags+0x2c0/0x2c0 [ 798.008846] ? ext4_buffered_write_iter+0x2c0/0x2c0 [ 798.008851] __kernel_write+0x3a1/0xa70 [ 798.008855] ? __x64_sys_preadv2+0x160/0x160 [ 798.008860] ? security_file_permission+0x4a/0xa0 [ 798.008865] kernel_write+0xbb/0x360 [ 798.008869] ksmbd_vfs_write+0x27e/0xb90 [ksmbd] [ 798.008881] ? ksmbd_vfs_read+0x830/0x830 [ksmbd] [ 798.008892] ? _raw_read_unlock+0x2a/0x50 [ 798.008896] smb2_write+0xb45/0x14e0 [ksmbd] [ 798.008909] ? __kasan_check_write+0x14/0x20 [ 798.008912] ? _raw_spin_lock_bh+0xd0/0xe0 [ 798.008916] ? smb2_read+0x15e0/0x15e0 [ksmbd] [ 798.008927] ? memcpy+0x4e/0x60 [ 798.008931] ? _raw_spin_unlock+0x19/0x30 [ 798.008934] ? ksmbd_smb2_check_message+0x16af/0x2350 [ksmbd] [ 798.008946] ? _raw_spin_lock_bh+0xe0/0xe0 [ 798.008950] handle_ksmbd_work+0x30e/0x1020 [ksmbd] [ 798.008962] process_one_work+0x778/0x11c0 [ 798.008966] ? _raw_spin_lock_irq+0x8e/0xe0 [ 798.008970] worker_thread+0x544/0x1180 [ 798.008973] ? __cpuidle_text_end+0x4/0x4 [ 798.008977] kthread+0x282/0x320 [ 798.008982] ? process_one_work+0x11c0/0x11c0 [ 798.008985] ? kthread_complete_and_exit+0x30/0x30 [ 798.008989] ret_from_fork+0x1f/0x30 [ 798.008995] </TASK> Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Cc: stable@vger.kernel.org Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-17817 Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	net_sched: cls_route: disallow handle of 0	Jamal Hadi Salim	2022-08-21	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 02799571714dc5dd6948824b9d080b44a295f695 upstream. Follows up on: https://lore.kernel.org/all/20220809170518.164662-1-cascardo@canonical.com/ handle of 0 implies from/to of universe realm which is not very sensible. Lets see what this patch will do: $sudo tc qdisc add dev $DEV root handle 1:0 prio //lets manufacture a way to insert handle of 0 $sudo tc filter add dev $DEV parent 1:0 protocol ip prio 100 \ route to 0 from 0 classid 1:10 action ok //gets rejected... Error: handle of 0 is not valid. We have an error talking to the kernel, -1 //lets create a legit entry.. sudo tc filter add dev $DEV parent 1:0 protocol ip prio 100 route from 10 \ classid 1:10 action ok //what did the kernel insert? $sudo tc filter ls dev $DEV parent 1:0 filter protocol ip pref 100 route chain 0 filter protocol ip pref 100 route chain 0 fh 0x000a8000 flowid 1:10 from 10 action order 1: gact action pass random type none pass val 0 index 1 ref 1 bind 1 //Lets try to replace that legit entry with a handle of 0 $ sudo tc filter replace dev $DEV parent 1:0 protocol ip prio 100 \ handle 0x000a8000 route to 0 from 0 classid 1:10 action drop Error: Replacing with handle of 0 is invalid. We have an error talking to the kernel, -1 And last, lets run Cascardo's POC: $ ./poc 0 0 -22 -22 -22 Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	tee: add overflow check in register_shm_helper()	Jens Wiklander	2022-08-21	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 573ae4f13f630d6660008f1974c0a8a29c30e18a upstream. With special lengths supplied by user space, register_shm_helper() has an integer overflow when calculating the number of pages covered by a supplied user space memory region. This causes internal_get_user_pages_fast() a helper function of pin_user_pages_fast() to do a NULL pointer dereference: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010 Modules linked in: CPU: 1 PID: 173 Comm: optee_example_a Not tainted 5.19.0 #11 Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 pc : internal_get_user_pages_fast+0x474/0xa80 Call trace: internal_get_user_pages_fast+0x474/0xa80 pin_user_pages_fast+0x24/0x4c register_shm_helper+0x194/0x330 tee_shm_register_user_buf+0x78/0x120 tee_ioctl+0xd0/0x11a0 __arm64_sys_ioctl+0xa8/0xec invoke_syscall+0x48/0x114 Fix this by adding an an explicit call to access_ok() in tee_shm_register_user_buf() to catch an invalid user space address early. Fixes: 033ddf12bcf5 ("tee: add register user memory") Cc: stable@vger.kernel.org Reported-by: Nimish Mishra <neelam.nimish@gmail.com> Reported-by: Anirban Chakraborty <ch.anirban00727@gmail.com> Reported-by: Debdeep Mukhopadhyay <debdeep.mukhopadhyay@gmail.com> Suggested-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	io_uring: use original request task for inflight tracking	Jens Axboe	2022-08-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 386e4fb6962b9f248a80f8870aea0870ca603e89 upstream. In prior kernels, we did file assignment always at prep time. This meant that req->task == current. But after deferring that assignment and then pushing the inflight tracking back in, we've got the inflight tracking using current when it should in fact now be using req->task. Fixup that error introduced by adding the inflight tracking back after file assignments got modifed. Fixes: 9cae36a094e7 ("io_uring: reinstate the inflight tracking") Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	Linux 5.15.61v5.15.61	Greg Kroah-Hartman	2022-08-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	Link: https://lore.kernel.org/r/20220815180337.130757997@linuxfoundation.org Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20220816124544.577833376@linuxfoundation.org Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Ron Economos <re@w6rz.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	scsi: lpfc: Resolve some cleanup issues following SLI path refactoring	James Smart	2022-08-17	2	-14/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit e27f05147bff21408c1b8410ad8e90cd286e7952 upstream. Following refactoring and consolidation in SLI processing, fix up some minor issues related to SLI path: - Correct the setting of LPFC_EXCHANGE_BUSY flag in response IOCB. - Fix some typographical errors. - Fix duplicate log messages. Link: https://lore.kernel.org/r/20220603174329.63777-4-jsmart2021@gmail.com Fixes: 1b64aa9eae28 ("scsi: lpfc: SLI path split: Refactor fast and slow paths to native SLI4") Cc: <stable@vger.kernel.org> # v5.18 Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	scsi: lpfc: Fix element offset in __lpfc_sli_release_iocbq_s4()	James Smart	2022-08-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 84c6f99e39074d45f75986e42ca28e27c140fd0d upstream. The prior commit that moved from iocb elements to explicit wqe elements missed a name change. Correct __lpfc_sli_release_iocbq_s4() to reference wqe rather than iocb. Link: https://lore.kernel.org/r/20220506035519.50908-2-jsmart2021@gmail.com Fixes: a680a9298e7b ("scsi: lpfc: SLI path split: Refactor lpfc_iocbq") Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	scsi: lpfc: Fix locking for lpfc_sli_iocbq_lookup()	James Smart	2022-08-17	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit c26bd6602e1d348bfa754dc55e5608c922dd2801 upstream. The rules changed for lpfc_sli_iocbq_lookup() vs locking. Prior, the routine properly took out the lock. In newly refactored code, the locks must be held when calling the routine. Fix lpfc_sli_process_sol_iocb() to take the locks before calling the routine. Fix lpfc_sli_handle_fast_ring_event() to not release the locks to call the routine. Link: https://lore.kernel.org/r/20220323205545.81814-3-jsmart2021@gmail.com Fixes: 1b64aa9eae28 ("scsi: lpfc: SLI path split: Refactor fast and slow paths to native SLI4") Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	drm/bridge: Move devm_drm_of_get_bridge to bridge/panel.c	Maxime Ripard	2022-08-17	2	-34/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit d4ae66f10c8b9959dce1766d9a87070e567236eb upstream. By depending on devm_drm_panel_bridge_add(), devm_drm_of_get_bridge() introduces a circular dependency between the modules drm (where devm_drm_of_get_bridge() ends up) and drm_kms_helper (where devm_drm_panel_bridge_add() is). Fix this by moving devm_drm_of_get_bridge() to bridge/panel.c and thus drm_kms_helper. Fixes: 87ea95808d53 ("drm/bridge: Add a function to abstract away panels") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Tested-by: Heiko Stuebner <heiko@sntech.de> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20210917180925.2602266-1-maxime@cerno.tech Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	Bluetooth: L2CAP: Fix l2cap_global_chan_by_psm regression	Luiz Augusto von Dentz	2022-08-17	1	-7/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 332f1795ca202489c665a75e62e18ff6284de077 upstream. The patch d0be8347c623: "Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_put" from Jul 21, 2022, leads to the following Smatch static checker warning: net/bluetooth/l2cap_core.c:1977 l2cap_global_chan_by_psm() error: we previously assumed 'c' could be null (see line 1996) Fixes: d0be8347c623 ("Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_put") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	Revert "net: usb: ax88179_178a needs FLAG_SEND_ZLP"	Jose Alonso	2022-08-17	1	-10/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 6fd2c17fb6e02a8c0ab51df1cfec82ce96b8e83d upstream. This reverts commit 36a15e1cb134c0395261ba1940762703f778438c. The usage of FLAG_SEND_ZLP causes problems to other firmware/hardware versions that have no issues. The FLAG_SEND_ZLP is not safe to use in this context. See: https://patchwork.ozlabs.org/project/netdev/patch/1270599787.8900.8.camel@Linuxdev4-laptop/#118378 The original problem needs another way to solve. Fixes: 36a15e1cb134 ("net: usb: ax88179_178a needs FLAG_SEND_ZLP") Cc: stable@vger.kernel.org Reported-by: Ronald Wahl <ronald.wahl@raritan.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216327 Link: https://bugs.archlinux.org/task/75491 Signed-off-by: Jose Alonso <joalonsof@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	io_uring: mem-account pbuf buckets	Pavel Begunkov	2022-08-17	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit cc18cc5e82033d406f54144ad6f8092206004684 upstream. Potentially, someone may create as many pbuf bucket as there are indexes in an xarray without any other restrictions bounding our memory usage, put memory needed for the buckets under memory accounting. Cc: <stable@vger.kernel.org> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d34c452e45793e978d26e2606211ec9070d329ea.1659622312.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	drm/meson: Fix refcount leak in meson_encoder_hdmi_init	Miaoqian Lin	2022-08-17	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 7381076809586528e2a812a709e2758916318a99 upstream. of_find_device_by_node() takes reference, we should use put_device() to release it when not need anymore. Add missing put_device() in error path to avoid refcount leak. Fixes: 0af5e0b41110 ("drm/meson: encoder_hdmi: switch to bridge DRM_BRIDGE_ATTACH_NO_CONNECTOR") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reviewed-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220511054052.51981-1-linmq006@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	drm/msm: Fix dirtyfb refcounting	Rob Clark	2022-08-17	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 9225b337072a10bf9b09df8bf281437488dd8a26 upstream. refcount_t complains about 0->1 transitions, which isn't quite what we wanted. So use dirtyfb==1 to mean that the fb is not connected to any output that requires dirtyfb flushing, so that we can keep the underflow and overflow checking. Fixes: 9e4dde28e9cd ("drm/msm: Avoid dirtyfb stalls on video mode displays (v2)") Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220304202146.845566-1-robdclark@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	tracing/perf: Avoid -Warray-bounds warning for __rel_loc macro	Kees Cook	2022-08-17	2	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit c6d777acdf8f62d4ebaef0e5c6cd8fedbd6e8546 upstream. As done for trace_events.h, also fix the __rel_loc macro in perf.h, which silences the -Warray-bounds warning: In file included from ./include/linux/string.h:253, from ./include/linux/bitmap.h:11, from ./include/linux/cpumask.h:12, from ./include/linux/mm_types_task.h:14, from ./include/linux/mm_types.h:5, from ./include/linux/buildid.h:5, from ./include/linux/module.h:14, from samples/trace_events/trace-events-sample.c:2: In function '__fortify_strcpy', inlined from 'perf_trace_foo_rel_loc' at samples/trace_events/./trace-events-sample.h:519:1: ./include/linux/fortify-string.h:47:33: warning: '__builtin_strcpy' offset 12 is out of the bounds [ 0, 4] [-Warray-bounds] 47 \| #define __underlying_strcpy __builtin_strcpy \| ^ ./include/linux/fortify-string.h:445:24: note: in expansion of macro '__underlying_strcpy' 445 \| return __underlying_strcpy(p, q); \| ^~~~~~~~~~~~~~~~~~~ Also make __data struct member a proper flexible array to avoid future problems. Link: https://lkml.kernel.org/r/20220125220037.2738923-1-keescook@chromium.org Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Fixes: 55de2c0b5610c ("tracing: Add '__rel_loc' using trace event macros") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	drm/vc4: change vc4_dma_range_matches from a global to static	Tom Rix	2022-08-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 63569d90863ff26c8b10c8971d1271c17a45224b upstream. sparse reports drivers/gpu/drm/vc4/vc4_drv.c:270:27: warning: symbol 'vc4_dma_range_matches' was not declared. Should it be static? vc4_dma_range_matches is only used in vc4_drv.c, so it's storage class specifier should be static. Fixes: da8e393e23ef ("drm/vc4: drv: Adopt the dma configuration from the HVS or V3D component") Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20220629200101.498138-1-trix@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	net: phy: smsc: Disable Energy Detect Power-Down in interrupt mode	Lukas Wunner	2022-08-17	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 2642cc6c3bbe0900ba15bab078fd15ad8baccbc5 upstream. Simon reports that if two LAN9514 USB adapters are directly connected without an intermediate switch, the link fails to come up and link LEDs remain dark. The issue was introduced by commit 1ce8b37241ed ("usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling"). The PHY suffers from a known erratum wherein link detection becomes unreliable if Energy Detect Power-Down is used. In poll mode, the driver works around the erratum by briefly disabling EDPD for 640 msec to detect a neighbor, then re-enabling it to save power. In interrupt mode, no interrupt is signaled if EDPD is used by both link partners, so it must not be enabled at all. We'll recoup the power savings by enabling SUSPEND1 mode on affected LAN95xx chips in a forthcoming commit. Fixes: 1ce8b37241ed ("usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling") Reported-by: Simon Han <z.han@kunbus.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Link: https://lore.kernel.org/r/439a3f3168c2f9d44b5fd9bb8d2b551711316be6.1655714438.git.lukas@wunner.de Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	drm/bridge: tc358767: Fix (e)DP bridge endpoint parsing in dedicated function	Marek Vasut	2022-08-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 9030a9e571b3ba250d3d450a98310e3c74ecaff4 upstream. Per toshiba,tc358767.yaml DT binding document, port@2 the output (e)DP port is optional. In case this port is not described in DT, the bridge driver operates in DPI-to-DP mode. The drm_of_find_panel_or_bridge() call in tc_probe_edp_bridge_endpoint() returns -ENODEV in case port@2 is not present in DT and this specific return value is incorrectly propagated outside of tc_probe_edp_bridge_endpoint() function. All other error values must be propagated and are propagated correctly. Return 0 in case the port@2 is missing instead, that reinstates the original behavior before the commit this patch fixes. Fixes: 8478095a8c4b ("drm/bridge: tc358767: Move (e)DP bridge endpoint parsing into dedicated function") Signed-off-by: Marek Vasut <marex@denx.de> Cc: Jonas Karlman <jonas@kwiboo.se> Cc: Laurent Pinchart <Laurent.pinchart@ideasonboard.com> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Marek Vasut <marex@denx.de> Cc: Maxime Ripard <maxime@cerno.tech> Cc: Neil Armstrong <narmstrong@baylibre.com> Cc: Robert Foss <robert.foss@linaro.org> Cc: Sam Ravnborg <sam@ravnborg.org> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Link: https://patchwork.freedesktop.org/patch/msgid/20220428213132.447890-1-marex@denx.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	Revert "s390/smp: enforce lowcore protection on CPU restart"	Alexander Gordeev	2022-08-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	commit 953503751a426413ea8aee2299ae3ee971b70d9b upstream. This reverts commit 6f5c672d17f583b081e283927f5040f726c54598. This breaks normal crash dump when CPU0 is offline. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	Revert "mwifiex: fix sleep in atomic context bugs caused by dev_coredumpv"	Greg Kroah-Hartman	2022-08-17	3	-10/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 5f8954e099b8ae96e7de1bb95950e00c85bedd40 upstream. This reverts commit a52ed4866d2b90dd5e4ae9dabd453f3ed8fa3cbc as it causes build problems in linux-next. It needs to be reintroduced in a way that can allow the api to evolve and not require a "flag day" to catch all users. Link: https://lore.kernel.org/r/20220623160723.7a44b573@canb.auug.org.au Cc: Duoming Zhou <duoming@zju.edu.cn> Cc: Brian Norris <briannorris@chromium.org> Cc: Johannes Berg <johannes@sipsolutions.net> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	crypto: lib/blake2s - reduce stack frame usage in self test	Jason A. Donenfeld	2022-08-17	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit d6c14da474bf260d73953fbf7992c98d9112aec7 upstream. Using 3 blocks here doesn't give us much more than using 2, and it causes a stack frame size warning on certain compiler/config/arch combinations: lib/crypto/blake2s-selftest.c: In function 'blake2s_selftest': >> lib/crypto/blake2s-selftest.c:632:1: warning: the frame size of 1088 bytes is larger than 1024 bytes [-Wframe-larger-than=] 632 \| } \| ^ So this patch just reduces the block from 3 to 2, which makes the warning go away. Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/linux-crypto/202206200851.gE3MHCgd-lkp@intel.com Fixes: 2d16803c562e ("crypto: blake2s - remove shash module") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	tcp: fix over estimation in sk_forced_mem_schedule()	Eric Dumazet	2022-08-17	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit c4ee118561a0f74442439b7b5b486db1ac1ddfeb upstream. sk_forced_mem_schedule() has a bug similar to ones fixed in commit 7c80b038d23e ("net: fix sk_wmem_schedule() and sk_rmem_schedule() errors") While this bug has little chance to trigger in old kernels, we need to fix it before the following patch. Fixes: d83769a580f1 ("tcp: fix possible deadlock in tcp_send_fin()") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Wei Wang <weiwan@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	mac80211: fix a memory leak where sta_info is not freed	Ahmed Zaki	2022-08-17	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 8f9dcc29566626f683843ccac6113a12208315ca upstream. The following is from a system that went OOM due to a memory leak: wlan0: Allocated STA 74:83:c2:64:0b:87 wlan0: Allocated STA 74:83:c2:64:0b:87 wlan0: IBSS finish 74:83:c2:64:0b:87 (---from ieee80211_ibss_add_sta) wlan0: Adding new IBSS station 74:83:c2:64:0b:87 wlan0: moving STA 74:83:c2:64:0b:87 to state 2 wlan0: moving STA 74:83:c2:64:0b:87 to state 3 wlan0: Inserted STA 74:83:c2:64:0b:87 wlan0: IBSS finish 74:83:c2:64:0b:87 (---from ieee80211_ibss_work) wlan0: Adding new IBSS station 74:83:c2:64:0b:87 wlan0: moving STA 74:83:c2:64:0b:87 to state 2 wlan0: moving STA 74:83:c2:64:0b:87 to state 3 . . wlan0: expiring inactive not authorized STA 74:83:c2:64:0b:87 wlan0: moving STA 74:83:c2:64:0b:87 to state 2 wlan0: moving STA 74:83:c2:64:0b:87 to state 1 wlan0: Removed STA 74:83:c2:64:0b:87 wlan0: Destroyed STA 74:83:c2:64:0b:87 The ieee80211_ibss_finish_sta() is called twice on the same STA from 2 different locations. On the second attempt, the allocated STA is not destroyed creating a kernel memory leak. This is happening because sta_info_insert_finish() does not call sta_info_free() the second time when the STA already exists (returns -EEXIST). Note that the caller sta_info_insert_rcu() assumes STA is destroyed upon errors. Same fix is applied to -ENOMEM. Signed-off-by: Ahmed Zaki <anzaki@gmail.com> Link: https://lore.kernel.org/r/20211002145329.3125293-1-anzaki@gmail.com [change the error path label to use the existing code] Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Viacheslav Sablin <sablin@ispras.ru> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	net_sched: cls_route: remove from list when handle is 0	Thadeu Lima de Souza Cascardo	2022-08-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 9ad36309e2719a884f946678e0296be10f0bb4c1 upstream. When a route filter is replaced and the old filter has a 0 handle, the old one won't be removed from the hashtable, while it will still be freed. The test was there since before commit 1109c00547fc ("net: sched: RCU cls_route"), when a new filter was not allocated when there was an old one. The old filter was reused and the reinserting would only be necessary if an old filter was replaced. That was still wrong for the same case where the old handle was 0. Remove the old filter from the list independently from its handle value. This fixes CVE-2022-2588, also reported as ZDI-CAN-17440. Reported-by: Zhenpeng Lin <zplin@u.northwestern.edu> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Reviewed-by: Kamal Mostafa <kamal@canonical.com> Cc: <stable@vger.kernel.org> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20220809170518.164662-1-cascardo@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	tracing: Use a struct alignof to determine trace event field alignment	Steven Rostedt (Google)	2022-08-17	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 4c3d2f9388d36eb28640a220a6f908328442d873 upstream. alignof() gives an alignment of types as they would be as standalone variables. But alignment in structures might be different, and when building the fields of events, the alignment must be the actual alignment otherwise the field offsets may not match what they actually are. This caused trace-cmd to crash, as libtraceevent did not check if the field offset was bigger than the event. The write_msr and read_msr events on 32 bit had their fields incorrect, because it had a u64 field between two ints. alignof(u64) would give 8, but the u64 field was at a 4 byte alignment. Define a macro as: ALIGN_STRUCTFIELD(type) ((int)(offsetof(struct {char a; type b;}, b))) which gives the actual alignment of types in a structure. Link: https://lkml.kernel.org/r/20220731015928.7ab3a154@rorschach.local.home Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: stable@vger.kernel.org Fixes: 04ae87a52074e ("ftrace: Rework event_create_dir()") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	powerpc: Fix eh field when calling lwarx on PPC32	Christophe Leroy	2022-08-17	1	-6/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 18db466a9a306406dab3b134014d9f6ed642471c upstream. Commit 9401f4e46cf6 ("powerpc: Use lwarx/ldarx directly instead of PPC_LWARX/LDARX macros") properly handled the eh field of lwarx in asm/bitops.h but failed to clear it for PPC32 in asm/simple_spinlock.h So, do as in arch_atomic_try_cmpxchg_lock(), set it to 1 if PPC64 but set it to 0 if PPC32. For that use IS_ENABLED(CONFIG_PPC64) which returns 1 when CONFIG_PPC64 is set and 0 otherwise. Fixes: 9401f4e46cf6 ("powerpc: Use lwarx/ldarx directly instead of PPC_LWARX/LDARX macros") Cc: stable@vger.kernel.org # v5.15+ Reported-by: Pali Rohár <pali@kernel.org> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Pali Rohár <pali@kernel.org> Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org> [mpe: Use symbolic names, use 'n' constraint per Segher] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a1176e19e627dd6a1b8d24c6c457a8ab874b7d12.1659430931.git.christophe.leroy@csgroup.eu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	xen-blkfront: Apply 'feature_persistent' parameter when connect	SeongJae Park	2022-08-17	2	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 402c43ea6b34a1b371ffeed9adf907402569eaf5 upstream. In some use cases[1], the backend is created while the frontend doesn't support the persistent grants feature, but later the frontend can be changed to support the feature and reconnect. In the past, 'blkback' enabled the persistent grants feature since it unconditionally checked if frontend supports the persistent grants feature for every connect ('connect_ring()') and decided whether it should use persistent grans or not. However, commit aac8a70db24b ("xen-blkback: add a parameter for disabling of persistent grants") has mistakenly changed the behavior. It made the frontend feature support check to not be repeated once it shown the 'feature_persistent' as 'false', or the frontend doesn't support persistent grants. Similar behavioral change has made on 'blkfront' by commit 74a852479c68 ("xen-blkfront: add a parameter for disabling of persistent grants"). This commit changes the behavior of the parameter to make effect for every connect, so that the previous behavior of 'blkfront' can be restored. [1] https://lore.kernel.org/xen-devel/CAJwUmVB6H3iTs-C+U=v-pwJB7-_ZRHPxHzKRJZ22xEPW7z8a=g@mail.gmail.com/ Fixes: 74a852479c68 ("xen-blkfront: add a parameter for disabling of persistent grants") Cc: <stable@vger.kernel.org> # 5.10.x Signed-off-by: SeongJae Park <sj@kernel.org> Reviewed-by: Maximilian Heyne <mheyne@amazon.de> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20220715225108.193398-4-sj@kernel.org Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	xen-blkback: Apply 'feature_persistent' parameter when connect	Maximilian Heyne	2022-08-17	2	-7/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit e94c6101e151b019b8babc518ac2a6ada644a5a1 upstream. In some use cases[1], the backend is created while the frontend doesn't support the persistent grants feature, but later the frontend can be changed to support the feature and reconnect. In the past, 'blkback' enabled the persistent grants feature since it unconditionally checked if frontend supports the persistent grants feature for every connect ('connect_ring()') and decided whether it should use persistent grans or not. However, commit aac8a70db24b ("xen-blkback: add a parameter for disabling of persistent grants") has mistakenly changed the behavior. It made the frontend feature support check to not be repeated once it shown the 'feature_persistent' as 'false', or the frontend doesn't support persistent grants. This commit changes the behavior of the parameter to make effect for every connect, so that the previous workflow can work again as expected. [1] https://lore.kernel.org/xen-devel/CAJwUmVB6H3iTs-C+U=v-pwJB7-_ZRHPxHzKRJZ22xEPW7z8a=g@mail.gmail.com/ Reported-by: Andrii Chepurnyi <andrii.chepurnyi82@gmail.com> Fixes: aac8a70db24b ("xen-blkback: add a parameter for disabling of persistent grants") Cc: <stable@vger.kernel.org> # 5.10.x Signed-off-by: Maximilian Heyne <mheyne@amazon.de> Signed-off-by: SeongJae Park <sj@kernel.org> Reviewed-by: Maximilian Heyne <mheyne@amazon.de> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20220715225108.193398-3-sj@kernel.org Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	xen-blkback: fix persistent grants negotiation	SeongJae Park	2022-08-17	1	-8/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit fc9be616bb8f3ed9cf560308f86904f5c06be205 upstream. Persistent grants feature can be used only when both backend and the frontend supports the feature. The feature was always supported by 'blkback', but commit aac8a70db24b ("xen-blkback: add a parameter for disabling of persistent grants") has introduced a parameter for disabling it runtime. To avoid the parameter be updated while being used by 'blkback', the commit caches the parameter into 'vbd->feature_gnt_persistent' in 'xen_vbd_create()', and then check if the guest also supports the feature and finally updates the field in 'connect_ring()'. However, 'connect_ring()' could be called before 'xen_vbd_create()', so later execution of 'xen_vbd_create()' can wrongly overwrite 'true' to 'vbd->feature_gnt_persistent'. As a result, 'blkback' could try to use 'persistent grants' feature even if the guest doesn't support the feature. This commit fixes the issue by moving the parameter value caching to 'xen_blkif_alloc()', which allocates the 'blkif'. Because the struct embeds 'vbd' object, which will be used by 'connect_ring()' later, this should be called before 'connect_ring()' and therefore this should be the right and safe place to do the caching. Fixes: aac8a70db24b ("xen-blkback: add a parameter for disabling of persistent grants") Cc: <stable@vger.kernel.org> # 5.10.x Signed-off-by: Maximilian Heyne <mheyne@amazon.de> Signed-off-by: SeongJae Park <sj@kernel.org> Reviewed-by: Maximilian Heyne <mheyne@amazon.de> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20220715225108.193398-2-sj@kernel.org Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	tpm: eventlog: Fix section mismatch for DEBUG_SECTION_MISMATCH	Huacai Chen	2022-08-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[ Upstream commit bed4593645366ad7362a3aa7bc0d100d8d8236a8 ] If DEBUG_SECTION_MISMATCH enabled, __calc_tpm2_event_size() will not be inlined, this cause section mismatch like this: WARNING: modpost: vmlinux.o(.text.unlikely+0xe30c): Section mismatch in reference from the variable L0 to the function .init.text:early_ioremap() The function L0() references the function __init early_memremap(). This is often because L0 lacks a __init annotation or the annotation of early_ioremap is wrong. Fix it by using __always_inline instead of inline for the called-once function __calc_tpm2_event_size(). Fixes: 44038bc514a2 ("tpm: Abstract crypto agile event size calculations") Cc: stable@vger.kernel.org # v5.3 Reported-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
*	KEYS: asymmetric: enforce SM2 signature use pkey algo	Tianjia Zhang	2022-08-17	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[ Upstream commit 0815291a8fd66cdcf7db1445d4d99b0d16065829 ] The signature verification of SM2 needs to add the Za value and recalculate sig->digest, which requires the detection of the pkey_algo in public_key_verify_signature(). As Eric Biggers said, the pkey_algo field in sig is attacker-controlled and should be use pkey->pkey_algo instead of sig->pkey_algo, and secondly, if sig->pkey_algo is NULL, it will also cause signature verification failure. The software_key_determine_akcipher() already forces the algorithms are matched, so the SM3 algorithm is enforced in the SM2 signature, although this has been checked, we still avoid using any algorithm information in the signature as input. Fixes: 215525639631 ("X.509: support OSCCA SM2-with-SM3 certificate verification") Reported-by: Eric Biggers <ebiggers@google.com> Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
*	ext4: fix race when reusing xattr blocks	Jan Kara	2022-08-17	1	-22/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[ Upstream commit 65f8b80053a1b2fd602daa6814e62d6fa90e5e9b ] When ext4_xattr_block_set() decides to remove xattr block the following race can happen: CPU1 CPU2 ext4_xattr_block_set() ext4_xattr_release_block() new_bh = ext4_xattr_block_cache_find() lock_buffer(bh); ref = le32_to_cpu(BHDR(bh)->h_refcount); if (ref == 1) { ... mb_cache_entry_delete(); unlock_buffer(bh); ext4_free_blocks(); ... ext4_forget(..., bh, ...); jbd2_journal_revoke(..., bh); ext4_journal_get_write_access(..., new_bh, ...) do_get_write_access() jbd2_journal_cancel_revoke(..., new_bh); Later the code in ext4_xattr_block_set() finds out the block got freed and cancels reusal of the block but the revoke stays canceled and so in case of block reuse and journal replay the filesystem can get corrupted. If the race works out slightly differently, we can also hit assertions in the jbd2 code. Fix the problem by making sure that once matching mbcache entry is found, code dropping the last xattr block reference (or trying to modify xattr block in place) waits until the mbcache entry reference is dropped. This way code trying to reuse xattr block is protected from someone trying to drop the last reference to xattr block. Reported-and-tested-by: Ritesh Harjani <ritesh.list@gmail.com> CC: stable@vger.kernel.org Fixes: 82939d7999df ("ext4: convert to mbcache2") Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220712105436.32204-5-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
*	ext4: unindent codeblock in ext4_xattr_block_set()	Jan Kara	2022-08-17	1	-39/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[ Upstream commit fd48e9acdf26d0cbd80051de07d4a735d05d29b2 ] Remove unnecessary else (and thus indentation level) from a code block in ext4_xattr_block_set(). It will also make following code changes easier. No functional changes. CC: stable@vger.kernel.org Fixes: 82939d7999df ("ext4: convert to mbcache2") Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220712105436.32204-4-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
*	ext4: use kmemdup() to replace kmalloc + memcpy	Shuqi Zhang	2022-08-17	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \|	[ Upstream commit 4efd9f0d120c55b08852ee5605dbb02a77089a5d ] Replace kmalloc + memcpy with kmemdup() Signed-off-by: Shuqi Zhang <zhangshuqi3@huawei.com> Reviewed-by: Ritesh Harjani <ritesh.list@gmail.com> Link: https://lore.kernel.org/r/20220525030120.803330-1-zhangshuqi3@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
*	ext4: remove EA inode entry from mbcache on inode eviction	Jan Kara	2022-08-17	3	-16/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[ Upstream commit 6bc0d63dad7f9f54d381925ee855b402f652fa39 ] Currently we remove EA inode from mbcache as soon as its xattr refcount drops to zero. However there can be pending attempts to reuse the inode and thus refcount handling code has to handle the situation when refcount increases from zero anyway. So save some work and just keep EA inode in mbcache until it is getting evicted. At that moment we are sure following iget() of EA inode will fail anyway (or wait for eviction to finish and load things from the disk again) and so removing mbcache entry at that moment is fine and simplifies the code a bit. CC: stable@vger.kernel.org Fixes: 82939d7999df ("ext4: convert to mbcache2") Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220712105436.32204-3-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
*	ext4: make sure ext4_append() always allocates new block	Lukas Czerner	2022-08-17	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[ Upstream commit b8a04fe77ef1360fbf73c80fddbdfeaa9407ed1b ] ext4_append() must always allocate a new block, otherwise we run the risk of overwriting existing directory block corrupting the directory tree in the process resulting in all manner of problems later on. Add a sanity check to see if the logical block is already allocated and error out if it is. Cc: stable@kernel.org Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20220704142721.157985-2-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
*	ext4: check if directory block is within i_size	Lukas Czerner	2022-08-17	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[ Upstream commit 65f8ea4cd57dbd46ea13b41dc8bac03176b04233 ] Currently ext4 directory handling code implicitly assumes that the directory blocks are always within the i_size. In fact ext4_append() will attempt to allocate next directory block based solely on i_size and the i_size is then appropriately increased after a successful allocation. However, for this to work it requires i_size to be correct. If, for any reason, the directory inode i_size is corrupted in a way that the directory tree refers to a valid directory block past i_size, we could end up corrupting parts of the directory tree structure by overwriting already used directory blocks when modifying the directory. Fix it by catching the corruption early in __ext4_read_dirblock(). Addresses Red-Hat-Bugzilla: #2070205 CVE: CVE-2022-1184 Signed-off-by: Lukas Czerner <lczerner@redhat.com> Cc: stable@vger.kernel.org Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20220704142721.157985-1-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
*	ext4: fix warning in ext4_iomap_begin as race between bmap and write	Ye Bin	2022-08-17	1	-3/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[ Upstream commit 51ae846cff568c8c29921b1b28eb2dfbcd4ac12d ] We got issue as follows: ------------[ cut here ]------------ WARNING: CPU: 3 PID: 9310 at fs/ext4/inode.c:3441 ext4_iomap_begin+0x182/0x5d0 RIP: 0010:ext4_iomap_begin+0x182/0x5d0 RSP: 0018:ffff88812460fa08 EFLAGS: 00010293 RAX: ffff88811f168000 RBX: 0000000000000000 RCX: ffffffff97793c12 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003 RBP: ffff88812c669160 R08: ffff88811f168000 R09: ffffed10258cd20f R10: ffff88812c669077 R11: ffffed10258cd20e R12: 0000000000000001 R13: 00000000000000a4 R14: 000000000000000c R15: ffff88812c6691ee FS: 00007fd0d6ff3740(0000) GS:ffff8883af180000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fd0d6dda290 CR3: 0000000104a62000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: iomap_apply+0x119/0x570 iomap_bmap+0x124/0x150 ext4_bmap+0x14f/0x250 bmap+0x55/0x80 do_vfs_ioctl+0x952/0xbd0 __x64_sys_ioctl+0xc6/0x170 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Above issue may happen as follows: bmap write bmap ext4_bmap iomap_bmap ext4_iomap_begin ext4_file_write_iter ext4_buffered_write_iter generic_perform_write ext4_da_write_begin ext4_da_write_inline_data_begin ext4_prepare_inline_data ext4_create_inline_data ext4_set_inode_flag(inode, EXT4_INODE_INLINE_DATA); if (WARN_ON_ONCE(ext4_has_inline_data(inode))) ->trigger bug_on To solved above issue hold inode lock in ext4_bamp. Signed-off-by: Ye Bin <yebin10@huawei.com> Link: https://lore.kernel.org/r/20220617013935.397596-1-yebin10@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
*	ext4: correct the misjudgment in ext4_iget_extra_inode	Baokun Li	2022-08-17	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[ Upstream commit fd7e672ea98b95b9d4c9dae316639f03c16a749d ] Use the EXT4_INODE_HAS_XATTR_SPACE macro to more accurately determine whether the inode have xattr space. Cc: stable@kernel.org Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220616021358.2504451-5-libaokun1@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
*	ext4: correct max_inline_xattr_value_size computing	Baokun Li	2022-08-17	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[ Upstream commit c9fd167d57133c5b748d16913c4eabc55e531c73 ] If the ext4 inode does not have xattr space, 0 is returned in the get_max_inline_xattr_value_size function. Otherwise, the function returns a negative value when the inode does not contain EXT4_STATE_XATTR. Cc: stable@kernel.org Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220616021358.2504451-4-libaokun1@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
*	ext4: fix use-after-free in ext4_xattr_set_entry	Baokun Li	2022-08-17	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[ Upstream commit 67d7d8ad99beccd9fe92d585b87f1760dc9018e3 ] Hulk Robot reported a issue: ================================================================== BUG: KASAN: use-after-free in ext4_xattr_set_entry+0x18ab/0x3500 Write of size 4105 at addr ffff8881675ef5f4 by task syz-executor.0/7092 CPU: 1 PID: 7092 Comm: syz-executor.0 Not tainted 4.19.90-dirty #17 Call Trace: [...] memcpy+0x34/0x50 mm/kasan/kasan.c:303 ext4_xattr_set_entry+0x18ab/0x3500 fs/ext4/xattr.c:1747 ext4_xattr_ibody_inline_set+0x86/0x2a0 fs/ext4/xattr.c:2205 ext4_xattr_set_handle+0x940/0x1300 fs/ext4/xattr.c:2386 ext4_xattr_set+0x1da/0x300 fs/ext4/xattr.c:2498 __vfs_setxattr+0x112/0x170 fs/xattr.c:149 __vfs_setxattr_noperm+0x11b/0x2a0 fs/xattr.c:180 __vfs_setxattr_locked+0x17b/0x250 fs/xattr.c:238 vfs_setxattr+0xed/0x270 fs/xattr.c:255 setxattr+0x235/0x330 fs/xattr.c:520 path_setxattr+0x176/0x190 fs/xattr.c:539 __do_sys_lsetxattr fs/xattr.c:561 [inline] __se_sys_lsetxattr fs/xattr.c:557 [inline] __x64_sys_lsetxattr+0xc2/0x160 fs/xattr.c:557 do_syscall_64+0xdf/0x530 arch/x86/entry/common.c:298 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x459fe9 RSP: 002b:00007fa5e54b4c08 EFLAGS: 00000246 ORIG_RAX: 00000000000000bd RAX: ffffffffffffffda RBX: 000000000051bf60 RCX: 0000000000459fe9 RDX: 00000000200003c0 RSI: 0000000020000180 RDI: 0000000020000140 RBP: 000000000051bf60 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000001009 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffc73c93fc0 R14: 000000000051bf60 R15: 00007fa5e54b4d80 [...] ================================================================== Above issue may happen as follows: ------------------------------------- ext4_xattr_set ext4_xattr_set_handle ext4_xattr_ibody_find >> s->end < s->base >> no EXT4_STATE_XATTR >> xattr_check_inode is not executed ext4_xattr_ibody_set ext4_xattr_set_entry >> size_t min_offs = s->end - s->base >> UAF in memcpy we can easily reproduce this problem with the following commands: mkfs.ext4 -F /dev/sda mount -o debug_want_extra_isize=128 /dev/sda /mnt touch /mnt/file setfattr -n user.cat -v `seq -s z 4096\|tr -d '[:digit:]'` /mnt/file In ext4_xattr_ibody_find, we have the following assignment logic: header = IHDR(inode, raw_inode) = raw_inode + EXT4_GOOD_OLD_INODE_SIZE + i_extra_isize is->s.base = IFIRST(header) = header + sizeof(struct ext4_xattr_ibody_header) is->s.end = raw_inode + s_inode_size In ext4_xattr_set_entry min_offs = s->end - s->base = s_inode_size - EXT4_GOOD_OLD_INODE_SIZE - i_extra_isize - sizeof(struct ext4_xattr_ibody_header) last = s->first free = min_offs - ((void *)last - s->base) - sizeof(__u32) = s_inode_size - EXT4_GOOD_OLD_INODE_SIZE - i_extra_isize - sizeof(struct ext4_xattr_ibody_header) - sizeof(__u32) In the calculation formula, all values except s_inode_size and i_extra_size are fixed values. When i_extra_size is the maximum value s_inode_size - EXT4_GOOD_OLD_INODE_SIZE, min_offs is -4 and free is -8. The value overflows. As a result, the preceding issue is triggered when memcpy is executed. Therefore, when finding xattr or setting xattr, check whether there is space for storing xattr in the inode to resolve this issue. Cc: stable@kernel.org Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220616021358.2504451-3-libaokun1@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
*	ext4: add EXT4_INODE_HAS_XATTR_SPACE macro in xattr.h	Baokun Li	2022-08-17	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[ Upstream commit 179b14152dcb6a24c3415200603aebca70ff13af ] When adding an xattr to an inode, we must ensure that the inode_size is not less than EXT4_GOOD_OLD_INODE_SIZE + extra_isize + pad. Otherwise, the end position may be greater than the start position, resulting in UAF. Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Link: https://lore.kernel.org/r/20220616021358.2504451-2-libaokun1@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
*	ext4: fix extent status tree race in writeback error recovery path	Eric Whitney	2022-08-17	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[ Upstream commit 7f0d8e1d607c1a4fa9a27362a108921d82230874 ] A race can occur in the unlikely event ext4 is unable to allocate a physical cluster for a delayed allocation in a bigalloc file system during writeback. Failure to allocate a cluster forces error recovery that includes a call to mpage_release_unused_pages(). That function removes any corresponding delayed allocated blocks from the extent status tree. If a new delayed write is in progress on the same cluster simultaneously, resulting in the addition of an new extent containing one or more blocks in that cluster to the extent status tree, delayed block accounting can be thrown off if that delayed write then encounters a similar cluster allocation failure during future writeback. Write lock the i_data_sem in mpage_release_unused_pages() to fix this problem. Ext4's block/cluster accounting code for bigalloc relies on i_data_sem for mutual exclusion, as is found in the delayed write path, and the locking in mpage_release_unused_pages() is missing. Cc: stable@kernel.org Reported-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Eric Whitney <enwlinux@gmail.com> Link: https://lore.kernel.org/r/20220615160530.1928801-1-enwlinux@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>