summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
* lockd: fix races in client GRANTED_MSG wait logicJeff Layton2023-04-262-31/+39
| | | | | | | | | | | | | | | | | | | | | | | | After the wait for a grant is done (for whatever reason), nlmclnt_block updates the status of the nlm_rqst with the status of the block. At the point it does this, however, the block is still queued its status could change at any time. This is particularly a problem when the waiting task is signaled during the wait. We can end up giving up on the lock just before the GRANTED_MSG callback comes in, and accept it even though the lock request gets back an error, leaving a dangling lock on the server. Since the nlm_wait never lives beyond the end of nlmclnt_lock, put it on the stack and add functions to allow us to enqueue and dequeue the block. Enqueue it just before the lock/wait loop, and dequeue it just after we exit the loop instead of waiting until the end of the function. Also, scrape the status at the time that we dequeue it to ensure that it's final. Reported-by: Yongcheng Yang <yoyang@redhat.com> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2063818 Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* lockd: move struct nlm_wait to lockd.hJeff Layton2023-04-261-12/+0
| | | | | | | | | The next patch needs struct nlm_wait in fs/lockd/clntproc.c, so move the definition to a shared header file. As an added clean-up, drop the unused b_reclaim field. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* lockd: purge resources held on behalf of nlm clients when shutting downJeff Layton2023-04-261-0/+1
| | | | | | | | | | | | | | | | | | | | | It's easily possible for the server to have an outstanding lock when we go to shut down. When that happens, we often get a warning like this in the kernel log: lockd: couldn't shutdown host module for net f0000000! This is because the shutdown procedures skip removing any hosts that still have outstanding resources (locks). Eventually, things seem to get cleaned up anyway, but the log message is unsettling, and server shutdown doesn't seem to be working the way it was intended. Ensure that we tear down any resources held on behalf of a client when tearing one down for server shutdown. Reported-by: Yongcheng Yang <yoyang@redhat.com> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2063818 Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Convert filecache to rhltableChuck Lever2023-04-262-187/+133
| | | | | | | | | | | | | | | | | | | | | | | | | While we were converting the nfs4_file hashtable to use the kernel's resizable hashtable data structure, Neil Brown observed that the list variant (rhltable) would be better for managing nfsd_file items as well. The nfsd_file hash table will contain multiple entries for the same inode -- these should be kept together on a list. And, it could be possible for exotic or malicious client behavior to cause the hash table to resize itself on every insertion. A nice simplification is that rhltable_lookup() can return a list that contains only nfsd_file items that match a given inode, which enables us to eliminate specialized hash table helper functions and use the default functions provided by the rhashtable implementation). Since we are now storing nfsd_file items for the same inode on a single list, that effectively reduces the number of hash entries that have to be tracked in the hash table. The mininum bucket count is therefore lowered. Light testing with fstests generic/531 show no regressions. Suggested-by: Neil Brown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* nfsd: allow reaping files still under writebackJeff Layton2023-04-262-4/+17
| | | | | | | | | | | | | | | | | | | On most filesystems, there is no reason to delay reaping an nfsd_file just because its underlying inode is still under writeback. nfsd just relies on client activity or the local flusher threads to do writeback. The main exception is NFS, which flushes all of its dirty data on last close. Add a new EXPORT_OP_FLUSH_ON_CLOSE flag to allow filesystems to signal that they do this, and only skip closing files under writeback on such filesystems. Also, remove a redundant NULL file pointer check in nfsd_file_check_writeback, and clean up nfs's export op flag definitions. Signed-off-by: Jeff Layton <jlayton@kernel.org> Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* nfsd: update comment over __nfsd_file_cache_purgeJeff Layton2023-04-261-1/+2
| | | | | Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* nfsd: don't take/put an extra reference when putting a fileJeff Layton2023-04-261-3/+1
| | | | | | | | The last thing that filp_close does is an fput, so don't bother taking and putting the extra reference. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* nfsd: add some comments to nfsd_file_do_acquireJeff Layton2023-04-261-0/+5
| | | | | | | | | David Howells mentioned that he found this bit of code confusing, so sprinkle in some comments to clarify. Reported-by: David Howells <dhowells@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* nfsd: don't kill nfsd_files because of lease break errorJeff Layton2023-04-261-14/+15
| | | | | | | | | An error from break_lease is non-fatal, so we needn't destroy the nfsd_file in that case. Just put the reference like we normally would and return the error. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* nfsd: simplify test_bit return in NFSD_FILE_KEY_FULL comparatorJeff Layton2023-04-261-1/+1
| | | | | | | | test_bit returns bool, so we can just compare the result of that to the key->gc value without the "!!". Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* nfsd: NFSD_FILE_KEY_INODE only needs to find GC'ed entriesJeff Layton2023-04-261-0/+4
| | | | | | | | | | | | Since v4 files are expected to be long-lived, there's little value in closing them out of the cache when there is conflicting access. Change the comparator to also match the gc value in the key. Change both of the current users of that key to set the gc value in the key to "true". Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* nfsd: don't open-code clear_and_wake_up_bitJeff Layton2023-04-261-3/+1
| | | | | Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netPaolo Abeni2023-04-261-5/+4
|\ | | | | | | | | | | No conflicts. Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * rxrpc: Fix potential race in error handling in afs_make_call()David Howells2023-04-221-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the rxrpc call set up by afs_make_call() receives an error whilst it is transmitting the request, there's the possibility that it may get to the point the rxrpc call is ended (after the error_kill_call label) just as the call is queued for async processing. This could manifest itself as call->rxcall being seen as NULL in afs_deliver_to_call() when it tries to lock the call. Fix this by splitting rxrpc_kernel_end_call() into a function to shut down an rxrpc call and a function to release the caller's reference and calling the latter only when we get to afs_put_call(). Reported-by: Jeffrey Altman <jaltman@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: kafs-testing+fedora36_64checkkafs-build-306@auristor.com cc: Marc Dionne <marc.dionne@auristor.com> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2023-04-205-28/+79
|\| | | | | | | | | | | | | | | | | | | | | Adjacent changes: net/mptcp/protocol.h 63740448a32e ("mptcp: fix accept vs worker race") 2a6a870e44dd ("mptcp: stops worker on unaccepted sockets at listener close") ddb1a072f858 ("mptcp: move first subflow allocation at mpc access time") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * Merge tag 'mm-hotfixes-stable-2023-04-19-16-36' of ↵Linus Torvalds2023-04-193-9/+34
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "22 hotfixes. 19 are cc:stable and the remainder address issues which were introduced during this merge cycle, or aren't considered suitable for -stable backporting. 19 are for MM and the remainder are for other subsystems" * tag 'mm-hotfixes-stable-2023-04-19-16-36' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (22 commits) nilfs2: initialize unused bytes in segment summary blocks mm: page_alloc: skip regions with hugetlbfs pages when allocating 1G pages mm/mmap: regression fix for unmapped_area{_topdown} maple_tree: fix mas_empty_area() search maple_tree: make maple state reusable after mas_empty_area_rev() mm: kmsan: handle alloc failures in kmsan_ioremap_page_range() mm: kmsan: handle alloc failures in kmsan_vmap_pages_range_noflush() tools/Makefile: do missed s/vm/mm/ mm: fix memory leak on mm_init error handling mm/page_alloc: fix potential deadlock on zonelist_update_seq seqlock kernel/sys.c: fix and improve control flow in __sys_setres[ug]id() Revert "userfaultfd: don't fail on unrecognized features" writeback, cgroup: fix null-ptr-deref write in bdi_split_work_to_wbs maple_tree: fix a potential memory leak, OOB access, or other unpredictable bug tools/mm/page_owner_sort.c: fix TGID output when cull=tg is used mailmap: update jtoppins' entry to reference correct email mm/mempolicy: fix use-after-free of VMA iterator mm/huge_memory.c: warn with pr_warn_ratelimited instead of VM_WARN_ON_ONCE_FOLIO mm/mprotect: fix do_mprotect_pkey() return on error mm/khugepaged: check again on anon uffd-wp during isolation ...
| | * nilfs2: initialize unused bytes in segment summary blocksRyusuke Konishi2023-04-181-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Syzbot still reports uninit-value in nilfs_add_checksums_on_logs() for KMSAN enabled kernels after applying commit 7397031622e0 ("nilfs2: initialize "struct nilfs_binfo_dat"->bi_pad field"). This is because the unused bytes at the end of each block in segment summaries are not initialized. So this fixes the issue by padding the unused bytes with null bytes. Link: https://lkml.kernel.org/r/20230417173513.12598-1-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+048585f3f4227bb2b49b@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=048585f3f4227bb2b49b Cc: Alexander Potapenko <glider@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| | * Revert "userfaultfd: don't fail on unrecognized features"Peter Xu2023-04-161-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a proposal to revert commit 914eedcb9ba0ff53c33808. I found this when writing a simple UFFDIO_API test to be the first unit test in this set. Two things breaks with the commit: - UFFDIO_API check was lost and missing. According to man page, the kernel should reject ioctl(UFFDIO_API) if uffdio_api.api != 0xaa. This check is needed if the api version will be extended in the future, or user app won't be able to identify which is a new kernel. - Feature flags checks were removed, which means UFFDIO_API with a feature that does not exist will also succeed. According to the man page, we should (and it makes sense) to reject ioctl(UFFDIO_API) if unknown features passed in. Link: https://lore.kernel.org/r/20220722201513.1624158-1-axelrasmussen@google.com Link: https://lkml.kernel.org/r/20230412163922.327282-2-peterx@redhat.com Fixes: 914eedcb9ba0 ("userfaultfd: don't fail on unrecognized features") Signed-off-by: Peter Xu <peterx@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Zach O'Keefe <zokeefe@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| | * writeback, cgroup: fix null-ptr-deref write in bdi_split_work_to_wbsBaokun Li2023-04-161-7/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KASAN report null-ptr-deref: ================================================================== BUG: KASAN: null-ptr-deref in bdi_split_work_to_wbs+0x5c5/0x7b0 Write of size 8 at addr 0000000000000000 by task sync/943 CPU: 5 PID: 943 Comm: sync Tainted: 6.3.0-rc5-next-20230406-dirty #461 Call Trace: <TASK> dump_stack_lvl+0x7f/0xc0 print_report+0x2ba/0x340 kasan_report+0xc4/0x120 kasan_check_range+0x1b7/0x2e0 __kasan_check_write+0x24/0x40 bdi_split_work_to_wbs+0x5c5/0x7b0 sync_inodes_sb+0x195/0x630 sync_inodes_one_sb+0x3a/0x50 iterate_supers+0x106/0x1b0 ksys_sync+0x98/0x160 [...] ================================================================== The race that causes the above issue is as follows: cpu1 cpu2 -------------------------|------------------------- inode_switch_wbs INIT_WORK(&isw->work, inode_switch_wbs_work_fn) queue_rcu_work(isw_wq, &isw->work) // queue_work async inode_switch_wbs_work_fn wb_put_many(old_wb, nr_switched) percpu_ref_put_many ref->data->release(ref) cgwb_release queue_work(cgwb_release_wq, &wb->release_work) // queue_work async &wb->release_work cgwb_release_workfn ksys_sync iterate_supers sync_inodes_one_sb sync_inodes_sb bdi_split_work_to_wbs kmalloc(sizeof(*work), GFP_ATOMIC) // alloc memory failed percpu_ref_exit ref->data = NULL kfree(data) wb_get(wb) percpu_ref_get(&wb->refcnt) percpu_ref_get_many(ref, 1) atomic_long_add(nr, &ref->data->count) atomic64_add(i, v) // trigger null-ptr-deref bdi_split_work_to_wbs() traverses &bdi->wb_list to split work into all wbs. If the allocation of new work fails, the on-stack fallback will be used and the reference count of the current wb is increased afterwards. If cgroup writeback membership switches occur before getting the reference count and the current wb is released as old_wd, then calling wb_get() or wb_put() will trigger the null pointer dereference above. This issue was introduced in v4.3-rc7 (see fix tag1). Both sync_inodes_sb() and __writeback_inodes_sb_nr() calls to bdi_split_work_to_wbs() can trigger this issue. For scenarios called via sync_inodes_sb(), originally commit 7fc5854f8c6e ("writeback: synchronize sync(2) against cgroup writeback membership switches") reduced the possibility of the issue by adding wb_switch_rwsem, but in v5.14-rc1 (see fix tag2) removed the "inode_io_list_del_locked(inode, old_wb)" from inode_switch_wbs_work_fn() so that wb->state contains WB_has_dirty_io, thus old_wb is not skipped when traversing wbs in bdi_split_work_to_wbs(), and the issue becomes easily reproducible again. To solve this problem, percpu_ref_exit() is called under RCU protection to avoid race between cgwb_release_workfn() and bdi_split_work_to_wbs(). Moreover, replace wb_get() with wb_tryget() in bdi_split_work_to_wbs(), and skip the current wb if wb_tryget() fails because the wb has already been shutdown. Link: https://lkml.kernel.org/r/20230410130826.1492525-1-libaokun1@huawei.com Fixes: b817525a4a80 ("writeback: bdi_writeback iteration must not skip dying ones") Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Tejun Heo <tj@kernel.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Christian Brauner <brauner@kernel.org> Cc: Dennis Zhou <dennis@kernel.org> Cc: Hou Tao <houtao1@huawei.com> Cc: yangerkun <yangerkun@huawei.com> Cc: Zhang Yi <yi.zhang@huawei.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * | Merge tag '6.3-rc6-ksmbd-server-fix' of git://git.samba.org/ksmbdLinus Torvalds2023-04-161-9/+14
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull ksmbd server fix from Steve French: "smb311 server preauth integrity negotiate context parsing fix (check for out of bounds access)" * tag '6.3-rc6-ksmbd-server-fix' of git://git.samba.org/ksmbd: ksmbd: avoid out of bounds access in decode_preauth_ctxt()
| | * | ksmbd: avoid out of bounds access in decode_preauth_ctxt()David Disseldorp2023-04-131-9/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Confirm that the accessed pneg_ctxt->HashAlgorithms address sits within the SMB request boundary; deassemble_neg_contexts() only checks that the eight byte smb2_neg_context header + (client controlled) DataLength are within the packet boundary, which is insufficient. Checking for sizeof(struct smb2_preauth_neg_context) is overkill given that the type currently assumes SMB311_SALT_SIZE bytes of trailing Salt. Signed-off-by: David Disseldorp <ddiss@suse.de> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
| * | | Merge tag '6.3-rc6-smb311-client-negcontext-fix' of ↵Linus Torvalds2023-04-151-10/+31
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.samba.org/sfrench/cifs-2.6 Pull cifs fix from Steve French: "Small client fix for better checking for smb311 negotiate context overflows, also marked for stable" * tag '6.3-rc6-smb311-client-negcontext-fix' of git://git.samba.org/sfrench/cifs-2.6: cifs: fix negotiate context parsing
| | * | | cifs: fix negotiate context parsingDavid Disseldorp2023-04-151-10/+31
| | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | smb311_decode_neg_context() doesn't properly check against SMB packet boundaries prior to accessing individual negotiate context entries. This is due to the length check omitting the eight byte smb2_neg_context header, as well as incorrect decrementing of len_of_ctxts. Fixes: 5100d8a3fe03 ("SMB311: Improve checking of negotiate security contexts") Reported-by: Volker Lendecke <vl@samba.org> Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
* | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2023-04-1322-105/+233
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: tools/testing/selftests/net/config 62199e3f1658 ("selftests: net: Add VXLAN MDB test") 3a0385be133e ("selftests: add the missing CONFIG_IP_SCTP in net config") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | | netfs: Fix netfs_extract_iter_to_sg() for ITER_UBUF/IOVECDavid Howells2023-04-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix netfs_extract_iter_to_sg() for ITER_UBUF and ITER_IOVEC to set the size of the page to the part of the page extracted, not the remaining amount of data in the extracted page array at that point. This doesn't yet affect anything as cifs, the only current user, only passes in non-user-backed iterators. Fixes: 018584697533 ("netfs: Add a function to extract an iterator into a scatterlist") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Cc: Steve French <sfrench@samba.org> Cc: Shyam Prasad N <nspmangalore@gmail.com> Cc: Rohith Surabattula <rohiths.msft@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | Merge tag 'for-6.3-rc6-tag' of ↵Linus Torvalds2023-04-112-2/+16
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - fix fast checksum detection, this affects filesystems with non-crc32c checksum, calculation would not be offloaded to worker threads - restore thread_pool mount option behaviour for endio workers, the new value for maximum active threads would not be set to the actual work queues * tag 'for-6.3-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix fast csum implementation detection btrfs: restore the thread_pool= behavior in remount for the end I/O workqueues
| | * | | btrfs: fix fast csum implementation detectionChristoph Hellwig2023-04-062-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The BTRFS_FS_CSUM_IMPL_FAST flag is currently set whenever a non-generic crc32c is detected, which is the incorrect check if the file system uses a different checksumming algorithm. Refactor the code to only check this if crc32c is actually used. Note that in an ideal world the information if an algorithm is hardware accelerated or not should be provided by the crypto API instead, but that's left for another day. CC: stable@vger.kernel.org # 5.4.x: c8a5f8ca9a9c: btrfs: print checksum type and implementation at mount time CC: stable@vger.kernel.org # 5.4.x Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | | btrfs: restore the thread_pool= behavior in remount for the end I/O workqueuesChristoph Hellwig2023-04-061-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit d7b9416fe5c5 ("btrfs: remove btrfs_end_io_wq") converted the read and I/O handling from btrfs_workqueues to Linux workqueues, and as part of that lost the code to apply the thread_pool= based max_active limit on remount. Restore it. Fixes: d7b9416fe5c5 ("btrfs: remove btrfs_end_io_wq") CC: stable@vger.kernel.org # 6.0+ Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | Merge tag '9p-6.3-fixes-rc7' of ↵Linus Torvalds2023-04-101-3/+5
| |\ \ \ \ | | |_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs Pull 9p fixes from Eric Van Hensbergen: "These are some collected fixes for the 6.3-rc series that have been passed our 9p regression tests and been in for-next for at least a week. They include a fix for a KASAN reported problem in the extended attribute handling code and a use after free in the xen transport. This also includes some updates for the MAINTAINERS file including the transition of our development mailing list from sourceforge.net to lists.linux.dev" * tag '9p-6.3-fixes-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: Update email address and mailing list for v9fs 9p/xen : Fix use after free bug in xen_9pfs_front_remove due to race condition 9P FS: Fix wild-memory-access write in v9fs_get_acl
| | * | | 9P FS: Fix wild-memory-access write in v9fs_get_aclIvan Orlov2023-03-271-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KASAN reported the following issue: [ 36.825817][ T5923] BUG: KASAN: wild-memory-access in v9fs_get_acl+0x1a4/0x390 [ 36.827479][ T5923] Write of size 4 at addr 9fffeb37f97f1c00 by task syz-executor798/5923 [ 36.829303][ T5923] [ 36.829846][ T5923] CPU: 0 PID: 5923 Comm: syz-executor798 Not tainted 6.2.0-syzkaller-18302-g596b6b709632 #0 [ 36.832110][ T5923] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/21/2023 [ 36.834464][ T5923] Call trace: [ 36.835196][ T5923] dump_backtrace+0x1c8/0x1f4 [ 36.836229][ T5923] show_stack+0x2c/0x3c [ 36.837100][ T5923] dump_stack_lvl+0xd0/0x124 [ 36.838103][ T5923] print_report+0xe4/0x4c0 [ 36.839068][ T5923] kasan_report+0xd4/0x130 [ 36.840052][ T5923] kasan_check_range+0x264/0x2a4 [ 36.841199][ T5923] __kasan_check_write+0x2c/0x3c [ 36.842216][ T5923] v9fs_get_acl+0x1a4/0x390 [ 36.843232][ T5923] v9fs_mount+0x77c/0xa5c [ 36.844163][ T5923] legacy_get_tree+0xd4/0x16c [ 36.845173][ T5923] vfs_get_tree+0x90/0x274 [ 36.846137][ T5923] do_new_mount+0x25c/0x8c8 [ 36.847066][ T5923] path_mount+0x590/0xe58 [ 36.848147][ T5923] __arm64_sys_mount+0x45c/0x594 [ 36.849273][ T5923] invoke_syscall+0x98/0x2c0 [ 36.850421][ T5923] el0_svc_common+0x138/0x258 [ 36.851397][ T5923] do_el0_svc+0x64/0x198 [ 36.852398][ T5923] el0_svc+0x58/0x168 [ 36.853224][ T5923] el0t_64_sync_handler+0x84/0xf0 [ 36.854293][ T5923] el0t_64_sync+0x190/0x194 Calling '__v9fs_get_acl' method in 'v9fs_get_acl' creates the following chain of function calls: __v9fs_get_acl v9fs_fid_get_acl v9fs_fid_xattr_get p9_client_xattrwalk Function p9_client_xattrwalk accepts a pointer to u64-typed variable attr_size and puts some u64 value into it. However, after the executing the p9_client_xattrwalk, in some circumstances we assign the value of u64-typed variable 'attr_size' to the variable 'retval', which we will return. However, the type of 'retval' is ssize_t, and if the value of attr_size is larger than SSIZE_MAX, we will face the signed type overflow. If the overflow occurs, the result of v9fs_fid_xattr_get may be negative, but not classified as an error. When we try to allocate an acl with 'broken' size we receive an error, but don't process it. When we try to free this acl, we face the 'wild-memory-access' error (because it wasn't allocated). This patch will add new condition to the 'v9fs_fid_xattr_get' function, so it will return an EOVERFLOW error if the 'attr_size' is larger than SSIZE_MAX. In this version of the patch I simplified the condition. In previous (v2) version of the patch I removed explicit type conversion and added separate condition to check the possible overflow and return an error (in v1 version I've just modified the existing condition). Tested via syzkaller. Suggested-by: Christian Schoenebeck <linux_oss@crudebyte.com> Reported-by: syzbot+cb1d16facb3cc90de5fb@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=fbbef66d9e4d096242f3617de5d14d12705b4659 Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com> Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
| * | | | Merge tag '6.3-rc5-smb3-cifs-client-fixes' of ↵Linus Torvalds2023-04-084-8/+12
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.samba.org/sfrench/cifs-2.6 Pull cifs client fixes from Steve French: "Two cifs/smb3 client fixes, one for stable: - double lock fix for a cifs/smb1 reconnect path - DFS prefixpath fix for reconnect when server moved" * tag '6.3-rc5-smb3-cifs-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: double lock in cifs_reconnect_tcon() cifs: sanitize paths in cifs_update_super_prepath.
| | * | | | cifs: double lock in cifs_reconnect_tcon()Dan Carpenter2023-04-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This lock was supposed to be an unlock. Fixes: 6cc041e90c17 ("cifs: avoid races in parallel reconnects in smb1") Signed-off-by: Dan Carpenter <error27@gmail.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| | * | | | cifs: sanitize paths in cifs_update_super_prepath.Thiago Rafael Becker2023-04-053-7/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After a server reboot, clients are failing to move files with ENOENT. This is caused by DFS referrals containing multiple separators, which the server move call doesn't recognize. v1: Initial patch. v2: Move prototype to header. Link: https://bugzilla.redhat.com/show_bug.cgi?id=2182472 Fixes: a31080899d5f ("cifs: sanitize multiple delimiters in prepath") Actually-Fixes: 24e0a1eff9e2 ("cifs: switch to new mount api") Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Thiago Rafael Becker <tbecker@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * | | | | Merge tag 'mm-hotfixes-stable-2023-04-07-16-23' of ↵Linus Torvalds2023-04-086-12/+59
| |\ \ \ \ \ | | | |_|_|/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM fixes from Andrew Morton: "28 hotfixes. 23 are cc:stable and the other five address issues which were introduced during this merge cycle. 20 are for MM and the remainder are for other subsystems" * tag 'mm-hotfixes-stable-2023-04-07-16-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (28 commits) maple_tree: fix a potential concurrency bug in RCU mode maple_tree: fix get wrong data_end in mtree_lookup_walk() mm/swap: fix swap_info_struct race between swapoff and get_swap_pages() nilfs2: fix sysfs interface lifetime mm: take a page reference when removing device exclusive entries mm: vmalloc: avoid warn_alloc noise caused by fatal signal nilfs2: initialize "struct nilfs_binfo_dat"->bi_pad field nilfs2: fix potential UAF of struct nilfs_sc_info in nilfs_segctor_thread() zsmalloc: document freeable stats zsmalloc: document new fullness grouping fsdax: force clear dirty mark if CoW mm/hugetlb: fix uffd wr-protection for CoW optimization path mm: enable maple tree RCU mode by default maple_tree: add RCU lock checking to rcu callback functions maple_tree: add smp_rmb() to dead node detection maple_tree: fix write memory barrier of nodes once dead for RCU mode maple_tree: remove extra smp_wmb() from mas_dead_leaves() maple_tree: fix freeing of nodes in rcu mode maple_tree: detect dead nodes in mas_start() maple_tree: be more cautious about dead nodes ...
| | * | | | nilfs2: fix sysfs interface lifetimeRyusuke Konishi2023-04-052-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current nilfs2 sysfs support has issues with the timing of creation and deletion of sysfs entries, potentially leading to null pointer dereferences, use-after-free, and lockdep warnings. Some of the sysfs attributes for nilfs2 per-filesystem instance refer to metadata file "cpfile", "sufile", or "dat", but nilfs_sysfs_create_device_group that creates those attributes is executed before the inodes for these metadata files are loaded, and nilfs_sysfs_delete_device_group which deletes these sysfs entries is called after releasing their metadata file inodes. Therefore, access to some of these sysfs attributes may occur outside of the lifetime of these metadata files, resulting in inode NULL pointer dereferences or use-after-free. In addition, the call to nilfs_sysfs_create_device_group() is made during the locking period of the semaphore "ns_sem" of nilfs object, so the shrinker call caused by the memory allocation for the sysfs entries, may derive lock dependencies "ns_sem" -> (shrinker) -> "locks acquired in nilfs_evict_inode()". Since nilfs2 may acquire "ns_sem" deep in the call stack holding other locks via its error handler __nilfs_error(), this causes lockdep to report circular locking. This is a false positive and no circular locking actually occurs as no inodes exist yet when nilfs_sysfs_create_device_group() is called. Fortunately, the lockdep warnings can be resolved by simply moving the call to nilfs_sysfs_create_device_group() out of "ns_sem". This fixes these sysfs issues by revising where the device's sysfs interface is created/deleted and keeping its lifetime within the lifetime of the metadata files above. Link: https://lkml.kernel.org/r/20230330205515.6167-1-konishi.ryusuke@gmail.com Fixes: dd70edbde262 ("nilfs2: integrate sysfs support into driver") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+979fa7f9c0d086fdc282@syzkaller.appspotmail.com Link: https://lkml.kernel.org/r/0000000000003414b505f7885f7e@google.com Reported-by: syzbot+5b7d542076d9bddc3c6a@syzkaller.appspotmail.com Link: https://lkml.kernel.org/r/0000000000006ac86605f5f44eb9@google.com Cc: Viacheslav Dubeyko <slava@dubeyko.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| | * | | | nilfs2: initialize "struct nilfs_binfo_dat"->bi_pad fieldTetsuo Handa2023-04-052-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nilfs_btree_assign_p() and nilfs_direct_assign_p() are not initializing "struct nilfs_binfo_dat"->bi_pad field, causing uninit-value reports when being passed to CRC function. Link: https://lkml.kernel.org/r/20230326152146.15872-1-konishi.ryusuke@gmail.com Reported-by: syzbot <syzbot+048585f3f4227bb2b49b@syzkaller.appspotmail.com> Link: https://syzkaller.appspot.com/bug?extid=048585f3f4227bb2b49b Reported-by: Dipanjan Das <mail.dipanjan.das@gmail.com> Link: https://lkml.kernel.org/r/CANX2M5bVbzRi6zH3PTcNE_31TzerstOXUa9Bay4E6y6dX23_pg@mail.gmail.com Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| | * | | | nilfs2: fix potential UAF of struct nilfs_sc_info in nilfs_segctor_thread()Ryusuke Konishi2023-04-051-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The finalization of nilfs_segctor_thread() can race with nilfs_segctor_kill_thread() which terminates that thread, potentially causing a use-after-free BUG as KASAN detected. At the end of nilfs_segctor_thread(), it assigns NULL to "sc_task" member of "struct nilfs_sc_info" to indicate the thread has finished, and then notifies nilfs_segctor_kill_thread() of this using waitqueue "sc_wait_task" on the struct nilfs_sc_info. However, here, immediately after the NULL assignment to "sc_task", it is possible that nilfs_segctor_kill_thread() will detect it and return to continue the deallocation, freeing the nilfs_sc_info structure before the thread does the notification. This fixes the issue by protecting the NULL assignment to "sc_task" and its notification, with spinlock "sc_state_lock" of the struct nilfs_sc_info. Since nilfs_segctor_kill_thread() does a final check to see if "sc_task" is NULL with "sc_state_lock" locked, this can eliminate the race. Link: https://lkml.kernel.org/r/20230327175318.8060-1-konishi.ryusuke@gmail.com Reported-by: syzbot+b08ebcc22f8f3e6be43a@syzkaller.appspotmail.com Link: https://lkml.kernel.org/r/00000000000000660d05f7dfa877@google.com Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| | * | | | fsdax: force clear dirty mark if CoWShiyang Ruan2023-04-051-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | XFS allows CoW on non-shared extents to combat fragmentation[1]. The old non-shared extent could be mwrited before, its dax entry is marked dirty. This results in a WARNing: [ 28.512349] ------------[ cut here ]------------ [ 28.512622] WARNING: CPU: 2 PID: 5255 at fs/dax.c:390 dax_insert_entry+0x342/0x390 [ 28.513050] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache netfs nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables [ 28.515462] CPU: 2 PID: 5255 Comm: fsstress Kdump: loaded Not tainted 6.3.0-rc1-00001-g85e1481e19c1-dirty #117 [ 28.515902] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Arch Linux 1.16.1-1-1 04/01/2014 [ 28.516307] RIP: 0010:dax_insert_entry+0x342/0x390 [ 28.516536] Code: 30 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 8b 45 20 48 83 c0 01 e9 e2 fe ff ff 48 8b 45 20 48 83 c0 01 e9 cd fe ff ff <0f> 0b e9 53 ff ff ff 48 8b 7c 24 08 31 f6 e8 1b 61 a1 00 eb 8c 48 [ 28.517417] RSP: 0000:ffffc9000845fb18 EFLAGS: 00010086 [ 28.517721] RAX: 0000000000000053 RBX: 0000000000000155 RCX: 000000000018824b [ 28.518113] RDX: 0000000000000000 RSI: ffffffff827525a6 RDI: 00000000ffffffff [ 28.518515] RBP: ffffea00062092c0 R08: 0000000000000000 R09: ffffc9000845f9c8 [ 28.518905] R10: 0000000000000003 R11: ffffffff82ddb7e8 R12: 0000000000000155 [ 28.519301] R13: 0000000000000000 R14: 000000000018824b R15: ffff88810cfa76b8 [ 28.519703] FS: 00007f14a0c94740(0000) GS:ffff88817bd00000(0000) knlGS:0000000000000000 [ 28.520148] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 28.520472] CR2: 00007f14a0c8d000 CR3: 000000010321c004 CR4: 0000000000770ee0 [ 28.520863] PKRU: 55555554 [ 28.521043] Call Trace: [ 28.521219] <TASK> [ 28.521368] dax_fault_iter+0x196/0x390 [ 28.521595] dax_iomap_pte_fault+0x19b/0x3d0 [ 28.521852] __xfs_filemap_fault+0x234/0x2b0 [ 28.522116] __do_fault+0x30/0x130 [ 28.522334] do_fault+0x193/0x340 [ 28.522586] __handle_mm_fault+0x2d3/0x690 [ 28.522975] handle_mm_fault+0xe6/0x2c0 [ 28.523259] do_user_addr_fault+0x1bc/0x6f0 [ 28.523521] exc_page_fault+0x60/0x140 [ 28.523763] asm_exc_page_fault+0x22/0x30 [ 28.524001] RIP: 0033:0x7f14a0b589ca [ 28.524225] Code: c5 fe 7f 07 c5 fe 7f 47 20 c5 fe 7f 47 40 c5 fe 7f 47 60 c5 f8 77 c3 66 0f 1f 84 00 00 00 00 00 40 0f b6 c6 48 89 d1 48 89 fa <f3> aa 48 89 d0 c5 f8 77 c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 [ 28.525198] RSP: 002b:00007fff1dea1c98 EFLAGS: 00010202 [ 28.525505] RAX: 000000000000001e RBX: 000000000014a000 RCX: 0000000000006046 [ 28.525895] RDX: 00007f14a0c82000 RSI: 000000000000001e RDI: 00007f14a0c8d000 [ 28.526290] RBP: 000000000000006f R08: 0000000000000004 R09: 000000000014a000 [ 28.526681] R10: 0000000000000008 R11: 0000000000000246 R12: 028f5c28f5c28f5c [ 28.527067] R13: 8f5c28f5c28f5c29 R14: 0000000000011046 R15: 00007f14a0c946c0 [ 28.527449] </TASK> [ 28.527600] ---[ end trace 0000000000000000 ]--- To be able to delete this entry, clear its dirty mark before invalidate_inode_pages2_range(). [1] https://lore.kernel.org/linux-xfs/20230321151339.GA11376@frogsfrogsfrogs/ Link: https://lkml.kernel.org/r/1679653680-2-1-git-send-email-ruansy.fnst@fujitsu.com Fixes: f80e1668888f3 ("fsdax: invalidate pages when CoW") Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Darrick J. Wong <djwong@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| | * | | | fsdax: dedupe should compare the min of two iters' lengthShiyang Ruan2023-03-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In an dedupe comparison iter loop, the length of iomap_iter decreases because it implies the remaining length after each iteration. The dedupe command will fail with -EIO if the range is larger than one page size and not aligned to the page size. Also report warning in dmesg: [ 4338.498374] ------------[ cut here ]------------ [ 4338.498689] WARNING: CPU: 3 PID: 1415645 at fs/iomap/iter.c:16 ... The compare function should use the min length of the current iters, not the total length. Link: https://lkml.kernel.org/r/1679469958-2-1-git-send-email-ruansy.fnst@fujitsu.com Fixes: 0e79e3736d54 ("fsdax: dedupe: iter two files at the same time") Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| | * | | | fsdax: unshare: zero destination if srcmap is HOLE or UNWRITTENShiyang Ruan2023-03-281-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | unshare copies data from source to destination. But if the source is HOLE or UNWRITTEN extents, we should zero the destination, otherwise the HOLE or UNWRITTEN part will be user-visible old data of the new allocated extent. Found by running generic/649 while mounting with -o dax=always on pmem. Link: https://lkml.kernel.org/r/1679483469-2-1-git-send-email-ruansy.fnst@fujitsu.com Fixes: d984648e428b ("fsdax,xfs: port unshare to fsdax") Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Darrick J. Wong <djwong@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Alistair Popple <apopple@nvidia.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * | | | | Merge tag '6.3-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds2023-04-078-79/+140
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull ksmbd server fixes from Steve French: "Four fixes, three for stable: - slab out of bounds fix - lock cancellation fix - minor cleanup to address clang warning - fix for xfstest 551 (wrong parms passed to kvmalloc)" * tag '6.3-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: fix slab-out-of-bounds in init_smb2_rsp_hdr ksmbd: delete asynchronous work from list ksmbd: remove unused is_char_allowed function ksmbd: do not call kvmalloc() with __GFP_NORETRY | __GFP_NO_WARN
| | * | | | | ksmbd: fix slab-out-of-bounds in init_smb2_rsp_hdrNamjae Jeon2023-04-024-37/+111
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When smb1 mount fails, KASAN detect slab-out-of-bounds in init_smb2_rsp_hdr like the following one. For smb1 negotiate(56bytes) , init_smb2_rsp_hdr() for smb2 is called. The issue occurs while handling smb1 negotiate as smb2 server operations. Add smb server operations for smb1 (get_cmd_val, init_rsp_hdr, allocate_rsp_buf, check_user_session) to handle smb1 negotiate so that smb2 server operation does not handle it. [ 411.400423] CIFS: VFS: Use of the less secure dialect vers=1.0 is not recommended unless required for access to very old servers [ 411.400452] CIFS: Attempting to mount \\192.168.45.139\homes [ 411.479312] ksmbd: init_smb2_rsp_hdr : 492 [ 411.479323] ================================================================== [ 411.479327] BUG: KASAN: slab-out-of-bounds in init_smb2_rsp_hdr+0x1e2/0x1f4 [ksmbd] [ 411.479369] Read of size 16 at addr ffff888488ed0734 by task kworker/14:1/199 [ 411.479379] CPU: 14 PID: 199 Comm: kworker/14:1 Tainted: G OE 6.1.21 #3 [ 411.479386] Hardware name: ASUSTeK COMPUTER INC. Z10PA-D8 Series/Z10PA-D8 Series, BIOS 3801 08/23/2019 [ 411.479390] Workqueue: ksmbd-io handle_ksmbd_work [ksmbd] [ 411.479425] Call Trace: [ 411.479428] <TASK> [ 411.479432] dump_stack_lvl+0x49/0x63 [ 411.479444] print_report+0x171/0x4a8 [ 411.479452] ? kasan_complete_mode_report_info+0x3c/0x200 [ 411.479463] ? init_smb2_rsp_hdr+0x1e2/0x1f4 [ksmbd] [ 411.479497] kasan_report+0xb4/0x130 [ 411.479503] ? init_smb2_rsp_hdr+0x1e2/0x1f4 [ksmbd] [ 411.479537] kasan_check_range+0x149/0x1e0 [ 411.479543] memcpy+0x24/0x70 [ 411.479550] init_smb2_rsp_hdr+0x1e2/0x1f4 [ksmbd] [ 411.479585] handle_ksmbd_work+0x109/0x760 [ksmbd] [ 411.479616] ? _raw_spin_unlock_irqrestore+0x50/0x50 [ 411.479624] ? smb3_encrypt_resp+0x340/0x340 [ksmbd] [ 411.479656] process_one_work+0x49c/0x790 [ 411.479667] worker_thread+0x2b1/0x6e0 [ 411.479674] ? process_one_work+0x790/0x790 [ 411.479680] kthread+0x177/0x1b0 [ 411.479686] ? kthread_complete_and_exit+0x30/0x30 [ 411.479692] ret_from_fork+0x22/0x30 [ 411.479702] </TASK> Fixes: 39b291b86b59 ("ksmbd: return unsupported error on smb1 mount") Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
| | * | | | | ksmbd: delete asynchronous work from listNamjae Jeon2023-04-024-20/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When smb2_lock request is canceled by smb2_cancel or smb2_close(), ksmbd is missing deleting async_request_entry async_requests list. Because calling init_smb2_rsp_hdr() in smb2_lock() mark ->synchronous as true and then it will not be deleted in ksmbd_conn_try_dequeue_request(). This patch add release_async_work() to release the ones allocated for async work. Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
| | * | | | | ksmbd: remove unused is_char_allowed functionTom Rix2023-03-251-18/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | clang with W=1 reports fs/ksmbd/unicode.c:122:19: error: unused function 'is_char_allowed' [-Werror,-Wunused-function] static inline int is_char_allowed(char *ch) ^ This function is not used so remove it. Signed-off-by: Tom Rix <trix@redhat.com> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
| | * | | | | ksmbd: do not call kvmalloc() with __GFP_NORETRY | __GFP_NO_WARNMarios Makassikis2023-03-251-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 83dcedd5540d ("ksmbd: fix infinite loop in ksmbd_conn_handler_loop()"), changes GFP modifiers passed to kvmalloc(). This cause xfstests generic/551 test to fail. We limit pdu length size according to connection status and maximum number of connections. In the rest, memory allocation of request is limited by credit management. so these flags are no longer needed. Fixes: 83dcedd5540d ("ksmbd: fix infinite loop in ksmbd_conn_handler_loop()") Cc: stable@vger.kernel.org Signed-off-by: Marios Makassikis <mmakassikis@freebox.fr> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
* | | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2023-04-0614-50/+147
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/ethernet/google/gve/gve.h 3ce934558097 ("gve: Secure enough bytes in the first TX desc for all TCP pkts") 75eaae158b1b ("gve: Add XDP DROP and TX support for GQI-QPL format") https://lore.kernel.org/all/20230406104927.45d176f5@canb.auug.org.au/ https://lore.kernel.org/all/c5872985-1a95-0bc8-9dcc-b6f23b439e9d@tessares.net/ Adjacent changes: net/can/isotp.c 051737439eae ("can: isotp: fix race between isotp_sendsmg() and isotp_release()") 96d1c81e6a04 ("can: isotp: add module parameter for maximum pdu size") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | | | | | Merge tag 'nfsd-6.3-5' of ↵Linus Torvalds2023-04-044-10/+11
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Fix a crash and a resource leak in NFSv4 COMPOUND processing - Fix issues with AUTH_SYS credential handling - Try again to address an NFS/NFSD/SUNRPC build dependency regression * tag 'nfsd-6.3-5' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: NFSD: callback request does not use correct credential for AUTH_SYS NFS: Remove "select RPCSEC_GSS_KRB5 sunrpc: only free unix grouplist after RCU settles nfsd: call op_release, even when op_func returns an error NFSD: Avoid calling OPDESC() with ops->opnum == OP_ILLEGAL
| | * | | | | | NFSD: callback request does not use correct credential for AUTH_SYSDai Ngo2023-04-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently callback request does not use the credential specified in CREATE_SESSION if the security flavor for the back channel is AUTH_SYS. Problem was discovered by pynfs 4.1 DELEG5 and DELEG7 test with error: DELEG5 st_delegation.testCBSecParms : FAILURE expected callback with uid, gid == 17, 19, got 0, 0 Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Fixes: 8276c902bbe9 ("SUNRPC: remove uid and gid from struct auth_cred") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| | * | | | | | NFS: Remove "select RPCSEC_GSS_KRB5Chuck Lever2023-04-041-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If CONFIG_CRYPTO=n (e.g. arm/shmobile_defconfig): WARNING: unmet direct dependencies detected for RPCSEC_GSS_KRB5 Depends on [n]: NETWORK_FILESYSTEMS [=y] && SUNRPC [=y] && CRYPTO [=n] Selected by [y]: - NFS_V4 [=y] && NETWORK_FILESYSTEMS [=y] && NFS_FS [=y] As NFSv4 can work without crypto enabled, remove the RPCSEC_GSS_KRB5 dependency altogether. Trond says: > It is possible to use the NFSv4.1 client with just AUTH_SYS, and > in fact there are plenty of people out there using only that. The > fact that RFC5661 gets its knickers in a twist about RPCSEC_GSS > support is largely irrelevant to those people. > > The other issue is that ’select’ enforces the strict dependency > that if the NFS client is compiled into the kernel, then the > RPCSEC_GSS and kerberos code needs to be compiled in as well: they > cannot exist as modules. Fixes: e57d06527738 ("NFS & NFSD: Update GSS dependencies") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Niklas Söderlund <niklas.soderlund@ragnatech.se> Suggested-by: Trond Myklebust <trondmy@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| | * | | | | | nfsd: call op_release, even when op_func returns an errorJeff Layton2023-03-312-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For ops with "trivial" replies, nfsd4_encode_operation will shortcut most of the encoding work and skip to just marshalling up the status. One of the things it skips is calling op_release. This could cause a memory leak in the layoutget codepath if there is an error at an inopportune time. Have the compound processing engine always call op_release, even when op_func sets an error in op->status. With this change, we also need nfsd4_block_get_device_info_scsi to set the gd_device pointer to NULL on error to avoid a double free. Reported-by: Zhi Li <yieli@redhat.com> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2181403 Fixes: 34b1744c91cc ("nfsd4: define ->op_release for compound ops") Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>