summaryrefslogtreecommitdiffstats
path: root/fs/cifs/file.c
Commit message (Collapse)AuthorAgeFilesLines
* smb: move client and server files to common directory fs/smbSteve French2023-05-241-5097/+0
| | | | | | | | | | | | | Move CIFS/SMB3 related client and server files (cifs.ko and ksmbd.ko and helper modules) to new fs/smb subdirectory: fs/cifs --> fs/smb/client fs/ksmbd --> fs/smb/server fs/smbfs_common --> fs/smb/common Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
* cifs: Fix cifs_limit_bvec_subset() to correctly check the maxmimum sizeDavid Howells2023-05-231-1/+2
| | | | | | | | | | | | | | | | | | | Fix cifs_limit_bvec_subset() so that it limits the span to the maximum specified and won't return with a size greater than max_size. Fixes: d08089f649a0 ("cifs: Change the I/O paths to use an iterator rather than a page list") Cc: stable@vger.kernel.org # 6.3 Reported-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <smfrench@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Paulo Alcantara <pc@manguebit.com> cc: Tom Talpey <tom@talpey.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
* SMB3: drop reference to cfile before sending oplock breakBharath SM2023-05-171-5/+12
| | | | | | | | | | | | | | | | | | | In cifs_oplock_break function we drop reference to a cfile at the end of function, due to which close command goes on wire after lease break acknowledgment even if file is already closed by application but we had deferred the handle close. If other client with limited file shareaccess waiting on lease break ack proceeds operation on that file as soon as first client sends ack, then we may encounter status sharing violation error because of open handle. Solution is to put reference to cfile(send close on wire if last ref) and then send oplock acknowledgment to server. Fixes: 9e31678fb403 ("SMB3: fix lease break timeout when multiple deferred close handles for the same file.") Cc: stable@kernel.org Signed-off-by: Bharath SM <bharathsm@microsoft.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
* SMB3: Close all deferred handles of inode in case of handle lease breakBharath SM2023-05-171-8/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Oplock break may occur for different file handle than the deferred handle. Check for inode deferred closes list, if it's not empty then close all the deferred handles of inode because we should not cache handles if we dont have handle lease. Eg: If openfilelist has one deferred file handle and another open file handle from app for a same file, then on a lease break we choose the first handle in openfile list. The first handle in list can be deferred handle or actual open file handle from app. In case if it is actual open handle then today, we don't close deferred handles if we lose handle lease on a file. Problem with this is, later if app decides to close the existing open handle then we still be caching deferred handles until deferred close timeout. Leaving open handle may result in sharing violation when windows client tries to open a file with limited file share access. So we should check for deferred list of inode and walk through the list of deferred files in inode and close all deferred files. Fixes: 9e31678fb403 ("SMB3: fix lease break timeout when multiple deferred close handles for the same file.") Cc: stable@kernel.org Signed-off-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
* SMB3: Close deferred file handles in case of handle lease breakBharath SM2023-04-271-0/+16
| | | | | | | | | | We should not cache deferred file handles if we dont have handle lease on a file. And we should immediately close all deferred handles in case of handle lease break. Fixes: 9e31678fb403 ("SMB3: fix lease break timeout when multiple deferred close handles for the same file.") Signed-off-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
* cifs: Fix unbuffered readDavid Howells2023-04-181-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If read() is done in an unbuffered manner, such that, say, cifs_strict_readv() goes through cifs_user_readv() and thence __cifs_readv(), it doesn't recognise the EOF and keeps indicating to userspace that it returning full buffers of data. This is due to ctx->iter being advanced in cifs_send_async_read() as the buffer is split up amongst a number of rdata objects. The iterator count is then used in collect_uncached_read_data() in the non-DIO case to set the total length read - and thus the return value of sys_read(). But since the iterator normally gets used up completely during splitting, ctx->total_len gets overridden to the full amount. However, prior to that in collect_uncached_read_data(), we've gone through the list of rdatas and added up the amount of data we actually received (which we then throw away). Fix this by removing the bit that overrides the amount read in the non-DIO case and just going with the total added up in the aforementioned loop. This was observed by mounting a cifs share with multiple channels, e.g.: mount //192.168.6.1/test /test/ -o user=shares,pass=...,max_channels=6 and then reading a 1MiB file on the share: strace cat /xfstest.test/1M >/dev/null Through strace, the same data can be seen being read again and again. Fixes: d08089f649a0 ("cifs: Change the I/O paths to use an iterator rather than a page list") Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Paulo Alcantara (SUSE) <pc@manguebit.com> cc: Jérôme Glisse <jglisse@redhat.com> cc: Long Li <longli@microsoft.com> cc: Enzo Matsumiya <ematsumiya@suse.de> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
* cifs: check only tcon status on tcon related functionsShyam Prasad N2023-03-171-4/+4
| | | | | | | | | | We had a couple of checks for session in cifs_tree_connect and cifs_mark_open_files_invalid, which were unnecessary. And that was done with ses_lock. Changed that to tc_lock too. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
* cifs: Fix memory leak in direct I/ODavid Howells2023-03-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When __cifs_readv() and __cifs_writev() extract pages from a user-backed iterator into a BVEC-type iterator, they set ->bv_need_unpin to note whether they need to unpin the pages later. However, in both cases they examine the BVEC-type iterator and not the source iterator - and so bv_need_unpin doesn't get set and the pages are leaked. I think this may be responsible for the generic/208 xfstest failing occasionally with: WARNING: CPU: 0 PID: 3064 at mm/gup.c:218 try_grab_page+0x65/0x100 RIP: 0010:try_grab_page+0x65/0x100 follow_page_pte+0x1a7/0x570 __get_user_pages+0x1a2/0x650 __gup_longterm_locked+0xdc/0xb50 internal_get_user_pages_fast+0x17f/0x310 pin_user_pages_fast+0x46/0x60 iov_iter_extract_pages+0xc9/0x510 ? __kmalloc_large_node+0xb1/0x120 ? __kmalloc_node+0xbe/0x130 netfs_extract_user_iter+0xbf/0x200 [netfs] __cifs_writev+0x150/0x330 [cifs] vfs_write+0x2a8/0x3c0 ksys_pwrite64+0x65/0xa0 with the page refcount going negative. This is less unlikely than it seems because the page is being pinned, not simply got, and so the refcount increased by 1024 each time, and so only needs to be called around ~2097152 for the refcount to go negative. Further, the test program (aio-dio-invalidate-failure) uses a 32MiB static buffer and all the PTEs covering it refer to the same page because it's never written to. The warning in try_grab_page(): if (WARN_ON_ONCE(folio_ref_count(folio) <= 0)) return -ENOMEM; then trips and prevents us ever using the page again for DIO at least. Fixes: d08089f649a0 ("cifs: Change the I/O paths to use an iterator rather than a page list") Reported-by: Murphy Zhou <jencce.kernel@gmail.com> Link: https://lore.kernel.org/r/CAH2r5mvaTsJ---n=265a4zqRA7pP+o4MJ36WCQUS6oPrOij8cw@mail.gmail.com Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
* cifs: Fix cifs_write_back_from_locked_folio()David Howells2023-03-011-0/+1
| | | | | | | | | | | | | | | | | | cifs_write_back_from_locked_folio() should return the number of bytes read, but returns the result of ->async_writev(), which will be 0 on success. As it happens, this doesn't prevent cifs_writepages_region() from working as it will then examine and ignore the pages that are no longer dirty rather than just skipping over them. Fixes: d08089f649a0 ("cifs: Change the I/O paths to use an iterator rather than a page list") Signed-off-by: David Howells <dhowells@redhat.com> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Tom Talpey <tom@talpey.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
* cifs: Add some missing xas_retry() callsDavid Howells2023-03-011-0/+6
| | | | | | | | | | | | | | | The xas_for_each loops added into fs/cifs/file.c need to go round again if indicated by xas_retry(). Fixes: b8713c4dbfa3 ("cifs: Add some helper functions") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Tom Talpey <tom@talpey.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
* cifs: Fix cifs_writepages_region()David Howells2023-02-241-1/+2
| | | | | | | | | | | | | | | | | | | | Fix the cifs_writepages_region() to just jump over members of the batch that have been cleaned up rather than counting them as skipped. Unlike the other "skip_write" cases, this situation happens even for WB_SYNC_ALL, simply because the page has either been cleaned by somebody else, or was truncated. So in this case we're not "skipping" the write, we simply no longer need any write at all, so it's very different from the other skip_write cases. And we definitely shouldn't stop writing the rest just because of too many of these cases (or because we want to be rescheduled). Fixes: 3822a7c40997 ("Merge tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/lkml/2213409.1677249075@warthog.procyon.org.uk/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge tag 'mm-stable-2023-02-20-13-37' of ↵Linus Torvalds2023-02-231-52/+66
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Daniel Verkamp has contributed a memfd series ("mm/memfd: add F_SEAL_EXEC") which permits the setting of the memfd execute bit at memfd creation time, with the option of sealing the state of the X bit. - Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset() thread-safe for pmd unshare") which addresses a rare race condition related to PMD unsharing. - Several folioification patch serieses from Matthew Wilcox, Vishal Moola, Sidhartha Kumar and Lorenzo Stoakes - Johannes Weiner has a series ("mm: push down lock_page_memcg()") which does perform some memcg maintenance and cleanup work. - SeongJae Park has added DAMOS filtering to DAMON, with the series "mm/damon/core: implement damos filter". These filters provide users with finer-grained control over DAMOS's actions. SeongJae has also done some DAMON cleanup work. - Kairui Song adds a series ("Clean up and fixes for swap"). - Vernon Yang contributed the series "Clean up and refinement for maple tree". - Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series. It adds to MGLRU an LRU of memcgs, to improve the scalability of global reclaim. - David Hildenbrand has added some userfaultfd cleanup work in the series "mm: uffd-wp + change_protection() cleanups". - Christoph Hellwig has removed the generic_writepages() library function in the series "remove generic_writepages". - Baolin Wang has performed some maintenance on the compaction code in his series "Some small improvements for compaction". - Sidhartha Kumar is doing some maintenance work on struct page in his series "Get rid of tail page fields". - David Hildenbrand contributed some cleanup, bugfixing and generalization of pte management and of pte debugging in his series "mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with swap PTEs". - Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation flag in the series "Discard __GFP_ATOMIC". - Sergey Senozhatsky has improved zsmalloc's memory utilization with his series "zsmalloc: make zspage chain size configurable". - Joey Gouly has added prctl() support for prohibiting the creation of writeable+executable mappings. The previous BPF-based approach had shortcomings. See "mm: In-kernel support for memory-deny-write-execute (MDWE)". - Waiman Long did some kmemleak cleanup and bugfixing in the series "mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF". - T.J. Alumbaugh has contributed some MGLRU cleanup work in his series "mm: multi-gen LRU: improve". - Jiaqi Yan has provided some enhancements to our memory error statistics reporting, mainly by presenting the statistics on a per-node basis. See the series "Introduce per NUMA node memory error statistics". - Mel Gorman has a second and hopefully final shot at fixing a CPU-hog regression in compaction via his series "Fix excessive CPU usage during compaction". - Christoph Hellwig does some vmalloc maintenance work in the series "cleanup vfree and vunmap". - Christoph Hellwig has removed block_device_operations.rw_page() in ths series "remove ->rw_page". - We get some maple_tree improvements and cleanups in Liam Howlett's series "VMA tree type safety and remove __vma_adjust()". - Suren Baghdasaryan has done some work on the maintainability of our vm_flags handling in the series "introduce vm_flags modifier functions". - Some pagemap cleanup and generalization work in Mike Rapoport's series "mm, arch: add generic implementation of pfn_valid() for FLATMEM" and "fixups for generic implementation of pfn_valid()" - Baoquan He has done some work to make /proc/vmallocinfo and /proc/kcore better represent the real state of things in his series "mm/vmalloc.c: allow vread() to read out vm_map_ram areas". - Jason Gunthorpe rationalized the GUP system's interface to the rest of the kernel in the series "Simplify the external interface for GUP". - SeongJae Park wishes to migrate people from DAMON's debugfs interface over to its sysfs interface. To support this, we'll temporarily be printing warnings when people use the debugfs interface. See the series "mm/damon: deprecate DAMON debugfs interface". - Andrey Konovalov provided the accurately named "lib/stackdepot: fixes and clean-ups" series. - Huang Ying has provided a dramatic reduction in migration's TLB flush IPI rates with the series "migrate_pages(): batch TLB flushing". - Arnd Bergmann has some objtool fixups in "objtool warning fixes". * tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (505 commits) include/linux/migrate.h: remove unneeded externs mm/memory_hotplug: cleanup return value handing in do_migrate_range() mm/uffd: fix comment in handling pte markers mm: change to return bool for isolate_movable_page() mm: hugetlb: change to return bool for isolate_hugetlb() mm: change to return bool for isolate_lru_page() mm: change to return bool for folio_isolate_lru() objtool: add UACCESS exceptions for __tsan_volatile_read/write kmsan: disable ftrace in kmsan core code kasan: mark addr_has_metadata __always_inline mm: memcontrol: rename memcg_kmem_enabled() sh: initialize max_mapnr m68k/nommu: add missing definition of ARCH_PFN_OFFSET mm: percpu: fix incorrect size in pcpu_obj_full_size() maple_tree: reduce stack usage with gcc-9 and earlier mm: page_alloc: call panic() when memoryless node allocation fails mm: multi-gen LRU: avoid futile retries migrate_pages: move THP/hugetlb migration support check to simplify code migrate_pages: batch flushing TLB migrate_pages: share more code between _unmap and _move ...
| * fs: convert writepage_t callback to pass a folioMatthew Wilcox (Oracle)2023-02-021-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series "Convert writepage_t to use a folio". More folioisation. I split out the mpage work from everything else because it completely dominated the patch, but some implementations I just converted outright. This patch (of 2): We always write back an entire folio, but that's currently passed as the head page. Convert all filesystems that use write_cache_pages() to expect a folio instead of a page. Link: https://lkml.kernel.org/r/20230126201255.1681189-1-willy@infradead.org Link: https://lkml.kernel.org/r/20230126201255.1681189-2-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * cifs: convert wdata_alloc_and_fillpages() to use filemap_get_folios_tag()Vishal Moola (Oracle)2023-02-021-3/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is in preparation for the removal of find_get_pages_range_tag(). Now also supports the use of large folios. Since tofind might be larger than the max number of folios in a folio_batch (15), we loop through filling in wdata->pages pulling more batches until we either reach tofind pages or run out of folios. This function may not return all pages in the last found folio before tofind pages are reached. Link: https://lkml.kernel.org/r/20230104211448.4804-10-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Cc: Tom Talpey <tom@talpey.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | Merge tag '6.3-rc-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds2023-02-221-1012/+798
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull cifs client updates from Steve French: "The largest subset of this is from David Howells et al: making the cifs/smb3 driver pass iov_iters down to the lowest layers, directly to the network transport rather than passing lists of pages around, helping multiple areas: - Pin user pages, thereby fixing the race between concurrent DIO read and fork, where the pages containing the DIO read buffer may end up belonging to the child process and not the parent - with the result that the parent might not see the retrieved data. - cifs shouldn't take refs on pages extracted from non-user-backed iterators (eg. KVEC). With these changes, cifs will apply the appropriate cleanup. - Making it easier to transition to using folios in cifs rather than pages by dealing with them through BVEC and XARRAY iterators. - Allowing cifs to use the new splice function The remainder are: - fixes for stable, including various fixes for uninitialized memory, wrong length field causing mount issue to very old servers, important directory lease fixes and reconnect fixes - cleanups (unused code removal, change one element array usage, and a change form strtobool to kstrtobool, and Kconfig cleanups) - SMBDIRECT (RDMA) fixes including iov_iter integration and UAF fixes - reconnect fixes - multichannel fixes, including improving channel allocation (to least used channel) - remove the last use of lock_page_killable by moving to folio_lock_killable" * tag '6.3-rc-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: (46 commits) update internal module version number for cifs.ko cifs: update ip_addr for ses only for primary chan setup cifs: use tcon allocation functions even for dummy tcon cifs: use the least loaded channel for sending requests cifs: DIO to/from KVEC-type iterators should now work cifs: Remove unused code cifs: Build the RDMA SGE list directly from an iterator cifs: Change the I/O paths to use an iterator rather than a page list cifs: Add a function to read into an iter from a socket cifs: Add some helper functions cifs: Add a function to Hash the contents of an iterator cifs: Add a function to build an RDMA SGE list from an iterator netfs: Add a function to extract an iterator into a scatterlist netfs: Add a function to extract a UBUF or IOVEC into a BVEC iterator cifs: Implement splice_read to pass down ITER_BVEC not ITER_PIPE splice: Export filemap/direct_splice_read() iov_iter: Add a function to extract a page list from an iterator iov_iter: Define flags to qualify page extraction. splice: Add a func to do a splice from an O_DIRECT file without ITER_PIPE splice: Add a func to do a splice from a buffered file without ITER_PIPE ...
| * | cifs: DIO to/from KVEC-type iterators should now workDavid Howells2023-02-201-20/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DIO to/from KVEC-type iterators should now work as the iterator is passed down to the socket in non-RDMA/non-crypto mode and in RDMA or crypto mode care is taken to handle vmap/vmalloc correctly and not take page refs when building a scatterlist. Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Tom Talpey <tom@talpey.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
| * | cifs: Remove unused codeDavid Howells2023-02-201-606/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove a bunch of functions that are no longer used and are commented out after the conversion to use iterators throughout the I/O path. Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org Link: https://lore.kernel.org/r/164928621823.457102.8777804402615654773.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/165211421039.3154751.15199634443157779005.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/165348881165.2106726.2993852968344861224.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/165364827876.3334034.9331465096417303889.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/166126396915.708021.2010212654244139442.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/166697261080.61150.17513116912567922274.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/166732033255.3186319.5527423437137895940.stgit@warthog.procyon.org.uk/ # rfc Signed-off-by: Steve French <stfrench@microsoft.com>
| * | cifs: Change the I/O paths to use an iterator rather than a page listDavid Howells2023-02-201-447/+748
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the cifs I/O paths hand lists of pages from the VM interface routines at the top all the way through the intervening layers to the socket interface at the bottom. This is a problem, however, for interfacing with netfslib which passes an iterator through to the ->issue_read() method (and will pass an iterator through to the ->issue_write() method in future). Netfslib takes over bounce buffering for direct I/O, async I/O and encrypted content, so cifs doesn't need to do that. Netfslib also converts IOVEC-type iterators into BVEC-type iterators if necessary. Further, cifs needs foliating - and folios may come in a variety of sizes, so a page list pointing to an array of heterogeneous pages may cause problems in places such as where crypto is done. Change the cifs I/O paths to hand iov_iter iterators all the way through instead. Notes: (1) Some old routines are #if'd out to be removed in a follow up patch so as to avoid confusing diff, thereby making the diff output easier to follow. I've removed functions that don't overlap with anything added. (2) struct smb_rqst loses rq_pages, rq_offset, rq_npages, rq_pagesz and rq_tailsz which describe the pages forming the buffer; instead there's an rq_iter describing the source buffer and an rq_buffer which is used to hold the buffer for encryption. (3) struct cifs_readdata and cifs_writedata are similarly modified to smb_rqst. The ->read_into_pages() and ->copy_into_pages() are then replaced with passing the iterator directly to the socket. The iterators are stored in these structs so that they are persistent and don't get deallocated when the function returns (unlike if they were stack variables). (4) Buffered writeback is overhauled, borrowing the code from the afs filesystem to gather up contiguous runs of folios. The XARRAY-type iterator is then used to refer directly to the pagecache and can be passed to the socket to transmit data directly from there. This includes: cifs_extend_writeback() cifs_write_back_from_locked_folio() cifs_writepages_region() cifs_writepages() (5) Pages are converted to folios. (6) Direct I/O uses netfs_extract_user_iter() to create a BVEC-type iterator from an IOBUF/UBUF-type source iterator. (7) smb2_get_aead_req() uses netfs_extract_iter_to_sg() to extract page fragments from the iterator into the scatterlists that the crypto layer prefers. (8) smb2_init_transform_rq() attached pages to smb_rqst::rq_buffer, an xarray, to use as a bounce buffer for encryption. An XARRAY-type iterator can then be used to pass the bounce buffer to lower layers. Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Paulo Alcantara <pc@cjr.nz> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org Link: https://lore.kernel.org/r/164311907995.2806745.400147335497304099.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/164928620163.457102.11602306234438271112.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/165211420279.3154751.15923591172438186144.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/165348880385.2106726.3220789453472800240.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/165364827111.3334034.934805882842932881.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/166126396180.708021.271013668175370826.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/166697259595.61150.5982032408321852414.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/166732031756.3186319.12528413619888902872.stgit@warthog.procyon.org.uk/ # rfc Signed-off-by: Steve French <stfrench@microsoft.com>
| * | cifs: Add some helper functionsDavid Howells2023-02-201-0/+93
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add some helper functions to manipulate the folio marks by iterating through a list of folios held in an xarray rather than using a page list. Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org Link: https://lore.kernel.org/r/164928616583.457102.15157033997163988344.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/165211418840.3154751.3090684430628501879.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/165348878940.2106726.204291614267188735.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/165364825674.3334034.3356201708659748648.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/166126394799.708021.10637797063862600488.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/166697258147.61150.9940790486999562110.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/166732030314.3186319.9209944805565413627.stgit@warthog.procyon.org.uk/ # rfc Signed-off-by: Steve French <stfrench@microsoft.com>
| * | cifs: Implement splice_read to pass down ITER_BVEC not ITER_PIPEDavid Howells2023-02-201-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide cifs_splice_read() to use a bvec rather than an pipe iterator as the latter cannot so easily be split and advanced, which is necessary to pass an iterator down to the bottom levels. Upstream cifs gets around this problem by using iov_iter_get_pages() to prefill the pipe and then passing the list of pages down. This is done by: (1) Bulk-allocate a bunch of pages to carry as much of the requested amount of data as possible, but without overrunning the available slots in the pipe and add them to an ITER_BVEC. (2) Synchronously call ->read_iter() to read into the buffer. (3) Discard any unused pages. (4) Load the remaining pages into the pipe in order and advance the head pointer. Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: Al Viro <viro@zeniv.linux.org.uk> cc: linux-cifs@vger.kernel.org Link: https://lore.kernel.org/r/166732028113.3186319.1793644937097301358.stgit@warthog.procyon.org.uk/ # rfc Signed-off-by: Steve French <stfrench@microsoft.com>
| * | cifs: Fix uninitialized memory reads for oparms.modeVolker Lendecke2023-02-201-16/+19
| | | | | | | | | | | | | | | | | | | | | | | | Use a struct assignment with implicit member initialization Signed-off-by: Volker Lendecke <vl@samba.org> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
| * | cifs: Use a folio in cifs_page_mkwrite()Matthew Wilcox (Oracle)2023-02-201-9/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | Avoids many calls to compound_head() and removes calls to various compat functions. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
* | | Merge tag 'locks-v6.3' of ↵Linus Torvalds2023-02-201-0/+1
|\ \ \ | |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux Pull file locking updates from Jeff Layton: "The main change here is that I've broken out most of the file locking definitions into a new header file. I also went ahead and completed the removal of locks_inode function" * tag 'locks-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux: fs: remove locks_inode filelock: move file locking definitions to separate header file
| * | filelock: move file locking definitions to separate header fileJeff Layton2023-01-111-0/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The file locking definitions have lived in fs.h since the dawn of time, but they are only used by a small subset of the source files that include it. Move the file locking definitions to a new header file, and add the appropriate #include directives to the source files that need them. By doing this we trim down fs.h a bit and limit the amount of rebuilding that has to be done when we make changes to the file locking APIs. Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Acked-by: Steve French <stfrench@microsoft.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Jeff Layton <jlayton@kernel.org>
* / cifs: Fix use-after-free in rdata->read_into_pages()ZhaoLong Wang2023-02-061-2/+2
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the network status is unstable, use-after-free may occur when read data from the server. BUG: KASAN: use-after-free in readpages_fill_pages+0x14c/0x7e0 Call Trace: <TASK> dump_stack_lvl+0x38/0x4c print_report+0x16f/0x4a6 kasan_report+0xb7/0x130 readpages_fill_pages+0x14c/0x7e0 cifs_readv_receive+0x46d/0xa40 cifs_demultiplex_thread+0x121c/0x1490 kthread+0x16b/0x1a0 ret_from_fork+0x2c/0x50 </TASK> Allocated by task 2535: kasan_save_stack+0x22/0x50 kasan_set_track+0x25/0x30 __kasan_kmalloc+0x82/0x90 cifs_readdata_direct_alloc+0x2c/0x110 cifs_readdata_alloc+0x2d/0x60 cifs_readahead+0x393/0xfe0 read_pages+0x12f/0x470 page_cache_ra_unbounded+0x1b1/0x240 filemap_get_pages+0x1c8/0x9a0 filemap_read+0x1c0/0x540 cifs_strict_readv+0x21b/0x240 vfs_read+0x395/0x4b0 ksys_read+0xb8/0x150 do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc Freed by task 79: kasan_save_stack+0x22/0x50 kasan_set_track+0x25/0x30 kasan_save_free_info+0x2e/0x50 __kasan_slab_free+0x10e/0x1a0 __kmem_cache_free+0x7a/0x1a0 cifs_readdata_release+0x49/0x60 process_one_work+0x46c/0x760 worker_thread+0x2a4/0x6f0 kthread+0x16b/0x1a0 ret_from_fork+0x2c/0x50 Last potentially related work creation: kasan_save_stack+0x22/0x50 __kasan_record_aux_stack+0x95/0xb0 insert_work+0x2b/0x130 __queue_work+0x1fe/0x660 queue_work_on+0x4b/0x60 smb2_readv_callback+0x396/0x800 cifs_abort_connection+0x474/0x6a0 cifs_reconnect+0x5cb/0xa50 cifs_readv_from_socket.cold+0x22/0x6c cifs_read_page_from_socket+0xc1/0x100 readpages_fill_pages.cold+0x2f/0x46 cifs_readv_receive+0x46d/0xa40 cifs_demultiplex_thread+0x121c/0x1490 kthread+0x16b/0x1a0 ret_from_fork+0x2c/0x50 The following function calls will cause UAF of the rdata pointer. readpages_fill_pages cifs_read_page_from_socket cifs_readv_from_socket cifs_reconnect __cifs_reconnect cifs_abort_connection mid->callback() --> smb2_readv_callback queue_work(&rdata->work) # if the worker completes first, # the rdata is freed cifs_readv_complete kref_put cifs_readdata_release kfree(rdata) return rdata->... # UAF in readpages_fill_pages() Similarly, this problem also occurs in the uncache_fill_pages(). Fix this by adjusts the order of condition judgment in the return statement. Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com> Cc: stable@vger.kernel.org Acked-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
* Merge tag '6.2-rc-smb3-client-fixes-part1' of ↵Linus Torvalds2022-12-151-14/+22
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.samba.org/sfrench/cifs-2.6 Pull cifs client updates from Steve French: - SMB3.1.1 POSIX Extensions fixes - remove use of generic_writepages() and ->cifs_writepage(), in favor of ->cifs_writepages() and ->migrate_folio() - memory management fixes - mount parm parsing fixes - minor cleanup fixes * tag '6.2-rc-smb3-client-fixes-part1' of git://git.samba.org/sfrench/cifs-2.6: cifs: Remove duplicated include in cifsglob.h cifs: fix oops during encryption cifs: print warning when conflicting soft vs. hard mount options specified cifs: fix missing display of three mount options cifs: fix various whitespace errors in headers cifs: minor cleanup of some headers cifs: skip alloc when request has no pages cifs: remove ->writepage cifs: stop using generic_writepages cifs: wire up >migrate_folio cifs: Parse owner/group for stat in smb311 posix extensions cifs: Add "extbuf" and "extbuflen" args to smb2_compound_op() Fix path in cifs/usage.rst
| * cifs: remove ->writepageChristoph Hellwig2022-12-081-9/+0
| | | | | | | | | | | | | | | | | | | | | | ->writepage is a very inefficient method to write back data, and only used through write_cache_pages or a a fallback when no ->migrate_folio method is present. Now that cifs implements ->migrate_folio and doesn't call generic_writepages, the writepage method can be removed. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
| * cifs: stop using generic_writepagesChristoph Hellwig2022-12-081-2/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | generic_writepages is just a wrapper that calls ->writepages on a range, and thus in the way of eventually removing ->writepage. Switch cifs to just open code it in preparation of removing ->writepage. [note: I suspect just integrating the small wsize case with the rest of the writeback code might be a better idea here, but that needs someone more familiar with the code] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
| * cifs: wire up >migrate_folioChristoph Hellwig2022-12-081-3/+4
| | | | | | | | | | | | | | | | | | | | CIFS does not use page private data that needs migration, so it can just wire up filemap_migrate_folio. This prepares for removing ->writepage, which is used as a fallback if no migrate_folio method is set. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
* | Merge tag 'pull-iov_iter' of ↵Linus Torvalds2022-12-121-2/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull iov_iter updates from Al Viro: "iov_iter work; most of that is about getting rid of direction misannotations and (hopefully) preventing more of the same for the future" * tag 'pull-iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: use less confusing names for iov_iter direction initializers iov_iter: saner checks for attempt to copy to/from iterator [xen] fix "direction" argument of iov_iter_kvec() [vhost] fix 'direction' argument of iov_iter_{init,bvec}() [target] fix iov_iter_bvec() "direction" argument [s390] memcpy_real(): WRITE is "data source", not destination... [s390] zcore: WRITE is "data source", not destination... [infiniband] READ is "data destination", not source... [fsi] WRITE is "data source", not destination... [s390] copy_oldmem_kernel() - WRITE is "data source", not destination csum_and_copy_to_iter(): handle ITER_DISCARD get rid of unlikely() on page_copy_sane() calls
| * | use less confusing names for iov_iter direction initializersAl Viro2022-11-251-2/+2
| |/ | | | | | | | | | | | | | | | | | | | | | | | | READ/WRITE proved to be actively confusing - the meanings are "data destination, as used with read(2)" and "data source, as used with write(2)", but people keep interpreting those as "we read data from it" and "we write data to it", i.e. exactly the wrong way. Call them ITER_DEST and ITER_SOURCE - at least that is harder to misinterpret... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* / cifs: use locks_inode_context helperJeff Layton2022-11-301-1/+1
|/ | | | | | | | | | cifs currently doesn't access i_flctx safely. This requires a smp_load_acquire, as the pointer is set via cmpxchg (a release operation). Cc: Steve French <smfrench@samba.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jeff Layton <jlayton@kernel.org>
* cifs: Fix pages leak when writedata alloc failed in cifs_write_from_iter()Zhang Xiaoxu2022-10-231-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a kmemleak when writedata alloc failed: unreferenced object 0xffff888175ae4000 (size 4096): comm "dd", pid 19419, jiffies 4296028749 (age 739.396s) hex dump (first 32 bytes): 80 02 b0 04 00 ea ff ff c0 02 b0 04 00 ea ff ff ................ 80 22 4c 04 00 ea ff ff c0 22 4c 04 00 ea ff ff ."L......"L..... backtrace: [<0000000072fdbb86>] __kmalloc_node+0x50/0x150 [<0000000039faf56f>] __iov_iter_get_pages_alloc+0x605/0xdd0 [<00000000f862a9d4>] iov_iter_get_pages_alloc2+0x3b/0x80 [<000000008f226067>] cifs_write_from_iter+0x2ae/0xe40 [<000000001f78f2f1>] __cifs_writev+0x337/0x5c0 [<00000000257fcef5>] vfs_write+0x503/0x690 [<000000008778a238>] ksys_write+0xb9/0x150 [<00000000ed82047c>] do_syscall_64+0x35/0x80 [<000000003365551d>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 __iov_iter_get_pages_alloc+0x605/0xdd0 is: want_pages_array at lib/iov_iter.c:1304 (inlined by) __iov_iter_get_pages_alloc at lib/iov_iter.c:1457 If writedata allocate failed, the pages and pagevec should be cleanup. Fixes: 8c5f9c1ab7cb ("CIFS: Add support for direct I/O write") Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com>
* cifs: Fix pages array leak when writedata alloc failed in cifs_writedata_alloc()Zhang Xiaoxu2022-10-231-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a memory leak when writedata alloc failed: unreferenced object 0xffff888192364000 (size 8192): comm "sync", pid 22839, jiffies 4297313967 (age 60.230s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<0000000027de0814>] __kmalloc+0x4d/0x150 [<00000000b21e81ab>] cifs_writepages+0x35f/0x14a0 [<0000000076f7d20e>] do_writepages+0x10a/0x360 [<00000000d6a36edc>] filemap_fdatawrite_wbc+0x95/0xc0 [<000000005751a323>] __filemap_fdatawrite_range+0xa7/0xe0 [<0000000088afb0ca>] file_write_and_wait_range+0x66/0xb0 [<0000000063dbc443>] cifs_strict_fsync+0x80/0x5f0 [<00000000c4624754>] __x64_sys_fsync+0x40/0x70 [<000000002c0dc744>] do_syscall_64+0x35/0x80 [<0000000052f46bee>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 cifs_writepages+0x35f/0x14a0 is: kmalloc_array at include/linux/slab.h:628 (inlined by) kcalloc at include/linux/slab.h:659 (inlined by) cifs_writedata_alloc at fs/cifs/file.c:2438 (inlined by) wdata_alloc_and_fillpages at fs/cifs/file.c:2527 (inlined by) cifs_writepages at fs/cifs/file.c:2705 If writedata alloc failed in cifs_writedata_alloc(), the pages array should be freed. Fixes: 8e7360f67e75 ("CIFS: Add support for direct pages in wdata") Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
* cifs: Fix xid leak in cifs_flock()Zhang Xiaoxu2022-10-181-4/+7
| | | | | | | | | | If not flock, before return -ENOLCK, should free the xid, otherwise, the xid will be leaked. Fixes: d0677992d2af ("cifs: add support for flock") Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com>
* cifs: lease key is uninitialized in two additional functions when smb1Steve French2022-10-151-2/+2
| | | | | | | | | | | | cifs_open and _cifsFileInfo_put also end up with lease_key uninitialized in smb1 mounts. It is cleaner to set lease key to zero in these places where leases are not supported (smb1 can not return lease keys so the field was uninitialized). Addresses-Coverity: 1514207 ("Uninitialized scalar variable") Addresses-Coverity: 1514331 ("Uninitialized scalar variable") Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
* cifs: improve symlink handling for smb2+Paulo Alcantara2022-10-131-19/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When creating inode for symlink, the client used to send below requests to fill it in: * create+query_info+close (STATUS_STOPPED_ON_SYMLINK) * create(+reparse_flag)+query_info+close (set file attrs) * create+ioctl(get_reparse)+close (query reparse tag) and then for every access to the symlink dentry, the ->link() method would send another: * create+ioctl(get_reparse)+close (parse symlink) So, in order to improve: (i) Get rid of unnecessary roundtrips and then resolve symlinks as follows: * create+query_info+close (STATUS_STOPPED_ON_SYMLINK + parse symlink + get reparse tag) * create(+reparse_flag)+query_info+close (set file attrs) (ii) Set the resolved symlink target directly in inode->i_link and use simple_get_link() for ->link() to simply return it. Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
* cifs: destage dirty pages before re-reading them for cache=noneRonnie Sahlberg2022-09-251-0/+9
| | | | | | | | | | | | | | | This is the opposite case of kernel bugzilla 216301. If we mmap a file using cache=none and then proceed to update the mmapped area these updates are not reflected in a later pread() of that part of the file. To fix this we must first destage any dirty pages in the range before we allow the pread() to proceed. Cc: stable@vger.kernel.org Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
* cifs: revalidate mapping when doing direct writesRonnie Sahlberg2022-09-121-0/+3
| | | | | | | | | | | | | | Kernel bugzilla: 216301 When doing direct writes we need to also invalidate the mapping in case we have a cached copy of the affected page(s) in memory or else subsequent reads of the data might return the old/stale content before we wrote an update to the server. Cc: stable@vger.kernel.org Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
* Merge tag '5.20-rc-smb3-client-fixes-part2' of ↵Linus Torvalds2022-08-131-31/+8
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.samba.org/sfrench/cifs-2.6 Pull more cifs updates from Steve French: - two fixes for stable, one for a lock length miscalculation, and another fixes a lease break timeout bug - improvement to handle leases, allows the close timeout to be configured more safely - five restructuring/cleanup patches * tag '5.20-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6: cifs: Do not access tcon->cfids->cfid directly from is_path_accessible cifs: Add constructor/destructors for tcon->cfid SMB3: fix lease break timeout when multiple deferred close handles for the same file. smb3: allow deferred close timeout to be configurable cifs: Do not use tcon->cfid directly, use the cfid we get from open_cached_dir cifs: Move cached-dir functions into a separate file cifs: Remove {cifs,nfs}_fscache_release_page() cifs: fix lock length calculation
| * SMB3: fix lease break timeout when multiple deferred close handles for the ↵Bharath SM2022-08-111-19/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | same file. Solution is to send lease break ack immediately even in case of deferred close handles to avoid lease break request timing out and let deferred closed handle gets closed as scheduled. Later patches could optimize cases where we then close some of these handles sooner for the cases where lease break is to 'none' Cc: stable@kernel.org Signed-off-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * smb3: allow deferred close timeout to be configurableSteve French2022-08-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Deferred close can be a very useful feature for allowing caching data for read, and for minimizing the number of reopens needed for a file that is repeatedly opened and close but there are workloads where its default (1 second, similar to actimeo/acregmax) is much too small. Allow the user to configure the amount of time we can defer sending the final smb3 close when we have a handle lease on the file (rather than forcing it to depend on value of actimeo which is often unrelated, and less safe). Adds new mount parameter "closetimeo=" which is the maximum number of seconds we can wait before sending an SMB3 close when we have a handle lease for it. Default value also is set to slightly larger at 5 seconds (although some other clients use larger default this should still help). Suggested-by: Bharath SM <bharathsm@microsoft.com> Reviewed-by: Bharath SM <bharathsm@microsoft.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
| * cifs: Move cached-dir functions into a separate fileRonnie Sahlberg2022-08-111-7/+2
| | | | | | | | | | | | | | | | | | | | Also rename crfid to cfid to have consistent naming for this variable. This commit does not change any logic. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
| * cifs: fix lock length calculationPaulo Alcantara2022-08-101-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The lock length was wrongly set to 0 when fl_end == OFFSET_MAX, thus failing to lock the whole file when l_start=0 and l_len=0. This fixes test 2 from cthon04. Before patch: $ ./cthon04/lock/tlocklfs -t 2 /mnt Creating parent/child synchronization pipes. Test #1 - Test regions of an unlocked file. Parent: 1.1 - F_TEST [ 0, 1] PASSED. Parent: 1.2 - F_TEST [ 0, ENDING] PASSED. Parent: 1.3 - F_TEST [ 0,7fffffffffffffff] PASSED. Parent: 1.4 - F_TEST [ 1, 1] PASSED. Parent: 1.5 - F_TEST [ 1, ENDING] PASSED. Parent: 1.6 - F_TEST [ 1,7fffffffffffffff] PASSED. Parent: 1.7 - F_TEST [7fffffffffffffff, 1] PASSED. Parent: 1.8 - F_TEST [7fffffffffffffff, ENDING] PASSED. Parent: 1.9 - F_TEST [7fffffffffffffff,7fffffffffffffff] PASSED. Test #2 - Try to lock the whole file. Parent: 2.0 - F_TLOCK [ 0, ENDING] PASSED. Child: 2.1 - F_TEST [ 0, 1] FAILED! Child: **** Expected EACCES, returned success... Child: **** Probably implementation error. ** CHILD pass 1 results: 0/0 pass, 0/0 warn, 1/1 fail (pass/total). Parent: Child died ** PARENT pass 1 results: 10/10 pass, 0/0 warn, 0/0 fail (pass/total). After patch: $ ./cthon04/lock/tlocklfs -t 2 /mnt Creating parent/child synchronization pipes. Test #2 - Try to lock the whole file. Parent: 2.0 - F_TLOCK [ 0, ENDING] PASSED. Child: 2.1 - F_TEST [ 0, 1] PASSED. Child: 2.2 - F_TEST [ 0, ENDING] PASSED. Child: 2.3 - F_TEST [ 0,7fffffffffffffff] PASSED. Child: 2.4 - F_TEST [ 1, 1] PASSED. Child: 2.5 - F_TEST [ 1, ENDING] PASSED. Child: 2.6 - F_TEST [ 1,7fffffffffffffff] PASSED. Child: 2.7 - F_TEST [7fffffffffffffff, 1] PASSED. Child: 2.8 - F_TEST [7fffffffffffffff, ENDING] PASSED. Child: 2.9 - F_TEST [7fffffffffffffff,7fffffffffffffff] PASSED. Parent: 2.10 - F_ULOCK [ 0, ENDING] PASSED. ** PARENT pass 1 results: 2/2 pass, 0/0 warn, 0/0 fail (pass/total). ** CHILD pass 1 results: 9/9 pass, 0/0 warn, 0/0 fail (pass/total). Fixes: d80c69846ddf ("cifs: fix signed integer overflow when fl_end is OFFSET_MAX") Reported-by: Xiaoli Feng <xifeng@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
* | Merge tag 'pull-work.iov_iter-rebased' of ↵Linus Torvalds2022-08-081-5/+3
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more iov_iter updates from Al Viro: - more new_sync_{read,write}() speedups - ITER_UBUF introduction - ITER_PIPE cleanups - unification of iov_iter_get_pages/iov_iter_get_pages_alloc and switching them to advancing semantics - making ITER_PIPE take high-order pages without splitting them - handling copy_page_from_iter() for high-order pages properly * tag 'pull-work.iov_iter-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (32 commits) fix copy_page_from_iter() for compound destinations hugetlbfs: copy_page_to_iter() can deal with compound pages copy_page_to_iter(): don't split high-order page in case of ITER_PIPE expand those iov_iter_advance()... pipe_get_pages(): switch to append_pipe() get rid of non-advancing variants ceph: switch the last caller of iov_iter_get_pages_alloc() 9p: convert to advancing variant of iov_iter_get_pages_alloc() af_alg_make_sg(): switch to advancing variant of iov_iter_get_pages() iter_to_pipe(): switch to advancing variant of iov_iter_get_pages() block: convert to advancing variants of iov_iter_get_pages{,_alloc}() iov_iter: advancing variants of iov_iter_get_pages{,_alloc}() iov_iter: saner helper for page array allocation fold __pipe_get_pages() into pipe_get_pages() ITER_XARRAY: don't open-code DIV_ROUND_UP() unify the rest of iov_iter_get_pages()/iov_iter_get_pages_alloc() guts unify xarray_get_pages() and xarray_get_pages_alloc() unify pipe_get_pages() and pipe_get_pages_alloc() iov_iter_get_pages(): sanity-check arguments iov_iter_get_pages_alloc(): lift freeing pages array on failure exits into wrapper ...
| * iov_iter: advancing variants of iov_iter_get_pages{,_alloc}()Al Viro2022-08-081-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Most of the users immediately follow successful iov_iter_get_pages() with advancing by the amount it had returned. Provide inline wrappers doing that, convert trivial open-coded uses of those. BTW, iov_iter_get_pages() never returns more than it had been asked to; such checks in cifs ought to be removed someday... Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * new iov_iter flavour - ITER_UBUFAl Viro2022-08-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Equivalent of single-segment iovec. Initialized by iov_iter_ubuf(), checked for by iter_is_ubuf(), otherwise behaves like ITER_IOVEC ones. We are going to expose the things like ->write_iter() et.al. to those in subsequent commits. New predicate (user_backed_iter()) that is true for ITER_IOVEC and ITER_UBUF; places like direct-IO handling should use that for checking that pages we modify after getting them from iov_iter_get_pages() would need to be dirtied. DO NOT assume that replacing iter_is_iovec() with user_backed_iter() will solve all problems - there's code that uses iter_is_iovec() to decide how to poke around in iov_iter guts and for that the predicate replacement obviously won't suffice. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | cifs: when insecure legacy is disabled shrink amount of SMB1 codeSteve French2022-08-051-3/+261
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently much of the smb1 code is built even when CONFIG_CIFS_ALLOW_INSECURE_LEGACY is disabled. Move cifssmb.c to only be compiled when insecure legacy is disabled, and move various SMB1/CIFS helper functions to that ifdef. Some functions that were not SMB1/CIFS specific needed to be moved out of cifssmb.c This shrinks cifs.ko by more than 10% which is good - but also will help with the eventual movement of the legacy code to a distinct module. Follow on patches can shrink the number of ifdefs by code restructuring where smb1 code is wedged in functions that should be calling dialect specific helper functions instead, and also by moving some functions from file.c/dir.c/inode.c into smb1 specific c files. Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
* | cifs: trivial style fixupSteve French2022-08-011-0/+1
| | | | | | | | | | | | missing blank line after declaration Signed-off-by: Steve French <stfrench@microsoft.com>
* | cifs: list_for_each() -> list_for_each_entry()Enzo Matsumiya2022-08-011-7/+3
| | | | | | | | | | | | | | | | Replace list_for_each() by list_for_each_entr() where appropriate. Remove no longer used list_head stack variables. Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>