linux.git - Linux kernel mainline tree

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge tag 'nfs-for-5.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs	Linus Torvalds	2020-12-17	22	-524/+922
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull NFS client updates from Trond Myklebust: "Highlights include: Features: - NFSv3: Add emulation of lookupp() to improve open_by_filehandle() support - A series of patches to improve readdir performance, particularly with large directories - Basic support for using NFS/RDMA with the pNFS files and flexfiles drivers - Micro-optimisations for RDMA - RDMA tracing improvements Bugfixes: - Fix a long standing bug with xs_read_xdr_buf() when receiving partial pages (Dan Aloni) - Various fixes for getxattr and listxattr, when used over non-TCP transports - Fixes for containerised NFS from Sargun Dhillon - switch nfsiod to be an UNBOUND workqueue (Neil Brown) - READDIR should not ask for security label information if there is no LSM policy (Olga Kornievskaia) - Avoid using interval-based rebinding with TCP in lockd (Calum Mackay) - A series of RPC and NFS layer fixes to support the NFSv4.2 READ_PLUS code - A couple of fixes for pnfs/flexfiles read failover Cleanups: - Various cleanups for the SUNRPC xdr code in conjunction with the READ_PLUS fixes" * tag 'nfs-for-5.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (90 commits) NFS/pNFS: Fix a typo in ff_layout_resend_pnfs_read() pNFS/flexfiles: Avoid spurious layout returns in ff_layout_choose_ds_for_read NFSv4/pnfs: Add tracing for the deviceid cache fs/lockd: convert comma to semicolon NFSv4.2: fix error return on memory allocation failure NFSv4.2/pnfs: Don't use READ_PLUS with pNFS yet NFSv4.2: Deal with potential READ_PLUS data extent buffer overflow NFSv4.2: Don't error when exiting early on a READ_PLUS buffer overflow NFSv4.2: Handle hole lengths that exceed the READ_PLUS read buffer NFSv4.2: decode_read_plus_hole() needs to check the extent offset NFSv4.2: decode_read_plus_data() must skip padding after data segment NFSv4.2: Ensure we always reset the result->count in decode_read_plus() SUNRPC: When expanding the buffer, we may need grow the sparse pages SUNRPC: Cleanup - constify a number of xdr_buf helpers SUNRPC: Clean up open coded setting of the xdr_stream 'nwords' field SUNRPC: _copy_to/from_pages() now check for zero length SUNRPC: Cleanup xdr_shrink_bufhead() SUNRPC: Fix xdr_expand_hole() SUNRPC: Fixes for xdr_align_data() SUNRPC: _shift_data_left/right_pages should check the shift length ...
\| *	NFS/pNFS: Fix a typo in ff_layout_resend_pnfs_read()	Trond Myklebust	2020-12-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Don't bump the index twice. Fixes: 563c53e73b8b ("NFS: Fix flexfiles read failover") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	pNFS/flexfiles: Avoid spurious layout returns in ff_layout_choose_ds_for_read	Trond Myklebust	2020-12-16	1	-5/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The callers of ff_layout_choose_ds_for_read() should decide whether or not they want to return the layout on error. Sometimes, we may just want to retry from the beginning. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	NFSv4/pnfs: Add tracing for the deviceid cache	Trond Myklebust	2020-12-16	3	-8/+92
\| \| \| \| \| \| \| \| \| \| \| \|	Add tracepoints to allow debugging of the deviceid cache. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	NFSv4.2: fix error return on memory allocation failure	Colin Ian King	2020-12-16	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently when an alloc_page fails the error return is not set in variable err and a garbage initialized value is returned. Fix this by setting err to -ENOMEM before taking the error return path. Addresses-Coverity: ("Uninitialized scalar variable") Fixes: a1f26739ccdc ("NFSv4.2: improve page handling for GETXATTR") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	NFSv4.2/pnfs: Don't use READ_PLUS with pNFS yet	Trond Myklebust	2020-12-14	1	-7/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have no way of tracking server READ_PLUS support in pNFS for now, so just disable it. Reported-by: "Mkrtchyan, Tigran" <tigran.mkrtchyan@desy.de> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	NFSv4.2: Deal with potential READ_PLUS data extent buffer overflow	Trond Myklebust	2020-12-14	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the server returns more data than we have buffer space for, then we need to truncate and exit early. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	NFSv4.2: Don't error when exiting early on a READ_PLUS buffer overflow	Trond Myklebust	2020-12-14	1	-19/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Expanding the READ_PLUS extents can cause the read buffer to overflow. If it does, then don't error, but just exit early. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	NFSv4.2: Handle hole lengths that exceed the READ_PLUS read buffer	Trond Myklebust	2020-12-14	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a hole extends beyond the READ_PLUS read buffer, then we want to fill just the remaining buffer with zeros. Also ignore eof... Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	NFSv4.2: decode_read_plus_hole() needs to check the extent offset	Trond Myklebust	2020-12-14	1	-3/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The server is allowed to return a hole extent with an offset that starts before the offset supplied in the READ_PLUS argument. Ensure that we support that case too. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	NFSv4.2: decode_read_plus_data() must skip padding after data segment	Trond Myklebust	2020-12-14	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	All XDR opaque object sizes are 32-bit aligned, and a data segment is no exception. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	NFSv4.2: Ensure we always reset the result->count in decode_read_plus()	Trond Myklebust	2020-12-14	1	-0/+1
\| \| \| \| \| \| \| \|	Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	NFSv4.1: use BITS_PER_LONG macro in nfs4session.h	Geliang Tang	2020-12-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use the existing BITS_PER_LONG macro instead of calculating the value. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	NFSv4.2: improve page handling for GETXATTR	Frank van der Linden	2020-12-14	2	-16/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	XDRBUF_SPARSE_PAGES can cause problems for the RDMA transport, and it's easy enough to allocate enough pages for the request up front, so do that. Also, since we've allocated the pages anyway, use the full page aligned length for the receive buffer. This will allow caching of valid replies that are too large for the caller, but that still fit in the allocated pages. Signed-off-by: Frank van der Linden <fllinden@amazon.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	NFSv4.2: Fix up the get/listxattr calls to rpc_prepare_reply_pages()	Trond Myklebust	2020-12-10	1	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ensure that both getxattr and listxattr page array are correctly aligned, and that getxattr correctly accounts for the page padding word. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	NFS: switch nfsiod to be an UNBOUND workqueue.	NeilBrown	2020-12-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	nfsiod is currently a concurrency-managed workqueue (CMWQ). This means that workitems scheduled to nfsiod on a given CPU are queued behind all other work items queued on any CMWQ on the same CPU. This can introduce unexpected latency. Occaionally nfsiod can even cause excessive latency. If the work item to complete a CLOSE request calls the final iput() on an inode, the address_space of that inode will be dismantled. This takes time proportional to the number of in-memory pages, which on a large host working on large files (e.g.. 5TB), can be a large number of pages resulting in a noticable number of seconds. We can avoid these latency problems by switching nfsiod to WQ_UNBOUND. This causes each concurrent work item to gets a dedicated thread which can be scheduled to an idle CPU. There is precedent for this as several other filesystems use WQ_UNBOUND workqueue for handling various async events. Signed-off-by: NeilBrown <neilb@suse.de> Fixes: ada609ee2ac2 ("workqueue: use WQ_MEM_RECLAIM instead of WQ_RESCUER") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	NFSv4: Refactor to use user namespaces for nfs4idmap	Sargun Dhillon	2020-12-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In several patches work has been done to enable NFSv4 to use user namespaces: 58002399da65: NFSv4: Convert the NFS client idmapper to use the container user namespace 3b7eb5e35d0f: NFS: When mounting, don't share filesystems between different user namespaces Unfortunately, the userspace APIs were only such that the userspace facing side of the filesystem (superblock s_user_ns) could be set to a non init user namespace. This furthers the fs_context related refactoring, and piggybacks on top of that logic, so the superblock user namespace, and the NFS user namespace are the same. Users can still use rpc.idmapd if they choose to, but there are complexities with user namespaces and request-key that have yet to be addresssed. Eventually, we will need to at least: * Come up with an upcall mechanism that can be triggered inside of the container, or safely triggered outside, with the requisite context to do the right mapping. * Handle whatever refactoring needs to be done in net/sunrpc. Signed-off-by: Sargun Dhillon <sargun@sargun.me> Tested-by: Alban Crequy <alban.crequy@gmail.com> Fixes: 62a55d088cd8 ("NFS: Additional refactoring for fs_context conversion") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	NFS: NFSv2/NFSv3: Use cred from fs_context during mount	Sargun Dhillon	2020-12-02	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There was refactoring done to use the fs_context for mounting done in: 62a55d088cd87: NFS: Additional refactoring for fs_context conversion This made it so that the net_ns is fetched from the fs_context (the netns that fsopen is called in). This change also makes it so that the credential fetched during fsopen is used as well as the net_ns. NFS has already had a number of changes to prepare it for user namespaces: 1a58e8a0e5c1: NFS: Store the credential of the mount process in the nfs_server 264d948ce7d0: NFS: Convert NFSv3 to use the container user namespace c207db2f5da5: NFS: Convert NFSv2 to use the container user namespace Previously, different credentials could be used for creation of the fs_context versus creation of the nfs_server, as FSCONFIG_CMD_CREATE did the actual credential check, and that's where current_creds() were fetched. This meant that the user namespace which fsopen was called in could be a non-init user namespace. This still requires that the user that calls FSCONFIG_CMD_CREATE has CAP_SYS_ADMIN in the init user ns. This roughly allows a privileged user to mount on behalf of an unprivileged usernamespace, by forking off and calling fsopen in the unprivileged user namespace. It can then pass back that fsfd to the privileged process which can configure the NFS mount, and then it can call FSCONFIG_CMD_CREATE before switching back into the mount namespace of the container, and finish up the mounting process and call fsmount and move_mount. Signed-off-by: Sargun Dhillon <sargun@sargun.me> Tested-by: Alban Crequy <alban.crequy@gmail.com> Fixes: 62a55d088cd8 ("NFS: Additional refactoring for fs_context conversion") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	NFSv4: Fix a pNFS layout related use-after-free race when freeing the inode	Trond Myklebust	2020-12-02	3	-3/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When returning the layout in nfs4_evict_inode(), we need to ensure that the layout is actually done being freed before we can proceed to free the inode itself. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	NFSv4: Fix open coded xdr_stream_remaining()	Trond Myklebust	2020-12-02	1	-3/+3
\| \| \| \| \| \| \| \|	Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	SUNRPC: Clean up the handling of page padding in rpc_prepare_reply_pages()	Trond Myklebust	2020-12-02	3	-39/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rpc_prepare_reply_pages() currently expects the 'hdrsize' argument to contain the length of the data that we expect to want placed in the head kvec plus a count of 1 word of padding that is placed after the page data. This is very confusing when trying to read the code, and sometimes leads to callers adding an arbitrary value of '1' just in order to satisfy the requirement (whether or not the page data actually needs such padding). This patch aims to clarify the code by changing the 'hdrsize' argument to remove that 1 word of padding. This means we need to subtract the padding from all the existing callers. Fixes: 02ef04e432ba ("NFS: Account for XDR pad of buf->pages") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	NFSv4: Fix the alignment of page data in the getdeviceinfo reply	Trond Myklebust	2020-12-02	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We can fit the device_addr4 opaque data padding in the pages. Fixes: cf500bac8fd4 ("SUNRPC: Introduce rpc_prepare_reply_pages()") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	pNFS: Clean up open coded xdr string decoding	Trond Myklebust	2020-12-02	1	-36/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use the existing xdr_stream_decode_string_dup() to safely decode into kmalloced strings. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	pNFS/flexfiles: Fix up layoutstats reporting for non-TCP transports	Trond Myklebust	2020-12-02	1	-7/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ensure that we report the correct netid when using UDP or RDMA transports to the DSes. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	NFSv4/pNFS: Store the transport type in struct nfs4_pnfs_ds_addr	Trond Myklebust	2020-12-02	2	-15/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We want to enable RDMA and UDP as valid transport methods if a GETDEVICEINFO call specifies it. Do so by adding a parser for the netid that translates it to an appropriate argument for the RPC transport layer. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	pNFS: Add helpers for allocation/free of struct nfs4_pnfs_ds_addr	Trond Myklebust	2020-12-02	1	-5/+16
\| \| \| \| \| \| \| \|	Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	NFSv4/pNFS: Use connections to a DS that are all of the same protocol family	Trond Myklebust	2020-12-02	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the pNFS metadata server advertises multiple addresses for the same data server, we should try to connect to just one protocol family and transport type on the assumption that homogeneity will improve performance. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	NFS: Switch mount code to use xprt_find_transport_ident()	Trond Myklebust	2020-12-02	1	-9/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Switch the mount code to use xprt_find_transport_ident() and to check the results before allowing the mount to proceed. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	NFS: Do uncached readdir when we're seeking a cookie in an empty page cache	Trond Myklebust	2020-12-02	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the directory is changing, causing the page cache to get invalidated while we are listing the contents, then the NFS client is currently forced to read in the entire directory contents from scratch, because it needs to perform a linear search for the readdir cookie. While this is not an issue for small directories, it does not scale to directories with millions of entries. In order to be able to deal with large directories that are changing, add a heuristic to ensure that if the page cache is empty, and we are searching for a cookie that is not the zero cookie, we just default to performing uncached readdir. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Dave Wysochanski <dwysocha@redhat.com>
\| *	NFS: Reduce number of RPC calls when doing uncached readdir	Trond Myklebust	2020-12-02	1	-36/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we're doing uncached readdir, allocate multiple pages in order to try to avoid duplicate RPC calls for the same getdents() call. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Dave Wysochanski <dwysocha@redhat.com>
\| *	NFS: Optimisations for monotonically increasing readdir cookies	Trond Myklebust	2020-12-02	1	-1/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the server is handing out monotonically increasing readdir cookie values, then we can optimise away searches through pages that contain cookies that lie outside our search range. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Dave Wysochanski <dwysocha@redhat.com>
\| *	NFS: Improve handling of directory verifiers	Trond Myklebust	2020-12-02	2	-19/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the server insists on using the readdir verifiers in order to allow cookies to expire, then we should ensure that we cache the verifier with the cookie, so that we can return an error if the application tries to use the expired cookie. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Dave Wysochanski <dwysocha@redhat.com>
\| *	NFS: Handle NFS4ERR_NOT_SAME and NFSERR_BADCOOKIE from readdir calls	Trond Myklebust	2020-12-02	2	-8/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the server returns NFS4ERR_NOT_SAME or tells us that the cookie is bad in response to a READDIR call, then we should empty the page cache so that we can fill it from scratch again. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Dave Wysochanski <dwysocha@redhat.com>
\| *	NFS: Allow the NFS generic code to pass in a verifier to readdir	Trond Myklebust	2020-12-02	4	-51/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we're ever going to allow support for servers that use the readdir verifier, then that use needs to be managed by the middle layers as those need to be able to reject cookies from other verifiers. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Dave Wysochanski <dwysocha@redhat.com>
\| *	NFS: Cleanup to remove nfs_readdir_descriptor_t typedef	Trond Myklebust	2020-12-02	1	-17/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Dave Wysochanski <dwysocha@redhat.com>
\| *	NFS: Reduce readdir stack usage	Trond Myklebust	2020-12-02	1	-25/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The descriptor and the struct nfs_entry are both large structures, so don't allocate them from the stack. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Dave Wysochanski <dwysocha@redhat.com>
\| *	NFS: nfs_do_filldir() does not return a value	Trond Myklebust	2020-12-02	1	-14/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Clean up nfs_do_filldir(). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Dave Wysochanski <dwysocha@redhat.com>
\| *	NFS: More readdir cleanups	Trond Myklebust	2020-12-02	1	-14/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove the redundant caching of the credential in struct nfs_open_dir_context. Pass the buffer size as an argument to nfs_readdir_xdr_filler(). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Dave Wysochanski <dwysocha@redhat.com>
\| *	NFS: Support larger readdir buffers	Trond Myklebust	2020-12-02	3	-22/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Support readdir buffers of up to 1MB in size so that we can read large directories using few RPC calls. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Dave Wysochanski <dwysocha@redhat.com>
\| *	NFS: Simplify struct nfs_cache_array_entry	Trond Myklebust	2020-12-02	1	-21/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We don't need to store a hash, so replace struct qstr with a simple const char pointer and length. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Dave Wysochanski <dwysocha@redhat.com>
\| *	NFS: Replace kmap() with kmap_atomic() in nfs_readdir_search_array()	Trond Myklebust	2020-12-02	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Dave Wysochanski <dwysocha@redhat.com>
\| *	NFS: Remove unnecessary kmap in nfs_readdir_xdr_to_array()	Trond Myklebust	2020-12-02	1	-7/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The kmapped pointer is only used once per loop to check if we need to exit. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Dave Wysochanski <dwysocha@redhat.com>
\| *	NFS: Don't discard readdir results	Trond Myklebust	2020-12-02	1	-4/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a readdir call returns more data than we can fit into one page cache page, then allocate a new one for that data rather than discarding the data. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Dave Wysochanski <dwysocha@redhat.com>
\| *	NFS: Clean up directory array handling	Trond Myklebust	2020-12-02	1	-61/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Refactor to use pagecache_get_page() so that we can fill the page in multiple stages. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Dave Wysochanski <dwysocha@redhat.com>
\| *	NFS: Clean up nfs_readdir_page_filler()	Trond Myklebust	2020-12-02	1	-21/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Clean up handling of the case where there are no entries in the readdir reply. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Dave Wysochanski <dwysocha@redhat.com>
\| *	NFS: Clean up readdir struct nfs_cache_array	Trond Myklebust	2020-12-02	1	-17/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since the 'eof_index' is only ever used as a flag, make it so. Also add a flag to detect if the page has been completely filled. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Dave Wysochanski <dwysocha@redhat.com>
\| *	NFS: Ensure contents of struct nfs_open_dir_context are consistent	Trond Myklebust	2020-12-02	1	-29/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ensure that the contents of struct nfs_open_dir_context are consistent by setting them under the file->f_lock from a private copy (that is known to be consistent). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Dave Wysochanski <dwysocha@redhat.com>
\| *	NFSv4.2: condition READDIR's mask for security label based on LSM state	Olga Kornievskaia	2020-12-02	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, the client will always ask for security_labels if the server returns that it supports that feature regardless of any LSM modules (such as Selinux) enforcing security policy. This adds performance penalty to the READDIR operation. Client adjusts superblock's support of the security_label based on the server's support but also current client's configuration of the LSM modules. Thus, prior to using the default bitmask in READDIR, this patch checks the server's capabilities and then instructs READDIR to remove FATTR4_WORD2_SECURITY_LABEL from the bitmask. v5: fixing silly mistakes of the rushed v4 v4: simplifying logic v3: changing label's initialization per Ondrej's comment v2: dropping selinux hook and using the sb cap. Suggested-by: Ondrej Mosnacek <omosnace@redhat.com> Suggested-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Fixes: 2b0143b5c986 ("VFS: normal filesystems (and lustre): d_inode() annotations") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	NFSv4: Observe the NFS_MOUNT_SOFTREVAL flag in _nfs4_proc_lookupp	Trond Myklebust	2020-12-02	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We need to respect the NFS_MOUNT_SOFTREVAL flag in _nfs4_proc_lookupp, by timing out if the server is unavailable. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
\| *	NFSv3: Add emulation of the lookupp() operation	Trond Myklebust	2020-12-02	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to use the open_by_filehandle() operations on NFSv3, we need to be able to emulate lookupp() so that nfs_get_parent() can be used to convert disconnected dentries into connected ones. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>