summaryrefslogtreecommitdiffstats
path: root/fs/nfsd/nfs4state.c
Commit message (Collapse)AuthorAgeFilesLines
* NFSD: Enable write delegation supportDai Ngo2023-08-291-20/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch grants write delegations for OPEN with NFS4_SHARE_ACCESS_WRITE if there is no conflict with other OPENs. Write delegation conflicts with another OPEN, REMOVE, RENAME and SETATTR are handled the same as read delegation using notify_change, try_break_deleg. The NFSv4.0 protocol does not enable a server to determine that a conflicting GETATTR originated from the client holding the delegation versus coming from some other client. With NFSv4.1 and later, the SEQUENCE operation that begins each COMPOUND contains a client ID, so delegation recall can be safely squelched in this case. With NFSv4.0, however, the server must recall or send a CB_GETATTR (per RFC 7530 Section 16.7.5) even when the GETATTR originates from the client holding that delegation. An NFSv4.0 client can trigger a pathological situation if it always sends a DELEGRETURN preceded by a conflicting GETATTR in the same COMPOUND. COMPOUND execution will always stop at the GETATTR and the DELEGRETURN will never get executed. The server eventually revokes the delegation, which can result in loss of open or lock state. Tracepoint added to track whether read or write delegation is granted. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: handle GETATTR conflict with write delegationDai Ngo2023-08-291-0/+65
| | | | | | | | | | | | | | | If the GETATTR request on a file that has write delegation in effect and the request attributes include the change info and size attribute then the write delegation is recalled. If the delegation is returned within 30ms then the GETATTR is serviced as normal otherwise the NFS4ERR_DELAY error is returned for the GETATTR. Add counter for write delegation recall due to conflict GETATTR. This is used to evaluate the need to implement CB_GETATTR to adoid recalling the delegation with conflit GETATTR. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* nfsd: Fix race to FREE_STATEID and cl_revokedBenjamin Coddington2023-08-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | We have some reports of linux NFS clients that cannot satisfy a linux knfsd server that always sets SEQ4_STATUS_RECALLABLE_STATE_REVOKED even though those clients repeatedly walk all their known state using TEST_STATEID and receive NFS4_OK for all. Its possible for revoke_delegation() to set NFS4_REVOKED_DELEG_STID, then nfsd4_free_stateid() finds the delegation and returns NFS4_OK to FREE_STATEID. Afterward, revoke_delegation() moves the same delegation to cl_revoked. This would produce the observed client/server effect. Fix this by ensuring that the setting of sc_type to NFS4_REVOKED_DELEG_STID and move to cl_revoked happens within the same cl_lock. This will allow nfsd4_free_stateid() to properly remove the delegation from cl_revoked. Link: https://bugzilla.redhat.com/show_bug.cgi?id=2217103 Link: https://bugzilla.redhat.com/show_bug.cgi?id=2176575 Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Cc: stable@vger.kernel.org # v4.17+ Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* nfsd: Remove incorrect check in nfsd4_validate_stateidTrond Myklebust2023-07-181-2/+0
| | | | | | | | | | | | | | | | If the client is calling TEST_STATEID, then it is because some event occurred that requires it to check all the stateids for validity and call FREE_STATEID on the ones that have been revoked. In this case, either the stateid exists in the list of stateids associated with that nfs4_client, in which case it should be tested, or it does not. There are no additional conditions to be considered. Reported-by: "Frank Ch. Eigler" <fche@redhat.com> Fixes: 7df302f75ee2 ("NFSD: TEST_STATEID should not return NFS4ERR_STALE_STATEID") Cc: stable@vger.kernel.org # v5.7+ Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* Merge tag 'nfsd-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds2023-02-221-47/+83
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull nfsd updates from Chuck Lever: "Two significant security enhancements are part of this release: - NFSD's RPC header encoding and decoding, including RPCSEC GSS and gssproxy header parsing, has been overhauled to make it more memory-safe. - Support for Kerberos AES-SHA2-based encryption types has been added for both the NFS client and server. This provides a clean path for deprecating and removing insecure encryption types based on DES and SHA-1. AES-SHA2 is also FIPS-140 compliant, so that NFS with Kerberos may now be used on systems with fips enabled. In addition to these, NFSD is now able to handle crossing into an auto-mounted mount point on an exported NFS mount. A number of fixes have been made to NFSD's server-side copy implementation. RPC metrics have been converted to per-CPU variables. This helps reduce unnecessary cross-CPU and cross-node memory bus traffic, and significantly reduces noise when KCSAN is enabled" * tag 'nfsd-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (121 commits) NFSD: Clean up nfsd_symlink() NFSD: copy the whole verifier in nfsd_copy_write_verifier nfsd: don't fsync nfsd_files on last close SUNRPC: Fix occasional warning when destroying gss_krb5_enctypes nfsd: fix courtesy client with deny mode handling in nfs4_upgrade_open NFSD: fix problems with cleanup on errors in nfsd4_copy nfsd: fix race to check ls_layouts nfsd: don't hand out delegation on setuid files being opened for write SUNRPC: Remove ->xpo_secure_port() SUNRPC: Clean up the svc_xprt_flags() macro nfsd: remove fs/nfsd/fault_inject.c NFSD: fix leaked reference count of nfsd4_ssc_umount_item nfsd: clean up potential nfsd_file refcount leaks in COPY codepath nfsd: zero out pointers after putting nfsd_files on COPY setup error SUNRPC: Fix whitespace damage in svcauth_unix.c nfsd: eliminate __nfs4_get_fd nfsd: add some kerneldoc comments for stateid preprocessing functions nfsd: eliminate find_deleg_file_locked nfsd: don't take nfsd4_copy ref for OP_OFFLOAD_STATUS SUNRPC: Add encryption self-tests ...
| * nfsd: fix courtesy client with deny mode handling in nfs4_upgrade_openJeff Layton2023-02-201-10/+11
| | | | | | | | | | | | | | | | | | | | | | | | The nested if statements here make no sense, as you can never reach "else" branch in the nested statement. Fix the error handling for when there is a courtesy client that holds a conflicting deny mode. Fixes: 3d6942715180 ("NFSD: add support for share reservation conflict to courteous server") Reported-by: 張智諺 <cc85nod@gmail.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| * NFSD: fix problems with cleanup on errors in nfsd4_copyDai Ngo2023-02-201-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When nfsd4_copy fails to allocate memory for async_copy->cp_src, or nfs4_init_copy_state fails, it calls cleanup_async_copy to do the cleanup for the async_copy which causes page fault since async_copy is not yet initialized. This patche rearranges the order of initializing the fields in async_copy and adds checks in cleanup_async_copy to skip un-initialized fields. Fixes: ce0887ac96d3 ("NFSD add nfs4 inter ssc to nfsd4_copy") Fixes: 87689df69491 ("NFSD: Shrink size of struct nfsd4_copy") Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| * nfsd: don't hand out delegation on setuid files being opened for writeJeff Layton2023-02-201-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We had a bug report that xfstest generic/355 was failing on NFSv4.0. This test sets various combinations of setuid/setgid modes and tests whether DIO writes will cause them to be stripped. What I found was that the server did properly strip those bits, but the client didn't notice because it held a delegation that was not recalled. The recall didn't occur because the client itself was the one generating the activity and we avoid recalls in that case. Clearing setuid bits is an "implicit" activity. The client didn't specifically request that we do that, so we need the server to issue a CB_RECALL, or avoid the situation entirely by not issuing a delegation. The easiest fix here is to simply not give out a delegation if the file is being opened for write, and the mode has the setuid and/or setgid bit set. Note that there is a potential race between the mode and lease being set, so we test for this condition both before and after setting the lease. This patch fixes generic/355, generic/683 and generic/684 for me. (Note that 355 fails only on v4.0, and 683 and 684 require NFSv4.2 to run and fail). Reported-by: Boyang Xue <bxue@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| * nfsd: eliminate __nfs4_get_fdJeff Layton2023-02-201-13/+7
| | | | | | | | | | | | | | This is wrapper is pointless, and just obscures what's going on. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| * nfsd: add some kerneldoc comments for stateid preprocessing functionsJeff Layton2023-02-201-4/+25
| | | | | | | | | | Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| * nfsd: eliminate find_deleg_file_lockedJeff Layton2023-02-201-10/+1
| | | | | | | | | | | | | | We really don't need an accessor function here. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| * nfsd: fix potential race in nfs4_find_fileJeff Layton2023-02-201-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | The WARN_ON_ONCE check is not terribly useful. It also seems possible for nfs4_find_file to race with the destruction of an fi_deleg_file while trying to take a reference to it. Now that it's safe to pass nfs_get_file a NULL pointer, remove the WARN and NULL pointer check. Take the fi_lock when fetching fi_deleg_file. Cc: NeilBrown <neilb@suse.de> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| * nfsd: allow nfsd_file_get to sanely handle a NULL pointerJeff Layton2023-02-201-3/+1
| | | | | | | | | | | | | | | | ...and remove some now-useless NULL pointer checks in its callers. Suggested-by: NeilBrown <neilb@suse.de> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* | Merge tag 'locks-v6.3' of ↵Linus Torvalds2023-02-201-2/+2
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux Pull file locking updates from Jeff Layton: "The main change here is that I've broken out most of the file locking definitions into a new header file. I also went ahead and completed the removal of locks_inode function" * tag 'locks-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux: fs: remove locks_inode filelock: move file locking definitions to separate header file
| * fs: remove locks_inodeJeff Layton2023-01-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | locks_inode was turned into a wrapper around file_inode in de2a4a501e71 (Partially revert "locks: fix file locking on overlayfs"). Finish replacing locks_inode invocations everywhere with file_inode. Acked-by: Miklos Szeredi <mszeredi@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org>
* | Merge tag 'nfsd-6.2-6' of ↵Linus Torvalds2023-02-151-1/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fix from Chuck Lever: - Fix a teardown bug in the new nfs4_file hashtable * tag 'nfsd-6.2-6' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: nfsd: don't destroy global nfs4_file table in per-net shutdown
| * | nfsd: don't destroy global nfs4_file table in per-net shutdownJeff Layton2023-02-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nfs4_file table is global, so shutting it down when a containerized nfsd is shut down is wrong and can lead to double-frees. Tear down the nfs4_file_rhltable in nfs4_state_shutdown instead of nfs4_state_shutdown_net. Fixes: d47b295e8d76 ("NFSD: Use rhashtable for managing nfs4_file objects") Link: https://bugzilla.redhat.com/show_bug.cgi?id=2169017 Reported-by: JianHong Yin <jiyin@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* | | Merge tag 'nfsd-6.2-4' of ↵Linus Torvalds2023-01-171-15/+15
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Fix recently introduced use-after-free bugs * tag 'nfsd-6.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: NFSD: replace delayed_work with work_struct for nfsd_client_shrinker NFSD: register/unregister of nfsd-client shrinker at nfsd startup/shutdown time NFSD: fix use-after-free in nfsd4_ssc_setup_dul()
| * | NFSD: replace delayed_work with work_struct for nfsd_client_shrinkerDai Ngo2023-01-121-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since nfsd4_state_shrinker_count always calls mod_delayed_work with 0 delay, we can replace delayed_work with work_struct to save some space and overhead. Also add the call to cancel_work after unregister the shrinker in nfs4_state_shutdown_net. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| * | NFSD: register/unregister of nfsd-client shrinker at nfsd startup/shutdown timeDai Ngo2023-01-111-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the nfsd-client shrinker is registered and unregistered at the time the nfsd module is loaded and unloaded. The problem with this is the shrinker is being registered before all of the relevant fields in nfsd_net are initialized when nfsd is started. This can lead to an oops when memory is low and the shrinker is called while nfsd is not running. This patch moves the register/unregister of nfsd-client shrinker from module load/unload time to nfsd startup/shutdown time. Fixes: 44df6f439a17 ("NFSD: add delegation reaper to react to low memory condition") Reported-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* | | Merge tag 'nfsd-6.2-3' of ↵Linus Torvalds2023-01-101-12/+4
|\| | | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Fix a race when creating NFSv4 files - Revert the use of relaxed bitops * tag 'nfsd-6.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: NFSD: Use set_bit(RQ_DROPME) Revert "SUNRPC: Use RMW bitops in single-threaded hot paths" nfsd: fix handling of cached open files in nfsd4_open codepath
| * nfsd: fix handling of cached open files in nfsd4_open codepathJeff Layton2023-01-061-12/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit fb70bf124b05 ("NFSD: Instantiate a struct file when creating a regular NFSv4 file") added the ability to cache an open fd over a compound. There are a couple of problems with the way this currently works: It's racy, as a newly-created nfsd_file can end up with its PENDING bit cleared while the nf is hashed, and the nf_file pointer is still zeroed out. Other tasks can find it in this state and they expect to see a valid nf_file, and can oops if nf_file is NULL. Also, there is no guarantee that we'll end up creating a new nfsd_file if one is already in the hash. If an extant entry is in the hash with a valid nf_file, nfs4_get_vfs_file will clobber its nf_file pointer with the value of op_file and the old nf_file will leak. Fix both issues by making a new nfsd_file_acquirei_opened variant that takes an optional file pointer. If one is present when this is called, we'll take a new reference to it instead of trying to open the file. If the nfsd_file already has a valid nf_file, we'll just ignore the optional file and pass the nfsd_file back as-is. Also rework the tracepoints a bit to allow for an "opened" variant and don't try to avoid counting acquisitions in the case where we already have a cached open file. Fixes: fb70bf124b05 ("NFSD: Instantiate a struct file when creating a regular NFSv4 file") Cc: Trond Myklebust <trondmy@hammerspace.com> Reported-by: Stanislav Saner <ssaner@redhat.com> Reported-and-Tested-by: Ruben Vestergaard <rubenv@drcmr.dk> Reported-and-Tested-by: Torkil Svensgaard <torkil@drcmr.dk> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* | Merge tag 'nfsd-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds2022-12-121-109/+232
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull nfsd updates from Chuck Lever: "This release introduces support for the CB_RECALL_ANY operation. NFSD can send this operation to request that clients return any delegations they choose. The server uses this operation to handle low memory scenarios or indicate to a client when that client has reached the maximum number of delegations the server supports. The NFSv4.2 READ_PLUS operation has been simplified temporarily whilst support for sparse files in local filesystems and the VFS is improved. Two major data structure fixes appear in this release: - The nfs4_file hash table is replaced with a resizable hash table to reduce the latency of NFSv4 OPEN operations. - Reference counting in the NFSD filecache has been hardened against races. In furtherance of removing support for NFSv2 in a subsequent kernel release, a new Kconfig option enables server-side support for NFSv2 to be left out of a kernel build. MAINTAINERS has been updated to indicate that changes to fs/exportfs should go through the NFSD tree" * tag 'nfsd-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (49 commits) NFSD: Avoid clashing function prototypes SUNRPC: Fix crasher in unwrap_integ_data() SUNRPC: Make the svc_authenticate tracepoint conditional NFSD: Use only RQ_DROPME to signal the need to drop a reply SUNRPC: Clean up xdr_write_pages() SUNRPC: Don't leak netobj memory when gss_read_proxy_verf() fails NFSD: add CB_RECALL_ANY tracepoints NFSD: add delegation reaper to react to low memory condition NFSD: add support for sending CB_RECALL_ANY NFSD: refactoring courtesy_client_reaper to a generic low memory shrinker trace: Relocate event helper files NFSD: pass range end to vfs_fsync_range() instead of count lockd: fix file selection in nlmsvc_cancel_blocked lockd: ensure we use the correct file descriptor when unlocking lockd: set missing fl_flags field when retrieving args NFSD: Use struct_size() helper in alloc_session() nfsd: return error if nfs4_setacl fails lockd: set other missing fields when unlocking files NFSD: Add an nfsd_file_fsync tracepoint sunrpc: svc: Remove an unused static function svc_ungetu32() ...
| * NFSD: add CB_RECALL_ANY tracepointsDai Ngo2022-12-101-0/+2
| | | | | | | | | | | | | | | | Add tracepoints to trace start and end of CB_RECALL_ANY operation. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> [ cel: added show_rca_mask() macro ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| * NFSD: add delegation reaper to react to low memory conditionDai Ngo2022-12-101-4/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The delegation reaper is called by nfsd memory shrinker's on the 'count' callback. It scans the client list and sends the courtesy CB_RECALL_ANY to the clients that hold delegations. To avoid flooding the clients with CB_RECALL_ANY requests, the delegation reaper sends only one CB_RECALL_ANY request to each client per 5 seconds. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> [ cel: moved definition of RCA4_TYPE_MASK_RDATA_DLG ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| * NFSD: refactoring courtesy_client_reaper to a generic low memory shrinkerDai Ngo2022-12-101-9/+16
| | | | | | | | | | | | | | | | Refactoring courtesy_client_reaper to generic low memory shrinker so it can be used for other purposes. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| * NFSD: Use struct_size() helper in alloc_session()Xiu Jianfeng2022-12-101-5/+4
| | | | | | | | | | | | | | | | Use struct_size() helper to simplify the code, no functional changes. Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| * NFSD: Use rhashtable for managing nfs4_file objectsChuck Lever2022-11-281-35/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fh_match() is costly, especially when filehandles are large (as is the case for NFSv4). It needs to be used sparingly when searching data structures. Unfortunately, with common workloads, I see multiple thousands of objects stored in file_hashtbl[], which has just 256 buckets, making its bucket hash chains quite lengthy. Walking long hash chains with the state_lock held blocks other activity that needs that lock. Sizable hash chains are a common occurrance once the server has handed out some delegations, for example -- IIUC, each delegated file is held open on the server by an nfs4_file object. To help mitigate the cost of searching with fh_match(), replace the nfs4_file hash table with an rhashtable, which can dynamically resize its bucket array to minimize hash chain length. The result of this modification is an improvement in the latency of NFSv4 operations, and the reduction of nfsd CPU utilization due to eliminating the cost of multiple calls to fh_match() and reducing the CPU cache misses incurred while walking long hash chains in the nfs4_file hash table. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org>
| * NFSD: Refactor find_file()Chuck Lever2022-11-281-21/+15
| | | | | | | | | | | | | | | | | | find_file() is now the only caller of find_file_locked(), so just fold these two together. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org>
| * NFSD: Clean up find_or_add_file()Chuck Lever2022-11-281-36/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the call to find_file_locked() in insert_nfs4_file(). Tracing shows that over 99% of these calls return NULL. Thus it is not worth the expense of the extra bucket list traversal. insert_file() already deals correctly with the case where the item is already in the hash bucket. Since nfsd4_file_hash_insert() is now just a wrapper around insert_file(), move the meat of insert_file() into nfsd4_file_hash_insert() and get rid of it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: NeilBrown <neilb@suse.de>
| * NFSD: Add a nfsd4_file_hash_remove() helperChuck Lever2022-11-281-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Refactor to relocate hash deletion operation to a helper function that is close to most other nfs4_file data structure operations. The "noinline" annotation will become useful in a moment when the hlist_del_rcu() is replaced with a more complex rhash remove operation. It also guarantees that hash remove operations can be traced with "-p function -l remove_nfs4_file_locked". This also simplifies the organization of forward declarations: the to-be-added rhashtable and its param structure will be defined /after/ put_nfs4_file(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org>
| * NFSD: Clean up nfsd4_init_file()Chuck Lever2022-11-281-6/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Name this function more consistently. I'm going to use nfsd4_file_ and nfsd4_file_hash_ for these helpers. Change the @fh parameter to be const pointer for better type safety. Finally, move the hash insertion operation to the caller. This is typical for most other "init_object" type helpers, and it is where most of the other nfs4_file hash table operations are located. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org>
| * NFSD: Update file_hashtbl() helpersChuck Lever2022-11-281-2/+2
| | | | | | | | | | | | | | | | Enable callers to use const pointers for type safety. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org>
| * NFSD: Trace delegation revocationsChuck Lever2022-11-281-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Delegation revocation is an exceptional event that is not otherwise visible externally (eg, no network traffic is emitted). Generate a trace record when it occurs so that revocation can be observed or other activity can be triggered. Example: nfsd-1104 [005] 1912.002544: nfsd_stid_revoke: client 633c9343:4e82788d stateid 00000003:00000001 ref=2 type=DELEG Trace infrastructure is provided for subsequent additional tracing related to nfs4_stid activity. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org>
| * NFSD: Trace stateids returned via DELEGRETURNChuck Lever2022-11-281-0/+1
| | | | | | | | | | | | | | | | | | Handing out a delegation stateid is recorded with the nfsd_deleg_read tracepoint, but there isn't a matching tracepoint for recording when the stateid is returned. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>
| * NFSD: Revert "NFSD: NFSv4 CLOSE should release an nfsd_file immediately"Chuck Lever2022-11-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 5e138c4a750dc140d881dab4a8804b094bbc08d2. That commit attempted to make files available to other users as soon as all NFSv4 clients were done with them, rather than waiting until the filecache LRU had garbage collected them. It gets the reference counting wrong, for one thing. But it also misses that DELEGRETURN should release a file in the same fashion. In fact, any nfsd_file_put() on an file held open by an NFSv4 client needs potentially to release the file immediately... Clear the way for implementing that idea. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neilb@suse.de>
| * nfsd: don't call nfsd_file_put from client states seqfile displayJeff Layton2022-11-281-18/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We had a report of this: BUG: sleeping function called from invalid context at fs/nfsd/filecache.c:440 ...with a stack trace showing nfsd_file_put being called from nfs4_show_open. This code has always tried to call fput while holding a spinlock, but we recently changed this to use the filecache, and that started triggering the might_sleep() in nfsd_file_put. states_start takes and holds the cl_lock while iterating over the client's states, and we can't sleep with that held. Have the various nfs4_show_* functions instead hold the fi_lock instead of taking a nfsd_file reference. Fixes: 78599c42ae3c ("nfsd4: add file to display list of client's opens") Link: https://bugzilla.redhat.com/show_bug.cgi?id=2138357 Reported-by: Zhi Li <yieli@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* | nfsd: use locks_inode_context helperJeff Layton2022-11-301-3/+3
|/ | | | | | | | | | nfsd currently doesn't access i_flctx safely everywhere. This requires a smp_load_acquire, as the pointer is set via cmpxchg (a release operation). Acked-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jeff Layton <jlayton@kernel.org>
* Merge tag 'nfsd-6.1-4' of ↵Linus Torvalds2022-11-111-0/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Fix an export leak - Fix a potential tracepoint crash * tag 'nfsd-6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: nfsd: put the export reference in nfsd4_verify_deleg_dentry nfsd: fix use-after-free in nfsd_file_do_acquire tracepoint
| * nfsd: put the export reference in nfsd4_verify_deleg_dentryJeff Layton2022-11-081-0/+1
| | | | | | | | | | | | | | | | | | | | | | nfsd_lookup_dentry returns an export reference in addition to the dentry ref. Ensure that we put it too. Link: https://bugzilla.redhat.com/show_bug.cgi?id=2138866 Fixes: 876c553cb410 ("NFSD: verify the opened dentry after setting a delegation") Reported-by: Yongcheng Yang <yoyang@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* | treewide: use get_random_u32() when possibleJason A. Donenfeld2022-10-111-2/+2
|/ | | | | | | | | | | | | | | | | | | | | | The prandom_u32() function has been a deprecated inline wrapper around get_random_u32() for several releases now, and compiles down to the exact same code. Replace the deprecated wrapper with a direct call to the real function. The same also applies to get_random_int(), which is just a wrapper around get_random_u32(). This was done as a basic find and replace. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> # for ext4 Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> # for sch_cake Acked-by: Chuck Lever <chuck.lever@oracle.com> # for nfsd Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> # for thunderbolt Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs Acked-by: Helge Deller <deller@gmx.de> # for parisc Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390 Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* nfsd: extra checks when freeing delegation stateidsJeff Layton2022-09-261-1/+6
| | | | | | | | | | We've had some reports of problems in the refcounting for delegation stateids that we've yet to track down. Add some extra checks to ensure that we've removed the object from various lists before freeing it. Link: https://bugzilla.redhat.com/show_bug.cgi?id=2127067 Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* nfsd: make nfsd4_run_cb a bool return functionJeff Layton2022-09-261-3/+2
| | | | | | | | | | queue_work can return false and not queue anything, if the work is already queued. If that happens in the case of a CB_RECALL, we'll have taken an extra reference to the stid that will never be put. Ensure we throw a warning in that case. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* nfsd: fix comments about spinlock handling with delegationsJeff Layton2022-09-261-2/+2
| | | | | Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* nfsd: only fill out return pointer on success in nfsd4_lookup_stateidJeff Layton2022-09-261-4/+6
| | | | | | | | | In the case of a revoked delegation, we still fill out the pointer even when returning an error, which is bad form. Only overwrite the pointer on success. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: fix use-after-free on source server when doing inter-server copyDai Ngo2022-09-261-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | Use-after-free occurred when the laundromat tried to free expired cpntf_state entry on the s2s_cp_stateids list after inter-server copy completed. The sc_cp_list that the expired copy state was inserted on was already freed. When COPY completes, the Linux client normally sends LOCKU(lock_state x), FREE_STATEID(lock_state x) and CLOSE(open_state y) to the source server. The nfs4_put_stid call from nfsd4_free_stateid cleans up the copy state from the s2s_cp_stateids list before freeing the lock state's stid. However, sometimes the CLOSE was sent before the FREE_STATEID request. When this happens, the nfsd4_close_open_stateid call from nfsd4_close frees all lock states on its st_locks list without cleaning up the copy state on the sc_cp_list list. When the time the FREE_STATEID arrives the server returns BAD_STATEID since the lock state was freed. This causes the use-after-free error to occur when the laundromat tries to free the expired cpntf_state. This patch adds a call to nfs4_free_cpntf_statelist in nfsd4_close_open_stateid to clean up the copy state before calling free_ol_stateid_reaplist to free the lock state's stid on the reaplist. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Rename the fields in copy_stateid_tChuck Lever2022-09-261-15/+15
| | | | | | | | | | | Code maintenance: The name of the copy_stateid_t::sc_count field collides with the sc_count field in struct nfs4_stid, making the latter difficult to grep for when auditing stateid reference counting. No behavior change expected. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* nfsd: use DEFINE_SHOW_ATTRIBUTE to define client_info_fopsChenXiaoSong2022-09-261-12/+2
| | | | | | | | | | Use DEFINE_SHOW_ATTRIBUTE helper macro to simplify the code. inode is converted from seq_file->file instead of seq_file->private in client_info_show(). Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: add shrinker to reap courtesy clients on low memory conditionDai Ngo2022-09-261-8/+86
| | | | | | | | | | | | | | Add courtesy_client_reaper to react to low memory condition triggered by the system memory shrinker. The delayed_work for the courtesy_client_reaper is scheduled on the shrinker's count callback using the laundry_wq. The shrinker's scan callback is not used for expiring the courtesy clients due to potential deadlocks. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: keep track of the number of courtesy clients in the systemDai Ngo2022-09-261-1/+16
| | | | | | | | Add counter nfs4_courtesy_client_count to nfsd_net to keep track of the number of courtesy clients in the system. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>