summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* NFSD: Move copy offload callback arguments into a separate structureChuck Lever2022-07-293-45/+47
| | | | | | | | Refactor so that CB_OFFLOAD arguments can be passed without allocating a whole struct nfsd4_copy object. On my system (x86_64) this removes another 96 bytes from struct nfsd4_copy. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Add nfsd4_send_cb_offload()Chuck Lever2022-07-291-15/+22
| | | | | | Refactor for legibility. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Remove kmalloc from nfsd4_do_async_copy()Chuck Lever2022-07-291-14/+14
| | | | | | | | Instead of manufacturing a phony struct nfsd_file, pass the struct file returned by nfs42_ssc_open() directly to nfsd4_do_copy(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Refactor nfsd4_do_copy()Chuck Lever2022-07-291-8/+14
| | | | | | | | Refactor: Now that nfsd4_do_copy() no longer calls the cleanup helpers, plumb the use of struct file pointers all the way down to _nfsd_copy_file_range(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Refactor nfsd4_cleanup_inter_ssc() (2/2)Chuck Lever2022-07-291-8/+7
| | | | | | | | Move the nfsd4_cleanup_*() call sites out of nfsd4_do_copy(). A subsequent patch will modify one of the new call sites to avoid the need to manufacture the phony struct nfsd_file. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Refactor nfsd4_cleanup_inter_ssc() (1/2)Chuck Lever2022-07-291-5/+5
| | | | | | | | | The @src parameter is sometimes a pointer to a struct nfsd_file and sometimes a pointer to struct file hiding in a phony struct nfsd_file. Refactor nfsd4_cleanup_inter_ssc() so the @src parameter is always an explicit struct file. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Replace boolean fields in struct nfsd4_copyChuck Lever2022-07-293-43/+53
| | | | | | | Clean up: saves 8 bytes, and we can replace check_and_set_stop_copy() with an atomic bitop. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Make nfs4_put_copy() staticChuck Lever2022-07-292-2/+1
| | | | | | Clean up: All call sites are in fs/nfsd/nfs4proc.c. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Reorder the fields in struct nfsd4_opChuck Lever2022-07-291-2/+2
| | | | | | | | | | | Pack the fields to reduce the size of struct nfsd4_op, which is used an array in struct nfsd4_compoundargs. sizeof(struct nfsd4_op): Before: /* size: 672, cachelines: 11, members: 5 */ After: /* size: 640, cachelines: 10, members: 5 */ Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Shrink size of struct nfsd4_copyChuck Lever2022-07-293-4/+11
| | | | | | | | | | | struct nfsd4_copy is part of struct nfsd4_op, which resides in an 8-element array. sizeof(struct nfsd4_op): Before: /* size: 1696, cachelines: 27, members: 5 */ After: /* size: 672, cachelines: 11, members: 5 */ Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Shrink size of struct nfsd4_copy_notifyChuck Lever2022-07-293-6/+14
| | | | | | | | | | | struct nfsd4_copy_notify is part of struct nfsd4_op, which resides in an 8-element array. sizeof(struct nfsd4_op): Before: /* size: 2208, cachelines: 35, members: 5 */ After: /* size: 1696, cachelines: 27, members: 5 */ Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: nfserrno(-ENOMEM) is nfserr_jukeboxChuck Lever2022-07-291-2/+2
| | | | | Suggested-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Fix strncpy() fortify warningChuck Lever2022-07-292-2/+2
| | | | | | | | | | | | | | In function ‘strncpy’, inlined from ‘nfsd4_ssc_setup_dul’ at /home/cel/src/linux/manet/fs/nfsd/nfs4proc.c:1392:3, inlined from ‘nfsd4_interssc_connect’ at /home/cel/src/linux/manet/fs/nfsd/nfs4proc.c:1489:11: /home/cel/src/linux/manet/include/linux/fortify-string.h:52:33: warning: ‘__builtin_strncpy’ specified bound 63 equals destination size [-Wstringop-truncation] 52 | #define __underlying_strncpy __builtin_strncpy | ^ /home/cel/src/linux/manet/include/linux/fortify-string.h:89:16: note: in expansion of macro ‘__underlying_strncpy’ 89 | return __underlying_strncpy(p, q, size); | ^~~~~~~~~~~~~~~~~~~~ Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Clean up nfsd4_encode_readlink()Chuck Lever2022-07-291-15/+9
| | | | | | | | Similar changes to nfsd4_encode_readv(), all bundled into a single patch. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Use xdr_pad_size()Chuck Lever2022-07-291-7/+4
| | | | | | | | Clean up: Use a helper instead of open-coding the calculation of the XDR pad size. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Simplify starting_lenChuck Lever2022-07-291-5/+4
| | | | | | | | | Clean-up: Now that nfsd4_encode_readv() does not have to encode the EOF or rd_length values, it no longer needs to subtract 8 from @starting_len. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Optimize nfsd4_encode_readv()Chuck Lever2022-07-291-12/+6
| | | | | | | | | | | | write_bytes_to_xdr_buf() is pretty expensive to use for inserting an XDR data item that is always 1 XDR_UNIT at an address that is always XDR word-aligned. Since both the readv and splice read paths encode EOF and maxcount values, move both to a common code path. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Add an nfsd4_read::rd_eof fieldChuck Lever2022-07-292-8/+8
| | | | | | | | Refactor: Make the EOF result available in the entire NFSv4 READ path. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Clean up SPLICE_OK in nfsd4_encode_read()Chuck Lever2022-07-291-5/+4
| | | | | | | | Do the test_bit() once -- this reduces the number of locked-bus operations and makes the function a little easier to read. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Optimize nfsd4_encode_fattr()Chuck Lever2022-07-291-7/+4
| | | | | | | | | | | | write_bytes_to_xdr_buf() is a generic way to place a variable-length data item in an already-reserved spot in the encoding buffer. However, it is costly. In nfsd4_encode_fattr(), it is unnecessary because the data item is fixed in size and the buffer destination address is always word-aligned. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Optimize nfsd4_encode_operation()Chuck Lever2022-07-291-2/+1
| | | | | | | | | | | | write_bytes_to_xdr_buf() is a generic way to place a variable-length data item in an already-reserved spot in the encoding buffer. However, it is costly, and here, it is unnecessary because the data item is fixed in size, the buffer destination address is always word-aligned, and the destination location is already in @p. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* nfsd: silence extraneous printk on nfsd.ko insertionJeff Layton2022-07-291-1/+0
| | | | | | | | | | This printk pops every time nfsd.ko gets plugged in. Most kmods don't do that and this one is not very informative. Olaf's email address seems to be defunct at this point anyway. Just drop it. Cc: Olaf Kirch <okir@suse.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: limit the number of v4 clients to 1024 per 1GB of system memoryDai Ngo2022-07-293-6/+24
| | | | | | | | | | | | | | | | | | | | | | | Currently there is no limit on how many v4 clients are supported by the system. This can be a problem in systems with small memory configuration to function properly when a very large number of clients exist that creates memory shortage conditions. This patch enforces a limit of 1024 NFSv4 clients, including courtesy clients, per 1GB of system memory. When the number of the clients reaches the limit, requests that create new clients are returned with NFS4ERR_DELAY and the laundromat is kicked start to trim old clients. Due to the overhead of the upcall to remove the client record, the maximun number of clients the laundromat removes on each run is limited to 128. This is done to ensure the laundromat can still process the other tasks in a timely manner. Since there is now a limit of the number of clients, the 24-hr idle time limit of courtesy client is no longer needed and was removed. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: keep track of the number of v4 clients in the systemDai Ngo2022-07-292-2/+10
| | | | | | | | Add counter nfs4_client_count to keep track of the total number of v4 clients, including courtesy clients, in the system. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: refactoring v4 specific code to a helper in nfs4state.cDai Ngo2022-07-293-8/+17
| | | | | | | | This patch moves the v4 specific code from nfsd_init_net() to nfsd4_init_leases_net() helper in nfs4state.c Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Ensure nf_inode is never dereferencedChuck Lever2022-07-293-5/+4
| | | | | | | | | | | | | | | | | The documenting comment for struct nf_file states: /* * A representation of a file that has been opened by knfsd. These are hashed * in the hashtable by inode pointer value. Note that this object doesn't * hold a reference to the inode by itself, so the nf_inode pointer should * never be dereferenced, only used for comparison. */ Replace the two existing dereferences to make the comment always true. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: NFSv4 CLOSE should release an nfsd_file immediatelyChuck Lever2022-07-293-2/+21
| | | | | | | | | | | | The last close of a file should enable other accessors to open and use that file immediately. Leaving the file open in the filecache prevents other users from accessing that file until the filecache garbage-collects the file -- sometimes that takes several seconds. Reported-by: Wang Yugui <wangyugui@e16-tech.com> Link: https://bugzilla.linux-nfs.org/show_bug.cgi?387 Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Move nfsd_file_trace_alloc() tracepointChuck Lever2022-07-292-2/+25
| | | | | | | | | Avoid recording the allocation of an nfsd_file item that is immediately released because a matching item was already inserted in the hash. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Separate tracepoints for acquire and createChuck Lever2022-07-293-12/+52
| | | | | | | | These tracepoints collect different information: the create case does not open a file, so there's no nf_file available. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Clean up unused code after rhashtable conversionChuck Lever2022-07-292-33/+1
| | | | | Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Convert the filecache to use rhashtableChuck Lever2022-07-292-149/+179
| | | | | | | | | | | | Enable the filecache hash table to start small, then grow with the workload. Smaller server deployments benefit because there should be lower memory utilization. Larger server deployments should see improved scaling with the number of open files. Suggested-by: Jeff Layton <jlayton@kernel.org> Suggested-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Set up an rhashtable for the filecacheChuck Lever2022-07-292-21/+140
| | | | | | | | Add code to initialize and tear down an rhashtable. The rhashtable is not used yet. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Replace the "init once" mechanismChuck Lever2022-07-291-16/+26
| | | | | | | | | | | | | | In a moment, the nfsd_file_hashtbl global will be replaced with an rhashtable. Replace the one or two spots that need to check if the hash table is available. We can easily reuse the SHUTDOWN flag for this purpose. Document that this mechanism relies on callers to hold the nfsd_mutex to prevent init, shutdown, and purging to run concurrently. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Remove nfsd_file::nf_hashvalChuck Lever2022-07-292-5/+2
| | | | | | | | The value in this field can always be computed from nf_inode, thus it is no longer used. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: nfsd_file_hash_remove can compute hashvalChuck Lever2022-07-291-5/+14
| | | | | | | Remove an unnecessary use of nf_hashval. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Refactor __nfsd_file_close_inode()Chuck Lever2022-07-292-30/+54
| | | | | | | | | | | The code that computes the hashval is the same in both callers. To prevent them from going stale, reframe the documenting comments to remove descriptions of the underlying hash table structure, which is about to be replaced. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: nfsd_file_unhash can compute hashval from nf->nf_inodeChuck Lever2022-07-291-2/+6
| | | | | | | Remove an unnecessary usage of nf_hashval. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Remove lockdep assertion from unhash_and_release_locked()Chuck Lever2022-07-291-2/+0
| | | | | | | | IIUC, holding the hash bucket lock is needed only in nfsd_file_unhash, and there is already a lockdep assertion there. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: No longer record nf_hashval in the trace logChuck Lever2022-07-292-31/+29
| | | | | | | | | I'm about to replace nfsd_file_hashtbl with an rhashtable. The individual hash values will no longer be visible or relevant, so remove them from the tracepoints. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Never call nfsd_file_gc() in foreground pathsChuck Lever2022-07-291-9/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The checks in nfsd_file_acquire() and nfsd_file_put() that directly invoke filecache garbage collection are intended to keep cache occupancy between a low- and high-watermark. The reason to limit the capacity of the filecache is to keep filecache lookups reasonably fast. However, invoking garbage collection at those points has some undesirable negative impacts. Files that are held open by NFSv4 clients often push the occupancy of the filecache over these watermarks. At that point: - Every call to nfsd_file_acquire() and nfsd_file_put() results in an LRU walk. This has the same effect on lookup latency as long chains in the hash table. - Garbage collection will then run on every nfsd thread, causing a lot of unnecessary lock contention. - Limiting cache capacity pushes out files used only by NFSv3 clients, which are the type of files the filecache is supposed to help. To address those negative impacts, remove the direct calls to the garbage collector. Subsequent patches will address maintaining lookup efficiency as cache capacity increases. Suggested-by: Wang Yugui <wangyugui@e16-tech.com> Suggested-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Fix the filecache LRU shrinkerChuck Lever2022-07-291-2/+3
| | | | | | | | | | | | | Without LRU item rotation, the shrinker visits only a few items on the end of the LRU list, and those would always be long-term OPEN files for NFSv4 workloads. That makes the filecache shrinker completely ineffective. Adopt the same strategy as the inode LRU by using LRU_ROTATE. Suggested-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Leave open files out of the filecache LRUChuck Lever2022-07-292-5/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There have been reports of problems when running fstests generic/531 against Linux NFS servers with NFSv4. The NFS server that hosts the test's SCRATCH_DEV suffers from CPU soft lock-ups during the test. Analysis shows that: fs/nfsd/filecache.c 482 ret = list_lru_walk(&nfsd_file_lru, 483 nfsd_file_lru_cb, 484 &head, LONG_MAX); causes nfsd_file_gc() to walk the entire length of the filecache LRU list every time it is called (which is quite frequently). The walk holds a spinlock the entire time that prevents other nfsd threads from accessing the filecache. What's more, for NFSv4 workloads, none of the items that are visited during this walk may be evicted, since they are all files that are held OPEN by NFS clients. Address this by ensuring that open files are not kept on the LRU list. Reported-by: Frank van der Linden <fllinden@amazon.com> Reported-by: Wang Yugui <wangyugui@e16-tech.com> Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=386 Suggested-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Trace filecache LRU activityChuck Lever2022-07-292-13/+70
| | | | | | | | Observe the operation of garbage collection and the lifetime of filecache items. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: WARN when freeing an item still linked via nf_lruChuck Lever2022-07-291-2/+10
| | | | | | | | | | | | Add a guardrail to prevent freeing memory that is still on a list. This includes either a dispose list or the LRU list. This is the sign of a bug, but this class of bugs can be detected so that they don't endanger system stability, especially while debugging. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Hook up the filecache stat fileChuck Lever2022-07-291-0/+10
| | | | | | | | | There has always been the capability of exporting filecache metrics via /proc, but it was never hooked up. Let's surface these metrics to enable better observability of the filecache. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Zero counters when the filecache is re-initializedChuck Lever2022-07-291-0/+11
| | | | | | | | If nfsd_file_cache_init() is called after a shutdown, be sure the stat counters are reset. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Record number of flush callsChuck Lever2022-07-291-2/+11
| | | | | Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Report the number of items evicted by the LRU walkChuck Lever2022-07-292-3/+39
| | | | | Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Refactor nfsd_file_lru_scan()Chuck Lever2022-07-291-18/+7
| | | | | Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Refactor nfsd_file_gc()Chuck Lever2022-07-291-1/+5
| | | | | | | Refactor nfsd_file_gc() to use the new list_lru helper. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>