summaryrefslogtreecommitdiffstats
path: root/fs/nfsd/trace.h
Commit message (Collapse)AuthorAgeFilesLines
* NFSD: Move nfsd_file_trace_alloc() tracepointChuck Lever2022-07-291-1/+24
| | | | | | | | | Avoid recording the allocation of an nfsd_file item that is immediately released because a matching item was already inserted in the hash. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Separate tracepoints for acquire and createChuck Lever2022-07-291-8/+46
| | | | | | | | These tracepoints collect different information: the create case does not open a file, so there's no nf_file available. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Convert the filecache to use rhashtableChuck Lever2022-07-291-1/+62
| | | | | | | | | | | | Enable the filecache hash table to start small, then grow with the workload. Smaller server deployments benefit because there should be lower memory utilization. Larger server deployments should see improved scaling with the number of open files. Suggested-by: Jeff Layton <jlayton@kernel.org> Suggested-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Refactor __nfsd_file_close_inode()Chuck Lever2022-07-291-11/+33
| | | | | | | | | | | The code that computes the hashval is the same in both callers. To prevent them from going stale, reframe the documenting comments to remove descriptions of the underlying hash table structure, which is about to be replaced. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: No longer record nf_hashval in the trace logChuck Lever2022-07-291-24/+21
| | | | | | | | | I'm about to replace nfsd_file_hashtbl with an rhashtable. The individual hash values will no longer be visible or relevant, so remove them from the tracepoints. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Leave open files out of the filecache LRUChuck Lever2022-07-291-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There have been reports of problems when running fstests generic/531 against Linux NFS servers with NFSv4. The NFS server that hosts the test's SCRATCH_DEV suffers from CPU soft lock-ups during the test. Analysis shows that: fs/nfsd/filecache.c 482 ret = list_lru_walk(&nfsd_file_lru, 483 nfsd_file_lru_cb, 484 &head, LONG_MAX); causes nfsd_file_gc() to walk the entire length of the filecache LRU list every time it is called (which is quite frequently). The walk holds a spinlock the entire time that prevents other nfsd threads from accessing the filecache. What's more, for NFSv4 workloads, none of the items that are visited during this walk may be evicted, since they are all files that are held OPEN by NFS clients. Address this by ensuring that open files are not kept on the LRU list. Reported-by: Frank van der Linden <fllinden@amazon.com> Reported-by: Wang Yugui <wangyugui@e16-tech.com> Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=386 Suggested-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Trace filecache LRU activityChuck Lever2022-07-291-0/+39
| | | | | | | | Observe the operation of garbage collection and the lifetime of filecache items. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Report the number of items evicted by the LRU walkChuck Lever2022-07-291-0/+29
| | | | | Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Instrument fh_verify()Chuck Lever2022-07-291-0/+46
| | | | | | | | Capture file handles and how they map to local inodes. In particular, NFSv4 PUTFH uses fh_verify() so we can now observe which file handles are the target of OPEN, LOOKUP, RENAME, and so on. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* nfsd: eliminate the NFSD_FILE_BREAK_* flagsJeff Layton2022-07-291-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We had a report from the spring Bake-a-thon of data corruption in some nfstest_interop tests. Looking at the traces showed the NFS server allowing a v3 WRITE to proceed while a read delegation was still outstanding. Currently, we only set NFSD_FILE_BREAK_* flags if NFSD_MAY_NOT_BREAK_LEASE was set when we call nfsd_file_alloc. NFSD_MAY_NOT_BREAK_LEASE was intended to be set when finding files for COMMIT ops, where we need a writeable filehandle but don't need to break read leases. It doesn't make any sense to consult that flag when allocating a file since the file may be used on subsequent calls where we do want to break the lease (and the usage of it here seems to be reverse from what it should be anyway). Also, after calling nfsd_open_break_lease, we don't want to clear the BREAK_* bits. A lease could end up being set on it later (more than once) and we need to be able to break those leases as well. This means that the NFSD_FILE_BREAK_* flags now just mirror NFSD_MAY_{READ,WRITE} flags, so there's no need for them at all. Just drop those flags and unconditionally call nfsd_open_break_lease every time. Reported-by: Olga Kornieskaia <kolga@netapp.com> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2107360 Fixes: 65294c1f2c5e (nfsd: add a new struct file caching facility to nfsd) Cc: <stable@vger.kernel.org> # 5.4.x : bb283ca18d1e NFSD: Clean up the show_nf_flags() macro Cc: <stable@vger.kernel.org> # 5.4.x Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Clean up the show_nf_flags() macroChuck Lever2022-05-231-6/+0
| | | | | | | The flags are defined using C macros, so TRACE_DEFINE_ENUM is unnecessary. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Trace filecache opensChuck Lever2022-05-231-0/+28
| | | | | | | Instrument calls to nfsd_open_verified() to get a sense of the filecache hit rate. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Remove NFSD_PROC_ARGS_* macrosChuck Lever2022-02-281-19/+9
| | | | | | | | | | | | Clean up. The PROC_ARGS macros were added when I thought that NFSD tracepoints would be reporting endpoint information. However, tracepoints in the RPC server now report transport endpoint information, so in general there's no need for the upper layers to do that any more, and these macros can be retired. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Use __sockaddr field to store socket addressesChuck Lever2022-02-281-40/+39
| | | | | | | As an example usage of the new __sockaddr field, convert some NFSD trace points to use it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Fix offset type in I/O trace pointsChuck Lever2022-02-091-7/+7
| | | | | | | NFSv3 and NFSv4 use u64 offset values on the wire. Record these values verbatim without the implicit type case to loff_t. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Trace boot verifier resetsChuck Lever2022-01-081-0/+28
| | | | | | | | | | | | | According to commit bbf2f098838a ("nfsd: Reset the boot verifier on all write I/O errors"), the Linux NFS server forces all clients to resend pending unstable writes if any server-side write or commit operation encounters an error (say, ENOSPC). This is a rare and quite exceptional event that could require administrative recovery action, so it should be made trace-able. Example trace event: nfsd-938 [002] 7174.945558: nfsd_writeverf_reset: boot_time= 61cc920d xid=0xdcd62036 error=-28 new verifier=0x08aecc6142515904 Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* nfsd: Add a tracepoint for errors in nfsd4_clone_file_range()Trond Myklebust2022-01-081-0/+50
| | | | | | | | | Since a clone error commit can cause the boot verifier to change, we should trace those errors. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> [ cel: Addressed a checkpatch.pl splat in fs/nfsd/vfs.h ]
* NFSD: Combine XDR error tracepointsChuck Lever2022-01-081-21/+7
| | | | | | | | | Clean up: The garbage_args and cant_encode tracepoints report the same information as each other, so combine them into a single tracepoint class to reduce code duplication and slightly reduce the size of trace.o. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFS: Move NFS protocol display macros to global headerChuck Lever2021-11-021-0/+1
| | | | | | | | | | | | | | | Refactor: surface useful show_ macros so they can be shared between the client and server trace code. Additional clean up: - Housekeeping: ensure the correct #include files are pulled in and add proper TRACE_DEFINE_ENUM where they are missing - Use a consistent naming scheme for the helpers - Store values to be displayed symbolically as unsigned long, as that is the type that the __print_yada() functions take Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
* NFSD: Use new __string_len C macros for nfsd_clid_classChuck Lever2021-08-171-3/+2
| | | | | | Clean up. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Use new __string_len C macros for the nfs_dirent tracepointChuck Lever2021-08-171-7/+5
| | | | | | Clean up. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Prevent a possible oops in the nfs_dirent() tracepointChuck Lever2021-07-061-1/+0
| | | | | | | | | | The double copy of the string is a mistake, plus __assign_str() uses strlen(), which is wrong to do on a string that isn't guaranteed to be NUL-terminated. Fixes: 6019ce0742ca ("NFSD: Add a tracepoint to record directory entry encoding") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* NFSD: Update nfsd_cb_args tracepointChuck Lever2021-05-181-3/+3
| | | | | | | | Clean-up: Re-order the display of IP address and client ID to be consistent with other _cb_ tracepoints. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* NFSD: Remove the nfsd_cb_work and nfsd_cb_done tracepointsChuck Lever2021-05-181-48/+0
| | | | | | | | | Clean up: These are noise in properly working systems. If you really need to observe the operation of the callback mechanism, use the sunrpc:rpc\* tracepoints along with the workqueue tracepoints. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* NFSD: Add an nfsd_cb_probe tracepointChuck Lever2021-05-181-0/+1
| | | | | | | | | Record a tracepoint event when the server performs a callback probe. This event can be enabled as a group with other nfsd_cb tracepoints. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* NFSD: Replace the nfsd_deleg_break tracepointChuck Lever2021-05-181-1/+31
| | | | | | | | | | Renamed so it can be enabled as a set with the other nfsd_cb_ tracepoints. And, consistent with those tracepoints, report the address of the client, the client ID the server has given it, and the state ID being recalled. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* NFSD: Add an nfsd_cb_offload tracepointChuck Lever2021-05-181-0/+36
| | | | | | | | | | | | | | Record the arguments of CB_OFFLOAD callbacks so we can better observe asynchronous copy-offload behavior. For example: nfsd-995 [008] 7721.934222: nfsd_cb_offload: addr=192.168.2.51:0 client 6092a47c:35a43fc1 fh_hash=0x8739113a count=116528 status=0 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Olga Kornievskaia <kolga@netapp.com> Cc: Dai Ngo <Dai.Ngo@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* NFSD: Add an nfsd_cb_lm_notify tracepointChuck Lever2021-05-181-0/+26
| | | | | | | | | | | | When the server kicks off a CB_LM_NOTIFY callback, record its arguments so we can better observe asynchronous locking behavior. For example: nfsd-998 [002] 1471.705873: nfsd_cb_notify_lock: addr=192.168.2.51:0 client 6092a47c:35a43fc1 fh_hash=0x8950b23a Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* NFSD: Enhance the nfsd_cb_setup tracepointChuck Lever2021-05-181-1/+26
| | | | | | | | Display the transport protocol and authentication flavor so admins can see what they might be getting wrong. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* NFSD: Add cb_lost tracepointChuck Lever2021-05-181-0/+1
| | | | | | | Provide more clarity about when the callback channel is in trouble. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* NFSD: Drop TRACE_DEFINE_ENUM for NFSD4_CB_<state> macrosChuck Lever2021-05-181-5/+0
| | | | | | | TRACE_DEFINE_ENUM() is necessary for enum {} but not for C macros. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* NFSD: Add tracepoints for EXCHANGEID edge casesChuck Lever2021-05-181-0/+1
| | | | | | | | Some of the most common cases are traced. Enough infrastructure is now in place that more can be added later, as needed. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* NFSD: Add tracepoints for SETCLIENTID edge casesChuck Lever2021-05-181-0/+37
| | | | | Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* NFSD: Add a couple more nfsd_clid_expired call sitesChuck Lever2021-05-181-1/+2
| | | | | | | Improve observation of NFSv4 lease expiry. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* NFSD: Add nfsd_clid_destroyed tracepointChuck Lever2021-05-181-0/+1
| | | | | | | Record client-requested termination of client IDs. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* NFSD: Add nfsd_clid_reclaim_complete tracepointChuck Lever2021-05-181-0/+1
| | | | | Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* NFSD: Add nfsd_clid_confirmed tracepointChuck Lever2021-05-181-0/+1
| | | | | | | | | | | | | | | | This replaces a dprintk call site in order to get greater visibility on when client IDs are confirmed or re-used. Simple example: nfsd-995 [000] 126.622975: nfsd_compound: xid=0x3a34e2b1 opcnt=1 nfsd-995 [000] 126.623005: nfsd_cb_args: addr=192.168.2.51:45901 client 60958e3b:9213ef0e prog=1073741824 ident=1 nfsd-995 [000] 126.623007: nfsd_compound_status: op=1/1 OP_SETCLIENTID status=0 nfsd-996 [001] 126.623142: nfsd_compound: xid=0x3b34e2b1 opcnt=1 >>>> nfsd-996 [001] 126.623146: nfsd_clid_confirmed: client 60958e3b:9213ef0e nfsd-996 [001] 126.623148: nfsd_cb_probe: addr=192.168.2.51:45901 client 60958e3b:9213ef0e state=UNKNOWN nfsd-996 [001] 126.623154: nfsd_compound_status: op=1/1 OP_SETCLIENTID_CONFIRM status=0 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* NFSD: Remove trace_nfsd_clid_inuse_errChuck Lever2021-05-181-24/+0
| | | | | | | | This tracepoint has been replaced by nfsd_clid_cred_mismatch and nfsd_clid_verf_mismatch, and can simply be removed. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* NFSD: Add nfsd_clid_verf_mismatch tracepointChuck Lever2021-05-181-0/+32
| | | | | | | | | | Record when a client presents a different boot verifier than the one we know about. Typically this is a sign the client has rebooted, but sometimes it signals a conflicting client ID, which the client's administrator will need to address. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* NFSD: Add nfsd_clid_cred_mismatch tracepointChuck Lever2021-05-181-0/+28
| | | | | | | | | Record when a client tries to establish a lease record but uses an unexpected credential. This is often a sign of a configuration problem. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* NFSD: Add an RPC authflavor tracepoint display helperChuck Lever2021-05-181-0/+16
| | | | | | | To be used in subsequent patches. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* NFSD: Fix TP_printk() format specifier in nfsd_clid_classChuck Lever2021-05-181-29/+0
| | | | | | | | | | | | | | Since commit 9a6944fee68e ("tracing: Add a verifier to check string pointers for trace events"), which was merged in v5.13-rc1, TP_printk() no longer tacitly supports the "%.*s" format specifier. These are low value tracepoints, so just remove them. Reported-by: David Wysochanski <dwysocha@redhat.com> Fixes: dd5e3fbc1f47 ("NFSD: Add tracepoints to the NFSD state management code") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* NFSD: Add a tracepoint to record directory entry encodingChuck Lever2021-03-221-0/+24
| | | | | | | | | | | | | | | | | | | Enable watching the progress of directory encoding to capture the timing of any issues with reading or encoding a directory. The new tracepoint captures dirent encoding for all NFS versions. For example, here's what a few NFSv4 directory entries might look like: nfsd-989 [002] 468.596265: nfsd_dirent: fh_hash=0x5d162594 ino=2 name=. nfsd-989 [002] 468.596267: nfsd_dirent: fh_hash=0x5d162594 ino=1 name=.. nfsd-989 [002] 468.596299: nfsd_dirent: fh_hash=0x5d162594 ino=3827 name=zlib.c nfsd-989 [002] 468.596325: nfsd_dirent: fh_hash=0x5d162594 ino=3811 name=xdiff nfsd-989 [002] 468.596351: nfsd_dirent: fh_hash=0x5d162594 ino=3810 name=xdiff-interface.h nfsd-989 [002] 468.596377: nfsd_dirent: fh_hash=0x5d162594 ino=3809 name=xdiff-interface.c Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Add tracepoints in nfsd4_decode/encode_compound()Chuck Lever2020-11-301-0/+68
| | | | | | | | | | For troubleshooting purposes, record failures to decode NFSv4 operation arguments and encode operation results. trace_nfsd_compound_decode_err() replaces the dprintk() call sites that are embedded in READ_* macros that are about to be removed. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Add tracepoints in nfsd_dispatch()Chuck Lever2020-11-301-0/+60
| | | | | | | For troubleshooting purposes, record GARBAGE_ARGS and CANT_ENCODE failures. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Remove extra "0x" in tracepoint format specifierChuck Lever2020-11-301-4/+4
| | | | | | Clean up: %p adds its own 0x already. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Clean up the show_nf_may macroChuck Lever2020-11-301-14/+26
| | | | | | | | | | Display all currently possible NFSD_MAY permission flags. Move and rename show_nf_may with a more generic name because the NFSD_MAY permission flags are used in other places besides the file cache. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* nfsd: rename delegation related tracepoints to make them less confusingHou Tao2020-09-251-2/+2
| | | | | | | | | | | | | | | | | | | | | | | Now when a read delegation is given, two delegation related traces will be printed: nfsd_deleg_open: client 5f45b854:e6058001 stateid 00000030:00000001 nfsd_deleg_none: client 5f45b854:e6058001 stateid 0000002f:00000001 Although the intention is to let developers know two stateid are returned, the traces are confusing about whether or not a read delegation is handled out. So renaming trace_nfsd_deleg_none() to trace_nfsd_open() and trace_nfsd_deleg_open() to trace_nfsd_deleg_read() to make the intension clearer. The patched traces will be: nfsd_deleg_read: client 5f48a967:b55b21cd stateid 00000003:00000001 nfsd_open: client 5f48a967:b55b21cd stateid 00000002:00000001 Suggested-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* NFSD: Add tracepoints for monitoring NFSD callbacksChuck Lever2020-05-201-0/+153
| | | | Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* NFSD: Add tracepoints to the NFSD state management codeChuck Lever2020-05-201-0/+133
| | | | | | | | Capture obvious events and replace dprintk() call sites. Introduce infrastructure so that adding more tracepoints in this code later is simplified. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>