summaryrefslogtreecommitdiffstats
path: root/net/sunrpc
Commit message (Collapse)AuthorAgeFilesLines
* SUNRPC: RPC level errors should set task->tk_rpc_statusTrond Myklebust2022-08-311-1/+1
| | | | | | | | | | | [ Upstream commit ed06fce0b034b2e25bd93430f5c4cbb28036cc1a ] Fix up a case in call_encode() where we're failing to set task->tk_rpc_status when an RPC level error occurred. Fixes: 9c5948c24869 ("SUNRPC: task should be exit if encode return EKEYEXPIRED more times") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* net/sunrpc: fix potential memory leaks in rpc_sysfs_xprt_state_change()Xin Xiong2022-08-251-2/+4
| | | | | | | | | | | | | | | | | | | | commit bfc48f1b0505ffcb03a6d749139b7577d6b81ae0 upstream. The issue happens on some error handling paths. When the function fails to grab the object `xprt`, it simply returns 0, forgetting to decrease the reference count of another object `xps`, which is increased by rpc_sysfs_xprt_kobj_get_xprt_switch(), causing refcount leaks. Also, the function forgets to check whether `xps` is valid before using it, which may result in NULL-dereferencing issues. Fix it by adding proper error handling code when either `xprt` or `xps` is NULL. Fixes: 5b7eb78486cd ("SUNRPC: take a xprt offline using sysfs") Signed-off-by: Xin Xiong <xiongx18@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* SUNRPC: Reinitialise the backchannel request buffers before reuseTrond Myklebust2022-08-251-0/+14
| | | | | | | | | | | | commit 6622e3a73112fc336c1c2c582428fb5ef18e456a upstream. When we're reusing the backchannel requests instead of freeing them, then we should reinitialise any values of the send/receive xdr_bufs so that they reflect the available space. Fixes: 0d2a970d0ae5 ("SUNRPC: Fix a backchannel race") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* sunrpc: fix expiry of auth credsDan Aloni2022-08-251-1/+1
| | | | | | | | | | | | | commit f1bafa7375c01ff71fb7cb97c06caadfcfe815f3 upstream. Before this commit, with a large enough LRU of expired items (100), the loop skipped all the expired items and was entirely ineffectual in trimming the LRU list. Fixes: 95cd623250ad ('SUNRPC: Clean up the AUTH cache code') Signed-off-by: Dan Aloni <dan.aloni@vastdata.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* SUNRPC: Fix READ_PLUS crasherChuck Lever2022-07-071-1/+1
| | | | | | | | | | | | | | | commit a23dd544debcda4ee4a549ec7de59e85c3c8345c upstream. Looks like there are still cases when "space_left - frag1bytes" can legitimately exceed PAGE_SIZE. Ensure that xdr->end always remains within the current encode buffer. Reported-by: Bruce Fields <bfields@fieldses.org> Reported-by: Zorro Lang <zlang@redhat.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216151 Fixes: 6c254bf3b637 ("SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer()") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* sunrpc: set cl_max_connect when cloning an rpc_clntScott Mayhew2022-06-221-0/+1
| | | | | | | | | | | | | | | | | | | | [ Upstream commit 304791255a2dc1c9be7e7c8a6cbdb31b6847b0e5 ] If the initial attempt at trunking detection using the krb5i auth flavor fails with -EACCES, -NFS4ERR_CLID_INUSE, or -NFS4ERR_WRONGSEC, then the NFS client tries again using auth_sys, cloning the rpc_clnt in the process. If this second attempt at trunking detection succeeds, then the resulting nfs_client->cl_rpcclient winds up having cl_max_connect=0 and subsequent attempts to add additional transport connections to the rpc_clnt will fail with a message similar to the following being logged: [502044.312640] SUNRPC: reached max allowed number (0) did not add transport to server: 192.168.122.3 Signed-off-by: Scott Mayhew <smayhew@redhat.com> Fixes: dc48e0abee24 ("SUNRPC enforce creation of no more than max_connect xprts") Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer()Chuck Lever2022-06-141-1/+5
| | | | | | | | | | | | | | | | | | | | [ Upstream commit 6c254bf3b637dd4ef4f78eb78c7447419c0161d7 ] I found that NFSD's new NFSv3 READDIRPLUS XDR encoder was screwing up right at the end of the page array. xdr_get_next_encode_buffer() does not compute the value of xdr->end correctly: * The check to see if we're on the final available page in xdr->buf needs to account for the space consumed by @nbytes. * The new xdr->end value needs to account for the portion of @nbytes that is to be encoded into the previous buffer. Fixes: 2825a7f90753 ("nfsd4: allow encoding across page boundaries") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: NeilBrown <neilb@suse.de> Reviewed-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* SUNRPC: Trap RDMA segment overflowsChuck Lever2022-06-141-2/+2
| | | | | | | | | | | | | | [ Upstream commit f012e95b377c73c0283f009823c633104dedb337 ] Prevent svc_rdma_build_writes() from walking off the end of a Write chunk's segment array. Caught with KASAN. The test that this fix replaces is invalid, and might have been left over from an earlier prototype of the PCL work. Fixes: 7a1cbfa18059 ("svcrdma: Use parsed chunk lists to construct RDMA Writes") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* xprtrdma: treat all calls not a bcall when bc_serv is NULLKinglong Mee2022-06-141-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 11270e7ca268e8d61b5d9e5c3a54bd1550642c9c ] When a rdma server returns a fault format reply, nfs v3 client may treats it as a bcall when bc service is not exist. The debug message at rpcrdma_bc_receive_call are, [56579.837169] RPC: rpcrdma_bc_receive_call: callback XID 00000001, length=20 [56579.837174] RPC: rpcrdma_bc_receive_call: 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 04 After that, rpcrdma_bc_receive_call will meets NULL pointer as, [ 226.057890] BUG: unable to handle kernel NULL pointer dereference at 00000000000000c8 ... [ 226.058704] RIP: 0010:_raw_spin_lock+0xc/0x20 ... [ 226.059732] Call Trace: [ 226.059878] rpcrdma_bc_receive_call+0x138/0x327 [rpcrdma] [ 226.060011] __ib_process_cq+0x89/0x170 [ib_core] [ 226.060092] ib_cq_poll_work+0x26/0x80 [ib_core] [ 226.060257] process_one_work+0x1a7/0x360 [ 226.060367] ? create_worker+0x1a0/0x1a0 [ 226.060440] worker_thread+0x30/0x390 [ 226.060500] ? create_worker+0x1a0/0x1a0 [ 226.060574] kthread+0x116/0x130 [ 226.060661] ? kthread_flush_work_fn+0x10/0x10 [ 226.060724] ret_from_fork+0x35/0x40 ... Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* SUNRPC: Ensure we flush any closed sockets before xs_xprt_free()Trond Myklebust2022-05-182-9/+14
| | | | | | | | | | | | | | | | | | | | | | commit f00432063db1a0db484e85193eccc6845435b80e upstream. We must ensure that all sockets are closed before we call xprt_free() and release the reference to the net namespace. The problem is that calling fput() will defer closing the socket until delayed_fput() gets called. Let's fix the situation by allowing rpciod and the transport teardown code (which runs on the system wq) to call __fput_sync(), and directly close the socket. Reported-by: Felix Fu <foyjog@gmail.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Fixes: a73881c96d73 ("SUNRPC: Fix an Oops in udp_poll()") Cc: stable@vger.kernel.org # 5.1.x: 3be232f11a3c: SUNRPC: Prevent immediate close+reconnect Cc: stable@vger.kernel.org # 5.1.x: 89f42494f92f: SUNRPC: Don't call connect() more than once on a TCP socket Cc: stable@vger.kernel.org # 5.1.x Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Meena Shanmugam <meenashanmugam@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* SUNRPC: Ensure that the gssproxy client can start in a connected stateTrond Myklebust2022-05-182-0/+34
| | | | | | | | | | | | | commit fd13359f54ee854f00134abc6be32da94ec53dbf upstream. Ensure that the gssproxy client connects to the server from the gssproxy daemon process context so that the AF_LOCAL socket connection is done using the correct path and namespaces. Fixes: 1d658336b05f ("SUNRPC: Add RPC based upcall mechanism for RPCGSS auth") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* SUNRPC release the transport of a relocated task with an assigned transportOlga Kornievskaia2022-05-121-4/+7
| | | | | | | | | | | commit e13433b4416fa31a24e621cbbbb39227a3d651dd upstream. A relocated task must release its previous transport. Fixes: 82ee41b85cef1 ("SUNRPC don't resend a task on an offlined transport") Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Revert "SUNRPC: attempt AF_LOCAL connect on setup"Trond Myklebust2022-05-121-3/+0
| | | | | | | | | | | | | | | | commit a3d0562d4dc039bca39445e1cddde7951662e17d upstream. This reverts commit 7073ea8799a8cf73db60270986f14e4aae20fa80. We must not try to connect the socket while the transport is under construction, because the mechanisms to safely tear it down are not in place. As the code stands, we end up leaking the sockets on a connection error. Reported-by: wanghai (M) <wanghai38@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* SUNRPC: Fix NFSD's request deferral on RDMA transportsChuck Lever2022-04-202-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 773f91b2cf3f52df0d7508fdbf60f37567cdaee4 upstream. Trond Myklebust reports an NFSD crash in svc_rdma_sendto(). Further investigation shows that the crash occurred while NFSD was handling a deferred request. This patch addresses two inter-related issues that prevent request deferral from working correctly for RPC/RDMA requests: 1. Prevent the crash by ensuring that the original svc_rqst::rq_xprt_ctxt value is available when the request is revisited. Otherwise svc_rdma_sendto() does not have a Receive context available with which to construct its reply. 2. Possibly since before commit 71641d99ce03 ("svcrdma: Properly compute .len and .buflen for received RPC Calls"), svc_rdma_recvfrom() did not include the transport header in the returned xdr_buf. There should have been no need for svc_defer() and friends to save and restore that header, as of that commit. This issue is addressed in a backport-friendly way by simply having svc_rdma_recvfrom() set rq_xprt_hlen to zero unconditionally, just as svc_tcp_recvfrom() does. This enables svc_deferred_recv() to correctly reconstruct an RPC message received via RPC/RDMA. Reported-by: Trond Myklebust <trondmy@hammerspace.com> Link: https://lore.kernel.org/linux-nfs/82662b7190f26fb304eb0ab1bb04279072439d4e.camel@hammerspace.com/ Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* SUNRPC: Don't call connect() more than once on a TCP socketTrond Myklebust2022-04-131-10/+11
| | | | | | | | | | | | | | commit 89f42494f92f448747bd8a7ab1ae8b5d5520577d upstream. Avoid socket state races due to repeated calls to ->connect() using the same socket. If connect() returns 0 due to the connection having completed, but we are in fact in a closing state, then we may leave the XPRT_CONNECTING flag set on the transport. Reported-by: Enrico Scholz <enrico.scholz@sigma-chemnitz.de> Fixes: 3be232f11a3c ("SUNRPC: Prevent immediate close+reconnect") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* SUNRPC: Prevent immediate close+reconnectTrond Myklebust2022-04-132-2/+3
| | | | | | | | | | commit 3be232f11a3cc9b0ef0795e39fa11bdb8e422a06 upstream. If we have already set up the socket and are waiting for it to connect, then don't immediately close and retry. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* SUNRPC: svc_tcp_sendmsg() should handle errors from xdr_alloc_bvec()Trond Myklebust2022-04-131-1/+3
| | | | | | | | | | | [ Upstream commit b056fa070814897be32d83b079dbc311375588e7 ] The allocation is done with GFP_KERNEL, but it could still fail in a low memory situation. Fixes: 4a85a6a3320b ("SUNRPC: Handle TCP socket sends with kernel_sendpage() again") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* SUNRPC: Handle low memory situations in call_status()Trond Myklebust2022-04-131-0/+5
| | | | | | | | | | | | [ Upstream commit 9d82819d5b065348ce623f196bf601028e22ed00 ] We need to handle ENFILE, ENOBUFS, and ENOMEM, because xprt_wake_pending_tasks() can be called with any one of these due to socket creation failures. Fixes: b61d59fffd3e ("SUNRPC: xs_tcp_connect_worker{4,6}: merge common code") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* SUNRPC: Handle ENOMEM in call_transmit_status()Trond Myklebust2022-04-131-0/+2
| | | | | | | | | | | [ Upstream commit d3c15033b240767d0287f1c4a529cbbe2d5ded8a ] Both call_transmit() and call_bc_transmit() can now return ENOMEM, so let's make sure that we handle the errors gracefully. Fixes: 0472e4766049 ("SUNRPC: Convert socket page send code to use iov_iter()") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* SUNRPC: Fix socket waits for write buffer spaceTrond Myklebust2022-04-131-15/+39
| | | | | | | | | | | [ Upstream commit 7496b59f588dd52886fdbac7633608097543a0a5 ] The socket layer requires that we use the socket lock to protect changes to the sock->sk_write_pending field and others. Reported-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* SUNRPC: remove scheduling boost for "SWAPPER" tasks.NeilBrown2022-04-132-18/+0
| | | | | | | | | | | | | | | | | | | | [ Upstream commit a80a8461868905823609be97f91776a26befe839 ] Currently, tasks marked as "swapper" tasks get put to the front of non-priority rpc_queues, and are sorted earlier than non-swapper tasks on the transport's ->xmit_queue. This is pointless as currently *all* tasks for a mount that has swap enabled on *any* file are marked as "swapper" tasks. So the net result is that the non-priority rpc_queues are reverse-ordered (LIFO). This scheduling boost is not necessary to avoid deadlocks, and hurts fairness, so remove it. If there were a need to expedite some requests, the tk_priority mechanism is a more appropriate tool. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* SUNRPC/xprt: async tasks mustn't block waiting for memoryNeilBrown2022-04-132-2/+5
| | | | | | | | | | | | | | | | | | | | | [ Upstream commit a721035477fb5fb8abc738fbe410b07c12af3dc5 ] When memory is short, new worker threads cannot be created and we depend on the minimum one rpciod thread to be able to handle everything. So it must not block waiting for memory. xprt_dynamic_alloc_slot can block indefinitely. This can tie up all workqueue threads and NFS can deadlock. So when called from a workqueue, set __GFP_NORETRY. The rdma alloc_slot already does not block. However it sets the error to -EAGAIN suggesting this will trigger a sleep. It does not. As we can see in call_reserveresult(), only -ENOMEM causes a sleep. -EAGAIN causes immediate retry. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* SUNRPC/call_alloc: async tasks mustn't block waiting for memoryNeilBrown2022-04-132-2/+6
| | | | | | | | | | | | | | | | | | | | | [ Upstream commit c487216bec83b0c5a8803e5c61433d33ad7b104d ] When memory is short, new worker threads cannot be created and we depend on the minimum one rpciod thread to be able to handle everything. So it must not block waiting for memory. mempools are particularly a problem as memory can only be released back to the mempool by an async rpc task running. If all available workqueue threads are waiting on the mempool, no thread is available to return anything. rpc_malloc() can block, and this might cause deadlocks. So check RPC_IS_ASYNC(), rather than RPC_IS_SWAPPER() to determine if blocking is acceptable. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* SUNRPC don't resend a task on an offlined transportOlga Kornievskaia2022-04-081-1/+3
| | | | | | | | | | | | | [ Upstream commit 82ee41b85cef16b4be1f4732650012d9baaedddd ] When a task is being retried, due to an NFS error, if the assigned transport has been put offline and the task is relocatable pick a new transport. Fixes: 6f081693e7b2b ("sunrpc: remove an offlined xprt using sysfs") Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* SUNRPC: avoid race between mod_timer() and del_timer_sync()NeilBrown2022-04-081-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | commit 3848e96edf4788f772d83990022fa7023a233d83 upstream. xprt_destory() claims XPRT_LOCKED and then calls del_timer_sync(). Both xprt_unlock_connect() and xprt_release() call ->release_xprt() which drops XPRT_LOCKED and *then* xprt_schedule_autodisconnect() which calls mod_timer(). This may result in mod_timer() being called *after* del_timer_sync(). When this happens, the timer may fire long after the xprt has been freed, and run_timer_softirq() will probably crash. The pairing of ->release_xprt() and xprt_schedule_autodisconnect() is always called under ->transport_lock. So if we take ->transport_lock to call del_timer_sync(), we can be sure that mod_timer() will run first (if it runs at all). Cc: stable@vger.kernel.org Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* SUNRPC: Fix sockaddr handling in the svc_xprt_create_error trace pointChuck Lever2022-03-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit dc6c6fb3d639756a532bcc47d4a9bf9f3965881b ] While testing, I got an unexpected KASAN splat: Jan 08 13:50:27 oracle-102.nfsv4.dev kernel: BUG: KASAN: stack-out-of-bounds in trace_event_raw_event_svc_xprt_create_err+0x190/0x210 [sunrpc] Jan 08 13:50:27 oracle-102.nfsv4.dev kernel: Read of size 28 at addr ffffc9000008f728 by task mount.nfs/4628 The memcpy() in the TP_fast_assign section of this trace point copies the size of the destination buffer in order that the buffer won't be overrun. In other similar trace points, the source buffer for this memcpy is a "struct sockaddr_storage" so the actual length of the source buffer is always long enough to prevent the memcpy from reading uninitialized or unallocated memory. However, for this trace point, the source buffer can be as small as a "struct sockaddr_in". For AF_INET sockaddrs, the memcpy() reads memory that follows the source buffer, which is not always valid memory. To avoid copying past the end of the passed-in sockaddr, make the source address's length available to the memcpy(). It would be a little nicer if the tracing infrastructure was more friendly about storing socket addresses that are not AF_INET, but I could not find a way to make printk("%pIS") work with a dynamic array. Reported-by: KASAN Fixes: 4b8f380e46e4 ("SUNRPC: Tracepoint to record errors in svc_xpo_create()") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* NFSD: Have legacy NFSD WRITE decoders use xdr_stream_subsegment()Chuck Lever2022-03-081-5/+6
| | | | | | | | | | | | | | | | | | | | | [ Upstream commit dae9a6cab8009e526570e7477ce858dcdfeb256e ] Refactor. Now that the NFSv2 and NFSv3 XDR decoders have been converted to use xdr_streams, the WRITE decoder functions can use xdr_stream_subsegment() to extract the WRITE payload into its own xdr_buf, just as the NFSv4 WRITE XDR decoder currently does. That makes it possible to pass the first kvec, pages array + length, page_base, and total payload length via a single function parameter. The payload's page_base is not yet assigned or used, but will be in subsequent patches. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* xprtrdma: fix pointer derefs in error cases of rpcrdma_ep_createDan Aloni2022-02-231-0/+3
| | | | | | | | | | | | | [ Upstream commit a9c10b5b3b67b3750a10c8b089b2e05f5e176e33 ] If there are failures then we must not leave the non-NULL pointers with the error value, otherwise `rpcrdma_ep_destroy` gets confused and tries free them, resulting in an Oops. Signed-off-by: Dan Aloni <dan.aloni@vastdata.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* sunrpc: Fix potential race conditions in rpc_sysfs_xprt_state_change()Anna Schumaker2022-02-161-16/+19
| | | | | | | | | | [ Upstream commit 1a48db3fef499f615b56093947ec4b0d3d8e3021 ] We need to use test_and_set_bit() when changing xprt state flags to avoid potentially getting xps->xps_nactive out of sync. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* net/sunrpc: fix reference count leaks in rpc_sysfs_xprt_state_changeXiyu Yang2022-02-161-2/+4
| | | | | | | | | | | | | | | | | | | | | [ Upstream commit 776d794f28c95051bc70405a7b1fa40115658a18 ] The refcount leak issues take place in an error handling path. When the 3rd argument buf doesn't match with "offline", "online" or "remove", the function simply returns -EINVAL and forgets to decrease the reference count of a rpc_xprt object and a rpc_xprt_switch object increased by rpc_sysfs_xprt_kobj_get_xprt() and rpc_sysfs_xprt_kobj_get_xprt_switch(), causing reference count leaks of both unused objects. Fix this issue by jumping to the error handling path labelled with out_put when buf matches none of "offline", "online" or "remove". Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Xiong <xiongx18@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* SUNRPC allow for unspecified transport time in rpc_clnt_add_xprtOlga Kornievskaia2022-02-161-1/+4
| | | | | | | | | | | [ Upstream commit b8a09619a56334414cbd7f935a0796240d0cc07e ] If the supplied argument doesn't specify the transport type, use the type of the existing rpc clnt and its existing transport. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* fsnotify: fix fsnotify hooks in pseudo filesystemsAmir Goldstein2022-02-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 29044dae2e746949ad4b9cbdbfb248994d1dcdb4 upstream. Commit 49246466a989 ("fsnotify: move fsnotify_nameremove() hook out of d_delete()") moved the fsnotify delete hook before d_delete() so fsnotify will have access to a positive dentry. This allowed a race where opening the deleted file via cached dentry is now possible after receiving the IN_DELETE event. To fix the regression in pseudo filesystems, convert d_delete() calls to d_drop() (see commit 46c46f8df9aa ("devpts_pty_kill(): don't bother with d_delete()") and move the fsnotify hook after d_drop(). Add a missing fsnotify_unlink() hook in nfsdfs that was found during the audit of fsnotify hooks in pseudo filesystems. Note that the fsnotify hooks in simple_recursive_removal() follow d_invalidate(), so they require no change. Link: https://lore.kernel.org/r/20220120215305.282577-2-amir73il@gmail.com Reported-by: Ivan Delalande <colona@arista.com> Link: https://lore.kernel.org/linux-fsdevel/YeNyzoDM5hP5LtGW@visor/ Fixes: 49246466a989 ("fsnotify: move fsnotify_nameremove() hook out of d_delete()") Cc: stable@vger.kernel.org # v5.3+ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* SUNRPC: Partial revert of commit 6f9f17287e78Trond Myklebust2021-11-181-13/+15
| | | | | | | | | | | | | | | | commit ea7a1019d8baf8503ecd6e3ec8436dec283569e6 upstream. The premise of commit 6f9f17287e78 ("SUNRPC: Mitigate cond_resched() in xprt_transmit()") was that cond_resched() is expensive and unnecessary when there has been just a single send. The point of cond_resched() is to ensure that tasks that should pre-empt this one get a chance to do so when it is safe to do so. The code prior to commit 6f9f17287e78 failed to take into account that it was keeping a rpc_task pinned for longer than it needed to, and so rather than doing a full revert, let's just move the cond_resched. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* nfsd: don't alloc under spinlock in rpc_parse_scope_idJ. Bruce Fields2021-11-181-22/+18
| | | | | | | | | | | | | | | | | | [ Upstream commit 9b6e27d01adcec58e046c624874f8a124e8b07ec ] Dan Carpenter says: The patch d20c11d86d8f: "nfsd: Protect session creation and client confirm using client_lock" from Jul 30, 2014, leads to the following Smatch static checker warning: net/sunrpc/addr.c:178 rpc_parse_scope_id() warn: sleeping in atomic context Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: d20c11d86d8f ("nfsd: Protect session creation and client...") Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* Merge tag 'nfsd-5.15-3' of ↵Linus Torvalds2021-10-071-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: "Bug fixes for NFSD error handling paths" * tag 'nfsd-5.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: NFSD: Keep existing listeners on portlist error SUNRPC: fix sign error causing rpcsec_gss drops nfsd: Fix a warning for nfsd_file_close_inode nfsd4: Handle the NFSv4 READDIR 'dircount' hint being zero nfsd: fix error handling of register_pernet_subsys() in init_nfsd()
| * SUNRPC: fix sign error causing rpcsec_gss dropsJ. Bruce Fields2021-10-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If sd_max is unsigned, then sd_max - GSS_SEQ_WIN is a very large number whenever sd_max is less than GSS_SEQ_WIN, and the comparison: seq_num <= sd->sd_max - GSS_SEQ_WIN in gss_check_seq_num is pretty much always true, even when that's clearly not what was intended. This was causing pynfs to hang when using krb5, because pynfs uses zero as the initial gss sequence number. That's perfectly legal, but this logic error causes knfsd to drop the rpc in that case. Out-of-order sequence IDs in the first GSS_SEQ_WIN (128) calls will also cause this. Fixes: 10b9d99a3dbb ("SUNRPC: Augment server-side rpcgss tracepoints") Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* | Merge tag 'nfsd-5.15-1' of ↵Linus Torvalds2021-09-083-7/+10
|\| | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Restore performance on memory-starved servers * tag 'nfsd-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: SUNRPC: improve error response to over-size gss credential SUNRPC: don't pause on incomplete allocation
| * SUNRPC: improve error response to over-size gss credentialNeilBrown2021-09-032-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the NFS server receives a large gss (kerberos) credential and tries to pass it up to rpc.svcgssd (which is deprecated), it triggers an infinite loop in cache_read(). cache_request() always returns -EAGAIN, and this causes a "goto again". This patch: - changes the error to -E2BIG to avoid the infinite loop, and - generates a WARN_ONCE when rsi_request first sees an over-sized credential. The warning suggests switching to gssproxy. Link: https://bugzilla.kernel.org/show_bug.cgi?id=196583 Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
| * SUNRPC: don't pause on incomplete allocationNeilBrown2021-09-011-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | alloc_pages_bulk_array() attempts to allocate at least one page based on the provided pages, and then opportunistically allocates more if that can be done without dropping the spinlock. So if it returns fewer than requested, that could just mean that it needed to drop the lock. In that case, try again immediately. Only pause for a time if no progress could be made. Reported-and-tested-by: Mike Javorski <mike.javorski@gmail.com> Reported-and-tested-by: Lothar Paltins <lopa@mailbox.org> Fixes: f6e70aab9dfe ("SUNRPC: refresh rq_pages using a bulk page allocator") Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: Mel Gorman <mgorman@suse.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
* | Merge tag 'nfs-for-5.15-1' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds2021-09-0417-136/+191
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client updates from Anna Schumaker: "New Features: - Better client responsiveness when server isn't replying - Use refcount_t in sunrpc rpc_client refcount tracking - Add srcaddr and dst_port to the sunrpc sysfs info files - Add basic support for connection sharing between servers with multiple NICs` Bugfixes and Cleanups: - Sunrpc tracepoint cleanups - Disconnect after ib_post_send() errors to avoid deadlocks - Fix for tearing down rpcrdma_reps - Fix a potential pNFS layoutget livelock loop - pNFS layout barrier fixes - Fix a potential memory corruption in rpc_wake_up_queued_task_set_status() - Fix reconnection locking - Fix return value of get_srcport() - Remove rpcrdma_post_sends() - Remove pNFS dead code - Remove copy size restriction for inter-server copies - Overhaul the NFS callback service - Clean up sunrpc TCP socket shutdowns - Always provide aligned buffers to RPC read layers" * tag 'nfs-for-5.15-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (39 commits) NFS: Always provide aligned buffers to the RPC read layers NFSv4.1 add network transport when session trunking is detected SUNRPC enforce creation of no more than max_connect xprts NFSv4 introduce max_connect mount options SUNRPC add xps_nunique_destaddr_xprts to xprt_switch_info in sysfs SUNRPC keep track of number of transports to unique addresses NFSv3: Delete duplicate judgement in nfs3_async_handle_jukebox SUNRPC: Tweak TCP socket shutdown in the RPC client SUNRPC: Simplify socket shutdown when not reusing TCP ports NFSv4.2: remove restriction of copy size for inter-server copy. NFS: Clean up the synopsis of callback process_op() NFS: Extract the xdr_init_encode/decode() calls from decode_compound NFS: Remove unused callback void decoder NFS: Add a private local dispatcher for NFSv4 callback operations SUNRPC: Eliminate the RQ_AUTHERR flag SUNRPC: Set rq_auth_stat in the pg_authenticate() callout SUNRPC: Add svc_rqst::rq_auth_stat SUNRPC: Add dst_port to the sysfs xprt info file SUNRPC: Add srcaddr as a file in sysfs sunrpc: Fix return value of get_srcport() ...
| * | SUNRPC enforce creation of no more than max_connect xprtsOlga Kornievskaia2021-08-271-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we are adding new transports via rpc_clnt_test_and_add_xprt() then check if we've reached the limit. Currently only pnfs path adds transports via that function but this is done in preparation when the client would add new transports when session trunking is detected. A warning is logged if the limit is reached. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| * | SUNRPC add xps_nunique_destaddr_xprts to xprt_switch_info in sysfsOlga Kornievskaia2021-08-271-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | In sysfs's xprt_switch_info attribute also display the value of number of transports with unique destination addresses for this xprt_switch. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| * | SUNRPC keep track of number of transports to unique addressesOlga Kornievskaia2021-08-272-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, xprt_switch keeps a number of all xprts (xps_nxprts) that were added to the switch regardless of whethere it's an nconnect transport or a transport to a trunkable address. Introduce a new counter to keep track of transports to unique destination addresses per xprt_switch. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| * | SUNRPC: Tweak TCP socket shutdown in the RPC clientTrond Myklebust2021-08-271-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We only really need to call shutdown() if we're in the ESTABLISHED TCP state, since that is the only case where the client is initiating a close of an established connection. If the socket is in FIN_WAIT1 or FIN_WAIT2, then we've already initiated socket shutdown and are waiting for the server's reply, so do nothing. In all other cases where we've already received a FIN from the server, we should be able to just close the socket. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| * | SUNRPC: Simplify socket shutdown when not reusing TCP portsTrond Myklebust2021-08-271-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we're not required to reuse the TCP port, then we can just immediately close the socket, and leave the cleanup details to the TCP layer. Fixes: e6237b6feb37 ("NFSv4.1: Don't rebind to the same source port when reconnecting to the server") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| * | SUNRPC: Eliminate the RQ_AUTHERR flagChuck Lever2021-08-101-20/+4
| | | | | | | | | | | | | | | | | | | | | | | | Now that there is an alternate method for returning an auth_stat value, replace the RQ_AUTHERR flag with use of that new method. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| * | SUNRPC: Set rq_auth_stat in the pg_authenticate() calloutChuck Lever2021-08-103-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In a few moments, rq_auth_stat will need to be explicitly set to rpc_auth_ok before execution gets to the dispatcher. svc_authenticate() already sets it, but it often gets reset to rpc_autherr_badcred right after that call, even when authentication is successful. Let's ensure that the pg_authenticate callout and svc_set_client() set it properly in every case. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| * | SUNRPC: Add svc_rqst::rq_auth_statChuck Lever2021-08-104-40/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I'd like to take commit 4532608d71c8 ("SUNRPC: Clean up generic dispatcher code") even further by using only private local SVC dispatchers for all kernel RPC services. This change would enable the removal of the logic that switches between svc_generic_dispatch() and a service's private dispatcher, and simplify the invocation of the service's pc_release method so that humans can visually verify that it is always invoked properly. All that will come later. First, let's provide a better way to return authentication errors from SVC dispatcher functions. Instead of overloading the dispatch method's *statp argument, add a field to struct svc_rqst that can hold an error value. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| * | SUNRPC: Add dst_port to the sysfs xprt info fileAnna Schumaker2021-08-091-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | This is most likely going to be 2049 for NFS, but some servers might be configured to export on a non-standard port. Let's show this information just in case somebody needs it. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| * | SUNRPC: Add srcaddr as a file in sysfsAnna Schumaker2021-08-091-0/+26
| | | | | | | | | | | | | | | | | | | | | I don't support changing it right now, but it could be useful information for clients with multiple network cards. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>