summaryrefslogtreecommitdiffstats
path: root/net/sunrpc
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6Linus Torvalds2011-04-081-1/+1
|\ | | | | | | | | | | | | * 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: NFS: Change initial mount authflavor only when server returns NFS4ERR_WRONGSEC NFS: Fix a signed vs. unsigned secinfo bug Revert "net/sunrpc: Use static const char arrays"
| * Revert "net/sunrpc: Use static const char arrays"Trond Myklebust2011-04-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 411b5e05617593efebc06241dbc56f42150f2abe. Olga Kornievskaia reports: Problem: linux client mounting linux server using rc4-hmac-md5 enctype. gssd fails with create a context after receiving a reply from the server. Diagnose: putting printout statements in the server kernel and kerberos libraries revealed that client and server derived different integrity keys. Server kernel code was at fault due the the commit [aglo@skydive linux-pnfs]$ git show 411b5e05617593efebc06241dbc56f42150f2abe Trond: The problem is that since it relies on virt_to_page(), you cannot call sg_set_buf() for data in the const section. Reported-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org [2.6.36+]
* | Fix common misspellingsLucas De Marchi2011-03-312-3/+3
|/ | | | | | Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
* NFS: Ensure that rpc_release_resources_task() can be called twice.OGAWA Hirofumi2011-03-271-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BUG: atomic_dec_and_test(): -1: atomic counter underflow at: Pid: 2827, comm: mount.nfs Not tainted 2.6.38 #1 Call Trace: [<ffffffffa02223a0>] ? put_rpccred+0x44/0x14e [sunrpc] [<ffffffffa021bbe9>] ? rpc_ping+0x4e/0x58 [sunrpc] [<ffffffffa021c4a5>] ? rpc_create+0x481/0x4fc [sunrpc] [<ffffffffa022298a>] ? rpcauth_lookup_credcache+0xab/0x22d [sunrpc] [<ffffffffa028be8c>] ? nfs_create_rpc_client+0xa6/0xeb [nfs] [<ffffffffa028c660>] ? nfs4_set_client+0xc2/0x1f9 [nfs] [<ffffffffa028cd3c>] ? nfs4_create_server+0xf2/0x2a6 [nfs] [<ffffffffa0295d07>] ? nfs4_remote_mount+0x4e/0x14a [nfs] [<ffffffff810dd570>] ? vfs_kern_mount+0x6e/0x133 [<ffffffffa029605a>] ? nfs_do_root_mount+0x76/0x95 [nfs] [<ffffffffa029643d>] ? nfs4_try_mount+0x56/0xaf [nfs] [<ffffffffa0297434>] ? nfs_get_sb+0x435/0x73c [nfs] [<ffffffff810dd59b>] ? vfs_kern_mount+0x99/0x133 [<ffffffff810dd693>] ? do_kern_mount+0x48/0xd8 [<ffffffff810f5b75>] ? do_mount+0x6da/0x741 [<ffffffff810f5c5f>] ? sys_mount+0x83/0xc0 [<ffffffff8100293b>] ? system_call_fastpath+0x16/0x1b Well, so, I think this is real bug of nfs codes somewhere. With some review, the code rpc_call_sync() rpc_run_task rpc_execute() __rpc_execute() rpc_release_task() rpc_release_resources_task() put_rpccred() <= release cred rpc_put_task rpc_do_put_task() rpc_release_resources_task() put_rpccred() <= release cred again seems to be release cred unintendedly. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* Merge branch 'nfs-for-2.6.39' of ↵Linus Torvalds2011-03-252-0/+40
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'nfs-for-2.6.39' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (28 commits) Cleanup XDR parsing for LAYOUTGET, GETDEVICEINFO NFSv4.1 convert layoutcommit sync to boolean NFSv4.1 pnfs_layoutcommit_inode fixes NFS: Determine initial mount security NFS: use secinfo when crossing mountpoints NFS: Add secinfo procedure NFS: lookup supports alternate client NFS: convert call_sync() to a function NFSv4.1 remove temp code that prevented ds commits NFSv4.1: layoutcommit NFSv4.1: filelayout driver specific code for COMMIT NFSv4.1: remove GETATTR from ds commits NFSv4.1: add generic layer hooks for pnfs COMMIT NFSv4.1: alloc and free commit_buckets NFSv4.1: shift filelayout_free_lseg NFSv4.1: pull out code from nfs_commit_release NFSv4.1: pull error handling out of nfs_commit_list NFSv4.1: add callback to nfs4_commit_done NFSv4.1: rearrange nfs_commit_rpcsetup NFSv4.1: don't send COMMIT to ds for data sync writes ...
| * Merge branch 'nfs-for-2.6.39' into nfs-for-nextTrond Myklebust2011-03-241-0/+2
| |\
| | * SUNRPC: Never reuse the socket port after an xs_close()Trond Myklebust2011-03-221-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | If we call xs_close(), we're in one of two situations: - Autoclose, which means we don't expect to resend a request - bind+connect failed, which probably means the port is in use Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
| * | NFS: Determine initial mount securityBryan Schumaker2011-03-241-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | When sec=<something> is not presented as a mount option, we should attempt to determine what security flavor the server is using. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: use secinfo when crossing mountpointsBryan Schumaker2011-03-241-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | A submount may use different security than the parent mount does. We should figure out what sec flavor the submount uses at mount time. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | | Merge branch 'for-2.6.39' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2011-03-241-9/+9
|\ \ \ | |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-2.6.39' of git://linux-nfs.org/~bfields/linux: SUNRPC: Remove resource leak in svc_rdma_send_error() nfsd: wrong index used in inner loop nfsd4: fix comment and remove unused nfsd4_file fields nfs41: make sure nfs server return right ca_maxresponsesize_cached nfsd: fix compile error svcrpc: fix bad argument in unix_domain_find nfsd4: fix struct file leak nfsd4: minor nfs4state.c reshuffling svcrpc: fix rare race on unix_domain creation nfsd41: modify the members value of nfsd4_op_flags nfsd: add proc file listing kernel's gss_krb5 enctypes gss:krb5 only include enctype numbers in gm_upcall_enctypes NFSD, VFS: Remove dead code in nfsd_rename() nfsd: kill unused macro definition locks: use assign_type()
| * | SUNRPC: Remove resource leak in svc_rdma_send_error()Jesper Juhl2011-03-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | We leak the memory allocated to 'ctxt' when we return after 'ib_dma_mapping_error()' returns !=0. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | svcrpc: fix bad argument in unix_domain_findJ. Bruce Fields2011-03-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "After merging the nfsd tree, today's linux-next build (powerpc ppc64_defconfig) produced this warning: net/sunrpc/svcauth_unix.c: In function 'unix_domain_find': net/sunrpc/svcauth_unix.c:58: warning: passing argument 1 of +'svcauth_unix_domain_release' from incompatible pointer type net/sunrpc/svcauth_unix.c:41: note: expected 'struct auth_domain *' but argument +is of type 'struct unix_domain *' Introduced by commit 8b3e07ac908d ("svcrpc: fix rare race on unix_domain creation")." Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | svcrpc: fix rare race on unix_domain creationJ. Bruce Fields2011-03-081-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Note that "new" here is not yet fully initialized; auth_domain_put should be called only on auth_domains that have actually been added to the hash. Before this fix, two attempts to add the same domain at once could cause the hlist_del in auth_domain_put to fail. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | gss:krb5 only include enctype numbers in gm_upcall_enctypesKevin Coffman2011-03-072-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | Make the value in gm_upcall_enctypes just the enctype values. This allows the values to be used more easily elsewhere. Signed-off-by: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* | | Merge branch 'nfs-for-2.6.39' of ↵Linus Torvalds2011-03-178-96/+120
|\ \ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'nfs-for-2.6.39' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (54 commits) RPC: killing RPC tasks races fixed xprt: remove redundant check SUNRPC: Convert struct rpc_xprt to use atomic_t counters SUNRPC: Ensure we always run the tk_callback before tk_action sunrpc: fix printk format warning xprt: remove redundant null check nfs: BKL is no longer needed, so remove the include NFS: Fix a warning in fs/nfs/idmap.c Cleanup: Factor out some cut-and-paste code. cleanup: save 60 lines/100 bytes by combining two mostly duplicate functions. NFS: account direct-io into task io accounting gss:krb5 only include enctype numbers in gm_upcall_enctypes RPCRDMA: Fix FRMR registration/invalidate handling. RPCRDMA: Fix to XDR page base interpretation in marshalling logic. NFSv4: Send unmapped uid/gids to the server when using auth_sys NFSv4: Propagate the error NFS4ERR_BADOWNER to nfs4_do_setattr NFSv4: cleanup idmapper functions to take an nfs_server argument NFSv4: Send unmapped uid/gids to the server if the idmapper fails NFSv4: If the server sends us a numeric uid/gid then accept it NFSv4.1: reject zero layout with zeroed stripe unit ...
| * | RPC: killing RPC tasks races fixedStanislav Kinsbursky2011-03-171-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RPC task RPC_TASK_QUEUED bit is set must be checked before trying to wake up task rpc_killall_tasks() because task->tk_waitqueue can not be set (equal to NULL). Also, as Trond Myklebust mentioned, such approach (instead of checking tk_waitqueue to NULL) allows us to "optimise away the call to rpc_wake_up_queued_task() altogether for those tasks that aren't queued". Here is an example of dereferencing of tk_waitqueue equal to NULL: CPU 0 CPU 1 CPU 2 -------------------- --------------------- -------------------------- nfs4_run_open_task rpc_run_task rpc_execute rpc_set_active rpc_make_runnable (waiting) rpc_async_schedule nfs4_open_prepare nfs_wait_on_sequence nfs_umount_begin rpc_killall_tasks rpc_wake_up_task rpc_wake_up_queued_task spin_lock(tk_waitqueue == NULL) BUG() rpc_sleep_on spin_lock(&q->lock) __rpc_sleep_on task->tk_waitqueue = q Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org> Cc: stable@kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | xprt: remove redundant checkj223yang@asset.uwaterloo.ca2011-03-171-1/+1
| | | | | | | | | | | | | | | | | | | | | remove redundant check. Signed-off-by: Jinqiu Yang <crindy646@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | SUNRPC: Convert struct rpc_xprt to use atomic_t countersTrond Myklebust2011-03-171-8/+8
| | | | | | | | | | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | SUNRPC: Ensure we always run the tk_callback before tk_actionTrond Myklebust2011-03-171-8/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes a race in which the task->tk_callback() puts the rpc_task to sleep, setting a new callback. Under certain circumstances, the current code may end up executing the task->tk_action before it gets round to the callback. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
| * | sunrpc: fix printk format warningRandy Dunlap2011-03-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Fix printk format build warning: net/sunrpc/xprtrdma/verbs.c:1463: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'dma_addr_t' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | xprt: remove redundant null checkj223yang@asset.uwaterloo.ca2011-03-151-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | 'req' is dereferenced before checked for NULL. The patch simply removes the check. Signed-off-by: Jinqiu Yang<crindy646@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | gss:krb5 only include enctype numbers in gm_upcall_enctypesKevin Coffman2011-03-112-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | Make the value in gm_upcall_enctypes just the enctype values. This allows the values to be used more easily elsewhere. Signed-off-by: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | RPCRDMA: Fix FRMR registration/invalidate handling.Tom Tucker2011-03-112-8/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the rpc_memreg_strategy is 5, FRMR are used to map RPC data. This mode uses an FRMR to map the RPC data, then invalidates (i.e. unregisers) the data in xprt_rdma_free. These FRMR are used across connections on the same mount, i.e. if the connection goes away on an idle timeout and reconnects later, the FRMR are not destroyed and recreated. This creates a problem for transport errors because the WR that invalidate an FRMR may be flushed (i.e. fail) leaving the FRMR valid. When the FRMR is later used to map an RPC it will fail, tearing down the transport and starting over. Over time, more and more of the FRMR pool end up in the wrong state resulting in seemingly random disconnects. This fix keeps track of the FRMR state explicitly by setting it's state based on the successful completion of a reg/inv WR. If the FRMR is ever used and found to be in the wrong state, an invalidate WR is prepended, re-syncing the FRMR state and avoiding the connection loss. Signed-off-by: Tom Tucker <tom@ogc.us> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | RPCRDMA: Fix to XDR page base interpretation in marshalling logic.Tom Tucker2011-03-111-44/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The RPCRDMA marshalling logic assumed that xdr->page_base was an offset into the first page of xdr->page_list. It is in fact an offset into the xdr->page_list itself, that is, it selects the first page in the page_list and the offset into that page. The symptom depended in part on the rpc_memreg_strategy, if it was FRMR, or some other one-shot mapping mode, the connection would get torn down on a base and bounds error. When the badly marshalled RPC was retransmitted it would reconnect, get the error, and tear down the connection again in a loop forever. This resulted in a hung-mount. For the other modes, it would result in silent data corruption. This bug is most easily reproduced by writing more data than the filesystem has space for. This fix corrects the page_base assumption and otherwise simplifies the iov mapping logic. Signed-off-by: Tom Tucker <tom@ogc.us> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFSv4.1: filelayout async error handlerAndy Adamson2011-03-111-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use our own async error handler. Mark the layout as failed and retry i/o through the MDS on specified errors. Update the mds_offset in nfs_readpage_retry so that a failed short-read retry to a DS gets correctly resent through the MDS. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | RPC: clarify rpc_run_task error handlingFred Isaman2011-03-112-12/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | rpc_run_task can only fail if it is not passed in a preallocated task. However, that is not at all clear with the current code. So remove several impossible to occur failure checks. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | RPC: remove check for impossible condition in rpc_make_runnableFred Isaman2011-03-111-8/+1
| | | | | | | | | | | | | | | | | | | | | queue_work() only returns 0 or 1, never a negative value. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6Linus Torvalds2011-03-161-12/+20
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1480 commits) bonding: enable netpoll without checking link status xfrm: Refcount destination entry on xfrm_lookup net: introduce rx_handler results and logic around that bonding: get rid of IFF_SLAVE_INACTIVE netdev->priv_flag bonding: wrap slave state work net: get rid of multiple bond-related netdevice->priv_flags bonding: register slave pointer for rx_handler be2net: Bump up the version number be2net: Copyright notice change. Update to Emulex instead of ServerEngines e1000e: fix kconfig for crc32 dependency netfilter ebtables: fix xt_AUDIT to work with ebtables xen network backend driver bonding: Improve syslog message at device creation time bonding: Call netif_carrier_off after register_netdevice bonding: Incorrect TX queue offset net_sched: fix ip_tos2prio xfrm: fix __xfrm_route_forward() be2net: Fix UDP packet detected status in RX compl Phonet: fix aligned-mode pipe socket buffer header reserve netxen: support for GbE port settings ... Fix up conflicts in drivers/staging/brcm80211/brcmsmac/wl_mac80211.c with the staging updates.
| * | | net: add __rcu annotations to sk_wq and wqEric Dumazet2011-02-221-12/+20
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add proper RCU annotations/verbs to sk_wq and wq members Fix __sctp_write_space() sk_sleep() abuse (and sock->wq access) Fix sunrpc sk_sleep() abuse too Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge branch 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds2011-03-161-1/+1
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: fix build failure introduced by s/freezeable/freezable/ workqueue: add system_freezeable_wq rds/ib: use system_wq instead of rds_ib_fmr_wq net/9p: replace p9_poll_task with a work net/9p: use system_wq instead of p9_mux_wq xfs: convert to alloc_workqueue() reiserfs: make commit_wq use the default concurrency level ocfs2: use system_wq instead of ocfs2_quota_wq ext4: convert to alloc_workqueue() scsi/scsi_tgt_lib: scsi_tgtd isn't used in memory reclaim path scsi/be2iscsi,qla2xxx: convert to alloc_workqueue() misc/iwmc3200top: use system_wq instead of dedicated workqueues i2o: use alloc_workqueue() instead of create_workqueue() acpi: kacpi*_wq don't need WQ_MEM_RECLAIM fs/aio: aio_wq isn't used in memory reclaim path input/tps6507x-ts: use system_wq instead of dedicated workqueue cpufreq: use system_wq instead of dedicated workqueues wireless/ipw2x00: use system_wq instead of dedicated workqueues arm/omap: use system_wq in mailbox workqueue: use WQ_MEM_RECLAIM instead of WQ_RESCUER
| * | Merge branch 'master' into for-2.6.39Tejun Heo2011-02-211-3/+1
| |\|
| * | workqueue: use WQ_MEM_RECLAIM instead of WQ_RESCUERTejun Heo2011-01-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | WQ_RESCUER is now an internal flag and should only be used in the workqueue implementation proper. Use WQ_MEM_RECLAIM instead. This doesn't introduce any functional difference. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: dm-devel@redhat.com Cc: Neil Brown <neilb@suse.de>
* | | sunrpc: Propagate errors from xs_bind() through xs_create_sock()Ben Hutchings2011-03-101-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | xs_create_sock() is supposed to return a pointer or an ERR_PTR-encoded error, but it currently returns 0 if xs_bind() fails. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Cc: stable@kernel.org [v2.6.37] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | | SUNRPC: Remove resource leak in svc_rdma_send_error()Jesper Juhl2011-03-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | We leak the memory allocated to 'ctxt' when we return after 'ib_dma_mapping_error()' returns !=0. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | | SUNRPC: Close a race in __rpc_wait_for_completion_task()Trond Myklebust2011-03-101-14/+61
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Although they run as rpciod background tasks, under normal operation (i.e. no SIGKILL), functions like nfs_sillyrename(), nfs4_proc_unlck() and nfs4_do_close() want to be fully synchronous. This means that when we exit, we want all references to the rpc_task to be gone, and we want any dentry references etc. held by that task to be released. For this reason these functions call __rpc_wait_for_completion_task(), followed by rpc_put_task() in the expectation that the latter will be releasing the last reference to the rpc_task, and thus ensuring that the callback_ops->rpc_release() has been called synchronously. This patch fixes a race which exists due to the fact that rpciod calls rpc_complete_task() (in order to wake up the callers of __rpc_wait_for_completion_task()) and then subsequently calls rpc_put_task() without ensuring that these two steps are done atomically. In order to avoid adding new spin locks, the patch uses the existing waitqueue spin lock to order the rpc_task reference count releases between the waiting process and rpciod. The common case where nobody is waiting for completion is optimised for by checking if the RPC_TASK_ASYNC flag is cleared and/or if the rpc_task reference count is 1: in those cases we drop trying to grab the spin lock, and immediately free up the rpc_task. Those few processes that need to put the rpc_task from inside an asynchronous context and that do not care about ordering are given a new helper: rpc_put_task_async(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | NFS do not find client in NFSv4 pg_authenticateAndy Adamson2011-01-251-3/+1
|/ | | | | | | | | | The information required to find the nfs_client cooresponding to the incoming back channel request is contained in the NFS layer. Perform minimal checking in the RPC layer pg_authenticate method, and push more detailed checking into the NFS layer where the nfs_client can be found. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* Merge branch 'for-2.6.38' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2011-01-1410-94/+141
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-2.6.38' of git://linux-nfs.org/~bfields/linux: (62 commits) nfsd4: fix callback restarting nfsd: break lease on unlink, link, and rename nfsd4: break lease on nfsd setattr nfsd: don't support msnfs export option nfsd4: initialize cb_per_client nfsd4: allow restarting callbacks nfsd4: simplify nfsd4_cb_prepare nfsd4: give out delegations more quickly in 4.1 case nfsd4: add helper function to run callbacks nfsd4: make sure sequence flags are set after destroy_session nfsd4: re-probe callback on connection loss nfsd4: set sequence flag when backchannel is down nfsd4: keep finer-grained callback status rpc: allow xprt_class->setup to return a preexisting xprt rpc: keep backchannel xprt as long as server connection rpc: move sk_bc_xprt to svc_xprt nfsd4: allow backchannel recovery nfsd4: support BIND_CONN_TO_SESSION nfsd4: modify session list under cl_lock Documentation: fl_mylease no longer exists ... Fix up conflicts in fs/nfsd/vfs.c with the vfs-scale work. The vfs-scale work touched some msnfs cases, and this merge removes support for that entirely, so the conflict was trivial to resolve.
| * rpc: allow xprt_class->setup to return a preexisting xprtJ. Bruce Fields2011-01-112-9/+12
| | | | | | | | | | | | | | This allows us to reuse the xprt associated with a server connection if one has already been set up. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * rpc: keep backchannel xprt as long as server connectionJ. Bruce Fields2011-01-113-11/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Multiple backchannels can share the same tcp connection; from rfc 5661 section 2.10.3.1: A connection's association with a session is not exclusive. A connection associated with the channel(s) of one session may be simultaneously associated with the channel(s) of other sessions including sessions associated with other client IDs. However, multiple backchannels share a connection, they must all share the same xid stream (hence the same rpc_xprt); the only way we have to match replies with calls at the rpc layer is using the xid. So, keep the rpc_xprt around as long as the connection lasts, in case we're asked to use the connection as a backchannel again. Requests to create new backchannel clients over a given server connection should results in creating new clients that reuse the existing rpc_xprt. But to start, just reject attempts to associate multiple rpc_xprt's with the same underlying bc_xprt. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * rpc: move sk_bc_xprt to svc_xprtJ. Bruce Fields2011-01-112-5/+7
| | | | | | | | | | | | | | This seems obviously transport-level information even if it's currently used only by the server socket code. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * svcrpc: ensure cache_check caller sees updated entryJ. Bruce Fields2011-01-041-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Supposes cache_check runs simultaneously with an update on a different CPU: cache_check task doing update ^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^ 1. test for CACHE_VALID 1'. set entry->data & !CACHE_NEGATIVE 2. use entry->data 2'. set CACHE_VALID If the two memory writes performed in step 1' and 2' appear misordered with respect to the reads in step 1 and 2, then the caller could get stale data at step 2 even though it saw CACHE_VALID set on the cache entry. Add memory barriers to prevent this. Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * svcrpc: take lock on turning entry NEGATIVE in cache_checkJ. Bruce Fields2011-01-041-7/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We attempt to turn a cache entry negative in place. But that entry may already have been filled in by some other task since we last checked whether it was valid, so we could be modifying an already-valid entry. If nothing else there's a likely leak in such a case when the entry is eventually put() and contents are not freed because it has CACHE_NEGATIVE set. So, take the cache_lock just as sunrpc_cache_update() does. Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * svcrpc: simpler request droppingJ. Bruce Fields2011-01-042-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we use -EAGAIN returns to determine when to drop a deferred request. On its own, that is error-prone, as it makes us treat -EAGAIN returns from other functions specially to prevent inadvertent dropping. So, use a flag on the request instead. Returning an error on request deferral is still required, to prevent further processing, but we no longer need worry that an error return on its own could result in a drop. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * svcrpc: avoid double reply caused by deferral raceJ. Bruce Fields2011-01-041-7/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit d29068c431599fa "sunrpc: Simplify cache_defer_req and related functions." asserted that cache_check() could determine success or failure of cache_defer_req() by checking the CACHE_PENDING bit. This isn't quite right. We need to know whether cache_defer_req() created a deferred request, in which case sending an rpc reply has become the responsibility of the deferred request, and it is important that we not send our own reply, resulting in two different replies to the same request. And the CACHE_PENDING bit doesn't tell us that; we could have succesfully created a deferred request at the same time as another thread cleared the CACHE_PENDING bit. So, partially revert that commit, to ensure that cache_check() returns -EAGAIN if and only if a deferred request has been created. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Acked-by: NeilBrown <neilb@suse.de>
| * SUNRPC: Remove more code when NFSD_DEPRECATED is not configuredJ. Bruce Fields2011-01-041-0/+12
| | | | | | | | | | | | Signed-off-by: NeilBrown <neilb@suse.de> [bfields@redhat.com: moved svcauth_unix_purge outside ifdef's.] Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * svcrpc: modifying valid sunrpc cache entries is racyJ. Bruce Fields2011-01-041-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Once a sunrpc cache entry is VALID, we should be replacing it (and allowing any concurrent users to destroy it on last put) instead of trying to update it in place. Otherwise someone referencing the ip_map we're modifying here could try to use the m_client just as we're putting the last reference. The bug should only be seen by users of the legacy nfsd interfaces. (Thanks to Neil for suggestion to use sunrpc_invalidate.) Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * net/sunrpc/auth_gss/gss_krb5_crypto.c: Use normal negative error value returnJoe Perches2010-12-171-1/+1
| | | | | | | | | | | | | | | | | | And remove unnecessary double semicolon too. No effect to code, as test is != 0. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * net: sunrpc: kill unused macrosShan Wei2010-12-173-5/+0
| | | | | | | | | | | | | | These macros never be used for several years. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * sunrpc: svc_sock_names should hold ref to socket being closed.NeilBrown2010-12-171-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | Currently svc_sock_names calls svc_close_xprt on a svc_sock to which it does not own a reference. As soon as svc_close_xprt sets XPT_CLOSE, the socket could be freed by a separate thread (though this is a very unlikely race). It is safer to hold a reference while calling svc_close_xprt. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * sunrpc: remove xpt_poolNeilBrown2010-12-171-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The xpt_pool field is only used for reporting BUGs. And it isn't used correctly. In particular, when it is cleared in svc_xprt_received before XPT_BUSY is cleared, there is no guarantee that either the compiler or the CPU might not re-order to two assignments, just setting xpt_pool to NULL after XPT_BUSY is cleared. If a different cpu were running svc_xprt_enqueue at this moment, it might see XPT_BUSY clear and then xpt_pool non-NULL, and so BUG. This could be fixed by calling smp_mb__before_clear_bit() before the clear_bit. However as xpt_pool isn't really used, it seems safest to simply remove xpt_pool. Another alternate would be to change the clear_bit to clear_bit_unlock, and the test_and_set_bit to test_and_set_bit_lock. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>