summaryrefslogtreecommitdiffstats
path: root/fs/nfs/nfs4state.c
Commit message (Collapse)AuthorAgeFilesLines
...
* NFS: Use root's credential for lease management when keytab is missingChuck Lever2013-08-071-1/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 05f4c350 "NFS: Discover NFSv4 server trunking when mounting" Fri Sep 14 17:24:32 2012 introduced Uniform Client String support, which forces our NFS client to establish a client ID immediately during a mount operation rather than waiting until a user wants to open a file. Normally machine credentials (eg. from a keytab) are used to perform a mount operation that is protected by Kerberos. Before 05fc350, SETCLIENTID used a machine credential, or fell back to a regular user's credential if no keytab is available. On clients that don't have a keytab, performing SETCLIENTID early means there's no user credential to fall back on, since no regular user has kinit'd yet. 05f4c350 seems to have broken the ability to mount with sec=krb5 on clients that don't have a keytab in kernels 3.7 - 3.10. To address this regression, commit 4edaa308 (NFS: Use "krb5i" to establish NFSv4 state whenever possible), Sat Mar 16 15:56:20 2013, was merged in 3.10. This commit forces the NFS client to fall back to AUTH_SYS for lease management operations if no keytab is available. Neil Brown noticed that, since root is required to kinit to do a sec=krb5 mount when a client doesn't have a keytab, we can try to use root's Kerberos credential before AUTH_SYS. Now, when determining a principal and flavor to use for lease management, the NFS client tries in this order: 1. Flavor: AUTH_GSS, krb5i Principal: service principal (via keytab) 2. Flavor: AUTH_GSS, krb5i Principal: user principal established for UID 0 (via kinit) 3. Flavor: AUTH_SYS Principal: UID 0 / GID 0 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Fix return type of nfs4_end_drain_session() stubChuck Lever2013-07-231-1/+1
| | | | | | | | | Clean up: when NFSv4.1 support is compiled out, nfs4_end_drain_session() becomes a stub. Make the synopsis of the stub match the synopsis of the real version of the function. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* Merge tag 'nfs-for-3.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds2013-07-091-15/+21
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client updates from Trond Myklebust: "Feature highlights include: - Add basic client support for NFSv4.2 - Add basic client support for Labeled NFS (selinux for NFSv4.2) - Fix the use of credentials in NFSv4.1 stateful operations, and add support for NFSv4.1 state protection. Bugfix highlights: - Fix another NFSv4 open state recovery race - Fix an NFSv4.1 back channel session regression - Various rpc_pipefs races - Fix another issue with NFSv3 auth negotiation Please note that Labeled NFS does require some additional support from the security subsystem. The relevant changesets have all been reviewed and acked by James Morris." * tag 'nfs-for-3.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (54 commits) NFS: Set NFS_CS_MIGRATION for NFSv4 mounts NFSv4.1 Refactor nfs4_init_session and nfs4_init_channel_attrs nfs: have NFSv3 try server-specified auth flavors in turn nfs: have nfs_mount fake up a auth_flavs list when the server didn't provide it nfs: move server_authlist into nfs_try_mount_request nfs: refactor "need_mount" code out of nfs_try_mount SUNRPC: PipeFS MOUNT notification optimization for dying clients SUNRPC: split client creation routine into setup and registration SUNRPC: fix races on PipeFS UMOUNT notifications SUNRPC: fix races on PipeFS MOUNT notifications NFSv4.1 use pnfs_device maxcount for the objectlayout gdia_maxcount NFSv4.1 use pnfs_device maxcount for the blocklayout gdia_maxcount NFSv4.1 Fix gdia_maxcount calculation to fit in ca_maxresponsesize NFS: Improve legacy idmapping fallback NFSv4.1 end back channel session draining NFS: Apply v4.1 capabilities to v4.2 NFSv4.1: Clean up layout segment comparison helper names NFSv4.1: layout segment comparison helpers should take 'const' parameters NFSv4: Move the DNS resolver into the NFSv4 module rpc_pipefs: only set rpc_dentry_ops if d_op isn't already set ...
| * NFSv4.1 end back channel session drainingAndy Adamson2013-06-201-12/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | We need to ensure that we clear NFS4_SLOT_TBL_DRAINING on the back channel when we're done recovering the session. Regression introduced by commit 774d5f14e (NFSv4.1 Fix a pNFS session draining deadlock) Signed-off-by: Andy Adamson <andros@netapp.com> [Trond: Changed order to start back-channel first. Minor code cleanup] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org [>=3.10]
| * NFSv4.1: Ensure that reclaim_complete uses the right credentialTrond Myklebust2013-06-061-3/+10
| | | | | | | | | | | | | | We want to use the same credential for reclaim_complete as we used for the exchange_id call. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | drivers: avoid parsing names as kthread_run() format stringsKees Cook2013-07-031-1/+1
| | | | | | | | | | | | | | | | | | | | Calling kthread_run with a single name parameter causes it to be handled as a format string. Many callers are passing potentially dynamic string content, so use "%s" in those cases to avoid any potential accidents. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | locks: protect most of the file_lock handling with i_lockJeff Layton2013-06-291-4/+4
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Having a global lock that protects all of this code is a clear scalability problem. Instead of doing that, move most of the code to be protected by the i_lock instead. The exceptions are the global lists that the ->fl_link sits on, and the ->fl_block list. ->fl_link is what connects these structures to the global lists, so we must ensure that we hold those locks when iterating over or updating these lists. Furthermore, sound deadlock detection requires that we hold the blocked_list state steady while checking for loops. We also must ensure that the search and update to the list are atomic. For the checking and insertion side of the blocked_list, push the acquisition of the global lock into __posix_lock_file and ensure that checking and update of the blocked_list is done without dropping the lock in between. On the removal side, when waking up blocked lock waiters, take the global lock before walking the blocked list and dequeue the waiters from the global list prior to removal from the fl_block list. With this, deadlock detection should be race free while we minimize excessive file_lock_lock thrashing. Finally, in order to avoid a lock inversion problem when handling /proc/locks output we must ensure that manipulations of the fl_block list are also protected by the file_lock_lock. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* NFSv4.1 Fix a pNFS session draining deadlockAndy Adamson2013-05-201-8/+7
| | | | | | | | | | | | | | | | | | | | | | | | On a CB_RECALL the callback service thread flushes the inode using filemap_flush prior to scheduling the state manager thread to return the delegation. When pNFS is used and I/O has not yet gone to the data server servicing the inode, a LAYOUTGET can preceed the I/O. Unlike the async filemap_flush call, the LAYOUTGET must proceed to completion. If the state manager starts to recover data while the inode flush is sending the LAYOUTGET, a deadlock occurs as the callback service thread holds the single callback session slot until the flushing is done which blocks the state manager thread, and the state manager thread has set the session draining bit which puts the inode flush LAYOUTGET RPC to sleep on the forechannel slot table waitq. Separate the draining of the back channel from the draining of the fore channel by moving the NFS4_SESSION_DRAINING bit from session scope into the fore and back slot tables. Drain the back channel first allowing the LAYOUTGET call to proceed (and fail) so the callback service thread frees the callback slot. Then proceed with draining the forechannel. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Ensure that we free the lock stateid on the serverTrond Myklebust2013-05-061-4/+7
| | | | | | | | This ensures that the server doesn't need to keep huge numbers of lock stateids waiting around for the final CLOSE. See section 8.2.4 in RFC5661. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* Merge branch 'bugfixes' into linux-nextTrond Myklebust2013-04-231-1/+7
|\ | | | | | | | | | | Fix up a conflict between the linux-next branch and mainline. Conflicts: fs/nfs/nfs4proc.c
| * NFSv4: Fix a memory leak in nfs4_discover_server_trunkingTrond Myklebust2013-04-051-1/+7
| | | | | | | | | | | | | | | | | | When we assign a new rpc_client to clp->cl_rpcclient, we need to destroy the old one. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: stable@vger.kernel.org [>=3.7]
* | Merge branch 'rpcsec_gss-from_cel' into linux-nextTrond Myklebust2013-04-231-49/+11
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * rpcsec_gss-from_cel: (21 commits) NFS: Retry SETCLIENTID with AUTH_SYS instead of AUTH_NONE NFSv4: Don't clear the machine cred when client establish returns EACCES NFSv4: Fix issues in nfs4_discover_server_trunking NFSv4: Fix the fallback to AUTH_NULL if krb5i is not available NFS: Use server-recommended security flavor by default (NFSv3) SUNRPC: Don't recognize RPC_AUTH_MAXFLAVOR NFS: Use "krb5i" to establish NFSv4 state whenever possible NFS: Try AUTH_UNIX when PUTROOTFH gets NFS4ERR_WRONGSEC NFS: Use static list of security flavors during root FH lookup recovery NFS: Avoid PUTROOTFH when managing leases NFS: Clean up nfs4_proc_get_rootfh NFS: Handle missing rpc.gssd when looking up root FH SUNRPC: Remove EXPORT_SYMBOL_GPL() from GSS mech switch SUNRPC: Make gss_mech_get() static SUNRPC: Refactor nfsd4_do_encode_secinfo() SUNRPC: Consider qop when looking up pseudoflavors SUNRPC: Load GSS kernel module by OID SUNRPC: Introduce rpcauth_get_pseudoflavor() SUNRPC: Define rpcsec_gss_info structure NFS: Remove unneeded forward declaration ...
| * | NFS: Retry SETCLIENTID with AUTH_SYS instead of AUTH_NONEChuck Lever2013-04-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recently I changed the SETCLIENTID code to use AUTH_GSS(krb5i), and then retry with AUTH_NONE if that didn't work. This was to enable Kerberos NFS mounts to work without forcing Linux NFS clients to have a keytab on hand. Rick Macklem reports that the FreeBSD server accepts AUTH_NONE only for NULL operations (thus certainly not for SETCLIENTID). Falling back to AUTH_NONE means our proposed 3.10 NFS client will not interoperate with FreeBSD servers over NFSv4 unless Kerberos is fully configured on both ends. If the Linux client falls back to using AUTH_SYS instead for SETCLIENTID, all should work fine as long as the NFS server is configured to allow AUTH_SYS for SETCLIENTID. This may still prevent access to Kerberos-only FreeBSD servers by Linux clients with no keytab. Rick is of the opinion that the security settings the server applies to its pseudo-fs should also apply to the SETCLIENTID operation. Linux and Solaris NFS servers do not place that limitation on SETCLIENTID. The security settings for the server's pseudo-fs are determined automatically as the union of security flavors allowed on real exports, as recommended by RFC 3530bis; and the flavors allowed for SETCLIENTID are all flavors supported by the respective server implementation. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFSv4: Don't clear the machine cred when client establish returns EACCESTrond Myklebust2013-04-051-16/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | The expected behaviour is that the client will decide at mount time whether or not to use a krb5i machine cred, or AUTH_NULL. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Bryan Schumaker <bjschuma@netapp.com>
| * | NFSv4: Fix issues in nfs4_discover_server_trunkingTrond Myklebust2013-04-051-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | - Ensure that we exit with ENOENT if the call to ops->get_clid_cred() fails. - Handle the case where ops->detect_trunking() exits with an unexpected error, and return EIO. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | NFS: Use "krb5i" to establish NFSv4 state whenever possibleChuck Lever2013-03-291-32/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently our client uses AUTH_UNIX for state management on Kerberos NFS mounts in some cases. For example, if the first mount of a server specifies "sec=sys," the SETCLIENTID operation is performed with AUTH_UNIX. Subsequent mounts using stronger security flavors can not change the flavor used for lease establishment. This might be less security than an administrator was expecting. Dave Noveck's migration issues draft recommends the use of an integrity-protecting security flavor for the SETCLIENTID operation. Let's ignore the mount's sec= setting and use krb5i as the default security flavor for SETCLIENTID. If our client can't establish a GSS context (eg. because it doesn't have a keytab or the server doesn't support Kerberos) we fall back to using AUTH_NULL. For an operation that requires a machine credential (which never represents a particular user) AUTH_NULL is as secure as AUTH_UNIX. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | | NFSv4: Use the open stateid if the delegation has the wrong modeTrond Myklebust2013-04-201-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | Fix nfs4_select_rw_stateid() so that it chooses the open stateid (or an all-zero stateid) if the delegation does not match the selected read/write mode. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | | NFSv4: Handle timeouts correctly when probing for lease validityTrond Myklebust2013-04-081-0/+4
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | When we send a RENEW or SEQUENCE operation in order to probe if the lease is still valid, we want it to be able to time out since the lease we are probing is likely to time out too. Currently, because we use soft mount semantics for these RPC calls, the return value is EIO, which causes the state manager to exit with an "unhandled error" message. This patch changes the call semantics, so that the RPC layer returns ETIMEDOUT instead of EIO. We then have the state manager default to a simple retry instead of exiting. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | NFSv4: Fix another reboot recovery raceTrond Myklebust2013-03-281-0/+2
| | | | | | | | | | | | | | | | | | | | If the open_context for the file is not yet fully initialised, then open recovery cannot succeed, and since nfs4_state_find_open_context returns an ENOENT, we end up treating the file as being irrecoverable. What we really want to do, is just defer the recovery until later. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | NFSv4.1: Select the "most recent locking state" for read/write/setattr stateidsTrond Myklebust2013-03-251-0/+2
| | | | | | | | | | | | | | | | Follow the practice described in section 8.2.2 of RFC5661: When sending a read/write or setattr stateid, set the seqid field to zero in order to signal that the NFS server should apply the most recent locking state. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | NFSv4: Resend the READ/WRITE RPC call if a stateid change causes an errorTrond Myklebust2013-03-251-9/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | Adds logic to ensure that if the server returns a BAD_STATEID, or other state related error, then we check if the stateid has already changed. If it has, then rather than start state recovery, we should just resend the failed RPC call with the new stateid. Allow nfs4_select_rw_stateid to notify that the stateid is unstable by having it return -EWOULDBLOCK if an RPC is underway that might change the stateid. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | NFS: Don't accept more reads/writes if the open context recovery failedTrond Myklebust2013-03-251-0/+16
| | | | | | | | | | | | | | If the state recovery failed, we want to ensure that the application doesn't try to use the same file descriptor for more reads or writes. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | NFSv4: Fail I/O if the state recovery fails irrevocablyTrond Myklebust2013-03-251-5/+14
|/ | | | | | | | | | | If state recovery fails with an ESTALE or a ENOENT, then we shouldn't keep retrying. Instead, mark the stateid as being invalid and fail the I/O with an EIO error. For other operations such as POSIX and BSD file locking, truncate etc, fail with an EBADF to indicate that this file descriptor is no longer valid. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Ensure delegation recall and byte range lock removal don't conflictTrond Myklebust2013-02-111-0/+1
| | | | | | | | | | Add a mutex to the struct nfs4_state_owner to ensure that delegation recall doesn't conflict with byte range lock removal. Note that we nest the new mutex _outside_ the state manager reclaim protection (nfsi->rwsem) in order to avoid deadlocks. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Allow the state manager to mark an open_owner as being recoveredTrond Myklebust2013-02-111-1/+9
| | | | | | | | | | This patch adds a seqcount_t lock for use by the state manager to signal that an open owner has been recovered. This mechanism will be used by the delegation, open and byte range lock code in order to figure out if they need to replay requests due to collisions with lock recovery. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Handle NFS4ERR_DELAY when resetting the NFSv4.1 sessionTrond Myklebust2013-01-301-2/+12
| | | | | | | | | | | | NFS4ERR_DELAY is a legal reply when we call DESTROY_SESSION. It usually means that the server is busy handling an unfinished RPC request. Just sleep for a second and then retry. We also need to be able to handle the NFS4ERR_BACK_CHAN_BUSY return value. If the NFS server has outstanding callbacks, we just want to similarly sleep & retry. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org
* NFSv4: Fix NFSv4 trunking discoveryTrond Myklebust2013-01-271-6/+2
| | | | | | | | | | | | | | If walking the list in nfs4[01]_walk_client_list fails, then the most likely explanation is that the server dropped the clientid before we actually managed to confirm it. As long as our nfs_client is the very last one in the list to be tested, the caller can be assured that this is the case when the final return value is NFS4ERR_STALE_CLIENTID. Reported-by: Ben Greear <greearb@candelatech.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: stable@vger.kernel.org [>=3.7] Tested-by: Ben Greear <greearb@candelatech.com>
* nfs: Remove unused list nfs4_clientid_listYanchuan Nian2012-12-131-1/+0
| | | | | | | | | This list was designed to store struct nfs4_client in the client side. But nfs4_client was obsolete and has been removed from the source code. So remove the unused list. Signed-off-by: Yanchuan Nian <ycnian@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* SUNRPC handle EKEYEXPIRED in call_refreshresultAndy Adamson2012-12-121-23/+0
| | | | | | | | | | | | | | | | | | Currently, when an RPCSEC_GSS context has expired or is non-existent and the users (Kerberos) credentials have also expired or are non-existent, the client receives the -EKEYEXPIRED error and tries to refresh the context forever. If an application is performing I/O, or other work against the share, the application hangs, and the user is not prompted to refresh/establish their credentials. This can result in a denial of service for other users. Users are expected to manage their Kerberos credential lifetimes to mitigate this issue. Move the -EKEYEXPIRED handling into the RPC layer. Try tk_cred_retry number of times to refresh the gss_context, and then return -EACCES to the application. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Ensure smooth handover of slots from one task to the next waitingTrond Myklebust2012-12-061-5/+1
| | | | | | | | Currently, we see a lot of bouncing for the value of highest_used_slotid due to the fact that slots are getting freed, instead of getting instantly transmitted to the next waiting task. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Remove the 'FIFO' behaviour for nfs41_setup_sequenceTrond Myklebust2012-12-061-3/+1
| | | | | | | | It is more important to preserve the task priority behaviour, which ensures that things like reclaim writes take precedence over background and kupdate writes. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Ping server when our session table limits are too highTrond Myklebust2012-12-061-0/+5
| | | | | | | | | If the server requests a lower target_highest_slotid, then ensure that we ping it with at least one RPC call containing an appropriate SEQUENCE op. This ensures that the server won't need to send a recall callback in order to shrink the slot table. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Move slot table and session struct definitions to nfs4session.hTrond Myklebust2012-12-061-0/+1
| | | | | | Clean up. Gather NFSv4.1 slot definitions in fs/nfs/nfs4session.h. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Move nfs4_wait_clnt_recover and nfs4_client_recover_expired_leaseTrond Myklebust2012-12-061-0/+34
| | | | | | | | nfs4_wait_clnt_recover and nfs4_client_recover_expired_lease are both generic state related functions. As such, they belong in nfs4state.c, and not nfs4proc.c Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Clean up session drainingTrond Myklebust2012-12-061-0/+10
| | | | | | | | | Coalesce nfs4_check_drain_bc_complete and nfs4_check_drain_fc_complete into a single function that can be called when the slot table is known to be empty, then change nfs4_callback_free_slot() and nfs4_free_slot() to use it. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: CB_RECALL_SLOT must schedule a sequence op after updating targetsTrond Myklebust2012-12-061-0/+12
| | | | | | | | | | RFC5661 requires us to make sure that the server knows we've updated our slot table size by sending at least one SEQUENCE op containing the new 'highest_slotid' value. We can do so using the 'CHECK_LEASE' functionality of the state manager. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Remove the state manager code to resize the slot tableTrond Myklebust2012-12-061-33/+0
| | | | | | | | The state manager no longer needs any special machinery to stop the session flow and resize the slot table. It is all done on the fly by the SEQUENCE op code now. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Allow SEQUENCE to resize the slot table on the flyTrond Myklebust2012-12-061-18/+4
| | | | | | | Instead of an array of slots, use a singly linked list of slots that can be dynamically appended to or shrunk. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Support dynamic resizing of the session slot tableTrond Myklebust2012-12-061-3/+3
| | | | | | | | Allow the server to control the size of the session slot table by adjusting the value of sr_target_max_slots in the reply to the SEQUENCE operation. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Ensure that the client tracks the server target_highest_slotidTrond Myklebust2012-12-061-4/+3
| | | | | | | | | | | | | | | | | Dynamic slot allocation in NFSv4.1 depends on the client being able to track the server's target value for the highest slotid in the slot table. See the reference in Section 2.10.6.1 of RFC5661. To avoid ordering problems in the case where 2 SEQUENCE replies contain conflicting updates to this target value, we also introduce a generation counter, to track whether or not an RPC containing a SEQUENCE operation was launched before or after the last update. Also rename the nfs4_slot_table target_max_slots field to 'target_highest_slotid' to avoid confusion with a slot table size or number of slots. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Shrink struct nfs4_sequence_res by moving the session pointerTrond Myklebust2012-11-261-1/+1
| | | | | | | Move the session pointer into the slot table, then have struct nfs4_slot point to that slot table. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: clean up nfs4_recall_slot to use nfs4_alloc_slotsTrond Myklebust2012-11-211-2/+1
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Handle session reset and bind_conn_to_session before lease checkTrond Myklebust2012-11-211-9/+8
| | | | | | | | We can't send a SEQUENCE op unless the session is OK, so it is pointless to handle the CHECK_LEASE state before we've dealt with SESSION_RESET and BIND_CONN_TO_SESSION. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Get rid of unnecessary BUG_ON()sTrond Myklebust2012-11-041-1/+0
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: don't do two EXCHANGE_IDs on mountWeston Andros Adamson2012-10-021-0/+1
| | | | | | | | | | | | Since the addition of NFSv4 server trunking detection the mount context calls nfs4_proc_exchange_id then schedules the state manager, which also calls nfs4_proc_exchange_id. Setting the NFS4CLNT_LEASE_CONFIRM bit makes the state manager skip the unneeded EXCHANGE_ID and continue on with session creation. Reported-by: Jorge Mora <mora@netapp.com> Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.0 reclaim reboot state when re-establishing clientidAndy Adamson2012-10-011-2/+6
| | | | | | | We should reclaim reboot state when the clientid is stale. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Fix up a merge conflict between migration and container changesTrond Myklebust2012-10-011-2/+3
| | | | | | nfs_callback_tcpport is now per-net_namespace. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Discover NFSv4 server trunking when mountingChuck Lever2012-10-011-1/+181
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "Server trunking" is a fancy named for a multi-homed NFS server. Trunking might occur if a client sends NFS requests for a single workload to multiple network interfaces on the same server. There are some implications for NFSv4 state management that make it useful for a client to know if a single NFSv4 server instance is multi-homed. (Note this is only a consideration for NFSv4, not for legacy versions of NFS, which are stateless). If a client cares about server trunking, no NFSv4 operations can proceed until that client determines who it is talking to. Thus server IP trunking discovery must be done when the client first encounters an unfamiliar server IP address. The nfs_get_client() function walks the nfs_client_list and matches on server IP address. The outcome of that walk tells us immediately if we have an unfamiliar server IP address. It invokes nfs_init_client() in this case. Thus, nfs4_init_client() is a good spot to perform trunking discovery. Discovery requires a client to establish a fresh client ID, so our client will now send SETCLIENTID or EXCHANGE_ID as the first NFS operation after a successful ping, rather than waiting for an application to perform an operation that requires NFSv4 state. The exact process for detecting trunking is different for NFSv4.0 and NFSv4.1, so a minorversion-specific init_client callout method is introduced. CLID_INUSE recovery is important for the trunking discovery process. CLID_INUSE is a sign the server recognizes the client's nfs_client_id4 id string, but the client is using the wrong principal this time for the SETCLIENTID operation. The SETCLIENTID must be retried with a series of different principals until one works, and then the rest of trunking discovery can proceed. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Slow down state manager after an unhandled errorChuck Lever2012-10-011-0/+1
| | | | | | | | | | | | | | | | | | If the state manager thread is not actually able to fully recover from some situation, it wakes up waiters, who kick off a new state manager thread. Quite often the fresh invocation of the state manager is just as successful. This results in a livelock as the client dumps thousands of NFS requests a second on the network in a vain attempt to recover. Not very friendly. To mitigate this situation, add a delay in the state manager after an unhandled error, so that the client sends just a few requests every second in this case. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: make nfs_callback_tcpport6 per network contextStanislav Kinsbursky2012-10-011-1/+1
| | | | | Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>