summaryrefslogtreecommitdiffstats
path: root/fs/nfs/nfs4proc.c
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'nfs-for-4.16-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds2018-01-301-10/+32
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client updates from Trond Myklebust: "Highlights include: Stable bugfixes: - Fix breakages in the nfsstat utility due to the inclusion of the NFSv4 LOOKUPP operation - Fix a NULL pointer dereference in nfs_idmap_prepare_pipe_upcall() due to nfs_idmap_legacy_upcall() being called without an 'aux' parameter - Fix a refcount leak in the standard O_DIRECT error path - Fix a refcount leak in the pNFS O_DIRECT fallback to MDS path - Fix CPU latency issues with nfs_commit_release_pages() - Fix the LAYOUTUNAVAILABLE error case in the file layout type - NFS: Fix a race between mmap() and O_DIRECT Features: - Support the statx() mask and query flags to enable optimisations when the user is requesting only attributes that are already up to date in the inode cache, or is specifying the AT_STATX_DONT_SYNC flag - Add a module alias for the SCSI pNFS layout type Bugfixes: - Automounting when resolving a NFSv4 referral should preserve the RDMA transport protocol settings - Various other RDMA bugfixes from Chuck - pNFS block layout fixes - Always set NFS_LOCK_LOST when a lock is lost" * tag 'nfs-for-4.16-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (69 commits) NFS: Fix a race between mmap() and O_DIRECT NFS: Remove a redundant call to unmap_mapping_range() pnfs/blocklayout: Ensure disk address in block device map pnfs/blocklayout: pnfs_block_dev_map uses bytes, not sectors lockd: Fix server refcounting SUNRPC: Fix null rpc_clnt dereference in rpc_task_queued tracepoint SUNRPC: Micro-optimize __rpc_execute SUNRPC: task_run_action should display tk_callback sunrpc: Format RPC events consistently for display SUNRPC: Trace xprt_timer events xprtrdma: Correct some documenting comments xprtrdma: Fix "bytes registered" accounting xprtrdma: Instrument allocation/release of rpcrdma_req/rep objects xprtrdma: Add trace points to instrument QP and CQ access upcalls xprtrdma: Add trace points in the client-side backchannel code paths xprtrdma: Add trace points for connect events xprtrdma: Add trace points to instrument MR allocation and recovery xprtrdma: Add trace points to instrument memory invalidation xprtrdma: Add trace points in reply decoder path xprtrdma: Add trace points to instrument memory registration ..
| * NFSv4: always set NFS_LOCK_LOST when a lock is lost.NeilBrown2018-01-141-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are 2 comments in the NFSv4 code which suggest that SIGLOST should possibly be sent to a process. In these cases a lock has been lost. The current practice is to set NFS_LOCK_LOST so that read/write returns EIO when a lock is lost. So change these comments to code when sets NFS_LOCK_LOST. One case is when lock recovery after apparent server restart fails with NFS4ERR_DENIED, NFS4ERR_RECLAIM_BAD, or NFS4ERRO_RECLAIM_CONFLICT. The other case is when a lock attempt as part of lease recovery fails with NFS4ERR_DENIED. In an ideal world, these should not happen. However I have a packet trace showing an NFSv4.1 session getting NFS4ERR_BADSESSION after an extended network parition. The NFSv4.1 client treats this like server reboot until/unless it get NFS4ERR_NO_GRACE, in which case it switches over to "nograce" recovery mode. In this network trace, the client attempts to recover a lock and the server (incorrectly) reports NFS4ERR_DENIED rather than NFS4ERR_NO_GRACE. This leads to the ineffective comment and the client then continues to write using the OPEN stateid. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * NFSv4: Convert LOCKU to use nfs4_async_handle_exception()Trond Myklebust2018-01-141-2/+8
| | | | | | | | | | | | | | Convert CLOSE so that it specifies the correct stateid and inode for the error handling. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * NFSv4: Convert DELEGRETURN to use nfs4_handle_exception()Trond Myklebust2018-01-141-3/+8
| | | | | | | | Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * NFSv4: Convert CLOSE to use nfs4_async_handle_exception()Trond Myklebust2018-01-141-1/+8
| | | | | | | | | | | | | | Convert CLOSE so that it specifies the correct stateid, state and inode for the error handling. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* | nfs: convert to new i_version APIJeff Layton2018-01-291-4/+6
|/ | | | | | | | | | For NFS, we just use the "raw" API since the i_version is mostly managed by the server. The exception there is when the client holds a write delegation, but we only need to bump it once there anyway to handle CB_GETATTR. Tested-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Jeff Layton <jlayton@redhat.com>
* NFS: Revert "NFS: Move the flock open mode check into nfs_flock()"Benjamin Coddington2017-11-171-0/+14
| | | | | | | | | | Commit e12937279c8b "NFS: Move the flock open mode check into nfs_flock()" changed NFSv3 behavior for flock() such that the open mode must match the lock type, however that requirement shouldn't be enforced for flock(). Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Cc: stable@vger.kernel.org # v4.12 Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* nfs: Fix ugly referral attributesChuck Lever2017-11-171-10/+8
| | | | | | | | | | | | | | | | | | | | | | | | | Before traversing a referral and performing a mount, the mounted-on directory looks strange: dr-xr-xr-x. 2 4294967294 4294967294 0 Dec 31 1969 dir.0 nfs4_get_referral is wiping out any cached attributes with what was returned via GETATTR(fs_locations), but the bit mask for that operation does not request any file attributes. Retrieve owner and timestamp information so that the memcpy in nfs4_get_referral fills in more attributes. Changes since v1: - Don't request attributes that the client unconditionally replaces - Request only MOUNTED_ON_FILEID or FILEID attribute, not both - encode_fs_locations() doesn't use the third bitmask word Fixes: 6b97fd3da1ea ("NFSv4: Follow a referral") Suggested-by: Pradeep Thomas <pradeepthomas@gmail.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4: nfs_set_open_stateid must not trigger state recovery for closed stateTrond Myklebust2017-11-171-1/+2
| | | | | | | | In nfs_set_open_stateid_locked, we must ignore stateids from closed state. Reported-by: Andrew W Elble <aweits@rit.edu> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4: Clean up nfs4_delegreturn_doneTrond Myklebust2017-11-171-18/+14
| | | | | Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4: cleanup nfs4_close_doneTrond Myklebust2017-11-171-19/+16
| | | | | Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4: Retry NFS4ERR_OLD_STATEID errors in layoutreturnTrond Myklebust2017-11-171-3/+12
| | | | | | | | If our layoutreturn returns an NFS4ERR_OLD_STATEID, then try to update the stateid and retry. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* pNFS: Retry NFS4ERR_OLD_STATEID errors in layoutreturn-on-closeTrond Myklebust2017-11-171-2/+16
| | | | | | | | | If our layoutreturn on close operation returns an NFS4ERR_OLD_STATEID, then try to update the stateid and retry. We know that there should be no further LAYOUTGET requests being launched. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4: Don't try to CLOSE if the stateid 'other' field has changedTrond Myklebust2017-11-171-10/+4
| | | | | | | | | | If the stateid is no longer recognised on the server, either due to a restart, or due to a competing CLOSE call, then we do not have to retry. Any open contexts that triggered a reopen of the file, will also act as triggers for any CLOSE for the updated stateids. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4: Retry CLOSE and DELEGRETURN on NFS4ERR_OLD_STATEID.Trond Myklebust2017-11-171-2/+19
| | | | | | | | | If we're racing with an OPEN, then retry the operation instead of declaring it a success. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> [Andrew W Elble: Fix a typo in nfs4_refresh_open_stateid] Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4: Fix open create exclusive when the server rebootsTrond Myklebust2017-11-171-15/+26
| | | | | | | | | | | | | | | | If the server that does not implement NFSv4.1 persistent session semantics reboots while we are performing an exclusive create, then the return value of NFS4ERR_DELAY when we replay the open during the grace period causes us to lose the verifier. When the grace period expires, and we present a new verifier, the server will then correctly reply NFS4ERR_EXIST. This commit ensures that we always present the same verifier when replaying the OPEN. Reported-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4: Add a tracepoint to document open stateid updatesTrond Myklebust2017-11-171-0/+3
| | | | | Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4: Fix OPEN / CLOSE raceTrond Myklebust2017-11-171-35/+119
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ben Coddington has noted the following race between OPEN and CLOSE on a single client. Process 1 Process 2 Server ========= ========= ====== 1) OPEN file 2) OPEN file 3) Process OPEN (1) seqid=1 4) Process OPEN (2) seqid=2 5) Reply OPEN (2) 6) Receive reply (2) 7) new stateid, seqid=2 8) CLOSE file, using stateid w/ seqid=2 9) Reply OPEN (1) 10( Process CLOSE (8) 11) Reply CLOSE (8) 12) Forget stateid file closed 13) Receive reply (7) 14) Forget stateid file closed. 15) Receive reply (1). 16) New stateid seqid=1 is really the same stateid that was closed. IOW: the reply to the first OPEN is delayed. Since "Process 2" does not wait before closing the file, and it does not cache the closed stateid, then when the delayed reply is finally received, it is treated as setting up a new stateid by the client. The fix is to ensure that the client processes the OPEN and CLOSE calls in the same order in which the server processed them. This commit ensures that we examine the seqid of the stateid returned by OPEN. If it is a new stateid, we assume the seqid must be equal to the value 1, and that each state transition increments the seqid value by 1 (See RFC7530, Section 9.1.4.2, and RFC5661, Section 8.2.2). If the tracker sees that an OPEN returns with a seqid that is greater than the cached seqid + 1, then it bumps a flag to ensure that the caller waits for the RPCs carrying the missing seqids to complete. Note that there can still be pathologies where the server crashes before it can even send us the missing seqids. Since the OPEN call is still holding a slot when it waits here, that could cause the recovery to stall forever. To avoid that, we time out after a 5 second wait. Reported-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* fs, nfs: convert nfs_client.cl_count from atomic_t to refcount_tElena Reshetova2017-11-171-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable nfs_client.cl_count is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* fs, nfs: convert nfs4_lock_state.ls_count from atomic_t to refcount_tElena Reshetova2017-11-171-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable nfs4_lock_state.ls_count is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4.1: Fix up replays of interrupted requestsTrond Myklebust2017-11-171-46/+102
| | | | | | | | | | | | | | | | | | | | If the previous request on a slot was interrupted before it was processed by the server, then our slot sequence number may be out of whack, and so we try the next operation using the old sequence number. The problem with this, is that not all servers check to see that the client is replaying the same operations as previously when they decide to go to the replay cache, and so instead of the expected error of NFS4ERR_SEQ_FALSE_RETRY, we get a replay of the old reply, which could (if the operations match up) be mistaken by the client for a new reply. To fix this, we attempt to send a COMPOUND containing only the SEQUENCE op in order to resync our slot sequence number. Cc: Olga Kornievskaia <olga.kornievskaia@gmail.com> [olga.kornievskaia@gmail.com: fix an Oops] Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFS: Don't compare apples to elephants to determine access bitsAnna Schumaker2017-10-161-18/+1
| | | | | | | | | | The NFS_ACCESS_* flags aren't a 1:1 mapping to the MAY_* flags, so checking for MAY_WHATEVER might have surprising results in nfs*_proc_access(). Let's simplify this check when determining which bits to ask for, and do it in a generic place instead of copying code for each NFS version. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* NFSv4/pnfs: Fix an infinite layoutget loopTrond Myklebust2017-10-041-2/+1
| | | | | | | | | | | | | Since we can now use a lock stateid or a delegation stateid, that differs from the context stateid, we need to change the test in nfs4_layoutget_handle_exception() to take this into account. This fixes an infinite layoutget loop in the NFS client whereby it keeps retrying the initial layoutget using the same broken stateid. Fixes: 70d2f7b1ea19b ("pNFS: Use the standard I/O stateid when...") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* NFS: remove jiffies field from access cacheNeilBrown2017-09-061-1/+0
| | | | | | | | This field hasn't been used since commit 57b691819ee2 ("NFS: Cache access checks more aggressively"). Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* Merge branch 'open_state'Trond Myklebust2017-08-151-22/+48
|\
| * NFSv4: Use the nfs4_state being recovered in _nfs4_opendata_to_nfs4_state()Trond Myklebust2017-08-131-16/+25
| | | | | | | | | | | | | | If we're recovering a nfs4_state, then we should try to use that instead of looking up a new stateid. Only do that if the inodes match, though. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * NFSv4: Use correct inode in _nfs4_opendata_to_nfs4_state()Trond Myklebust2017-08-131-5/+23
| | | | | | | | | | | | | | When doing open by filehandle we don't really want to lookup a new inode, but rather update the one we've got. Add a helper which does this for us. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * Merge tag 'nfs-for-4.13-5' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds2017-08-111-2/+1
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client fixes from Anna Schumaker: "A few more NFS client bugfixes from me for rc5. Dros has a stable fix for flexfiles to prevent leaking the nfs4_ff_ds_version arrays when freeing a layout, Trond fixed a potential recovery loop situation with the TEST_STATEID operation, and Christoph fixed up the pNFS blocklayout Kconfig options to prevent unsafe use with kernels that don't have large block device support. Summary: Stable fix: - fix leaking nfs4_ff_ds_version array Other fixes: - improve TEST_STATEID OLD_STATEID handling to prevent recovery loop - require 64-bit sector_t for pNFS blocklayout to prevent 32-bit compile errors" * tag 'nfs-for-4.13-5' of git://git.linux-nfs.org/projects/anna/linux-nfs: pnfs/blocklayout: require 64-bit sector_t NFSv4: Ignore NFS4ERR_OLD_STATEID in nfs41_check_open_stateid() nfs/flexfiles: fix leak of nfs4_ff_ds_version arrays
| | * NFSv4: Ignore NFS4ERR_OLD_STATEID in nfs41_check_open_stateid()Trond Myklebust2017-08-091-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | If the call to TEST_STATEID returns NFS4ERR_OLD_STATEID, then it just means we raced with other calls to OPEN. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* | | NFSv4: Cleanup setting of the migration flags.Trond Myklebust2017-08-071-6/+3
| | | | | | | | | | | | Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* | | NFSv4.1: Ensure we clear the SP4_MACH_CRED flags in nfs4_sp4_select_mode()Trond Myklebust2017-08-071-11/+15
| | | | | | | | | | | | | | | | | | | | | | | | If the server changes, so that it no longer supports SP4_MACH_CRED, or that it doesn't support the same set of SP4_MACH_CRED functionality, then we want to ensure that we clear the unsupported flags. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* | | NFSv4: Refactor _nfs4_proc_exchange_id()Trond Myklebust2017-08-071-86/+85
|/ / | | | | | | | | | | | | | | | | Tease apart the functionality in nfs4_exchange_id_done() so that it is easier to debug exchange id vs trunking issues by moving all the processing out of nfs4_exchange_id_done() and into the callers. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* | Merge tag 'nfs-for-4.13-4' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds2017-08-021-20/+7
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client fixes from Anna Schumaker: "Two fixes from Trond this time, now that he's back from his vacation. The first is a stable fix for the EXCHANGE_ID issue on the mailing list, and the other fixes a double-free situation that he found at the same time. Stable fix: - Fix EXCHANGE_ID corrupt verifier issue Other fix: - Fix double frees in nfs4_test_session_trunk()" * tag 'nfs-for-4.13-4' of git://git.linux-nfs.org/projects/anna/linux-nfs: NFSv4: Fix double frees in nfs4_test_session_trunk() NFSv4: Fix EXCHANGE_ID corrupt verifier issue
| * NFSv4: Fix double frees in nfs4_test_session_trunk()Trond Myklebust2017-08-021-13/+3
| | | | | | | | | | | | | | | | | | rpc_clnt_add_xprt() expects the callback function to be synchronous, and expects to release the transport and switch references itself. Fixes: 04fa2c6bb51b1 ("NFS pnfs data server multipath session trunking") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| * NFSv4: Fix EXCHANGE_ID corrupt verifier issueTrond Myklebust2017-08-011-7/+4
| | | | | | | | | | | | | | | | | | | | | | | | The verifier is allocated on the stack, but the EXCHANGE_ID RPC call was changed to be asynchronous by commit 8d89bd70bc939. If we interrrupt the call to rpc_wait_for_completion_task(), we can therefore end up transmitting random stack contents in lieu of the verifier. Fixes: 8d89bd70bc939 ("NFS setup async exchange_id") Cc: stable@vger.kernel.org # v4.9+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* | Merge tag 'nfs-for-4.13-3' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds2017-07-281-5/+9
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client fixes from Anna Schumaker: "More NFS client bugfixes for 4.13. Most of these fix locking bugs that Ben and Neil noticed, but I also have a patch to fix one more access bug that was reported after last week. Stable fixes: - Fix a race where CB_NOTIFY_LOCK fails to wake a waiter - Invalidate file size when taking a lock to prevent corruption Other fixes: - Don't excessively generate tiny writes with fallocate - Use the raw NFS access mask in nfs4_opendata_access()" * tag 'nfs-for-4.13-3' of git://git.linux-nfs.org/projects/anna/linux-nfs: NFSv4.1: Fix a race where CB_NOTIFY_LOCK fails to wake a waiter NFS: Optimize fallocate by refreshing mapping when needed. NFS: invalidate file size when taking a lock. NFS: Use raw NFS access mask in nfs4_opendata_access()
| * NFSv4.1: Fix a race where CB_NOTIFY_LOCK fails to wake a waiterBenjamin Coddington2017-07-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nfs4_retry_setlk() sets the task's state to TASK_INTERRUPTIBLE within the same region protected by the wait_queue's lock after checking for a notification from CB_NOTIFY_LOCK callback. However, after releasing that lock, a wakeup for that task may race in before the call to freezable_schedule_timeout_interruptible() and set TASK_WAKING, then freezable_schedule_timeout_interruptible() will set the state back to TASK_INTERRUPTIBLE before the task will sleep. The result is that the task will sleep for the entire duration of the timeout. Since we've already set TASK_INTERRUPTIBLE in the locked section, just use freezable_schedule_timout() instead. Fixes: a1d617d8f134 ("nfs: allow blocking locks to be awoken by lock callbacks") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Cc: stable@vger.kernel.org # v4.9+ Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| * NFS: Use raw NFS access mask in nfs4_opendata_access()Anna Schumaker2017-07-261-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit bd8b2441742b ("NFS: Store the raw NFS access mask in the inode's access cache") changed how the access results are stored after an access() call. An NFS v4 OPEN might have access bits returned with the opendata, so we should use the NFS4_ACCESS values when determining the return value in nfs4_opendata_access(). Fixes: bd8b2441742b ("NFS: Store the raw NFS access mask in the inode's access cache") Reported-by: Eryu Guan <eguan@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Tested-by: Takashi Iwai <tiwai@suse.de>
* | Merge tag 'nfs-for-4.13-1' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds2017-07-131-29/+80
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client updates from Anna Schumaker: "Stable bugfixes: - Fix -EACCESS on commit to DS handling - Fix initialization of nfs_page_array->npages - Only invalidate dentries that are actually invalid Features: - Enable NFSoRDMA transparent state migration - Add support for lookup-by-filehandle - Add support for nfs re-exporting Other bugfixes and cleanups: - Christoph cleaned up the way we declare NFS operations - Clean up various internal structures - Various cleanups to commits - Various improvements to error handling - Set the dt_type of . and .. entries in NFS v4 - Make slot allocation more reliable - Fix fscache stat printing - Fix uninitialized variable warnings - Fix potential list overrun in nfs_atomic_open() - Fix a race in NFSoRDMA RPC reply handler - Fix return size for nfs42_proc_copy() - Fix against MAC forgery timing attacks" * tag 'nfs-for-4.13-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (68 commits) NFS: Don't run wake_up_bit() when nobody is waiting... nfs: add export operations nfs4: add NFSv4 LOOKUPP handlers nfs: add a nfs_ilookup helper nfs: replace d_add with d_splice_alias in atomic_open sunrpc: use constant time memory comparison for mac NFSv4.2 fix size storage for nfs42_proc_copy xprtrdma: Fix documenting comments in frwr_ops.c xprtrdma: Replace PAGE_MASK with offset_in_page() xprtrdma: FMR does not need list_del_init() xprtrdma: Demote "connect" log messages NFSv4.1: Use seqid returned by EXCHANGE_ID after state migration NFSv4.1: Handle EXCHGID4_FLAG_CONFIRMED_R during NFSv4.1 migration xprtrdma: Don't defer MR recovery if ro_map fails xprtrdma: Fix FRWR invalidation error recovery xprtrdma: Fix client lock-up after application signal fires xprtrdma: Rename rpcrdma_req::rl_free xprtrdma: Pass only the list of registered MRs to ro_unmap_sync xprtrdma: Pre-mark remotely invalidated MRs xprtrdma: On invalidation failure, remove MWs from rl_registered ...
| * nfs4: add NFSv4 LOOKUPP handlersJeff Layton2017-07-131-0/+49
| | | | | | | | | | | | | | | | This will be needed in order to implement the get_parent export op for nfsd. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| * NFSv4.1: Use seqid returned by EXCHANGE_ID after state migrationChuck Lever2017-07-131-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Transparent State Migration copies a client's lease state from the server where a filesystem used to reside to the server where it now resides. When an NFSv4.1 client first contacts that destination server, it uses EXCHANGE_ID to detect trunking relationships. The lease that was copied there is returned to that client, but the destination server sets EXCHGID4_FLAG_CONFIRMED_R when replying to the client. This is because the lease was confirmed on the source server (before it was copied). When CONFIRMED_R is set, the client throws away the sequence ID returned by the server. During a Transparent State Migration, however there's no other way for the client to know what sequence ID to use with a lease that's been migrated. Therefore, the client must save and use the contrived slot sequence value returned by the destination server even when CONFIRMED_R is set. Note that some servers always return a seqid of 1 after a migration. Reported-by: Xuan Qi <xuan.qi@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Xuan Qi <xuan.qi@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| * NFS: convert flags to boolBenjamin Coddington2017-07-131-21/+21
| | | | | | | | | | | | | | | | | | NFS uses some int, and unsigned int :1, and bool as flags in structs and args. Assert the preference for uniformly replacing these with the bool type. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| * NFS: Set FATTR4_WORD0_TYPE for . and .. entriesAnna Schumaker2017-07-131-4/+7
| | | | | | | | | | | | | | | | The current code worked okay for getdents(), but getdents64() expects the d_type field to get filled out properly in the stat structure. Setting this field fixes xfstests generic/401. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* | Merge branch 'sched-core-for-linus' of ↵Linus Torvalds2017-07-031-2/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The main changes in this cycle were: - Add the SYSTEM_SCHEDULING bootup state to move various scheduler debug checks earlier into the bootup. This turns silent and sporadically deadly bugs into nice, deterministic splats. Fix some of the splats that triggered. (Thomas Gleixner) - A round of restructuring and refactoring of the load-balancing and topology code (Peter Zijlstra) - Another round of consolidating ~20 of incremental scheduler code history: this time in terms of wait-queue nomenclature. (I didn't get much feedback on these renaming patches, and we can still easily change any names I might have misplaced, so if anyone hates a new name, please holler and I'll fix it.) (Ingo Molnar) - sched/numa improvements, fixes and updates (Rik van Riel) - Another round of x86/tsc scheduler clock code improvements, in hope of making it more robust (Peter Zijlstra) - Improve NOHZ behavior (Frederic Weisbecker) - Deadline scheduler improvements and fixes (Luca Abeni, Daniel Bristot de Oliveira) - Simplify and optimize the topology setup code (Lauro Ramos Venancio) - Debloat and decouple scheduler code some more (Nicolas Pitre) - Simplify code by making better use of llist primitives (Byungchul Park) - ... plus other fixes and improvements" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (103 commits) sched/cputime: Refactor the cputime_adjust() code sched/debug: Expose the number of RT/DL tasks that can migrate sched/numa: Hide numa_wake_affine() from UP build sched/fair: Remove effective_load() sched/numa: Implement NUMA node level wake_affine() sched/fair: Simplify wake_affine() for the single socket case sched/numa: Override part of migrate_degrades_locality() when idle balancing sched/rt: Move RT related code from sched/core.c to sched/rt.c sched/deadline: Move DL related code from sched/core.c to sched/deadline.c sched/cpuset: Only offer CONFIG_CPUSETS if SMP is enabled sched/fair: Spare idle load balancing on nohz_full CPUs nohz: Move idle balancer registration to the idle path sched/loadavg: Generalize "_idle" naming to "_nohz" sched/core: Drop the unused try_get_task_struct() helper function sched/fair: WARN() and refuse to set buddy when !se->on_rq sched/debug: Fix SCHED_WARN_ON() to return a value on !CONFIG_SCHED_DEBUG as well sched/wait: Disambiguate wq_entry->task_list and wq_head->task_list naming sched/wait: Move bit_wait_table[] and related functionality from sched/core.c to sched/wait_bit.c sched/wait: Split out the wait_bit*() APIs from <linux/wait.h> into <linux/wait_bit.h> sched/wait: Re-adjust macro line continuation backslashes in <linux/wait.h> ...
| * | sched/wait: Rename wait_queue_t => wait_queue_entry_tIngo Molnar2017-06-201-2/+2
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename: wait_queue_t => wait_queue_entry_t 'wait_queue_t' was always a slight misnomer: its name implies that it's a "queue", but in reality it's a queue *entry*. The 'real' queue is the wait queue head, which had to carry the name. Start sorting this out by renaming it to 'wait_queue_entry_t'. This also allows the real structure name 'struct __wait_queue' to lose its double underscore and become 'struct wait_queue_entry', which is the more canonical nomenclature for such data types. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | NFSv4.1: Fix a race in nfs4_proc_layoutgetTrond Myklebust2017-06-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | If the task calling layoutget is signalled, then it is possible for the calls to nfs4_sequence_free_slot() and nfs4_layoutget_prepare() to race, in which case we leak a slot. The fix is to move the call to nfs4_sequence_free_slot() into the nfs4_layoutget_release() so that it gets called at task teardown time. Fixes: 2e80dbe7ac51 ("NFSv4.1: Close callback races for OPEN, LAYOUTGET...") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* | NFSv4.2: Don't send mode again in post-EXCLUSIVE4_1 SETATTR with umaskBenjamin Coddington2017-06-051-1/+2
|/ | | | | | | | | | | | | Now that we have umask support, we shouldn't re-send the mode in a SETATTR following an exclusive CREATE, or we risk having the same problem fixed in commit 5334c5bdac92 ("NFS: Send attributes in OPEN request for NFS4_CREATE_EXCLUSIVE4_1"), which is that files with S_ISGID will have that bit stripped away. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Fixes: dff25ddb4808 ("nfs: add support for the umask attribute") Cc: stable@vger.kernel.org # v4.10+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* NFSv4.1: Work around a Linux server bug...Trond Myklebust2017-05-091-0/+6
| | | | | | | | | | | It turns out the Linux server has a bug in its implementation of supattr_exclcreat; it returns the set of all attributes, whether or not they are supported by minor version 1. In order to avoid a regression, we therefore apply the supported_attrs as a mask on top of whatever the server sent us. Reported-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* Fix nfs_client refcounting if kmalloc fails in nfs4_proc_exchange_id and ↵Dave Wysochanski2017-05-051-2/+6
| | | | | | | | | | nfs4_proc_async_renew If memory allocation fails for the callback data, we need to put the nfs_client or we end up with an elevated refcount. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* NFSv4.1: RECLAIM_COMPLETE must handle NFS4ERR_CONN_NOT_BOUND_TO_SESSIONTrond Myklebust2017-05-051-1/+6
| | | | | | | | | | If the server returns NFS4ERR_CONN_NOT_BOUND_TO_SESSION because we are trunking, then RECLAIM_COMPLETE must handle that by calling nfs4_schedule_session_recovery() and then retrying. Reported-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-by: Chuck Lever <chuck.lever@oracle.com>