summaryrefslogtreecommitdiffstats
path: root/fs/nfs/pagelist.c
Commit message (Collapse)AuthorAgeFilesLines
...
| * | NFS: Don't reset pg_moreio in __nfs_pageio_add_requestTrond Myklebust2014-07-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Once we've started sending unstable NFS writes, we do not want to clear pg_moreio, or we may end up sending the very last request as a stable write if the commit lists are still empty. Do, however, reset pg_moreio in the case where we end up having to recoalesce the write if an attempt to use pNFS failed. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | nfs: handle multiple reqs in nfs_page_async_flushWeston Andros Adamson2014-07-121-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change nfs_find_and_lock_request so nfs_page_async_flush can handle multiple requests in a page. There is only one request for a page the first time nfs_page_async_flush is called, but if a write or commit fails, async_flush is called again and there may be multiple requests associated with the page. The solution is to merge all the requests in a page group into a single request before calling nfs_pageio_add_request. Rename nfs_find_and_lock_request to nfs_lock_and_join_requests and change it to first lock all requests for the page, then cancel and merge all subrequests into the head request. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | nfs: nfs_page should take a ref on the head reqWeston Andros Adamson2014-07-121-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nfs_pages that aren't the the head of a group must take a reference on the head as long as ->wb_head is set to it. This stops the head from hitting a refcount of 0 while there is still an active nfs_page for the page group. This avoids kref warnings in the writeback code when the page group head is found and referenced. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | nfs: mark nfs_page reqs with flag for extra refWeston Andros Adamson2014-07-121-1/+3
| |/ | | | | | | | | | | | | | | Change the use of PG_INODE_REF - set it when taking extra reference on subrequests and take care to only release once for each request. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* | sched: Allow wait_on_bit_action() functions to support a timeoutNeilBrown2014-07-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is currently not possible for various wait_on_bit functions to implement a timeout. While the "action" function that is called to do the waiting could certainly use schedule_timeout(), there is no way to carry forward the remaining timeout after a false wake-up. As false-wakeups a clearly possible at least due to possible hash collisions in bit_waitqueue(), this is a real problem. The 'action' function is currently passed a pointer to the word containing the bit being waited on. No current action functions use this pointer. So changing it to something else will be a little noisy but will have no immediate effect. This patch changes the 'action' function to take a pointer to the "struct wait_bit_key", which contains a pointer to the word containing the bit so nothing is really lost. It also adds a 'private' field to "struct wait_bit_key", which is initialized to zero. An action function can now implement a timeout with something like static int timed_out_waiter(struct wait_bit_key *key) { unsigned long waited; if (key->private == 0) { key->private = jiffies; if (key->private == 0) key->private -= 1; } waited = jiffies - key->private; if (waited > 10 * HZ) return -EAGAIN; schedule_timeout(waited - 10 * HZ); return 0; } If any other need for context in a waiter were found it would be easy to use ->private for some other purpose, or even extend "struct wait_bit_key". My particular need is to support timeouts in nfs_release_page() to avoid deadlocks with loopback mounted NFS. While wait_on_bit_timeout() would be a cleaner interface, it will not meet my need. I need the timeout to be sensitive to the state of the connection with the server, which could change. So I need to use an 'action' interface. Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steve French <sfrench@samba.org> Cc: David Howells <dhowells@redhat.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140707051604.28027.41257.stgit@notabene.brown Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | sched: Remove proliferation of wait_on_bit() action functionsNeilBrown2014-07-161-10/+2
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current "wait_on_bit" interface requires an 'action' function to be provided which does the actual waiting. There are over 20 such functions, many of them identical. Most cases can be satisfied by one of just two functions, one which uses io_schedule() and one which just uses schedule(). So: Rename wait_on_bit and wait_on_bit_lock to wait_on_bit_action and wait_on_bit_lock_action to make it explicit that they need an action function. Introduce new wait_on_bit{,_lock} and wait_on_bit{,_lock}_io which are *not* given an action function but implicitly use a standard one. The decision to error-out if a signal is pending is now made based on the 'mode' argument rather than being encoded in the action function. All instances of the old wait_on_bit and wait_on_bit_lock which can use the new version have been changed accordingly and their action functions have been discarded. wait_on_bit{_lock} does not return any specific error code in the event of a signal so the caller must check for non-zero and interpolate their own error code as appropriate. The wait_on_bit() call in __fscache_wait_on_invalidate() was ambiguous as it specified TASK_UNINTERRUPTIBLE but used fscache_wait_bit_interruptible as an action function. David Howells confirms this should be uniformly "uninterruptible" The main remaining user of wait_on_bit{,_lock}_action is NFS which needs to use a freezer-aware schedule() call. A comment in fs/gfs2/glock.c notes that having multiple 'action' functions is useful as they display differently in the 'wchan' field of 'ps'. (and /proc/$PID/wchan). As the new bit_wait{,_io} functions are tagged "__sched", they will not show up at all, but something higher in the stack. So the distinction will still be visible, only with different function names (gds2_glock_wait versus gfs2_glock_dq_wait in the gfs2/glock.c case). Since first version of this patch (against 3.15) two new action functions appeared, on in NFS and one in CIFS. CIFS also now uses an action function that makes the same freezer aware schedule call as NFS. Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: David Howells <dhowells@redhat.com> (fscache, keys) Acked-by: Steven Whitehouse <swhiteho@redhat.com> (gfs2) Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steve French <sfrench@samba.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140707051603.28027.72349.stgit@notabene.brown Signed-off-by: Ingo Molnar <mingo@kernel.org>
* Merge tag 'nfs-for-3.16-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds2014-06-101-50/+583
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client updates from Trond Myklebust: "Highlights include: - massive cleanup of the NFS read/write code by Anna and Dros - support multiple NFS read/write requests per page in order to deal with non-page aligned pNFS striping. Also cleans up the r/wsize < page size code nicely. - stable fix for ensuring inode is declared uptodate only after all the attributes have been checked. - stable fix for a kernel Oops when remounting - NFS over RDMA client fixes - move the pNFS files layout driver into its own subdirectory" * tag 'nfs-for-3.16-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (79 commits) NFS: populate ->net in mount data when remounting pnfs: fix lockup caused by pnfs_generic_pg_test NFSv4.1: Fix typo in dprintk NFSv4.1: Comment is now wrong and redundant to code NFS: Use raw_write_seqcount_begin/end int nfs4_reclaim_open_state xprtrdma: Disconnect on registration failure xprtrdma: Remove BUG_ON() call sites xprtrdma: Avoid deadlock when credit window is reset SUNRPC: Move congestion window constants to header file xprtrdma: Reset connection timeout after successful reconnect xprtrdma: Use macros for reconnection timeout constants xprtrdma: Allocate missing pagelist xprtrdma: Remove Tavor MTU setting xprtrdma: Ensure ia->ri_id->qp is not NULL when reconnecting xprtrdma: Reduce the number of hardway buffer allocations xprtrdma: Limit work done by completion handler xprtrmda: Reduce calls to ib_poll_cq() in completion handlers xprtrmda: Reduce lock contention in completion handlers xprtrdma: Split the completion queue xprtrdma: Make rpcrdma_ep_destroy() return void ...
| * NFS: Fix a potential busy wait in nfs_page_group_lockTrond Myklebust2014-05-291-10/+9
| | | | | | | | | | | | | | We cannot allow nfs_page_group_lock to use TASK_KILLABLE here, since the loop would cause a busy wait if somebody kills the task. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * NFS: Fix error handling in __nfs_pageio_add_requestTrond Myklebust2014-05-291-0/+6
| | | | | | | | | | | | | | Handle the case where nfs_create_request() returns an error. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * nfs: remove data list from pgio headerWeston Andros Adamson2014-05-291-33/+6
| | | | | | | | | | | | | | | | Since the ability to split pages into subpage requests has been added, nfs_pgio_header->rpc_list only ever has one pgio data. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * nfs: use > 1 request to handle bsize < PAGE_SIZEWeston Andros Adamson2014-05-291-69/+11
| | | | | | | | | | | | | | | | | | | | | | Use the newly added support for multiple requests per page for rsize/wsize < PAGE_SIZE, instead of having multiple read / write data structures per pageio header. This allows us to get rid of nfs_pgio_multi. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * nfs: allow coalescing of subpage requestsWeston Andros Adamson2014-05-291-4/+0
| | | | | | | | | | | | | | Remove check that the request covers a whole page. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * nfs: page group syncing in write pathWeston Andros Adamson2014-05-291-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Operations that modify state for a whole page must be syncronized across all requests within a page group. In the write path, this is calling end_page_writeback and removing the head request from an inode. Both of these operations should not be called until all requests in a page group have reached the point where they would call them. This patch should have no effect yet since all page groups currently have one request, but will come into play when pg_test functions are modified to split pages into sub-page regions. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * nfs: page group syncing in read pathWeston Andros Adamson2014-05-291-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Operations that modify state for a whole page must be syncronized across all requests within a page group. In the read path, this is calling unlock_page and SetPageUptodate. Both of these functions should not be called until all requests in a page group have reached the point where they would call them. This patch should have no effect yet since all page groups currently have one request, but will come into play when pg_test functions are modified to split pages into sub-page regions. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * nfs: add support for multiple nfs reqs per pageWeston Andros Adamson2014-05-291-14/+206
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add "page groups" - a circular list of nfs requests (struct nfs_page) that all reference the same page. This gives nfs read and write paths the ability to account for sub-page regions independently. This somewhat follows the design of struct buffer_head's sub-page accounting. Only "head" requests are ever added/removed from the inode list in the buffered write path. "head" and "sub" requests are treated the same through the read path and the rest of the write/commit path. Requests are given an extra reference across the life of the list. Page groups are never rejoined after being split. If the read/write request fails and the client falls back to another path (ie revert to MDS in PNFS case), the already split requests are pushed through the recoalescing code again, which may split them further and then coalesce them into properly sized requests on the wire. Fragmentation shouldn't be a problem with the current design, because we flush all requests in page group when a non-contiguous request is added, so the only time resplitting should occur is on a resend of a read or write. This patch lays the groundwork for sub-page splitting, but does not actually do any splitting. For now all page groups have one request as pg_test functions don't yet split pages. There are several related patches that are needed support multiple requests per page group. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * nfs: call nfs_can_coalesce_requests for every reqWeston Andros Adamson2014-05-291-15/+19
| | | | | | | | | | | | | | | | | | | | | | Call nfs_can_coalesce_requests for every request, even the first one. This is needed for future patches to give pg_test a way to inform add_request to reduce the size of the request. Now @prev can be null in nfs_can_coalesce_requests and pg_test functions. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * nfs: modify pg_test interface to return size_tWeston Andros Adamson2014-05-291-3/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | This is a step toward allowing pg_test to inform the the coalescing code to reduce the size of requests so they may fit in whatever scheme the pg_test callback wants to define. For now, just return the size of the request if there is space, or 0 if there is not. This shouldn't change any behavior as it acts the same as when the pg_test functions returned bool. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * nfs: remove unused arg from nfs_create_requestWeston Andros Adamson2014-05-291-3/+1
| | | | | | | | | | | | | | @inode is passed but not used. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * NFS: Create a common nfs_pageio_ops structAnna Schumaker2014-05-291-1/+6
| | | | | | | | | | | | | | | | At this point the read and write structures look identical, so combine them into something shared by both. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * NFS: Create a common generic_pg_pgios()Anna Schumaker2014-05-291-1/+25
| | | | | | | | | | | | | | | | What we have here is two functions that look identical. Let's share some more code! Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * NFS: Create a common multiple_pgios() functionAnna Schumaker2014-05-291-0/+21
| | | | | | | | | | | | | | | | Once again, these two functions look identical in the read and write case. Time to combine them together! Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * NFS: Create a common initiate_pgio() functionAnna Schumaker2014-05-291-0/+46
| | | | | | | | | | | | | | | | Most of this code is the same for both the read and write paths, so combine everything and use the rw_ops when necessary. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * NFS: Create a generic_pgio functionAnna Schumaker2014-05-281-5/+101
| | | | | | | | | | | | | | | | | | These functions are almost identical on both the read and write side. FLUSH_COND_STABLE will never be set for the read path, so leaving it in the generic code won't hurt anything. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * NFS: Create a common pgio_error functionAnna Schumaker2014-05-281-0/+20
| | | | | | | | | | | | | | | | At this point, the read and write versions of this function look identical so both should use the same function. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * NFS: Create a common rpcsetup function for reads and writesAnna Schumaker2014-05-281-0/+44
| | | | | | | | | | | | | | | | Write adds a little bit of code dealing with flush flags, but since "how" will always be 0 when reading we can share the code. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * NFS: Create a common rpc_call_ops structAnna Schumaker2014-05-281-3/+8
| | | | | | | | | | | | | | | | The read and write paths set up this struct in exactly the same way, so create a single shared struct. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * NFS: Create a common nfs_pgio_result_common functionAnna Schumaker2014-05-281-0/+23
| | | | | | | | | | | | | | | | Combining these functions will let me make a single nfs_rw_common_ops struct (see the next patch). Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * NFS: Create a common pgio_rpc_prepare functionAnna Schumaker2014-05-281-0/+26
| | | | | | | | | | | | | | | | The read and write paths do exactly the same thing for the rpc_prepare rpc_op. This patch combines them together into a single function. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * NFS: Create a common rw_header_alloc and rw_header_free functionAnna Schumaker2014-05-281-0/+33
| | | | | | | | | | | | | | | | | | | | I create a new struct nfs_rw_ops to decide the differences between reads and writes. This struct will be set when initializing a new nfs_pgio_descriptor, and then passed on to the nfs_rw_header when a new header is allocated. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * NFS: Create a common pgio_alloc and pgio_release functionAnna Schumaker2014-05-281-1/+61
| | | | | | | | | | | | | | | | These functions are identical for the read and write paths so they can be combined. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* | arch: Mass conversion of smp_mb__*()Peter Zijlstra2014-04-181-3/+3
|/ | | | | | | | | | | Mostly scripted conversion of the smp_mb__* barriers. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-arch@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* NFS: Don't check lock owner compatability unless file is locked (part 2)Trond Myklebust2013-09-061-6/+16
| | | | | | | | | When coalescing requests into a single READ or WRITE RPC call, and there is no file locking involved, we don't have to refuse coalescing for requests where the lock owner information doesn't match. Reported-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Add functionality to allow waiting on all outstanding reads to completeTrond Myklebust2013-04-081-0/+51
| | | | | | | This will later allow NFS locking code to wait for readahead to complete before releasing byte range locks. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Don't accept more reads/writes if the open context recovery failedTrond Myklebust2013-03-251-0/+2
| | | | | | | If the state recovery failed, we want to ensure that the application doesn't try to use the same file descriptor for more reads or writes. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Clean up helper function nfs4_select_rw_stateid()Trond Myklebust2012-09-281-1/+3
| | | | | | | | | | | We want to be able to pass on the information that the page was not dirtied under a lock. Instead of adding a flag parameter, do this by passing a pointer to a 'struct nfs_lock_owner' that may be NULL. Also reuse this structure in struct nfs_lock_context to carry the fl_owner_t and pid_t. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Convert nfs_get_lock_context to return an ERR_PTR on failureTrond Myklebust2012-09-281-3/+5
| | | | | | | | We want to be able to distinguish between allocation failures, and the case where the lock context is not needed (because there are no locks). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS41: add pg_layout_private to nfs_pageio_descriptorPeng Tao2012-08-021-0/+2
| | | | | | | | | To allow layout driver to pass private information around pg_init/pg_doio. Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* Merge branch 'akpm' (Andrew's patch-bomb)Linus Torvalds2012-07-311-2/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge Andrew's second set of patches: - MM - a few random fixes - a couple of RTC leftovers * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (120 commits) rtc/rtc-88pm80x: remove unneed devm_kfree rtc/rtc-88pm80x: assign ret only when rtc_register_driver fails mm: hugetlbfs: close race during teardown of hugetlbfs shared page tables tmpfs: distribute interleave better across nodes mm: remove redundant initialization mm: warn if pg_data_t isn't initialized with zero mips: zero out pg_data_t when it's allocated memcg: gix memory accounting scalability in shrink_page_list mm/sparse: remove index_init_lock mm/sparse: more checks on mem_section number mm/sparse: optimize sparse_index_alloc memcg: add mem_cgroup_from_css() helper memcg: further prevent OOM with too many dirty pages memcg: prevent OOM with too many dirty pages mm: mmu_notifier: fix freed page still mapped in secondary MMU mm: memcg: only check anon swapin page charges for swap cache mm: memcg: only check swap cache pages for repeated charging mm: memcg: split swapin charge function into private and public part mm: memcg: remove needless !mm fixup to init_mm when charging mm: memcg: remove unneeded shmem charge type ...
| * nfs: prevent page allocator recursions with swap over NFS.Mel Gorman2012-07-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GFP_NOFS is _more_ permissive than GFP_NOIO in that it will initiate IO, just not of any filesystem data. The problem is that previously NOFS was correct because that avoids recursion into the NFS code. With swap-over-NFS, it is no longer correct as swap IO can lead to this recursion. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Rik van Riel <riel@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: David S. Miller <davem@davemloft.net> Cc: Eric B Munson <emunson@mgebm.net> Cc: Eric Paris <eparis@redhat.com> Cc: James Morris <jmorris@namei.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: Neil Brown <neilb@suse.de> Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Xiaotian Feng <dfeng@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * nfs: teach the NFS client how to treat PG_swapcache pagesMel Gorman2012-07-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace all relevant occurences of page->index and page->mapping in the NFS client with the new page_file_index() and page_file_mapping() functions. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Rik van Riel <riel@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: David S. Miller <davem@davemloft.net> Cc: Eric B Munson <emunson@mgebm.net> Cc: Eric Paris <eparis@redhat.com> Cc: James Morris <jmorris@namei.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: Neil Brown <neilb@suse.de> Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Xiaotian Feng <dfeng@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | NFS: Convert v4 into a moduleBryan Schumaker2012-07-301-0/+4
|/ | | | | | | | | | | | This patch exports symbols needed by the v4 module. In addition, I also switch over to using IS_ENABLED() to check if CONFIG_NFS_V4 or CONFIG_NFS_V4_MODULE are set. The module (nfs4.ko) will be created in the same directory as nfs.ko and will be automatically loaded the first time you try to mount over NFS v4. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Clean up - Rename nfs_unlock_request and nfs_unlock_request_dont_releaseTrond Myklebust2012-05-091-5/+6
| | | | | | | | | Function rename to ensure that the functionality of nfs_unlock_request() mirrors that of nfs_lock_request(). Then let nfs_unlock_and_release_request() do the work of what used to be called nfs_unlock_request()... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Fred Isaman <iisaman@netapp.com>
* NFS: Prevent a deadlock in the new writeback codeTrond Myklebust2012-05-091-2/+10
| | | | | | | | | | | | | | | | We have to unlock the nfs_page before we call nfs_end_page_writeback to avoid races with functions that expect the page to be unlocked when PG_locked and PG_writeback are not set. The problem is that nfs_unlock_request also releases the nfs_page, causing a deadlock if the release of the nfs_open_context triggers an iput() while the PG_writeback flag is still set... The solution is to separate the unlocking and release of the nfs_page, so that we can do the former before nfs_end_page_writeback and the latter after. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Fred Isaman <iisaman@netapp.com>
* NFS: rewrite directio read to use async coalesce codeFred Isaman2012-04-271-3/+4
| | | | | | | This also has the advantage that it allows directio to use pnfs. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: prepare coalesce testing for directioFred Isaman2012-04-271-2/+2
| | | | | | | | | The coalesce code made assumptions that will no longer be true once non-page aligned io occurs. This introduces no change in current behavior, but allows for more general situations to come. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: remove unused wb_complete field from struct nfs_pageFred Isaman2012-04-271-1/+0
| | | | | Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: create completion structure to pass into page_init functionsFred Isaman2012-04-271-0/+3
| | | | | | | | | Factors out the code that will need to change when directio starts using these code paths. This will allow directio to use the generic pagein and flush routines Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: merge _full and _partial read rpc_opsFred Isaman2012-04-271-0/+24
| | | | | | | | | | | | | | | | | Decouple nfs_pgio_header and nfs_read_data, and have (possibly multiple) nfs_read_datas each take a refcount on nfs_pgio_header. For the moment keeps nfs_read_header as a way to preallocate a single nfs_read_data with the nfs_pgio_header. The code doesn't need this, and would be prettier without, but given the amount of churn I am already introducing I didn't want to play with tuning new mempools. This also fixes bug in pnfs_ld_handle_read_error. In the case of desc->pg_bsize < PAGE_CACHE_SIZE, the pages list was empty, causing replay attempt to do nothing. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: create struct nfs_page_arrayFred Isaman2012-04-271-0/+13
| | | | | | | | Both nfs_read_data and nfs_write_data devote several fields which can be combined into a single shared struct. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: remove nfs_inode radix treeFred Isaman2012-03-101-61/+0
| | | | | | | | The radix tree is only being used to compile lists of reqs needing commit. It is simpler to just put the reqs directly into a list. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>