summaryrefslogtreecommitdiffstats
path: root/include/linux/nfs_page.h
Commit message (Collapse)AuthorAgeFilesLines
* NFS41: add pg_layout_private to nfs_pageio_descriptorPeng Tao2012-08-021-0/+1
| | | | | | | | | To allow layout driver to pass private information around pg_init/pg_doio. Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Cleanup - only store the write verifier in struct nfs_pageTrond Myklebust2012-06-281-1/+1
| | | | | | | | | | The 'committed' field is not needed once we have put the struct nfs_page on the right list. Also correct the type of the verifier: it is not an array of __be32, but simply an 8 byte long opaque array. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Clean up - Rename nfs_unlock_request and nfs_unlock_request_dont_releaseTrond Myklebust2012-05-091-1/+1
| | | | | | | | | Function rename to ensure that the functionality of nfs_unlock_request() mirrors that of nfs_lock_request(). Then let nfs_unlock_and_release_request() do the work of what used to be called nfs_unlock_request()... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Fred Isaman <iisaman@netapp.com>
* NFS: Clean up - simplify nfs_lock_request()Trond Myklebust2012-05-091-12/+2
| | | | | | | | We only have two places where we need to grab a reference when trying to lock the nfs_page. We're better off making that explicit. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Fred Isaman <iisaman@netapp.com>
* NFS: Prevent a deadlock in the new writeback codeTrond Myklebust2012-05-091-0/+1
| | | | | | | | | | | | | | | | We have to unlock the nfs_page before we call nfs_end_page_writeback to avoid races with functions that expect the page to be unlocked when PG_locked and PG_writeback are not set. The problem is that nfs_unlock_request also releases the nfs_page, causing a deadlock if the release of the nfs_open_context triggers an iput() while the PG_writeback flag is still set... The solution is to separate the unlocking and release of the nfs_page, so that we can do the former before nfs_end_page_writeback and the latter after. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Fred Isaman <iisaman@netapp.com>
* NFS: rewrite directio read to use async coalesce codeFred Isaman2012-04-271-0/+1
| | | | | | | This also has the advantage that it allows directio to use pnfs. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: remove unused wb_complete field from struct nfs_pageFred Isaman2012-04-271-1/+0
| | | | | Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: create completion structure to pass into page_init functionsFred Isaman2012-04-271-0/+2
| | | | | | | | | Factors out the code that will need to change when directio starts using these code paths. This will allow directio to use the generic pagein and flush routines Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: merge _full and _partial read rpc_opsFred Isaman2012-04-271-1/+0
| | | | | | | | | | | | | | | | | Decouple nfs_pgio_header and nfs_read_data, and have (possibly multiple) nfs_read_datas each take a refcount on nfs_pgio_header. For the moment keeps nfs_read_header as a way to preallocate a single nfs_read_data with the nfs_pgio_header. The code doesn't need this, and would be prettier without, but given the amount of churn I am already introducing I didn't want to play with tuning new mempools. This also fixes bug in pnfs_ld_handle_read_error. In the case of desc->pg_bsize < PAGE_CACHE_SIZE, the pages list was empty, causing replay attempt to do nothing. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Clean ups and bugfixes for the pNFS read/writeback/commit codeTrond Myklebust2012-03-171-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move more pnfs-isms out of the generic commit code. Bugfixes: - filelayout_scan_commit_lists doesn't need to get/put the lseg. In fact since it is run under the inode->i_lock, the lseg_put() can deadlock. - Ensure that we distinguish between what needs to be done for commit-to-data server and what needs to be done for commit-to-MDS using the new flag PG_COMMIT_TO_DS. Otherwise we may end up calling put_lseg() on a bucket for a struct nfs_page that got written through the MDS. - Fix a case where we were using list_del() on an nfs_page->wb_list instead of list_del_init(). - filelayout_initiate_commit needs to call filelayout_commit_release on error instead of the mds_ops->rpc_release(). Otherwise it won't clear the commit lock. Cleanups: - Let the files layout manage the commit lists for the pNFS case. Don't expose stuff like pnfs_choose_commit_list, and the fact that the commit buckets hold references to the layout segment in common code. - Cast out the put_lseg() calls for the struct nfs_read/write_data->lseg into the pNFS layer from whence they came. - Let the pNFS layer manage the NFS_INO_PNFS_COMMIT bit. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Fred Isaman <iisaman@netapp.com>
* NFS: remove nfs_inode radix treeFred Isaman2012-03-101-12/+1
| | | | | | | | The radix tree is only being used to compile lists of reqs needing commit. It is simpler to just put the reqs directly into a list. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: remove NFS_PAGE_TAG_LOCKEDFred Isaman2012-03-101-3/+0
| | | | | | | | The last real use of this tag was removed by commit 7f2f12d963 NFS: Simplify nfs_wb_page() Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Don't rely on PageError in nfs_readpage_release_partialTrond Myklebust2011-10-191-0/+1
| | | | | | | | | | | Don't rely on the PageError flag to tell us if one of the partial reads of the page failed. Instead, replace that with a dedicated flag in the struct nfs_page. Then clean out redundant uses of the PageError flag: the VM no longer checks it for reads. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Move the pnfs write code into pnfs.cTrond Myklebust2011-07-151-3/+0
| | | | | | | ...and ensure that we recoalese to take into account differences in differences in block sizes when falling back to write through the MDS. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Move the pnfs read code into pnfs.cTrond Myklebust2011-07-151-1/+0
| | | | | | | ...and ensure that we recoalese to take into account differences in block sizes when falling back to read through the MDS. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Allow the nfs_pageio_descriptor to signal that a re-coalesce is neededTrond Myklebust2011-07-151-1/+2
| | | | | | | | | If an attempt to do pNFS fails, and we have to fall back to writing through the MDS, then we may want to re-coalesce the requests that we already have since the block size for the MDS read/writes may be different to that of the DS read/writes. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Cache rpc_ops in struct nfs_pageio_descriptorTrond Myklebust2011-07-151-0/+1
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Add an initialisation callback for pNFSTrond Myklebust2011-07-121-0/+1
| | | | | | Ensure that we always get a layout before setting up the i/o request. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Cleanup of the nfs_pageio code in preparation for a pnfs bugfixTrond Myklebust2011-07-121-3/+11
| | | | | | | | | | | | We need to ensure that the layouts are set up before we can decide to coalesce requests. To do so, we want to further split up the struct nfs_pageio_descriptor operations into an initialisation callback, a coalescing test callback, and a 'do i/o' callback. This patch cleans up the existing callback methods before adding the 'initialisation' callback. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: file layout must consider pg_bsize for coalescingBenny Halevy2011-06-201-0/+3
| | | | | | | Otherwise we end up overflowing the rpc buffer size on the receive end. Signed-off-by: Benny Halevy <benny@tonian.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: change pg_test return type to boolBenny Halevy2011-05-291-1/+1
| | | | Signed-off-by: Benny Halevy <bhalevy@panasas.com>
* NFS: Fix a hang in the writeback pathTrond Myklebust2011-03-271-1/+0
| | | | | | | | | | | | | | Now that the inode scalability patches have been merged, it is no longer safe to call igrab() under the inode->i_lock. Now that we no longer call nfs_clear_request() until the nfs_page is being freed, we know that we are always holding a reference to the nfs_open_context, which again holds a reference to the path, and so the inode cannot be freed until the last nfs_page has been removed from the radix tree and freed. We can therefore skip the igrab()/iput() altogether. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: add generic layer hooks for pnfs COMMITFred Isaman2011-03-231-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | We create three major hooks for the pnfs code. pnfs_mark_request_commit() is called during writeback_done from nfs_mark_request_commit, which gives the driver an opportunity to claim it wants control over commiting a particular req. pnfs_choose_commit_list() is called from nfs_scan_list to choose which list a given req should be added to, based on where we intend to send it for COMMIT. It is up to the driver to have preallocated list headers for each destination it may need. pnfs_commit_list() is how the driver actually takes control, it is used instead of nfs_commit_list(). In order to pass information between the above functions, we create a union in nfs_page to hold a lseg (which is possible because the req is not on any list while in transition), and add some flags to indicate if we need to use the pnfs code. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* FS: Use stable writes when not doing a bulk flushTrond Myklebust2011-03-211-0/+1
| | | | | | | | | If we're only doing a single write, and there are no other unstable writes being queued up, we might want to just flip to using a stable write RPC call. Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: rearrange ->doio argsFred Isaman2011-03-111-2/+2
| | | | | | | | This will make it possible to clear the lseg pointer in the same function as it is put, instead of in the caller nfs_pageio_doio(). Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: shift pnfs_update_layout locationsFred Isaman2011-03-111-2/+2
| | | | | | | | | | | | | | | | | Move the pnfs_update_layout call location to nfs_pageio_do_add_request(). Grab the lseg sent in the doio function to nfs_read_rpcsetup and attach it to each nfs_read_data so it can be sent to the layout driver. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Andy Adamson <andros@citi.umich.edu> Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Oleg Drokin <green@linuxhacker.ru> Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: coelesce across layout stripesFred Isaman2011-03-111-0/+2
| | | | | | | | | | | | | | | | Add a pg_test layout driver hook which is used to avoid coelescing I/O across layout stripes. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Andy Adamson <andros@citi.umich.edu> Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Oleg Drokin <green@linuxhacker.ru> Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs: remove extraneous and problematic calls to nfs_clear_requestTrond Myklebust2010-12-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | When a nfs_page is freed, nfs_free_request is called which also calls nfs_clear_request to clean out the lock and open contexts and free the pagecache page. However, a couple of places in the nfs code call nfs_clear_request themselves. What happens here if the refcount on the request is still high? We'll be releasing contexts and freeing pointers while the request is possibly still in use. Remove those bare calls to nfs_clear_context. That should only be done when the request is being freed. Note that when doing this, we need to watch out for tests of req->wb_page. Previously, nfs_set_page_tag_locked() and nfs_clear_page_tag_locked() would check the value of req->wb_page to figure out if the page is mapped into the nfsi->nfs_page_tree. We now indicate the page is mapped using the new bit PG_MAPPED in req->wb_flags . Reported-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Ensure that we track the NFSv4 lock state in read/write requests.Trond Myklebust2010-07-301-0/+1
| | | | | | | This patch fixes bugzilla entry 14501: https://bugzilla.kernel.org/show_bug.cgi?id=14501 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Allow redirtying of a completed unstable write.Trond Myklebust2008-07-091-3/+6
| | | | | | | | | | | | | Currently, if an unstable write completes, we cannot redirty the page in order to reflect a new change in the page data until after we've sent a COMMIT request. This patch allows a page rewrite to proceed without the unnecessary COMMIT step, putting it immediately back onto the dirty page list, undoing the VM unstable write accounting, and removing the NFS_PAGE_TAG_COMMIT tag from the NFS radix tree. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Clean up the write request locking.Trond Myklebust2008-01-301-12/+1
| | | | | | Ensure that we set/clear NFS_PAGE_TAG_LOCKED when the nfs_page is hashed. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Clean up write code...Trond Myklebust2007-10-091-1/+0
| | | | | | | The addition of nfs_page_mkwrite means that We should no longer need to create requests inside nfs_writepage() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Remove the redundant 'dirty' and 'commit' lists from nfs_inodeTrond Myklebust2007-07-101-4/+1
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS cleanup: speed up nfs_scan_commit using radix tree tagsTrond Myklebust2007-07-101-2/+3
| | | | | | Add a tag for requests that are waiting for a COMMIT Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS cleanup: Rename NFS_PAGE_TAG_WRITEBACK to NFS_PAGE_TAG_LOCKEDTrond Myklebust2007-07-101-3/+2
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Convert struct nfs_page to use krefsTrond Myklebust2007-07-101-5/+5
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Avoid a deadlock situation on writeTrond Myklebust2007-05-241-0/+1
| | | | | | | | | | | | | When processes are allowed to attempt to lock a non-contiguous range of nfs write requests, it is possible for generic_writepages to 'wrap round' the address space, and call writepage() on a request that is already locked by the same process. We avoid the deadlock by checking if the page index is contiguous with the list of nfs write requests that is already held in our nfs_pageio_descriptor prior to attempting to lock a new request. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Use pgoff_t in structures and functions that pass page cache offsetsTrond Myklebust2007-04-301-2/+2
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Fix a buffer overflow in the allocation of struct nfs_read/writedataTrond Myklebust2007-04-301-2/+2
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Fix a race when doing NFS write coalescingTrond Myklebust2007-04-301-7/+1
| | | | | | | | | | | | | | Currently we do write coalescing in a very inefficient manner: one pass in generic_writepages() in order to lock the pages for writing, then one pass in nfs_flush_mapping() and/or nfs_sync_mapping_wait() in order to gather the locked pages for coalescing into RPC requests of size "wsize". In fact, it turns out there is actually a deadlock possible here since we only start I/O on the second pass. If the user signals the process while we're in nfs_sync_mapping_wait(), for instance, then we may exit before starting I/O on all the requests that have been queued up. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Cleanup for nfs_readpages()Trond Myklebust2007-04-301-0/+2
| | | | | | | Do the coalescing of read requests into block sized requests at start of I/O as we scan through the pages instead of going through a second pass. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Another cleanup of the read/write request coalescing codeTrond Myklebust2007-04-301-2/+12
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Cleanup the coalescing codeTrond Myklebust2007-04-301-2/+11
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: clean up the unstable write codeTrond Myklebust2007-04-201-30/+0
| | | | | | | Get rid of the inlined #ifdefs. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* NFS: Ensure PG_writeback is cleared when writeback failsTrond Myklebust2007-04-141-1/+0
| | | | | | | | | | | | | | | | If the writebacks are cancelled via nfs_cancel_dirty_list, or due to the memory allocation failing in nfs_flush_one/nfs_flush_multi, then we must ensure that the PG_writeback flag is cleared. Also ensure that we actually own the PG_writeback flag whenever we schedule a new writeback by making nfs_set_page_writeback() return the value of test_set_page_writeback(). The PG_writeback page flag ends up replacing the functionality of the PG_FLUSHING nfs_page flag, so we rip that out too. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* NFS: Make nfs_updatepage() mark the page as dirty.Trond Myklebust2006-12-061-0/+1
| | | | | | | This will ensure that we can call set_page_writeback() from within nfs_writepage(), which is always called with the page lock set. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Add nfs_set_page_dirty()Trond Myklebust2006-12-061-0/+1
| | | | | | | | | | We will want to allow nfs_writepage() to distinguish between pages that have been marked as dirty by the VM, and those that have been marked as dirty by nfs_updatepage(). In the former case, the entire page will want to be written out, and so any requests that were pending need to be flushed out first. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Clean up nfs_scan_dirty()Trond Myklebust2006-12-061-2/+3
| | | | | | Pass down struct writeback control. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Store the file system "fsid" value in the NFS super block.Trond Myklebust2006-06-091-1/+0
| | | | | | | This should enable us to detect if we are crossing a mountpoint in the case where the server is exporting "nohide" mounts. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Flesh out nfs_invalidate_page()Trond Myklebust2006-06-091-2/+2
| | | | | | | In the case of a call to truncate_inode_pages(), we should really try to cancel any pending writes on the page. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>