summaryrefslogtreecommitdiffstats
path: root/fs/nfs/pnfs.c
Commit message (Collapse)AuthorAgeFilesLines
* NFSv4.1: LAYOUTGET EDELAY loops timeout to the MDSWeston Andros Adamson2013-02-281-4/+3
| | | | | | | | | | | | | | | | The client will currently try LAYOUTGETs forever if a server is returning NFS4ERR_LAYOUTTRYLATER or NFS4ERR_RECALLCONFLICT - even if the client no longer needs the layout (ie process killed, unmounted). This patch uses the DS timeout value (module parameter 'dataserver_timeo' via rpc layer) to set an upper limit of how long the client tries LATOUTGETs in this situation. Once the timeout is reached, IO is redirected to the MDS. This also changes how the client checks if a layout is on the clp list to avoid a double list_add. Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* pnfs: fix resend_to_mds for directioBenny Halevy2013-02-241-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pass the directio request on pageio_init to clean up the API. Percolate pg_dreq from original nfs_pageio_descriptor to the pnfs_{read,write}_done_resend_to_mds and use it on respective call to nfs_pageio_init_{read,write} on the newly created nfs_pageio_descriptor. Reproduced by command: mount -o vers=4.1 server:/ /mnt dd bs=128k count=8 if=/dev/zero of=/mnt/dd.out oflag=direct BUG: unable to handle kernel NULL pointer dereference at 0000000000000028 IP: [<ffffffffa021a3a8>] atomic_inc+0x4/0x9 [nfs] PGD 34786067 PUD 34794067 PMD 0 Oops: 0002 [#1] SMP Modules linked in: nfs_layout_nfsv41_files nfsv4 nfs nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc btrfs zlib_deflate libcrc32c ipv6 autofs4 CPU 1 Pid: 259, comm: kworker/1:2 Not tainted 3.8.0-rc6 #2 Bochs Bochs RIP: 0010:[<ffffffffa021a3a8>] [<ffffffffa021a3a8>] atomic_inc+0x4/0x9 [nfs] RSP: 0018:ffff880038f8fa68 EFLAGS: 00010206 RAX: ffffffffa021a6a9 RBX: ffff880038f8fb48 RCX: 00000000000a0000 RDX: ffffffffa021e616 RSI: ffff8800385e9a40 RDI: 0000000000000028 RBP: ffff880038f8fa68 R08: ffffffff81ad6720 R09: ffff8800385e9510 R10: ffffffffa0228450 R11: ffff880038e87418 R12: ffff8800385e9a40 R13: ffff8800385e9a70 R14: ffff880038f8fb38 R15: ffffffffa0148878 FS: 0000000000000000(0000) GS:ffff88003e400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000028 CR3: 0000000034789000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kworker/1:2 (pid: 259, threadinfo ffff880038f8e000, task ffff880038302480) Stack: ffff880038f8fa78 ffffffffa021a6bf ffff880038f8fa88 ffffffffa021bb82 ffff880038f8fae8 ffffffffa021f454 ffff880038f8fae8 ffffffff8109689d ffff880038f8fab8 ffffffff00000006 0000000000000000 ffff880038f8fb48 Call Trace: [<ffffffffa021a6bf>] nfs_direct_pgio_init+0x16/0x18 [nfs] [<ffffffffa021bb82>] nfs_pgheader_init+0x6a/0x6c [nfs] [<ffffffffa021f454>] nfs_generic_pg_writepages+0x51/0xf8 [nfs] [<ffffffff8109689d>] ? mark_held_locks+0x71/0x99 [<ffffffffa0148878>] ? rpc_release_resources_task+0x37/0x37 [sunrpc] [<ffffffffa021bc25>] nfs_pageio_doio+0x1a/0x43 [nfs] [<ffffffffa021be7c>] nfs_pageio_complete+0x16/0x2c [nfs] [<ffffffffa02608be>] pnfs_write_done_resend_to_mds+0x95/0xc5 [nfsv4] [<ffffffffa0148878>] ? rpc_release_resources_task+0x37/0x37 [sunrpc] [<ffffffffa028e27f>] filelayout_reset_write+0x8c/0x99 [nfs_layout_nfsv41_files] [<ffffffffa028e5f9>] filelayout_write_done_cb+0x4d/0xc1 [nfs_layout_nfsv41_files] [<ffffffffa024587a>] nfs4_write_done+0x36/0x49 [nfsv4] [<ffffffffa021f996>] nfs_writeback_done+0x53/0x1cc [nfs] [<ffffffffa021fb1d>] nfs_writeback_done_common+0xe/0x10 [nfs] [<ffffffffa028e03d>] filelayout_write_call_done+0x28/0x2a [nfs_layout_nfsv41_files] [<ffffffffa01488a1>] rpc_exit_task+0x29/0x87 [sunrpc] [<ffffffffa014a0c9>] __rpc_execute+0x11d/0x3cc [sunrpc] [<ffffffff810969dc>] ? trace_hardirqs_on_caller+0x117/0x173 [<ffffffffa014a39f>] rpc_async_schedule+0x27/0x32 [sunrpc] [<ffffffffa014a378>] ? __rpc_execute+0x3cc/0x3cc [sunrpc] [<ffffffff8105f8c1>] process_one_work+0x226/0x422 [<ffffffff8105f7f4>] ? process_one_work+0x159/0x422 [<ffffffff81094757>] ? lock_acquired+0x210/0x249 [<ffffffffa014a378>] ? __rpc_execute+0x3cc/0x3cc [sunrpc] [<ffffffff810600d8>] worker_thread+0x126/0x1c4 [<ffffffff8105ffb2>] ? manage_workers+0x240/0x240 [<ffffffff81064ef8>] kthread+0xb1/0xb9 [<ffffffff81064e47>] ? __kthread_parkme+0x65/0x65 [<ffffffff815206ec>] ret_from_fork+0x7c/0xb0 [<ffffffff81064e47>] ? __kthread_parkme+0x65/0x65 Code: 00 83 38 02 74 12 48 81 4b 50 00 00 01 00 c7 83 60 07 00 00 01 00 00 00 48 89 df e8 55 fe ff ff 5b 41 5c 5d c3 66 90 55 48 89 e5 <f0> ff 07 5d c3 55 48 89 e5 f0 ff 0f 0f 94 c0 84 c0 0f 95 c0 0f RIP [<ffffffffa021a3a8>] atomic_inc+0x4/0x9 [nfs] RSP <ffff880038f8fa68> CR2: 0000000000000028 Signed-off-by: Benny Halevy <bhalevy@tonian.com> Cc: stable@kernel.org [>= 3.6] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Fix bulk recall and destroy of layoutsTrond Myklebust2013-02-141-20/+130
| | | | | | | | | | | | | | | | | | | The current code in pnfs_destroy_all_layouts() assumes that removing the layout from the server->layouts list is sufficient to make it invisible to other processes. This ignores the fact that most users access the layout through the nfs_inode->layout... There is further breakage due to lack of reference counting of the layouts, meaning that the whole thing Oopses at the drop of a hat. The code in initiate_bulk_draining() is almost correct, and can be used as a model for pnfs_destroy_all_layouts(), so move that code to pnfs.c, and refactor the code to allow us to choose between a single filesystem bulk recall, and a recall of all layouts. Also note that initiate_bulk_draining() currently calls iput() while holding locks. Fix that too. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org
* pnfs: Increase the refcount when LAYOUTGET fails the first timeYanchuan Nian2013-01-041-1/+1
| | | | | | | | | The layout will be set unusable if LAYOUTGET fails. Is it reasonable to increase the refcount iff LAYOUTGET fails the first time? Signed-off-by: Yanchuan Nian <ycnian@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org [>= 3.7]
* NFSv4.1: Remove assertion BUG_ON()s from the files and generic layout codeTrond Myklebust2012-11-041-4/+2
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Remove unused function last_byte_offsetTrond Myklebust2012-11-041-11/+0
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs: Check whether a layout pointer is NULL before free itYanchuan Nian2012-10-311-2/+2
| | | | | | | | | The new layout pointer in pnfs_find_alloc_layout() may be NULL because of out of memory. we must do some check work, otherwise pnfs_free_layout_hdr() will go wrong because it can not deal with a NULL pointer. Signed-off-by: Yanchuan Nian <ycnian@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS41: send real read size in layoutgetPeng Tao2012-10-081-1/+9
| | | | | | | | | For buffer read, use offst-to-isize. For direct read, use dreq->bytes_left. Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS41: send real write size in layoutgetPeng Tao2012-10-081-2/+4
| | | | | | | | | | | For buffer write, block layout client scan inode mapping to find next hole and use offset-to-hole as layoutget length. Object layout client uses offset-to-isize as layoutget length. For direct write, both block layout and object layout use dreq->bytes_left. Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Cleanup ugliness in pnfs_layoutgets_blocked()Trond Myklebust2012-10-051-11/+14
| | | | | | | Split it into two functions, one which checks if layoutgets are blocked, and one which checks if the layout stateid has expired. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Ensure that the layout sequence id stays 'close' to the currentTrond Myklebust2012-10-041-13/+8
| | | | | | | | | | Clamp the layout barrier sequence id to the current sequence id minus the maximum number of outstanding layoutget requests. Also ensure that we correctly initialise lo->plh_barrier if there are no layout segments associated to this layout header. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Deal with seqid wraparound in the pNFS return-on-close codeTrond Myklebust2012-10-041-1/+1
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Deal with wraparound when updating the layout "barrier" seqidTrond Myklebust2012-10-021-4/+7
| | | | | | ...and fix a bug in pnfs_set_layout_stateid. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Deal with wraparound issues when updating the layout stateidTrond Myklebust2012-10-021-1/+10
| | | | | | ...and add a helper function. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Always set the layout stateid if this is the first layoutgetTrond Myklebust2012-10-021-3/+5
| | | | | | | If the list of layout segments is empty, we must unconditionally set the layout stateid. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Fix another refcount issue in pnfs_find_alloc_layoutTrond Myklebust2012-10-021-7/+8
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: _pnfs_return_layout() shouldn't invalidate the layout on failureTrond Myklebust2012-09-281-2/+3
| | | | | | | | Failure of the layoutreturn allocation fails is not a good reason to mark the pnfs_layout_hdr as having failed a layoutget or i/o. Just exit cleanly. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Remove the NFS_LAYOUT_RETURNED stateTrond Myklebust2012-09-281-6/+1
| | | | | | | It serves no purpose that the test for whether or not we have valid layout segments doesn't already serve. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Clear NFS_LAYOUT_BULK_RECALL when the layout segments are freedTrond Myklebust2012-09-281-0/+2
| | | | | | | Once all the affected layout segments have been freed up, clear the NFS_LAYOUT_BULK_RECALL flag so that we can reuse the pnfs_layout_hdr Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Get rid of the NFS_LAYOUT_DESTROYED stateTrond Myklebust2012-09-281-8/+1
| | | | | | | | | | | | | | We already have a mechanism for blocking LAYOUTGET by means of the plh_block_lgets counter. The only "service" that NFS_LAYOUT_DESTROYED provides at this point is to block layoutget once the layout segment list is empty, which basically means that you have to wait until the pnfs_layout_hdr is destroyed before you can do pNFS on that file again. This patch enables the reuse of the pnfs_layout_hdr if the layout segment list is empty. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Remove unused 'default allocation' for pnfs_alloc_layout_hdr()Trond Myklebust2012-09-281-3/+2
| | | | | | ...and ditto for pnfs_free_layout_hdr() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Get rid of pNFS spin lock debugging asserts...Trond Myklebust2012-09-281-3/+0
| | | | | | These are all in static declared functions that are called only once. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Balance pnfs_layout_hdr refcount in pnfs_layout_(insert|remove)_lsegTrond Myklebust2012-09-281-2/+4
| | | | | | | | | | Ensure that the reference count for pnfs_layout_hdr reverts to the original value after a call to pnfs_layout_remove_lseg(). Note that the caller is expected to hold a reference to the struct pnfs_layout_hdr. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Clean up pnfs_put_lseg()Trond Myklebust2012-09-281-6/+3
| | | | | | | There is no longer a need to use pnfs_free_lseg_list(). Just call pnfs_free_lseg() directly. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Clean up the removal of pnfs_layout_hdr from the server listTrond Myklebust2012-09-281-19/+10
| | | | | | | | Move the code into pnfs_free_layout_hdr(), and add checks to get_layout_by_fh_locked to ensure that they don't reference a layout that is being freed. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Free the pnfs_layout_hdr outside the inode->i_lockTrond Myklebust2012-09-281-12/+9
| | | | | | | None of the existing pNFS layout drivers seem to require the inode to be locked while they free the layout header. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Remove redundant reference to the pnfs_layout_hdrTrond Myklebust2012-09-281-9/+4
| | | | | | | | | | | | Each layout segment already holds a reference to the pnfs_layout_hdr, so there is no need to hold an extra reference that is released once the last layout segment is freed. Ensure that pnfs_find_alloc_layout() always returns a reference to the pnfs_layout_hdr, which will be matched by the final call to pnfs_put_layout_hdr() in pnfs_update_layout(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Rename the pnfs_put_lseg_common to pnfs_layout_remove_lsegTrond Myklebust2012-09-281-12/+15
| | | | | | | The latter name is more descriptive of the actual function. Also rename pnfs_insert_layout to pnfs_layout_insert_lseg. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: reset the inode MDS threshold counters on layout destructionTrond Myklebust2012-09-281-4/+5
| | | | | | | Instead of resetting the inode MDS threshold counters when we mark the layout for destruction, do it as part of freeing the layout. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Fix a race in the pNFS return-on-close codeTrond Myklebust2012-09-281-10/+12
| | | | | | | If we sleep after dropping the inode->i_lock, then we are no longer atomic with respect to the rpc_wake_up() call in pnfs_layout_remove_lseg(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: pnfs_layout_io_set_failed must clear invalid lsegsTrond Myklebust2012-09-281-0/+8
| | | | | | | | If pnfs_layout_io_test_failed() authorises a retry of the failed layoutgets, we should clear the existing layout segments so that we start afresh. Do this in pnfs_layout_io_set_failed(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Don't drop the pnfs_layout_hdr after a layoutget failureTrond Myklebust2012-09-281-8/+32
| | | | | | | | We want to cache the pnfs_layout_hdr after a layoutget or i/o failure so that pnfs_update_layout() can find it and know when it is time to retry. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Fix a reference leak in pnfs_update_layoutTrond Myklebust2012-09-281-3/+6
| | | | | | | If we exit after the call to pnfs_find_alloc_layout(), we have to ensure that we put the struct pnfs_layout_hdr. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Retry pNFS after a 2 minute timeoutTrond Myklebust2012-09-281-1/+14
| | | | | | | | If we had to fall back to read/write through MDS, then assume that we should retry pNFS after a suitable timeout period. The following patch sets a timeout of 2 minutes. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Add helpers for setting/reading the I/O fail bitTrond Myklebust2012-09-281-12/+26
| | | | | | ...and make them local to the pnfs.c file. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Replace dprintk() in pnfs_update_layout with something less buggyTrond Myklebust2012-09-281-7/+11
| | | | | | | | | | Dereferencing nfsi->layout in order to read plh_flags without holding a spin lock is bug prone. Furthermore, the dprintk() tells you nothing about whether or not the call succeeded. Replace it with something that tells you about whether or not a valid layout segment was returned for the inode in question. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Cleanup; add "pnfs_" prefix to put_lseg() and get_lseg()Trond Myklebust2012-09-281-18/+18
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Cleanup; add "pnfs_" prefix to get_layout_hdr() and put_layout_hdr()Trond Myklebust2012-09-281-15/+15
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Cleanup add a "pnfs_" prefix to mark_matching_lsegs_invalidTrond Myklebust2012-09-281-3/+3
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Clean up the pNFS layoutget interfaceTrond Myklebust2012-09-281-9/+16
| | | | | | | | Ensure that we do return errors from nfs4_proc_layoutget() and that we don't mark the layout as having failed if the error was due to a signal or resource problem on the client side. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* pnfs: defer release of pages in layoutgetIdan Kedar2012-08-021-38/+1
| | | | | | | | | | | | | | | | | | | we have encountered a bug whereby reading a lot of files (copying fedora's /bin) from a pNFS mount and hitting Ctrl+C in the middle caused a general protection fault in xdr_shrink_bufhead. this function is called when decoding the response from LAYOUTGET. the decoding is done by a worker thread, and the caller of LAYOUTGET waits for the worker thread to complete. hitting Ctrl+C caused the synchronous wait to end and the next thing the caller does is to free the pages, so when the worker thread calls xdr_shrink_bufhead, the pages are gone. therefore, the cleanup of these pages has been moved to nfs4_layoutget_release. Signed-off-by: Idan Kedar <idank@tonian.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Convert v4 into a moduleBryan Schumaker2012-07-301-0/+2
| | | | | | | | | | | | This patch exports symbols needed by the v4 module. In addition, I also switch over to using IS_ENABLED() to check if CONFIG_NFS_V4 or CONFIG_NFS_V4_MODULE are set. The module (nfs4.ko) will be created in the same directory as nfs.ko and will be automatically loaded the first time you try to mount over NFS v4. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1 do not send LAYOUTRETURN on emtpy plh_segs listAndy Adamson2012-07-161-4/+19
| | | | | | | | | | | | mark_matching_lsegs_invalid() resets the mds_threshold counters and can dereference the layout hdr on an initial empty plh_segs list. It returns 0 both in the case of an initial empty list and in a non-emtpy list that was cleared by calls to mark_lseg_invalid. Don't send a LAYOUTRETURN if the list was initially empty. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1 mark layout when already returnedAndy Adamson2012-07-161-2/+8
| | | | | | | | | | | | When the file layout driver is fencing a DS, _pnfs_return_layout can be called mulitple times per inode due to in-flight i/o referencing lsegs on it's plh_segs list. Remember that LAYOUTRETURN has been called, and do not call it again. Allow LAYOUTRETURNs after a subsequent LAYOUTGET. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Create an write_pageio_init() functionBryan Schumaker2012-06-291-6/+5
| | | | | | | | | pNFS needs to select a write function based on the layout driver currently in use, so I let each NFS version decide how to best handle initializing writes. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Create an read_pageio_init() functionBryan Schumaker2012-06-291-6/+5
| | | | | | | | | pNFS needs to select a read function based on the layout driver currently in use, so I let each NFS version decide how to best handle initializing reads. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Fix a race in set_pnfs_layoutdriverTrond Myklebust2012-06-191-4/+4
| | | | | | | | The call to try_module_get() dereferences ld_type outside the spin locks, which means that it may be pointing to garbage if a module unload was in progress. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Fix umount when filelayout DS is also the MDSTrond Myklebust2012-06-181-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | Currently there is a 'chicken and egg' issue when the DS is also the mounted MDS. The nfs_match_client() reference from nfs4_set_ds_client bumps the cl_count, the nfs_client is not freed at umount, and nfs4_deviceid_purge_client is not called to dereference the MDS usage of a deviceid which holds a reference to the DS nfs_client. The result is the umount program returns, but the nfs_client is not freed, and the cl_session hearbeat continues. The MDS (and all other nfs mounts) lose their last nfs_client reference in nfs_free_server when the last nfs_server (fsid) is umounted. The file layout DS lose their last nfs_client reference in destroy_ds when the last deviceid referencing the data server is put and destroy_ds is called. This is triggered by a call to nfs4_deviceid_purge_client which removes references to a pNFS deviceid used by an MDS mount. The fix is to track how many pnfs enabled filesystems are mounted from this server, and then to purge the device id cache once that count reaches zero. Reported-by: Jorge Mora <Jorge.Mora@netapp.com> Reported-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1 test the mdsthreshold hint parametersAndy Adamson2012-05-241-0/+79
| | | | | Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1 add nfs_inode book keeping for mdsthresholdAndy Adamson2012-05-241-0/+3
| | | | | | | | Keep track of the number of bytes read or written via buffered, direct, and mem-mapped i/o for use by mdsthreshold size_io hints. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>