From fbcc025613590d7b1d15521555dcc6393a148a6b Mon Sep 17 00:00:00 2001 From: Dave Chinner Date: Mon, 15 Feb 2016 17:21:19 +1100 Subject: xfs: Introduce writeback context for writepages xfs_vm_writepages() calls generic_writepages to writeback a range of a file, but then xfs_vm_writepage() clusters pages itself as it does not have any context it can pass between->writepage calls from __write_cache_pages(). Introduce a writeback context for xfs_vm_writepages() and call __write_cache_pages directly with our own writepage callback so that we can pass that context to each writepage invocation. This encapsulates the current mapping, whether it is valid or not, the current ioend and it's IO type and the ioend chain being built. This requires us to move the ioend submission up to the level where the writepage context is declared. This does mean we do not submit IO until we packaged the entire writeback range, but with the block plugging in the writepages call this is the way IO is submitted, anyway. It also means that we need to handle discontiguous page ranges. If the pages sent down by write_cache_pages to the writepage callback are discontiguous, we need to detect this and put each discontiguous page range into individual ioends. This is needed to ensure that the ioend accurately represents the range of the file that it covers so that file size updates during IO completion set the size correctly. Failure to take into account the discontiguous ranges results in files being too small when writeback patterns are non-sequential. Signed-off-by: Dave Chinner Reviewed-by: Christoph Hellwig Signed-off-by: Dave Chinner --- fs/xfs/xfs_aops.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'fs/xfs/xfs_aops.h') diff --git a/fs/xfs/xfs_aops.h b/fs/xfs/xfs_aops.h index f6ffc9ae5ceb..3c3f1a37a1c7 100644 --- a/fs/xfs/xfs_aops.h +++ b/fs/xfs/xfs_aops.h @@ -24,12 +24,14 @@ extern mempool_t *xfs_ioend_pool; * Types of I/O for bmap clustering and I/O completion tracking. */ enum { + XFS_IO_INVALID, /* initial state */ XFS_IO_DELALLOC, /* covers delalloc region */ XFS_IO_UNWRITTEN, /* covers allocated but uninitialized data */ XFS_IO_OVERWRITE, /* covers already allocated extent */ }; #define XFS_IO_TYPES \ + { XFS_IO_INVALID, "invalid" }, \ { XFS_IO_DELALLOC, "delalloc" }, \ { XFS_IO_UNWRITTEN, "unwritten" }, \ { XFS_IO_OVERWRITE, "overwrite" } -- cgit v1.2.3 From e10de3723c53378e7cf441529f563c316fdc0dd3 Mon Sep 17 00:00:00 2001 From: Dave Chinner Date: Mon, 15 Feb 2016 17:23:12 +1100 Subject: xfs: don't chain ioends during writepage submission Currently we can build a long ioend chain during ->writepages that gets attached to the writepage context. IO submission only then occurs when we finish all the writepage processing. This means we can have many ioends allocated and pending, and this violates the mempool guarantees that we need to give about forwards progress. i.e. we really should only have one ioend being built at a time, otherwise we may drain the mempool trying to allocate a new ioend and that blocks submission, completion and freeing of ioends that are already in progress. To prevent this situation from happening, we need to submit ioends for IO as soon as they are ready for dispatch rather than queuing them for later submission. This means the ioends have bios built immediately and they get queued on any plug that is current active. Hence if we schedule away from writeback, the ioends that have been built will make forwards progress due to the plug flushing on context switch. This will also prevent context switches from creating unnecessary IO submission latency. We can't completely avoid having nested IO allocation - when we have a block size smaller than a page size, we still need to hold the ioend submission until after we have marked the current page dirty. Hence we may need multiple ioends to be held while the current page is completely mapped and made ready for IO dispatch. We cannot avoid this problem - the current code already has this ioend chaining within a page so we can mostly ignore that it occurs. Signed-off-by: Dave Chinner Reviewed-by: Christoph Hellwig Signed-off-by: Dave Chinner --- fs/xfs/xfs_aops.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'fs/xfs/xfs_aops.h') diff --git a/fs/xfs/xfs_aops.h b/fs/xfs/xfs_aops.h index 3c3f1a37a1c7..4e01bd5b6426 100644 --- a/fs/xfs/xfs_aops.h +++ b/fs/xfs/xfs_aops.h @@ -41,7 +41,7 @@ enum { * It can manage several multi-page bio's at once. */ typedef struct xfs_ioend { - struct xfs_ioend *io_list; /* next ioend in chain */ + struct list_head io_list; /* next ioend in chain */ unsigned int io_type; /* delalloc / unwritten */ int io_error; /* I/O error code */ atomic_t io_remaining; /* hold count */ -- cgit v1.2.3