summaryrefslogtreecommitdiffstats
path: root/fs/xfs/xfs_log_priv.h
diff options
context:
space:
mode:
authorDave Chinner <dchinner@redhat.com>2021-08-10 18:00:44 -0700
committerDarrick J. Wong <djwong@kernel.org>2021-08-16 12:09:30 -0700
commit0020a190cf3eac16995143db41b21b82bacdcbe3 (patch)
treeffedd21903f36e369326aea9e8378c883923e6c4 /fs/xfs/xfs_log_priv.h
parent68a74dcae6737c27b524b680e070fe41f0cad43a (diff)
downloadlinux-stable-0020a190cf3eac16995143db41b21b82bacdcbe3.tar.gz
linux-stable-0020a190cf3eac16995143db41b21b82bacdcbe3.tar.bz2
linux-stable-0020a190cf3eac16995143db41b21b82bacdcbe3.zip
xfs: AIL needs asynchronous CIL forcing
The AIL pushing is stalling on log forces when it comes across pinned items. This is happening on removal workloads where the AIL is dominated by stale items that are removed from AIL when the checkpoint that marks the items stale is committed to the journal. This results is relatively few items in the AIL, but those that are are often pinned as directories items are being removed from are still being logged. As a result, many push cycles through the CIL will first issue a blocking log force to unpin the items. This can take some time to complete, with tracing regularly showing push delays of half a second and sometimes up into the range of several seconds. Sequences like this aren't uncommon: .... 399.829437: xfsaild: last lsn 0x11002dd000 count 101 stuck 101 flushing 0 tout 20 <wanted 20ms, got 270ms delay> 400.099622: xfsaild: target 0x11002f3600, prev 0x11002f3600, last lsn 0x0 400.099623: xfsaild: first lsn 0x11002f3600 400.099679: xfsaild: last lsn 0x1100305000 count 16 stuck 11 flushing 0 tout 50 <wanted 50ms, got 500ms delay> 400.589348: xfsaild: target 0x110032e600, prev 0x11002f3600, last lsn 0x0 400.589349: xfsaild: first lsn 0x1100305000 400.589595: xfsaild: last lsn 0x110032e600 count 156 stuck 101 flushing 30 tout 50 <wanted 50ms, got 460ms delay> 400.950341: xfsaild: target 0x1100353000, prev 0x110032e600, last lsn 0x0 400.950343: xfsaild: first lsn 0x1100317c00 400.950436: xfsaild: last lsn 0x110033d200 count 105 stuck 101 flushing 0 tout 20 <wanted 20ms, got 200ms delay> 401.142333: xfsaild: target 0x1100361600, prev 0x1100353000, last lsn 0x0 401.142334: xfsaild: first lsn 0x110032e600 401.142535: xfsaild: last lsn 0x1100353000 count 122 stuck 101 flushing 8 tout 10 <wanted 10ms, got 10ms delay> 401.154323: xfsaild: target 0x1100361600, prev 0x1100361600, last lsn 0x1100353000 401.154328: xfsaild: first lsn 0x1100353000 401.154389: xfsaild: last lsn 0x1100353000 count 101 stuck 101 flushing 0 tout 20 <wanted 20ms, got 300ms delay> 401.451525: xfsaild: target 0x1100361600, prev 0x1100361600, last lsn 0x0 401.451526: xfsaild: first lsn 0x1100353000 401.451804: xfsaild: last lsn 0x1100377200 count 170 stuck 22 flushing 122 tout 50 <wanted 50ms, got 500ms delay> 401.933581: xfsaild: target 0x1100361600, prev 0x1100361600, last lsn 0x0 .... In each of these cases, every AIL pass saw 101 log items stuck on the AIL (pinned) with very few other items being found. Each pass, a log force was issued, and delay between last/first is the sleep time + the sync log force time. Some of these 101 items pinned the tail of the log. The tail of the log does slowly creep forward (first lsn), but the problem is that the log is actually out of reservation space because it's been running so many transactions that stale items that never reach the AIL but consume log space. Hence we have a largely empty AIL, with long term pins on items that pin the tail of the log that don't get pushed frequently enough to keep log space available. The problem is the hundreds of milliseconds that we block in the log force pushing the CIL out to disk. The AIL should not be stalled like this - it needs to run and flush items that are at the tail of the log with minimal latency. What we really need to do is trigger a log flush, but then not wait for it at all - we've already done our waiting for stuff to complete when we backed off prior to the log force being issued. Even if we remove the XFS_LOG_SYNC from the xfs_log_force() call, we still do a blocking flush of the CIL and that is what is causing the issue. Hence we need a new interface for the CIL to trigger an immediate background push of the CIL to get it moving faster but not to wait on that to occur. While the CIL is pushing, the AIL can also be pushing. We already have an internal interface to do this - xlog_cil_push_now() - but we need a wrapper for it to be used externally. xlog_cil_force_seq() can easily be extended to do what we need as it already implements the synchronous CIL push via xlog_cil_push_now(). Add the necessary flags and "push current sequence" semantics to xlog_cil_force_seq() and convert the AIL pushing to use it. One of the complexities here is that the CIL push does not guarantee that the commit record for the CIL checkpoint is written to disk. The current log force ensures this by submitting the current ACTIVE iclog that the commit record was written to. We need the CIL to actually write this commit record to disk for an async push to ensure that the checkpoint actually makes it to disk and unpins the pinned items in the checkpoint on completion. Hence we need to pass down to the CIL push that we are doing an async flush so that it can switch out the commit_iclog if necessary to get written to disk when the commit iclog is finally released. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Diffstat (limited to 'fs/xfs/xfs_log_priv.h')
-rw-r--r--fs/xfs/xfs_log_priv.h4
1 files changed, 4 insertions, 0 deletions
diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h
index 1ed299803904..014e0dc0ba97 100644
--- a/fs/xfs/xfs_log_priv.h
+++ b/fs/xfs/xfs_log_priv.h
@@ -277,6 +277,7 @@ struct xfs_cil {
spinlock_t xc_push_lock ____cacheline_aligned_in_smp;
xfs_csn_t xc_push_seq;
+ bool xc_push_commit_stable;
struct list_head xc_committing;
wait_queue_head_t xc_commit_wait;
wait_queue_head_t xc_start_wait;
@@ -520,6 +521,8 @@ int xlog_write(struct xlog *log, struct xfs_cil_ctx *ctx,
void xfs_log_ticket_ungrant(struct xlog *log, struct xlog_ticket *ticket);
void xfs_log_ticket_regrant(struct xlog *log, struct xlog_ticket *ticket);
+void xlog_state_switch_iclogs(struct xlog *log, struct xlog_in_core *iclog,
+ int eventual_size);
int xlog_state_release_iclog(struct xlog *log, struct xlog_in_core *iclog,
xfs_lsn_t log_tail_lsn);
@@ -594,6 +597,7 @@ void xlog_cil_set_ctx_write_state(struct xfs_cil_ctx *ctx,
/*
* CIL force routines
*/
+void xlog_cil_flush(struct xlog *log);
xfs_lsn_t xlog_cil_force_seq(struct xlog *log, xfs_csn_t sequence);
static inline void