summaryrefslogtreecommitdiffstats
path: root/fs/xfs/xfs_log_priv.h
diff options
context:
space:
mode:
authorDave Chinner <dchinner@redhat.com>2021-08-10 18:00:44 -0700
committerDarrick J. Wong <djwong@kernel.org>2021-08-16 12:09:30 -0700
commit68a74dcae6737c27b524b680e070fe41f0cad43a (patch)
tree80070d8327061b38c81e1cc7bb94eff08726184b /fs/xfs/xfs_log_priv.h
parentcaa80090d17c89d0caca1dcb4c8a9cdef5335e71 (diff)
downloadlinux-stable-68a74dcae6737c27b524b680e070fe41f0cad43a.tar.gz
linux-stable-68a74dcae6737c27b524b680e070fe41f0cad43a.tar.bz2
linux-stable-68a74dcae6737c27b524b680e070fe41f0cad43a.zip
xfs: order CIL checkpoint start records
Because log recovery depends on strictly ordered start records as well as strictly ordered commit records. This is a zero day bug in the way XFS writes pipelined transactions to the journal which is exposed by fixing the zero day bug that prevents the CIL from pipelining checkpoints. This re-introduces explicit concurrent commits back into the on-disk journal and hence out of order start records. The XFS journal commit code has never ordered start records and we have relied on strict commit record ordering for correct recovery ordering of concurrently written transactions. Unfortunately, root cause analysis uncovered the fact that log recovery uses the LSN of the start record for transaction commit processing. Hence, whilst the commits are processed in strict order by recovery, the LSNs associated with the commits can be out of order and so recovery may stamp incorrect LSNs into objects and/or misorder intents in the AIL for later processing. This can result in log recovery failures and/or on disk corruption, sometimes silent. Because this is a long standing log recovery issue, we can't just fix log recovery and call it good. This still leaves older kernels susceptible to recovery failures and corruption when replaying a log from a kernel that pipelines checkpoints. There is also the issue that in-memory ordering for AIL pushing and data integrity operations are based on checkpoint start LSNs, and if the start LSN is incorrect in the journal, it is also incorrect in memory. Hence there's really only one choice for fixing this zero-day bug: we need to strictly order checkpoint start records in ascending sequence order in the log, the same way we already strictly order commit records. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Diffstat (limited to 'fs/xfs/xfs_log_priv.h')
-rw-r--r--fs/xfs/xfs_log_priv.h1
1 files changed, 1 insertions, 0 deletions
diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h
index e0934e6aaf8a..1ed299803904 100644
--- a/fs/xfs/xfs_log_priv.h
+++ b/fs/xfs/xfs_log_priv.h
@@ -279,6 +279,7 @@ struct xfs_cil {
xfs_csn_t xc_push_seq;
struct list_head xc_committing;
wait_queue_head_t xc_commit_wait;
+ wait_queue_head_t xc_start_wait;
xfs_csn_t xc_current_sequence;
struct work_struct xc_push_work;
wait_queue_head_t xc_push_wait; /* background push throttle */