diff options
author | Dave Chinner <dchinner@redhat.com> | 2021-08-10 18:00:44 -0700 |
---|---|---|
committer | Darrick J. Wong <djwong@kernel.org> | 2021-08-16 12:09:30 -0700 |
commit | 68a74dcae6737c27b524b680e070fe41f0cad43a (patch) | |
tree | 80070d8327061b38c81e1cc7bb94eff08726184b /fs/xfs/xfs_log_priv.h | |
parent | caa80090d17c89d0caca1dcb4c8a9cdef5335e71 (diff) | |
download | linux-stable-68a74dcae6737c27b524b680e070fe41f0cad43a.tar.gz linux-stable-68a74dcae6737c27b524b680e070fe41f0cad43a.tar.bz2 linux-stable-68a74dcae6737c27b524b680e070fe41f0cad43a.zip |
xfs: order CIL checkpoint start records
Because log recovery depends on strictly ordered start records as
well as strictly ordered commit records.
This is a zero day bug in the way XFS writes pipelined transactions
to the journal which is exposed by fixing the zero day bug that
prevents the CIL from pipelining checkpoints. This re-introduces
explicit concurrent commits back into the on-disk journal and hence
out of order start records.
The XFS journal commit code has never ordered start records and we
have relied on strict commit record ordering for correct recovery
ordering of concurrently written transactions. Unfortunately, root
cause analysis uncovered the fact that log recovery uses the LSN of
the start record for transaction commit processing. Hence, whilst
the commits are processed in strict order by recovery, the LSNs
associated with the commits can be out of order and so recovery may
stamp incorrect LSNs into objects and/or misorder intents in the AIL
for later processing. This can result in log recovery failures
and/or on disk corruption, sometimes silent.
Because this is a long standing log recovery issue, we can't just
fix log recovery and call it good. This still leaves older kernels
susceptible to recovery failures and corruption when replaying a log
from a kernel that pipelines checkpoints. There is also the issue
that in-memory ordering for AIL pushing and data integrity
operations are based on checkpoint start LSNs, and if the start LSN
is incorrect in the journal, it is also incorrect in memory.
Hence there's really only one choice for fixing this zero-day bug:
we need to strictly order checkpoint start records in ascending
sequence order in the log, the same way we already strictly order
commit records.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Diffstat (limited to 'fs/xfs/xfs_log_priv.h')
-rw-r--r-- | fs/xfs/xfs_log_priv.h | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h index e0934e6aaf8a..1ed299803904 100644 --- a/fs/xfs/xfs_log_priv.h +++ b/fs/xfs/xfs_log_priv.h @@ -279,6 +279,7 @@ struct xfs_cil { xfs_csn_t xc_push_seq; struct list_head xc_committing; wait_queue_head_t xc_commit_wait; + wait_queue_head_t xc_start_wait; xfs_csn_t xc_current_sequence; struct work_struct xc_push_work; wait_queue_head_t xc_push_wait; /* background push throttle */ |