Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 updates from Ted Ts'o: "In addition to bug fixes and cleanups, there are two new features for ext4 in 5.14: - Allow applications to poll on changes to /sys/fs/ext4/*/errors_count - Add the ioctl EXT4_IOC_CHECKPOINT which allows the journal to be checkpointed, truncated and discarded or zero'ed" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (32 commits) jbd2: export jbd2_journal_[un]register_shrinker() ext4: notify sysfs on errors_count value change fs: remove bdev_try_to_free_page callback ext4: remove bdev_try_to_free_page() callback jbd2: simplify journal_clean_one_cp_list() jbd2,ext4: add a shrinker to release checkpointed buffers jbd2: remove redundant buffer io error checks jbd2: don't abort the journal when freeing buffers jbd2: ensure abort the journal if detect IO error when writing original buffer back jbd2: remove the out label in __jbd2_journal_remove_checkpoint() ext4: no need to verify new add extent block jbd2: clean up misleading comments for jbd2_fc_release_bufs ext4: add check to prevent attempting to resize an fs with sparse_super2 ext4: consolidate checks for resize of bigalloc into ext4_resize_begin ext4: remove duplicate definition of ext4_xattr_ibody_inline_set() ext4: fsmap: fix the block/inode bitmap comment ext4: fix comment for s_hash_unsigned ext4: use local variable ei instead of EXT4_I() macro ext4: fix avefreec in find_group_orlov ext4: correct the cache_nr in tracepoint ext4_es_shrink_exit ...
author: Linus Torvalds <torvalds@linux-foundation.org> 2021-06-30 19:37:39 -0700
committer: Linus Torvalds <torvalds@linux-foundation.org> 2021-06-30 19:37:39 -0700
commit: a6ecc2a491e378e00e65e59a006d4005e1c2f4af (patch)
tree: c252ded3b08909bbab2a7bde76b60a0b225d3c3a /Documentation/filesystems/ext4
parent: 2cfa582be80081fb8db02d4d9b44bff34b82ac54 (diff)
parent: 16aa4c9a1fbe763c147a964cdc1f5be8ed98ed13 (diff)
download: linux-a6ecc2a491e378e00e65e59a006d4005e1c2f4af.tar.gz
linux-a6ecc2a491e378e00e65e59a006d4005e1c2f4af.tar.bz2
linux-a6ecc2a491e378e00e65e59a006d4005e1c2f4af.zip
1 files changed, 31 insertions, 8 deletions
diff --git a/Documentation/filesystems/ext4/journal.rst b/Documentation/filesystems/ext4/journal.rst
index cdbfec473167..5fad38860f17 100644
--- a/Documentation/filesystems/ext4/journal.rst
+++ b/Documentation/filesystems/ext4/journal.rst
@@ -4,14 +4,14 @@ Journal (jbd2)
 --------------
 
 Introduced in ext3, the ext4 filesystem employs a journal to protect the
-filesystem against corruption in the case of a system crash. A small
-continuous region of disk (default 128MiB) is reserved inside the
-filesystem as a place to land “important” data writes on-disk as quickly
-as possible. Once the important data transaction is fully written to the
-disk and flushed from the disk write cache, a record of the data being
-committed is also written to the journal. At some later point in time,
-the journal code writes the transactions to their final locations on
-disk (this could involve a lot of seeking or a lot of small
+filesystem against metadata inconsistencies in the case of a system crash. Up
+to 10,240,000 file system blocks (see man mke2fs(8) for more details on journal
+size limits) can be reserved inside the filesystem as a place to land
+“important” data writes on-disk as quickly as possible. Once the important
+data transaction is fully written to the disk and flushed from the disk write
+cache, a record of the data being committed is also written to the journal. At
+some later point in time, the journal code writes the transactions to their
+final locations on disk (this could involve a lot of seeking or a lot of small
 read-write-erases) before erasing the commit record. Should the system
 crash during the second slow write, the journal can be replayed all the
 way to the latest commit record, guaranteeing the atomicity of whatever
@@ -731,3 +731,26 @@ point, the refcount for inode 11 is not reliable, but that gets fixed by the
 replay of last inode 11 tag. Thus, by converting a non-idempotent procedure
 into a series of idempotent outcomes, fast commits ensured idempotence during
 the replay.
+
+Journal Checkpoint
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Checkpointing the journal ensures all transactions and their associated buffers
+are submitted to the disk. In-progress transactions are waited upon and included
+in the checkpoint. Checkpointing is used internally during critical updates to
+the filesystem including journal recovery, filesystem resizing, and freeing of
+the journal_t structure.
+
+A journal checkpoint can be triggered from userspace via the ioctl
+EXT4_IOC_CHECKPOINT. This ioctl takes a single, u64 argument for flags.
+Currently, three flags are supported. First, EXT4_IOC_CHECKPOINT_FLAG_DRY_RUN
+can be used to verify input to the ioctl. It returns error if there is any
+invalid input, otherwise it returns success without performing
+any checkpointing. This can be used to check whether the ioctl exists on a
+system and to verify there are no issues with arguments or flags. The
+other two flags are EXT4_IOC_CHECKPOINT_FLAG_DISCARD and
+EXT4_IOC_CHECKPOINT_FLAG_ZEROOUT. These flags cause the journal blocks to be
+discarded or zero-filled, respectively, after the journal checkpoint is
+complete. EXT4_IOC_CHECKPOINT_FLAG_DISCARD and EXT4_IOC_CHECKPOINT_FLAG_ZEROOUT
+cannot both be set. The ioctl may be useful when snapshotting a system or for
+complying with content deletion SLOs.
author	Linus Torvalds <torvalds@linux-foundation.org>	2021-06-30 19:37:39 -0700
committer	Linus Torvalds <torvalds@linux-foundation.org>	2021-06-30 19:37:39 -0700
commit	a6ecc2a491e378e00e65e59a006d4005e1c2f4af (patch)
tree	c252ded3b08909bbab2a7bde76b60a0b225d3c3a /Documentation/filesystems/ext4
parent	2cfa582be80081fb8db02d4d9b44bff34b82ac54 (diff)
parent	16aa4c9a1fbe763c147a964cdc1f5be8ed98ed13 (diff)
download	linux-a6ecc2a491e378e00e65e59a006d4005e1c2f4af.tar.gz linux-a6ecc2a491e378e00e65e59a006d4005e1c2f4af.tar.bz2 linux-a6ecc2a491e378e00e65e59a006d4005e1c2f4af.zip