diff options
author | Kent Overstreet <kent.overstreet@gmail.com> | 2020-11-14 09:59:58 -0500 |
---|---|---|
committer | Kent Overstreet <kent.overstreet@linux.dev> | 2023-10-22 17:08:49 -0400 |
commit | adbcada43fa79197224b5a522b1faaf222b43bcd (patch) | |
tree | 5df18388e19feb9d2f13f7d661ae9dea4d4f4782 /fs/bcachefs/journal_types.h | |
parent | b6df4325cd914d988e5b96016f64b879058d0bc6 (diff) | |
download | linux-adbcada43fa79197224b5a522b1faaf222b43bcd.tar.gz linux-adbcada43fa79197224b5a522b1faaf222b43bcd.tar.bz2 linux-adbcada43fa79197224b5a522b1faaf222b43bcd.zip |
bcachefs: Don't require flush/fua on every journal write
This patch adds a flag to journal entries which, if set, indicates that
they weren't done as flush/fua writes.
- non flush/fua journal writes don't update last_seq (i.e. they don't
free up space in the journal), thus the journal free space
calculations now check whether nonflush journal writes are currently
allowed (i.e. are we low on free space, or would doing a flush write
free up a lot of space in the journal)
- write_delay_ms, the user configurable option for when open journal
entries are automatically written, is now interpreted as the max
delay between flush journal writes (default 1 second).
- bch2_journal_flush_seq_async is changed to ensure a flush write >=
the requested sequence number has happened
- journal read/replay must now ignore, and blacklist, any journal
entries newer than the most recent flush entry in the journal. Also,
the way the read_entire_journal option is handled has been improved;
struct journal_replay now has an entry, 'ignore', for entries that
were read but should not be used.
- assorted refactoring and improvements related to journal read in
journal_io.c and recovery.c
Previously, we'd have to issue a flush/fua write every time we
accumulated a full journal entry - typically the bucket size. Now we
need to issue them much less frequently: when an fsync is requested, or
it's been more than write_delay_ms since the last flush, or when we need
to free up space in the journal. This is a significant performance
improvement on many write heavy workloads.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Diffstat (limited to 'fs/bcachefs/journal_types.h')
-rw-r--r-- | fs/bcachefs/journal_types.h | 8 |
1 files changed, 8 insertions, 0 deletions
diff --git a/fs/bcachefs/journal_types.h b/fs/bcachefs/journal_types.h index 6b525dc6ab7c..cf9675310f2b 100644 --- a/fs/bcachefs/journal_types.h +++ b/fs/bcachefs/journal_types.h @@ -29,6 +29,8 @@ struct journal_buf { unsigned disk_sectors; /* maximum size entry could have been, if buf_size was bigger */ unsigned u64s_reserved; + bool noflush; /* write has already been kicked off, and was noflush */ + bool must_flush; /* something wants a flush */ /* bloom filter: */ unsigned long has_inode[1024 / sizeof(unsigned long)]; }; @@ -146,6 +148,7 @@ enum { JOURNAL_RECLAIM_STARTED, JOURNAL_NEED_WRITE, JOURNAL_MAY_GET_UNRESERVED, + JOURNAL_MAY_SKIP_FLUSH, }; /* Embedded in struct bch_fs */ @@ -203,6 +206,7 @@ struct journal { /* seq, last_seq from the most recent journal entry successfully written */ u64 seq_ondisk; + u64 flushed_seq_ondisk; u64 last_seq_ondisk; u64 err_seq; u64 last_empty_seq; @@ -252,11 +256,15 @@ struct journal { unsigned write_delay_ms; unsigned reclaim_delay_ms; + unsigned long last_flush_write; u64 res_get_blocked_start; u64 need_write_time; u64 write_start_time; + u64 nr_flush_writes; + u64 nr_noflush_writes; + struct bch2_time_stats *write_time; struct bch2_time_stats *delay_time; struct bch2_time_stats *blocked_time; |