linux.git - Linux kernel mainline tree

	Commit message (Collapse)	Author	Age	Files	Lines
*	bcachefs: bch_fs_usage_base	Kent Overstreet	2024-01-21	1	-10/+5
\| \| \| \| \| \| \|	Split out base filesystem usage into its own type; prep work for breaking up bch2_trans_fs_usage_apply(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Kill dev_usage->buckets_ec	Kent Overstreet	2024-01-01	1	-2/+0
\| \| \| \| \| \| \|	This counter is redundant; it's simply the sum of BCH_DATA_stripe and BCH_DATA_parity buckets. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Fragmentation LRU	Kent Overstreet	2023-10-22	1	-11/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we have much more efficient updates to the LRU btree, this patch adds a new LRU that indexes buckets by fragmentation. This means copygc no longer has to scan every bucket to find buckets that need to be evacuated. Changes: - A new field in bch_alloc_v4, fragmentation_lru - this corresponds to the bucket's position in the fragmentation LRU. We add a new field for this instead of calculating it as needed because we may make the fragmentation LRU optional; this field indicates whether a bucket is on the fragmentation LRU. Also, zoned devices will introduce variable bucket sizes; explicitly recording the LRU position will be safer for them. - A new copygc path for using the fragmentation LRU instead of scanning every bucket and building up an in-memory heap. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Copygc now uses backpointers	Kent Overstreet	2023-10-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, copygc needed to walk the entire extents & reflink btrees to find extents that needed to be moved. Now that we have backpointers, this patch implements bch2_evacuate_bucket() in the move code, which copygc now uses for evacuating mostly empty buckets. Also, thanks to the new backpointers code, copygc can now move btree nodes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Fold bucket_state in to BCH_DATA_TYPES()	Kent Overstreet	2023-10-22	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, we were missing accounting for buckets in need_gc_gens and need_discard states. This matters because buckets in those states need other btree operations done before they can be used, so they can't be conuted when checking current number of free buckets against the allocation watermark. Also, we weren't directly counting free buckets at all. Now, data type 0 == BCH_DATA_free, and free buckets are counted; this means we can get rid of the separate (poorly defined) count of unavailable buckets. This is a new on disk format version, with upgrade and fsck required for the accounting changes. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
*	bcachefs: Kill struct bucket_mark	Kent Overstreet	2023-10-22	1	-22/+8
\| \| \| \| \| \| \| \|	This switches struct bucket to using a lock, instead of cmpxchg. And now that the protected members no longer need to fit into a u64, we can expand the sector counts to 32 bits. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Kill main in-memory bucket array	Kent Overstreet	2023-10-22	1	-1/+0
\| \| \| \| \| \| \|	All code using the in-memory bucket array, excluding GC, has now been converted to use the alloc btree directly - so we can finally delete it. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Kill allocator threads & freelists	Kent Overstreet	2023-10-22	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we have new persistent data structures for the allocator, this patch converts the allocator to use them. Now, foreground bucket allocation uses the freespace btree to find buckets to allocate, instead of popping buckets off the freelist. The background allocator threads are no longer needed and are deleted, as well as the allocator freelists. Now we only need background tasks for invalidating buckets containing cached data (when we are low on empty buckets), and for issuing discards. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: KEY_TYPE_alloc_v4	Kent Overstreet	2023-10-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This introduces a new alloc key which doesn't use varints. Soon we'll be adding backpointers and storing them in alloc keys, which means our pack/unpack workflow for alloc keys won't really work - we'll need to be mutating alloc keys in place. Instead of bch2_alloc_unpack(), we now have bch2_alloc_to_v4() that converts older types of alloc keys to v4 if needed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: bch2_gc_gens() no longer uses bucket array	Kent Overstreet	2023-10-22	1	-1/+0
\| \| \| \| \| \| \| \|	Like the previous patches, this converts bch2_gc_gens() to use the alloc btree directly, and private arrays of generation numbers for its own recalculation of oldest_gen. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
*	bcachefs: New data structure for buckets waiting on journal commit	Kent Overstreet	2023-10-22	1	-9/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	Implement a hash table, using cuckoo hashing, for empty buckets that are waiting on a journal commit before they can be reused. This replaces the journal_seq field of bucket_mark, and is part of eventually getting rid of the in memory bucket array. We may need to make bch2_bucket_needs_journal_commit() lockless, pending profiling and testing. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
*	bcachefs: New in-memory array for bucket gens	Kent Overstreet	2023-10-22	1	-0/+7
\| \| \| \| \| \| \| \|	The main in-memory bucket array is going away, but we'll still need to keep bucket generations in memory, at least for now - ptr_stale() needs to be an efficient operation. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
*	bcachefs: Fix bch2_trans_mark_dev_sb()	Kent Overstreet	2023-10-22	1	-0/+5
\| \| \| \| \|	Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Kill bch2_fs_usage_scratch_get()	Kent Overstreet	2023-10-22	1	-16/+0
\| \| \| \| \| \| \| \|	This is an important cleanup, eliminating an unnecessary copy in the transaction commit path. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Persist 64 bit io clocks	Kent Overstreet	2023-10-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Originally, bcachefs - going back to bcache - stored, for each bucket, a 16 bit counter corresponding to how long it had been since the bucket was read from. But, this required periodically rescaling counters on every bucket to avoid wraparound. That wasn't an issue in bcache, where we'd perodically rewrite the per bucket metadata all at once, but in bcachefs we're trying to avoid having to walk every single bucket. This patch switches to persisting 64 bit io clocks, corresponding to the 64 bit bucket timestaps introduced in the previous patch with KEY_TYPE_alloc_v2. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: KEY_TYPE_alloc_v2	Kent Overstreet	2023-10-22	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This introduces a new version of KEY_TYPE_alloc, which uses the new varint encoding introduced for inodes. This means we'll eventually be able to support much larger bucket sizes (for SMR devices), and the read/write time fields are expanded to 64 bits - which will be used in the next patch to get rid of the periodic rescaling of those fields. Also, for buckets that are members of erasure coded stripes, this adds persistent fields for the index of the stripe they're members of and the stripe redundancy. This is part of work to get rid of having to scan and read into memory the alloc and stripes btrees at mount time. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Refactor dev usage	Kent Overstreet	2023-10-22	1	-7/+6
\| \| \| \| \| \| \|	This is to make it more amenable for serialization. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Don't drop replicas when copygcing ec data	Kent Overstreet	2023-10-22	1	-0/+2
\| \| \| \| \|	Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Change copygc to consider bucket fragmentation	Kent Overstreet	2023-10-22	1	-0/+1
\| \| \| \| \| \| \|	When devices have different sized buckets this is more correct. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Make copygc thread global	Kent Overstreet	2023-10-22	1	-0/+1
\| \| \| \| \| \| \| \|	Per device copygc threads don't move data to different devices and they make fragmentation works - they don't make much sense anymore. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Track sectors of erasure coded data	Kent Overstreet	2023-10-22	1	-1/+3
\| \| \| \| \|	Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Implement a new gc that only recalcs oldest gen	Kent Overstreet	2023-10-22	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Full mark and sweep gc doesn't (yet?) work with the new btree key cache code, but it also blocks updates to interior btree nodes for the duration and isn't really necessary in practice; we aren't currently attempting to repair errors in allocation info at runtime. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Make replicas_delta_list smaller	Kent Overstreet	2023-10-22	1	-1/+5
\| \| \| \|	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Refactor bch2_alloc_write()	Kent Overstreet	2023-10-22	1	-1/+0
\| \| \| \| \| \| \| \|	Major simplification - gets rid of the need for marking buckets as dirty, instead we write buckets if the in memory mark is different from what's in the btree. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Don't use a fixed size buffer for fs_usage_deltas	Kent Overstreet	2023-10-22	1	-3/+2
\| \| \| \|	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: bch2_trans_mark_update()	Kent Overstreet	2023-10-22	1	-0/+13
\| \| \| \|	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Write out fs usage consistently	Kent Overstreet	2023-10-22	1	-7/+5
\| \| \| \|	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: add ability to run gc on metadata only	Kent Overstreet	2023-10-22	1	-0/+1
\| \| \| \|	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Add a mechanism for blocking the journal	Kent Overstreet	2023-10-22	1	-15/+13
\| \| \| \|	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Fix oldest_gen handling	Kent Overstreet	2023-10-22	1	-0/+1
\| \| \| \|	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Fix check for if extent update is allocating	Kent Overstreet	2023-10-22	1	-6/+8
\| \| \| \|	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Include summarized counts in fs_usage	Kent Overstreet	2023-10-22	1	-4/+15
\| \| \| \|	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: refactor bch_fs_usage	Kent Overstreet	2023-10-22	1	-5/+9
\| \| \| \|	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: move dirty into bucket_mark	Kent Overstreet	2023-10-22	1	-1/+1
\| \| \| \|	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Add new alloc fields	Kent Overstreet	2023-10-22	1	-20/+18
\| \| \| \| \| \|	prep work for persistent alloc info Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Track nr_inodes with the key marking machinery	Kent Overstreet	2023-10-22	1	-0/+2
\| \| \| \|	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: gc now operates on second set of bucket marks	Kent Overstreet	2023-10-22	1	-2/+4
\| \| \| \| \| \| \|	This means we can now use gc to verify the allocation information - important for testing persistant alloc info Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Erasure coding	Kent Overstreet	2023-10-22	1	-1/+4
\| \| \| \|	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Change how replicated data is accounted	Kent Overstreet	2023-10-22	1	-1/+0
\| \| \| \| \| \| \| \| \|	Due to compression, the different replicas of a replicated extent don't necessarily have to take up the same amount of space - so replicated data sector counts shouldn't be stored divided by the number of replicas. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Account for internal fragmentation better	Kent Overstreet	2023-10-22	1	-1/+3
\| \| \| \|	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: kill s_alloc, use bch_data_type	Kent Overstreet	2023-10-22	1	-8/+2
\| \| \| \|	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: don't call bch2_bucket_seq_cleanup from journal_buf_switch	Kent Overstreet	2023-10-22	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	journal_buf_switch is called from the foreground when getting a journal reservation and thus is somewhat latency sensitive; bch2_bucket_seq_cleanup has to run infrequently but is a bit expensive when it does run. Call it from the journal write path instead, and punt the journal write to worqueue context. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
*	bcachefs: Initial commit	Kent Overstreet	2023-10-22	1	-0/+96
	Initially forked from drivers/md/bcache, bcachefs is a new copy-on-write filesystem with every feature you could possibly want. Website: https://bcachefs.org Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>