summaryrefslogtreecommitdiffstats
path: root/fs/bcachefs/alloc_foreground.h
Commit message (Collapse)AuthorAgeFilesLines
* bcachefs: sb-members.cKent Overstreet2023-10-221-1/+1
| | | | | | | Split out a new file for bch_sb_field_members - we'll likely want to move more code here in the future. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Rename enum alloc_reserve -> bch_watermarkKent Overstreet2023-10-221-4/+4
| | | | | | This is prep work for consolidating with JOURNAL_WATERMARK. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: New erasure coding shutdown pathKent Overstreet2023-10-221-5/+1
| | | | | | | | | | | | | | | | | | This implements a new shutdown path for erasure coding, which is needed for the upcoming BCH_WRITE_WAIT_FOR_EC write path. The process is: - Cancel new stripes being built up - Close out/cancel open buckets on write points or the partial list that are for stripes - Shutdown rebalance/copygc - Then wait for in flight new stripes to finish With BCH_WRITE_WAIT_FOR_EC, move ops will be waiting on stripes to fill up before they complete; the new ec shutdown path is needed for shutting down copygc/rebalance without deadlocking. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Rework open bucket partial list allocationKent Overstreet2023-10-221-4/+4
| | | | | | | | | | | | Now, any open_bucket can go on the partial list: allocating from the partial list has been moved to its own dedicated function, open_bucket_add_bucets() -> bucket_alloc_set_partial(). In particular, this means that erasure coded buckets can safely go on the partial list; the new location works with the "allocate an ec bucket first, then the rest" logic. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_open_bucket_to_text()Kent Overstreet2023-10-221-0/+1
| | | | | | Factor out a common helper Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Erasure coding now uses bch2_bucket_alloc_transKent Overstreet2023-10-221-1/+1
| | | | | | | | This code predates plumbing btree_trans through the bucket allocation path: switching to it fixes a deadlock due to using multiple btree_trans at the same time, which we never want to do. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Better inlining in core write pathKent Overstreet2023-10-221-0/+49
| | | | | | | | | | | | | | | | Provide inline versions of some allocation functions - bch2_alloc_sectors_done_inlined() - bch2_alloc_sectors_append_ptrs_inlined() and use them in the core IO path. Also, inline bch2_extent_update_i_size_sectors() and bch2_bkey_append_ptr(). In the core write path, function call overhead matters - every function call is a jump to a new location and a potential cache miss. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Kill bch2_alloc_sectors_start()Kent Overstreet2023-10-221-9/+0
| | | | | | Only used in one place, just inline it there. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Kill allocator threads & freelistsKent Overstreet2023-10-221-0/+11
| | | | | | | | | | | | | | | Now that we have new persistent data structures for the allocator, this patch converts the allocator to use them. Now, foreground bucket allocation uses the freespace btree to find buckets to allocate, instead of popping buckets off the freelist. The background allocator threads are no longer needed and are deleted, as well as the allocator freelists. Now we only need background tasks for invalidating buckets containing cached data (when we are low on empty buckets), and for issuing discards. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Freespace, need_discard btreesKent Overstreet2023-10-221-0/+14
| | | | | | | | | | | This adds two new btrees for the upcoming allocator rewrite: an extents btree of free buckets, and a btree for buckets awaiting discards. We also add a new trigger for alloc keys to keep the new btrees up to date, and a compatibility path to initialize them on existing filesystems. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Run btree updates after write out of write_pointKent Overstreet2023-10-221-8/+11
| | | | | | | | | | | | | | | | | | | | In the write path, after the write to the block device(s) complete we have to punt to process context to do the btree update. Instead of using the work item embedded in op->cl, this patch switches to a per write-point work item. This helps with two different issues: - lock contention: btree updates to the same writepoint will (usually) be updating the same alloc keys - context switch overhead: when we're bottlenecked on btree updates, having a thread (running out of a work item) checking the write point for completed ops is cheaper than queueing up a new work item and waking up a kworker. In an arbitrary benchmark, 4k random writes with fio running inside a VM, this patch resulted in a 10% improvement in total iops. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: x-macroize alloc_reserve enumKent Overstreet2023-10-221-0/+2
| | | | | | This makes an array of strings available, like our other enums. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Put open_buckets in a hashtableKent Overstreet2023-10-221-0/+24
| | | | | | | | This is so that the copygc code doesn't have to refer to bucket_mark.owned_by_allocator - assisting in getting rid of the in memory bucket array. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Refactor open_bucket codeKent Overstreet2023-10-221-1/+4
| | | | | | | | | Prep work for adding a hash table of open buckets - instead of embedding a bch_extent_ptr, we need to refer to the bucket directly so that we're not calling sector_to_bucket() in the hash table lookup code, which has an expensive divide. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: bch2_alloc_sectors_append_ptrs() now takes cached flagKent Overstreet2023-10-221-1/+1
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Convert bucket_alloc_ret to negative error codesKent Overstreet2023-10-221-9/+1
| | | | | | | | | | | Start a new header, errcode.h, for bcachefs-private error codes - more error codes will be converted later. This patch just converts bucket_alloc_ret so that they can be mixed with standard error codes and passed as ERR_PTR errors - the ec.c code was doing this already, but incorrectly. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Don't let copygc buckets be stolen by other threadsKent Overstreet2023-10-221-7/+0
| | | | | | | And assorted other copygc fixes. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Delete unused argumentsKent Overstreet2023-10-221-2/+1
| | | | | Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Don't restrict copygc writes to the same deviceKent Overstreet2023-10-221-4/+12
| | | | | | | | This no longer makes any sense, since copygc is now one thread per filesystem, not per device, with a single write point. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Refactor stripe creationKent Overstreet2023-10-221-0/+5
| | | | | | | | | | Prep work for the patch to update existing stripes with new data blocks. This moves allocating new stripes into ec.c, and also sets up the data structures so that we can handly only allocating some of the blocks in a stripe. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Drop unused arg to bch2_open_buckets_stop_dev()Kent Overstreet2023-10-221-1/+1
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Fix some reserve calculationsKent Overstreet2023-10-221-0/+1
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Make bkey types globally uniqueKent Overstreet2023-10-221-1/+1
| | | | | | | | | | this lets us get rid of a lot of extra switch statements - in a lot of places we dispatch on the btree node type, and then the key type, so this is a nice cleanup across a lot of code. Also improve the on disk format versioning stuff. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Erasure codingKent Overstreet2023-10-221-6/+25
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Scale down number of writepoints when low on spaceKent Overstreet2023-10-221-9/+2
| | | | | | | this means we don't have to reserve space for them when calculating filesystem capacity Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: fix missing includeKent Overstreet2023-10-221-0/+2
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Allocation code refactoringKent Overstreet2023-10-221-21/+23
| | | | | | | | | | | | | | | | | | | | | bch2_alloc_sectors_start() was a nightmare to work with - it's got some tricky stuff to do, since it wants to use the buckets the writepoint already has, unless they're not in the target it wants to write to, unless it can't allocate from any other devices in which case it will use those buckets if it has to - et cetera. This restructures the code to start with a new empty list of open buckets we're going to use for the new allocation, pulling buckets from the write point's list as we decide that we really are going to use them - making the code somewhat more functional and drastically easier to understand. Also fixes a bug where we could end up waiting on c->freelist_wait (because allocating from one device failed) but return success from bch2_bucket_alloc(), because allocating from a different device succeeded. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Split out alloc_background.cKent Overstreet2023-10-221-0/+116
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>