summaryrefslogtreecommitdiffstats
path: root/fs/bcachefs/move.c
Commit message (Collapse)AuthorAgeFilesLines
* bcachefs: Fix shift-by-64 in bformat_needs_redo()Kent Overstreet2024-05-061-8/+14
| | | | | | | | | | Ancient versions of bcachefs produced packed formats that could represent keys that our in memory format cannot represent; bformat_needs_redo() has some tricky shifts to check for this sort of overflow. Reported-by: syzbot+594427aebfefeebe91c6@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: opts->compression can now also be applied in the backgroundKent Overstreet2024-01-211-1/+1
| | | | | | | | | The "apply this compression method in the background" paths now use the compression option if background_compression is not set; this means that setting or changing the compression option will cause existing data to be compressed accordingly in the background. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Prep work for variable size btree node buffersKent Overstreet2024-01-211-4/+6
| | | | | | | | | | | | | | | | | bcachefs btree nodes are big - typically 256k - and btree roots are pinned in memory. As we're now up to 18 btrees, we now have significant memory overhead in mostly empty btree roots. And in the future we're going to start enforcing that certain btree node boundaries exist, to solve lock contention issues - analagous to XFS's AGIs. Thus, we need to start allocating smaller btree node buffers when we can. This patch changes code that refers to the filesystem constant c->opts.btree_node_size to refer to the btree node buffer size - btree_buf_bytes() - where appropriate. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Improve move_extent tracepointKent Overstreet2024-01-211-2/+38
| | | | | | | Also print out the data_opts, so that we can see what specifically is being done to an extent. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Re-add move_extent_write tracepointKent Overstreet2024-01-211-0/+9
| | | | | | | It appears this was accidentally deleted at some point - also, do a bit of cleanup. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: helpers for printing data typesKent Overstreet2024-01-211-3/+3
| | | | | | We need bounds checking since new versions may introduce new data types. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bkey_for_each_ptr() now declares loop iterKent Overstreet2024-01-011-4/+1
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: for_each_btree_key() now declares loop iterKent Overstreet2024-01-011-3/+0
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: darray_for_each() now declares loop iterKent Overstreet2024-01-011-3/+1
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch_err_(fn|msg) check if should printKent Overstreet2024-01-011-3/+2
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Rename for_each_btree_key2() -> for_each_btree_key()Kent Overstreet2024-01-011-2/+2
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Kill for_each_btree_key()Kent Overstreet2024-01-011-7/+4
| | | | | | | | | | | | | for_each_btree_key() handles transaction restarts, like for_each_btree_key2(), but only calls bch2_trans_begin() after a transaction restart - for_each_btree_key2() wraps every loop iteration in a transaction. The for_each_btree_key() behaviour is problematic when it leads to holding the SRCU lock that prevents key cache reclaim for an unbounded amount of time - there's no real need to keep it around. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: rebalance should wakeup on shutdown if disabledDaniel Hill2024-01-011-1/+1
| | | | | Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: remove dead bch2_evacuate_bucket()Daniel Hill2024-01-011-19/+1
| | | | | Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Replace zero-length arrays with flexible-array membersGustavo A. R. Silva2024-01-011-1/+1
| | | | | | | | | | | Fake flexible arrays (zero-length and one-element arrays) are deprecated, and should be replaced by flexible-array members. So, replace zero-length arrays with flexible-array members in multiple structures. Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: count_event()Kent Overstreet2024-01-011-1/+2
| | | | | | Small helper for event counters. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_btree_write_buffer_flush() -> bch2_btree_write_buffer_tryflush()Kent Overstreet2024-01-011-4/+3
| | | | | | More accurate naming. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: New bucket sector count helpersKent Overstreet2024-01-011-1/+1
| | | | | | | This introduces bch2_bucket_sectors() and bch2_bucket_sectors_dirty(), prep work for separately accounting stripe sectors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: BCH_DATA_OP_drop_extra_replicasKent Overstreet2024-01-011-6/+44
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Convert bch2_move_btree() to bbposKent Overstreet2024-01-011-25/+19
| | | | | | Minor cleanup. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: x-macro-ify bch_data_ops enumKent Overstreet2024-01-011-12/+17
| | | | | | This will let us add an enum -> string table for a to_text() fn. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Extra kthread_should_stop() calls for copygcKent Overstreet2023-11-281-3/+9
| | | | | | | | | This fixes a bug where going read-only was taking longer than it should have due to copygc forgetting to check kthread_should_stop() Additionally: fix a missing is_kthread check in bch2_move_ratelimit(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: -EROFS doesn't count as move_extent_start_failKent Overstreet2023-11-281-0/+4
| | | | | | | | The automated tests check if we've hit too many slowpath/error path events and fail the test - if we're just shutting down, that naturally shouldn't count. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: trace_move_extent_start_fail() now includes errcodeKent Overstreet2023-11-281-13/+10
| | | | | | | Renamed from trace_move_extent_alloc_mem_fail, because there are other reasons we colud fail (disk space allocation failure). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Data update path won't accidentaly grow replicasKent Overstreet2023-11-251-53/+5
| | | | | | | | | | | | | | | Previously, there was a bug where if an extent had greater durability than required (because we needed to move a durability=1 pointer and ended up putting it on a durability 2 device), we would submit a write for replicas=2 - the durability of the pointer being rewritten - instead of the number of replicas required to bring it back up to the data_replicas option. This, plus the allocation path sometimes allocating on a greater durability device than requested, meant that extents could continue having more and more replicas added as they were being rewritten. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Make sure bch2_move_ratelimit() also waits for move_opsKent Overstreet2023-11-241-13/+4
| | | | | | | | This adds move_ctxt_wait_event_timeout(), which can sleep for a timeout while also issueing pending moves as reads complete. Co-developed-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_moving_ctxt_flush_all()Kent Overstreet2023-11-241-5/+11
| | | | | | | | | | Introduce a new helper to flush all move IOs, and use it in a few places where we should have been. The new helper also drops btree locks before waiting on outstanding move writes, avoiding potential deadlocks. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Data move path now uses bch2_trans_unlock_long()Kent Overstreet2023-11-041-5/+8
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: move: move_stats refactoringKent Overstreet2023-10-311-45/+53
| | | | | | | | | | | data_progress_list is gone - it was redundant with moving_context_list The upcoming rebalance rewrite is going to have it using two different move_stats objects with the same moving_context, depending on whether it's scanning or using the rebalance_work btree - this patch plumbs stats around a bit differently so that will work. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: move: convert to bbposKent Overstreet2023-10-311-11/+8
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: moving_context now owns a btree_transKent Overstreet2023-10-311-51/+42
| | | | | | | btree_trans and moving_context are used together, and having the moving_context owns the transaction object reduces some plumbing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: move.c exports, refactoringKent Overstreet2023-10-311-55/+64
| | | | | | Prep work for the new rebalance code - we need a few helpers exported. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Improve io option handling in data move pathKent Overstreet2023-10-311-50/+81
| | | | | | | The data move path now correctly picks IO options when inodes in different snapshots have different options applied. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_btree_id_str()Kent Overstreet2023-10-311-1/+1
| | | | | | | Since we can run with unknown btree IDs, we can't directly index btree IDs into fixed size arrays. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: More minor smatch fixesKent Overstreet2023-10-221-1/+1
| | | | | | | - fix a few uninitialized return values - return a proper error code in lookup_lostfound() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Heap allocate btree_transKent Overstreet2023-10-221-21/+18
| | | | | | | | | | We're using more stack than we'd like in a number of functions, and btree_trans is the biggest object that we stack allocate. But we have to do a heap allocatation to initialize it anyways, so there's no real downside to heap allocating the entire thing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Fix W=12 build errorsKent Overstreet2023-10-221-1/+0
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Break up io.cKent Overstreet2023-10-221-1/+2
| | | | | | | | | More reorganization, this splits up io.c into - io_read.c - io_misc.c - fallocate, fpunch, truncate - io_write.c Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Improve bch2_moving_ctxt_to_text()Kent Overstreet2023-10-221-25/+19
| | | | | | | | Print more information out about moving contexts - fold in the output of the redundant bch2_data_jobs_to_text(), and also include information relevant to whether move_data() should be blocked. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Allow for unknown btree IDsKent Overstreet2023-10-221-2/+8
| | | | | | | | | | | | | | | | | We need to allow filesystems with metadata from newer versions to be mountable and usable by older versions. This patch enables us to roll out new btrees without a new major version number; we can now handle btree roots for unknown btree types. The unknown btree roots will be retained, and fsck (including backpointers) will check them, the same as other btree types. We add a dynamic array for the extra, unknown btree roots, in addition to the fixed size btree root array, and add new helpers for looking up btree roots. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: New error message helpersKent Overstreet2023-10-221-3/+5
| | | | | | | | | | | | | Add two new helpers for printing error messages with __func__ and bch2_err_str(): - bch_err_fn - bch_err_msg Also kill the old error strings in the recovery path, which were causing us to incorrectly report memory allocation failures - they're not needed anymore. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Convert -ENOENT to private error codesKent Overstreet2023-10-221-1/+1
| | | | | | | As with previous conversions, replace -ENOENT uses with more informative private error codes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Delete an incorrect bch2_trans_unlock()Kent Overstreet2023-10-221-1/+0
| | | | | | | | | | | | These deletes a bch2_trans_unlock() call from __bch2_move_data(). It was redundant; bch2_move_extent() has the correct unlock call, and it was buggy because when move_extent calls bch2_extent_drop_ptrs() we don't want the transaction to be unlocked yet - this fixes a btree_iter.c assertion. Fixes https://github.com/koverstreet/bcachefs/issues/511. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_bkey_make_mut() now calls bch2_trans_update()Kent Overstreet2023-10-221-1/+1
| | | | | | | | | | | It's safe to call bch2_trans_update with a k/v pair where the value hasn't been filled out, as long as the key part has been and the value is filled out by transaction commit time. This patch folds the bch2_trans_update() call into bch2_bkey_make_mut(), eliminating a bit of boilerplate. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Kill bch2_verify_bucket_evacuated()Kent Overstreet2023-10-221-79/+0
| | | | | | | | | | | | With backpointers, it's now impossible for bch2_evacuate_bucket() to be completely reliable: it can race with an extent being partially overwritten or split, which needs a new write buffer flush for the backpointer to be seen. This shouldn't be a real issue in practice; the previous patch added a new tracepoint so we'll be able to see more easily if it is. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Improve move path tracepointsKent Overstreet2023-10-221-3/+40
| | | | | | | Move path tracepoints now include the key being moved. Also, add new tracepoints for the start of move_extent, and evacuate_bucket. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Rip out code for storing backpointers in alloc keysKent Overstreet2023-10-221-14/+12
| | | | | | | | | | We don't store backpointers in alloc keys anymore, since we gained the btree write buffer. This patch drops support for backpointers in alloc keys, and revs the on disk format version so that we know a fsck is required. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Use BTREE_ITER_INTENT in ec_stripe_update_extent()Kent Overstreet2023-10-221-2/+2
| | | | | | | | This adds a flags param to bch2_backpointer_get_key() so that we can pass BTREE_ITER_INTENT, since ec_stripe_update_extent() is updating the extent immediately. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Fix bch2_verify_bucket_evacuated()Kent Overstreet2023-10-221-0/+5
| | | | | | | | | | We were going into an infinite loop when printing out backpointers, due to never incrementing bp_offset - whoops. Also limit the number of backpointers we print to 10; this is debug code and we only need to print a sample, not all of them. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: verify_bucket_evacuated() -> set_btree_iter_dontneed()Kent Overstreet2023-10-221-0/+3
| | | | | | This should help with excessive 'would deadlock' transaction restarts. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>