summaryrefslogtreecommitdiffstats
path: root/fs/bcachefs/fs.c
Commit message (Collapse)AuthorAgeFilesLines
...
* bcachefs: Unwritten extents supportKent Overstreet2023-10-221-0/+3
| | | | | | | | | | | | | - bch2_extent_merge checks unwritten bit - read path returns 0s for unwritten extents without actually reading - reflink path skips over unwritten extents - bch2_bkey_ptrs_invalid() checks for extents with both written and unwritten extents, and non-normal extents (stripes, btree ptrs) with unwritten ptrs - fiemap checks for unwritten extents and returns FIEMAP_EXTENT_UNWRITTEN Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: More errcode cleanupKent Overstreet2023-10-221-2/+5
| | | | | | | | We shouldn't be overloading standard error codes now that we have provisions for bcachefs-specific errorcodes: this patch converts super.c and super-io.c to per error site errcodes, with a bit of cleanup. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Factor out two_state_shared_lockKent Overstreet2023-10-221-53/+1
| | | | | | | | | | | We have a unique lock used for controlling adding to the pagecache: the lock has two states, where both states are shared - the lock may be held multiple times for either state - but not both states at the same time. This is exactly what we need for nocow mode locking, so this patch pulls it out of fs.c into its own file. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Assorted checkpatch fixesKent Overstreet2023-10-221-4/+3
| | | | | | | | | | | | | | | | | checkpatch.pl gives lots of warnings that we don't want - suggested ignore list: ASSIGN_IN_IF UNSPECIFIED_INT - bcachefs coding style prefers single token type names NEW_TYPEDEFS - typedefs are occasionally good FUNCTION_ARGUMENTS - we prefer to look at functions in .c files (hopefully with docbook documentation), not .h file prototypes MULTISTATEMENT_MACRO_USE_DO_WHILE - we have _many_ x-macros and other macros where we can't do this Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Errcodes can now subtype standard error codesKent Overstreet2023-10-221-10/+13
| | | | | | | | | | The next patch is going to be adding private error codes for all the places we return -ENOSPC. Additionally, this patch updates return paths at all module boundaries to call bch2_err_class(), to return the standard error code. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: EINTR -> BCH_ERR_transaction_restartKent Overstreet2023-10-221-5/+5
| | | | | | | | | Now that we have error codes, with subtypes, we can switch to our own error code for transaction restarts - and even better, a distinct error code for each transaction restart reason: clearer code and better debugging. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Use bch2_err_str() in error messagesKent Overstreet2023-10-221-4/+4
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Rename __bch2_trans_do() -> commit_do()Kent Overstreet2023-10-221-3/+3
| | | | | | | Better/more descriptive naming, and prep for adding nested_lockrestart_do() and nested_commit_do(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Delete bch_writepageKent Overstreet2023-10-221-1/+0
| | | | | | | Per Dave Chinner and the xfs folks, .writepage is no longer needed, and it's better not to define it if .writepages is the intended path. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: darraysKent Overstreet2023-10-221-1/+1
| | | | | | Inspired by CCAN darray - simple, stupid resizable (dynamic) arrays. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: bch2_btree_iter_peek_upto()Kent Overstreet2023-10-221-3/+2
| | | | | | | | | | | | | In BTREE_ITER_FILTER_SNAPHOTS mode, we skip over keys in unrelated snapshots. When we hit the end of an inode, if the next inode(s) are in a different subvolume, we could potentially have to skip past many keys before finding a key we can return to the caller, so they can terminate the iteration. This adds a peek_upto() variant to solve this problem, to be used when we know the range we're searching within. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Convert bch2_sb_to_text to master option listKent Overstreet2023-10-221-1/+1
| | | | | | | | | Options no longer have to be manually added to bch2_sb_to_text() - it now uses the master list of options in opts.h. Also, improve some of the formatting by converting it to tabstops. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Fix transaction path overflow in fiemapKent Overstreet2023-10-221-1/+2
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Heap allocate printbufsKent Overstreet2023-10-221-4/+9
| | | | | | | | | | | This patch changes printbufs dynamically allocate and reallocate a buffer as needed. Stack usage has become a bit of a problem, and a major cause of that has been static size string buffers on the stack. The most involved part of this refactoring is that printbufs must now be exited with printbuf_exit(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: btree_id_cached()Kent Overstreet2023-10-221-2/+2
| | | | | | | | Add a new helper that returns true if the given btree ID uses the btree key cache. This enables some new cleanups, since the helper can check the options for whether caching is enabled on a given btree. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Switch to __func__for recording where btree_trans was initializedKent Overstreet2023-10-221-1/+0
| | | | | | | | Symbol decoding, via %ps, isn't supported in userspace - this will also be faster when we're using trans->fn in the fast path, as with the new BCH_JSET_ENTRY_log journal messages. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Option improvementsKent Overstreet2023-10-221-3/+3
| | | | | | | | | | | | This adds flags for options that must be a power of two (block size and btree node size), and options that are stored in the superblock as a power of two (encoded extent max). Also: options are now stored in memory in the same units they're displayed in (bytes): we now convert when getting and setting from the superblock. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Fix quota support for snapshotsKent Overstreet2023-10-221-8/+20
| | | | | | | | Quota support was disabled when snapshots were released, because of some tricky interactions with snpashots. We're sidestepping that for now - we're simply disabling quota accounting on snapshot subvolumes. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Update export_operations for snapshotsKent Overstreet2023-10-221-23/+207
| | | | | | | | | | When support for snapshots was merged, export operations weren't updated yet. This patch adds new filehandle types for bcachefs that include the subvolume ID and updates export operations for subvolumes - and also .get_parent, support for which was added just prior to snapshots. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Tweak vfs cache shrinker behaviourKent Overstreet2023-10-221-0/+2
| | | | | | | In bcachefs, inodes and dentries are also cached - more compactly - by the btree node cache, they don't require seeks to recreate. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: bch2_assert_pos_locked()Kent Overstreet2023-10-221-23/+35
| | | | | | | | | This adds a new assertion to be used by bch2_inode_update_after_write(), which updates the VFS inode based on the update to the btree inode we just did - we require that the btree inode still be locked when we do that update. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Switch fsync to use bi_journal_seqKent Overstreet2023-10-221-45/+7
| | | | | | | | Now that we're recording in each inode the journal sequence number of the most recent update, fsync becomes a lot simpler and we can delete all the plumbing for ei_journal_seq. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Kill journal buf bloom filterKent Overstreet2023-10-221-4/+0
| | | | | | | This was used for recording which inodes have been modified by in flight journal writes, but was broken and has been superceded. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Add journal_seq to inode & alloc keysKent Overstreet2023-10-221-1/+1
| | | | | | | | | | | | | | | | Add fields to inode & alloc keys that record the journal sequence number when they were most recently modified. For alloc keys, this is needed to know what journal sequence number we have to flush before the bucket can be reused. Currently this is tracked in memory, but we'll be getting rid of the in memory bucket array. For inodes, this is needed for fsync when the inode has been evicted from the vfs cache. Currently we use a bloom filter per outstanding journal buf - but that mechanism has been broken since we added the ability to not issue a flush/fua for every journal write. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Move bch2_evict_subvolume_inodes() to fs.cKent Overstreet2023-10-221-7/+47
| | | | | | | This fixes building in userspace - code that's coupled to the kernel VFS interface should live in fs.c Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Add BCH_SUBVOLUME_UNLINKEDKent Overstreet2023-10-221-2/+9
| | | | | | | | Snapshot deletion needs to become a multi step process, where we unlink, then tear down the page cache, then delete the subvolume - the deleting flag is equivalent to an inode with i_nlink = 0. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: bch2_trans_exit() no longer returns errorsKent Overstreet2023-10-221-1/+1
| | | | | | | | Now that peek_node()/next_node() are converted to return errors directly, we don't need bch2_trans_exit() to return errors - it's cleaner this way and wasn't used much anymore. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Snapshot creation, deletionKent Overstreet2023-10-221-13/+16
| | | | | | | | | | | This is the final patch in the patch series implementing snapshots. This patch implements two new ioctls that work like creation and deletion of directories, but fancier. - BCH_IOCTL_SUBVOLUME_CREATE, for creating new subvolumes and snaphots - BCH_IOCTL_SUBVOLUME_DESTROY, for deleting subvolumes and snapshots Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Plumb through subvolume idKent Overstreet2023-10-221-31/+46
| | | | | | | | | | | | | | To implement snapshots, we need every filesystem btree operation (every btree operation without a subvolume) to start by looking up the subvolume and getting the current snapshot ID, with bch2_subvolume_get_snapshot() - then, that snapshot ID is used for doing btree lookups in BTREE_ITER_FILTER_SNAPSHOTS mode. This patch adds those bch2_subvolume_get_snapshot() calls, and also switches to passing around a subvol_inum instead of just an inode number. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Add subvolume to ei_inode_infoKent Overstreet2023-10-221-23/+62
| | | | | | | | | | | | | | | | | Filesystem operations generally operate within a subvolume: at the start of every btree transaction we'll be looking up (and locking) the subvolume to get the current snapshot ID, which we then use for our other btree lookups in BTREE_ITER_FILTER_SNAPSHOTS mode. But inodes don't record what subvolume they're in - they can't, because if they did we'd have to update every single inode within a subvolume when taking a snapshot in order to keep that field up to date. So it needs to be tracked in memory, based on how we got to that inode. Hence this patch adds a subvolume field to ei_inode_info, and switches to iget5() so we can index by it in the inode hash table. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: btree_pathKent Overstreet2023-10-221-23/+21
| | | | | | | | | | | | | | | This splits btree_iter into two components: btree_iter is now the externally visible componont, and it points to a btree_path which is now reference counted. This means we no longer have to clone iterators up front if they might be mutated - btree_path can be shared by multiple iterators, and cloned if an iterator would mutate a shared btree_path. This will help us use iterators more efficiently, as well as slimming down the main long lived state in btree_trans, and significantly cleans up the logic for iterator lifetimes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Kill BTREE_INSERT_NOUNLOCKKent Overstreet2023-10-221-9/+3
| | | | | | | With the recent transaction restart changes, it's no longer needed - all transaction commits have BTREE_INSERT_NOUNLOCK semantics. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Use bch2_trans_begin() more consistentlyKent Overstreet2023-10-221-0/+2
| | | | | | | Upcoming patch will require that a transaction restart is always immediately followed by bch2_trans_begin(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* Revert "bcachefs: statfs bfree and bavail should be the same"Kent Overstreet2023-10-221-2/+2
| | | | | | This reverts commit 664f9847bec525d396d62d2db094ca9020289ae0. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: statfs bfree and bavail should be the sameDan Robertson2023-10-221-2/+2
| | | | | | | | The value of f_bfree and f_bavail should be the same. The value of f_bfree is not currently scaled by the availability factor. Signed-off-by: Dan Robertson <dan@dlrobertson.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: fix truncate with ATTR_MODEKent Overstreet2023-10-221-4/+7
| | | | | | | | | After the v5.12 rebase, we started oopsing when truncate was passed ATTR_MODE, due to not passing mnt_userns to setattr_copy(). This refactors things so that truncate/extend finish by using bch2_setattr_nonsize(), which solves the problem. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: mount: fix null deref with null devnameDan Robertson2023-10-221-3/+3
| | | | | | | | - Fix null deref on mount when given a null device name. - Move the dev_name checks to return EINVAL when it is invalid. Signed-off-by: Dan Robertson <dan@dlrobertson.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Preallocate transaction memKent Overstreet2023-10-221-1/+1
| | | | | | This helps avoid transaction restarts. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Don't use uuid in tracepointsKent Overstreet2023-10-221-0/+2
| | | | | | %pU for printing out pointers to uuids doesn't work in perf trace Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: statfs resports incorrect avail blocksDan Robertson2023-10-221-2/+2
| | | | | | | | | The current implementation of bch_statfs does not scale the number of available blocks provided in f_bavail by the reserve factor. This causes an allocation of a file of this size to fail. Signed-off-by: Dan Robertson <dan@dlrobertson.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: avoid out-of-bounds in split_devsStijn Tintel2023-10-221-0/+4
| | | | | | | | Calling mount with an empty source string causes an out-of-bounds error in split_devs. Check the length of the source string to avoid this. Signed-off-by: Stijn Tintel <stijn@linux-ipv6.be> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Fix time handlingKent Overstreet2023-10-221-1/+3
| | | | | | | | | | | There were some overflows in the time conversion functions - fix this by converting tv_sec and tv_nsec separately. Also, set sb->time_min and sb->time_max. Fixes xfstest generic/258. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Ensure that fpunch updates inode timestampsKent Overstreet2023-10-221-1/+1
| | | | | | | Fixes xfstests generic/059 Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Replace bch2_btree_iter_next() calls with bch2_btree_iter_advanceKent Overstreet2023-10-221-1/+1
| | | | | | | | | | | The way btree iterators work internally has been changing, particularly with the iter->real_pos changes, and bch2_btree_iter_next() is no longer hyper optimized - it's just advance followed by peek, so it's more efficient to just call advance where we're not using the return value of bch2_btree_iter_next(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Require all btree iterators to be freedKent Overstreet2023-10-221-0/+3
| | | | | | | | We keep running into occasional bugs with btree transaction iterators overflowing - this will make those bugs more visible. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Fix read retry path for indirect extentsKent Overstreet2023-10-221-1/+3
| | | | | | | | | | In the read path, for retry of indirect extents to work we need to differentiate between the location in the btree the read was for, vs. the location where we found the data. This patch adds that plumbing to bch_read_bio. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Kill ei_str_hashKent Overstreet2023-10-221-4/+3
| | | | | Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Use __bch2_trans_do() in a few more placesKent Overstreet2023-10-221-33/+19
| | | | | | | Minor cleanup, it was being open coded. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Rename BTREE_ID enums for consistency with other enumsKent Overstreet2023-10-221-1/+1
| | | | | Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Reduce/kill BKEY_PADDED useKent Overstreet2023-10-221-8/+8
| | | | | | | | | | | | With various newer key types - stripe keys, inline data extents - the old approach of calculating the maximum size of the value is becoming more and more error prone. Better to switch to bkey_on_stack, which can dynamically allocate if necessary to handle any size bkey. In particular we also want to get rid of BKEY_EXTENT_VAL_U64s_MAX. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>