summaryrefslogtreecommitdiffstats
path: root/fs/bcachefs/opts.h
Commit message (Collapse)AuthorAgeFilesLines
* bcachefs: Standardize helpers for printing enum strs with bounds checksKent Overstreet2024-04-131-4/+6
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Repair pass for scanning for btree nodesKent Overstreet2024-04-031-2/+2
| | | | | | | | | | | | | | | If a btree root or interior btree node goes bad, we're going to lose a lot of data, unless we can recover the nodes that it pointed to by scanning. Fortunately btree node headers are fully self describing, and additionally the magic number is xored with the filesytem UUID, so we can do so safely. This implements the scanning - next patch will rework topology repair to make use of the found nodes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Improve -o norecovery; opts.recovery_pass_limitKent Overstreet2024-03-311-1/+6
| | | | | | | | | | | | | | | | | | | | This adds opts.recovery_pass_limit, and redoes -o norecovery to make use of it; this fixes some issues with -o norecovery so it can be safely used for data recovery. Norecovery means "don't do journal replay"; it's an important data recovery tool when we're getting stuck in journal replay. When using it this way we need to make sure we don't free journal keys after startup, so we continue to overlay them: thus it needs to imply retain_recovery_info, as well as nochanges. recovery_pass_limit is an explicit option for telling recovery to exit after a specific recovery pass; this is a much cleaner way of implementing -o norecovery, as well as being a useful debug feature in its own right. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Pin btree cache in ram for random access in fsckKent Overstreet2024-03-131-0/+5
| | | | | | | | | | | | | | | | | | | | | | | Various phases of fsck involve checking references from one btree to another: this means doing a sequential scan of one btree, and then mostly random access into the second. This is particularly painful for checking extents <-> backpointers; we can prefetch btree node access on the sequential scan, but not on the random access portion, and this is particularly painful on spinning rust, where we'd like to keep the pipeline fairly full of btree node reads so that the elevator can reduce seeking. This patch implements prefetching and pinning of the portion of the btree that we'll be doing random access to. We already calculate how much of the random access btree will fit in memory so it's a fairly straightforward change. This will put more pressure on system memory usage, so we introduce a new option, fsck_memory_usage_percent, which is the percentage of total system ram that fsck is allowed to pin. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: no_splitbrain_check optionKent Overstreet2024-03-101-0/+5
| | | | | | | | This adds an option to disable kicking out devices when splitbrain is detected - it seems there's some issues with splitbrain detection and we're kicking out devices erronously. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: opts->compression can now also be applied in the backgroundKent Overstreet2024-01-211-0/+5
| | | | | | | | | The "apply this compression method in the background" paths now use the compression option if background_compression is not set; this means that setting or changing the compression option will cause existing data to be compressed accordingly in the background. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_prt_compression_type()Kent Overstreet2024-01-211-1/+1
| | | | | | bounds checking helper, since compression types are extensible Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: helpers for printing data typesKent Overstreet2024-01-211-1/+1
| | | | | | We need bounds checking since new versions may introduce new data types. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Add an option to control btree node prefetchingKent Overstreet2024-01-051-1/+7
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: factor out thread_with_file, thread_with_stdioKent Overstreet2024-01-051-2/+2
| | | | | | | thread_with_stdio now knows how to handle input - fsck can now prompt to fix errors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Fix nochanges/read_only interactionKent Overstreet2024-01-051-1/+1
| | | | | | | | | | | | | nochanges means "we cannot issue writes at all"; it's possible to go into a pseudo read-write mode where we pin dirty metadata in memory, which is used for fsck in dry run mode and doing journal replay on a read only mount, but we do not want to allow an actual read-write mount in nochanges mode. But we do always want to allow early read-write, during recovery - this patch clarifies that. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: btree write buffer now slurps keys from journalKent Overstreet2024-01-011-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previosuly, the transaction commit path would have to add keys to the btree write buffer as a separate operation, requiring additional global synchronization. This patch introduces a new journal entry type, which indicates that the keys need to be copied into the btree write buffer prior to being written out. We switch the journal entry type back to JSET_ENTRY_btree_keys prior to write, so this is not an on disk format change. Flushing the btree write buffer may require pulling keys out of journal entries yet to be written, and quiescing outstanding journal reservations; we previously added journal->buf_lock for synchronization with the journal write path. We also can't put strict bounds on the number of keys in the journal destined for the write buffer, which means we might overflow the size of the preallocated buffer and have to reallocate - this introduces a potentially fatal memory allocation failure. This is something we'll have to watch for, if it becomes an issue in practice we can do additional mitigation. The transaction commit path no longer has to explicitly check if the write buffer is full and wait on flushing; this is another performance optimization. Instead, when the btree write buffer is close to full we change the journal watermark, so that only reservations for journal reclaim are allowed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Add ability to redirect log outputKent Overstreet2024-01-011-0/+5
| | | | | | | | | | | | | | | | | | | | Upcoming patches are going to add two new ioctls for running fsck in the kernel, but pretending that we're running our normal userspace fsck. This patch adds some plumbing for redirecting our normal log messages away from the dmesg log to a thread_with_file file descriptor - via a struct log_output, which will be consumed by the fsck f_op's read method. The new ioctls will allow for running fsck in the kernel against an offline filesystem (without mounting it), and an online filesystem. For an offline filesystem we need a way to pass in a pointer to the log_output, which is done via a new hidden opts.h option. For online fsck, we can set c->output directly, but only want to redirect log messages from the thread running fsck - hence the new c->output_filter method. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Add IO error counts to bch_memberKent Overstreet2023-11-011-1/+0
| | | | | | | | | We now track IO errors per device since filesystem creation. IO error counts can be viewed in sysfs, or with the 'bcachefs show-super' command. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Guard against unknown compression optionsKent Overstreet2023-10-311-0/+1
| | | | | | | | | | Since compression options now include compression level, proper validation is a bit more involved. This adds bch2_compression_opt_valid(), and plumbs it around appropriately. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_btree_id_str()Kent Overstreet2023-10-311-1/+1
| | | | | | | Since we can run with unknown btree IDs, we can't directly index btree IDs into fixed size arrays. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Add iops fields to bch_memberHunter Shaffer2023-10-221-0/+1
| | | | | Signed-off-by: Hunter Shaffer <huntershaffer182456@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Fix W=12 build errorsKent Overstreet2023-10-221-1/+1
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Compression levelsKent Overstreet2023-10-221-2/+2
| | | | | | | | | | | | | | | | | | | | This allows including a compression level when specifying a compression type, e.g. compression=zstd:15 Values from 1 through 15 indicate compression levels, 0 or unspecified indicates the default. For LZ4, values 3-15 specify that the HC algorithm should be used. Note that for compatibility, extents themselves only include the compression type, not the compression level. This means that specifying the same compression algorithm but different compression levels for the compression and background_compression options will have no effect. XXX: perhaps we could add a warning for this Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: fix_errors option is now a proper enumKent Overstreet2023-10-221-2/+15
| | | | | | | | | | Before, it was parsed as a bool but internally it was really an enum: this lets us pass in all the possible values. But we special case the option parsing: no supplied value is parsed as FSCK_FIX_yes, to match the previous behaviour. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch_opt_fnKent Overstreet2023-10-221-2/+9
| | | | | | Minor refactoring to get rid of some unneeded token pasting. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: version_upgrade is now an enumKent Overstreet2023-10-221-2/+3
| | | | | | | | | | | | | The version_upgrade parameter is now an enum, not a bool, and it's persistent in the superblock: - compatible (default): upgrade to the latest compatible version - incompatible: upgrade to latest incompatible version - none Currently all upgrades are incompatible upgrades, but the next release will introduce major:minor versions. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_version_to_text()Kent Overstreet2023-10-221-1/+0
| | | | | | | Add a new helper for printing out metadata versions in a standard format. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Verbose on by default when CONFIG_BCACHEFS_DEBUG=yKent Overstreet2023-10-221-1/+7
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Add option for completely disabling nocowKent Overstreet2023-10-221-0/+6
| | | | | | | This adds an option for completely disabling nocow mode, including the locking in the data move path. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Add max nr of IOs in flight to the move pathKent Overstreet2023-10-221-1/+6
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Nocow supportKent Overstreet2023-10-221-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for nocow mode, where we do writes in-place when possible. Patch components: - New boolean filesystem and inode option, nocow: note that when nocow is enabled, data checksumming and compression are implicitly disabled - To prevent in-place writes from racing with data moves (data_update.c) or bucket reuse (i.e. a bucket being reused and re-allocated while a nocow write is in flight, we have a new locking mechanism. Buckets can be locked for either data update or data move, using a fixed size hash table of two_state_shared locks. We don't have any chaining, meaning updates and moves to different buckets that hash to the same lock will wait unnecessarily - we'll want to watch for this becoming an issue. - The allocator path also needs to check for in-place writes in flight to a given bucket before giving it out: thus we add another counter to bucket_alloc_state so we can track this. - Fsync now may need to issue cache flushes to block devices instead of flushing the journal. We add a device bitmask to bch_inode_info, ei_devs_need_flush, which tracks devices that need to have flushes issued - note that this will lead to unnecessary flushes when other codepaths have already issued flushes, we may want to replace this with a sequence number. - New nocow write path: look up extents, and if they're writable write to them - otherwise fall back to the normal COW write path. XXX: switch to sequence numbers instead of bitmask for devs needing journal flush XXX: ei_quota_lock being a mutex means bch2_nocow_write_done() needs to run in process context - see if we can improve this Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Btree write bufferKent Overstreet2023-10-221-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a new method of doing btree updates - a straight write buffer, implemented as a flat fixed size array. This is only useful when we don't need to read from the btree in order to do the update, and when reading is infrequent - perfect for the LRU btree. This will make LRU btree updates fast enough that we'll be able to use it for persistently indexing buckets by fragmentation, which will be a massive boost to copygc performance. Changes: - A new btree_insert_type enum, for btree_insert_entries. Specifies btree, btree key cache, or btree write buffer. - bch2_trans_update_buffered(): updates via the btree write buffer don't need a btree path, so we need a new update path. - Transaction commit path changes: The update to the btree write buffer both mutates global, and can fail if there isn't currently room. Therefore we do all write buffer updates in the transaction all at once, and also if it fails we have to revert filesystem usage counter changes. If there isn't room we flush the write buffer in the transaction commit error path and retry. - A new persistent option, for specifying the number of entries in the write buffer. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_inode_opts_get()Kent Overstreet2023-10-221-5/+0
| | | | | | | | | | | | | This improves io_opts() and makes it a non-inline function - it's big enough that it probably shouldn't be. Also, bch_io_opts no longer needs fields for whether options are defined, so we can slim it down a bit. We'd like to stop passing around the full bch_io_opts, but that'll be tricky because of bch2_rebalance_add_key(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Improve bch2_inode_opts_to_opts()Kent Overstreet2023-10-221-1/+0
| | | | | | | | | | | | It turns out the *_defined entries of bch_io_opts are only used in one place - in the xattr get path - and there we immediately convert to a bch_opts struct, which also has the *_defined entries. This patch changes bch2_inode_opts_to_opts() to go directly from bch_inode_unpacked to bch_opts, which is a minor simplification and will also let us slim down struct bch_io_opts in another patch. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Add an O_DIRECT option (for userspace)Kent Overstreet2023-10-221-0/+5
| | | | | | | Sometimes we see IO errors due to O_DIRECT alignment issues - having an option to use buffered IO will be helpful. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Make verbose option settable at runtimeKent Overstreet2023-10-221-1/+1
| | | | | | | | -o verbose is very useful, and we're starting to use it more for runtime debug statements - making it possible to enable at runtime is a no brainer. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Make IO in flight by copygc/rebalance configurableKent Overstreet2023-10-221-0/+5
| | | | | | | | | | | This adds a new option, move_bytes_in_flight, for configuring the amount of IO in flight by copygc/rebalance - users with many devices in their filesystem will want to increase this. In the future we should be smarter about this, but this is an easy improvement. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Rename group to label for remaining strings.Daniel Hill2023-10-221-4/+4
| | | | | Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Make bch_option compatible with Rust ffiBrett Holman2023-10-221-11/+3
| | | | | | | | Rust FFI lacks support for unnamed structs and unions. The space saved in bch_option is not enough to be significant. Signed-off-by: Brett Holman <bholman.devel@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Kill old rebuild_replicas optionKent Overstreet2023-10-221-5/+0
| | | | | | | This option was useful when the replicas mechism was new and still being debugged, but hasn't been used in ages - let's delete it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: New discard implementationKent Overstreet2023-10-221-1/+1
| | | | | | | | | | | | | | | | In the old allocator code, buckets would be discarded just prior to being used - this made sense in bcache where we were discarding buckets just after invalidating the cached data they contain, but in a filesystem where we typically have more free space we want to be discarding buckets when they become empty. This patch implements the new behaviour - it checks the need_discard btree for buckets awaiting discards, and then clears the appropriate bit in the alloc btree, which moves the buckets to the freespace btree. Additionally, discards are now enabled by default. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Make minimum journal_flush_delay nonzeroKent Overstreet2023-10-221-1/+1
| | | | | | | | We're seeing a very strange bug where journal_flush_delay sometimes gets set to 0 in the superblock. Together with the preceding patch, this should help us track it down. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Better superblock opt validationKent Overstreet2023-10-221-2/+3
| | | | | | | This moves validation of superblock options to bch2_sb_validate(), so they'll be checked in the write path as well. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: x-macro metadata version enumKent Overstreet2023-10-221-0/+1
| | | | | | | Now we've got strings for metadata versions - this changes bch2_sb_to_text() and our mount log message to use it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Convert bch2_sb_to_text to master option listKent Overstreet2023-10-221-30/+32
| | | | | | | | | Options no longer have to be manually added to bch2_sb_to_text() - it now uses the master list of options in opts.h. Also, improve some of the formatting by converting it to tabstops. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: opts.read_journal_onlyKent Overstreet2023-10-221-0/+5
| | | | | | | Add an option that tells recovery to only read the journal, to be used by the list_journal command. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Only allocate buckets_nouse when requestedKent Overstreet2023-10-221-0/+5
| | | | | | | It's only needed by the migrate tool - this patch adds an option to enable allocating it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: bch2_journal_entry_to_text()Kent Overstreet2023-10-221-0/+2
| | | | | | | | This adds a _to_text() pretty printer for journal entries - including every subtype - which will shortly be used by the 'bcachefs list_journal' subcommand. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: BCH_JSET_ENTRY_logKent Overstreet2023-10-221-0/+5
| | | | | | | | | Add a journal entry type for logging messages, and add an option to use it to log the transaction name - this makes for a very handy debugging tool, as with it we can use the 'bcachefs list_journal' command to see not only what updates were done, but what was doing them. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Kill non-lru cache replacement policiesKent Overstreet2023-10-221-1/+0
| | | | | | | Prep work for persistent LRUs and getting rid of the in memory bucket array. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Turn encoded_extent_max into a regular optionKent Overstreet2023-10-221-0/+6
| | | | | | | It'll now be handled at format time and in sysfs like other options - it still can only be set at format time, though. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Option improvementsKent Overstreet2023-10-221-15/+25
| | | | | | | | | | | | This adds flags for options that must be a power of two (block size and btree node size), and options that are stored in the superblock as a power of two (encoded extent max). Also: options are now stored in memory in the same units they're displayed in (bytes): we now convert when getting and setting from the superblock. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Add more time_statsKent Overstreet2023-10-221-3/+3
| | | | | | | | | | This adds more latency/event measurements and breaks some apart into more events. Journal writes are broken apart into flush writes and noflush writes, btree compactions are broken out from btree splits, btree mergers are added, as well as btree_interior_updates - foreground and total. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Specify filesystem optionsKent Overstreet2023-10-221-53/+54
| | | | | | | | | | | We've got three types of options now - filesystem, device and inode, and a given option may belong to more than one of those types. This patch changes the options to specify explicitly when they're a filesystem option - in the future we'll probably be adding more device options. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>