summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
* xfs: scrub in-core metadataDarrick J. Wong2018-01-176-0/+57
| | | | | | | | Whenever we load a buffer, explicitly re-call the structure verifier to ensure that memory isn't corrupting things. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: cross-reference the block mappings when possibleDarrick J. Wong2018-01-171-0/+34
| | | | | | | Use an inode's block mappings to cross-reference inode block counters. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: cross-reference the realtime bitmapDarrick J. Wong2018-01-175-0/+57
| | | | | | | | While we're scrubbing various btrees, cross-reference the records with the other metadata. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: cross-reference refcount btree during scrubDarrick J. Wong2018-01-178-14/+186
| | | | | | | | During metadata btree scrub, we should cross-reference with the reference counts. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: cross-reference the rmapbt data with the refcountbtDarrick J. Wong2018-01-171-2/+334
| | | | | | | | | | Cross reference the refcount data with the rmap data to check that the number of rmaps for a given block match the refcount of that block, and that CoW blocks (which are owned entirely by the refcountbt) are tracked as well. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: cross-reference reverse-mapping btreeDarrick J. Wong2018-01-1710-4/+440
| | | | | | | | | | When scrubbing various btrees, we should cross-reference the records with the reverse mapping btree and ensure that traversing the btree finds the same number of blocks that the rmapbt thinks are owned by that btree. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: cross-reference inode btrees during scrubDarrick J. Wong2018-01-178-0/+160
| | | | | | | | Cross-reference the inode btrees with the other metadata when we scrub the filesystem. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: cross-reference bnobt records with cntbtDarrick J. Wong2018-01-172-0/+78
| | | | | | | | Scrub should make sure that each bnobt record has a corresponding cntbt record. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: cross-reference with the bnobtDarrick J. Wong2018-01-179-0/+176
| | | | | | | | When we're scrubbing various btrees, cross-reference the records with the bnobt to ensure that we don't also think the space is free. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: introduce scrubber cross-referencing stubsDarrick J. Wong2018-01-177-1/+157
| | | | | | | | Create some stubs that will be used to cross-reference metadata records. The actual cross-referencing will be filled in by subsequent patches. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: check btree block ownership with bnobt/rmapbt when scrubbing btreeDarrick J. Wong2018-01-171-0/+93
| | | | | | | | | | | | | | | When scanning a metadata btree block, cross-reference the block location with the free space btree and the reverse mapping btree to ensure that the rmapbt knows about the block and the bnobt does not. Add a mechanism to defer checks when we happen to be scanning the bnobt/rmapbt itself because it's less efficient to repeatedly clone and destroy the cursor. This patch provides the framework to make btree block owner checks happen; the actual meat will be added in subsequent patches. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: fix a few erroneous process_error calls in the scrubbersDarrick J. Wong2018-01-173-5/+5
| | | | | | | | | | | | | There are a few places where we make a libxfs api call on behalf of some object other than the one we're scrubbing but inadvertently call the regular process_error function. When this happens we mark the object corrupt even though it was corruption in /some other/ object that actually produced the -EFSCORRUPTED code. The correct output flag for these situations is SCRUB_OFLAG_XFAIL, not SCRUB_OFLAG_CORRUPT, so fix this now that we also have a helper to set these. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: set up scrub cross-referencing helpersDarrick J. Wong2018-01-176-22/+240
| | | | | | | | Create some helper functions that we'll use later to deal with problems we might encounter while cross referencing metadata with other metadata. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: add scrub cross-referencing helpers for the refcount btreesDarrick J. Wong2018-01-172-0/+22
| | | | | | | | Add a couple of functions to the refcount btrees that will be used to cross-reference metadata against the refcountbt. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: add scrub cross-referencing helpers for the rmap btreesDarrick J. Wong2018-01-172-0/+72
| | | | | | | | Add a couple of functions to the rmap btrees that will be used to cross-reference metadata against the rmapbt. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: add scrub cross-referencing helpers for the inode btreesDarrick J. Wong2018-01-172-0/+105
| | | | | | | | Add a couple of functions to the inode btrees that will be used to cross-reference metadata against the inobt. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: add scrub cross-referencing helpers for the free space btreesDarrick J. Wong2018-01-174-1/+62
| | | | | | | | | Add a couple of functions to the free space btrees that will be used to cross-reference metadata against the bnobt/cntbt, and a generic btree function that provides the real implementation. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: cancel tx on xfs_defer_finish() error during xattr set/removeBrian Foster2018-01-161-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Chris Dunlop reports a problem where an xattr operation fails, reports the following error to syslog and hangs during unmount: ================================================ [ BUG: lock held when returning to user space! ] ... ------------------------------------------------ <PID> is leaving the kernel with locks still held! 1 lock held by <PID>: #0: (sb_internal){......}, at: [<ffffffffa07692a3>] xfs_trans_alloc+0xe3/0x130 [xfs] The failure/shutdown occurs during deferred ops processing which leads to an error return from xfs_defer_finish() via xfs_attr_leaf_addname(). While the root cause of the failure is unknown corruption, the cause of the subsequent BUG above and unmount hang is failure to cancel the transaction before returning to userspace. The transaction is not cancelled because the out_defer_cancel error handling paths in the xfs_attr_[leaf|node]_[add|remove]name() functions clear args.trans without releasing the transaction. The callers therefore lose the reference to the transaction and fail to cancel it. Since xfs_attr_[set|remove]() always cancel args.trans when != NULL and xfs_defer_finish()->...->xfs_trans_roll() should always return with a valid transaction, update the leaf/node xattr functions to not reset args.trans in the error path responsible for cancelling deferred ops. Reported-by: Chris Dunlop <chris@onthe.net.au> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
* xfs: account finobt blocks properly in perag reservationBrian Foster2018-01-121-4/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | XFS started using the perag metadata reservation pool for free inode btree blocks in commit 76d771b4cbe33 ("xfs: use per-AG reservations for the finobt"). To handle backwards compatibility, finobt blocks are accounted against the pool so long as the full reservation is available at mount time. Otherwise the ->m_inotbt_nores flag is set and the filesystem falls back to the traditional per-transaction finobt reservation. This commit has two problems: - finobt blocks are always accounted against the metadata reservation on allocation, regardless of ->m_inotbt_nores state - finobt blocks are never returned to the reservation pool on free The first problem affects reflink+finobt filesystems where the full finobt reservation is not available at mount time. finobt blocks are essentially stolen from the reflink reservation, putting refcountbt management at risk of allocation failure. The second problem is an unconditional leak of metadata reservation whenever finobt is enabled. Update the finobt block allocation callouts to consider ->m_inotbt_nores and account blocks appropriately. Blocks should be consistently accounted against the metadata pool when ->m_inotbt_nores is false and otherwise tagged as RESV_NONE. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
* xfs: fix check on struct_version for versions 4 or greaterColin Ian King2018-01-121-1/+1
| | | | | | | | | | | | It appears that the check for versions 4 or more is incorrect and is off-by-one. Fix this. Detected by CoverityScan, CID#1463775 ("Logically dead code") Fixes: ac503a4cc9e8 ("xfs: refactor the geometry structure filling function") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
* xfs: destroy mutex pag_ici_reclaim_lock before freeXiongwei Song2018-01-121-0/+3
| | | | | | | | | | | | The mutex pag_ici_reclaim_lock of xfs_perag_t structure is initialized in xfs_initialize_perag. If happen errors in xfs_initialize_perag, or free resources in xfs_free_perag, wo need to destroy the mutex before free perag. Signed-off-by: Xiongwei Song <sxwjean@me.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
* xfs: use %px for data pointers when debuggingDarrick J. Wong2018-01-129-25/+37
| | | | | | | | | | | | | | | | | | Starting with commit 57e734423ad ("vsprintf: refactor %pK code out of pointer"), the behavior of the raw '%p' printk format specifier was changed to print a 32-bit hash of the pointer value to avoid leaking kernel pointers into dmesg. For most situations that's good. This is /undesirable/ behavior when we're trying to debug XFS, however, so define a PTR_FMT that prints the actual pointer when we're in debug mode. Note that %p for tracepoints still prints the raw pointer, so in the long run we could consider rewriting some of these messages as tracepoints. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: use %pS printk format for direct instruction addressesDarrick J. Wong2018-01-123-23/+23
| | | | | | | | | | | Use the %pS instead of the %pF printk format specifier for printing symbols from direct addresses. This is needed for the ia64, ppc64 and parisc64 architectures. While we're at it, be consistent with the capitalization of the 'S'. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: change 0x%p -> %p in print messagesDarrick J. Wong2018-01-126-28/+28
| | | | | | | Since %p prepends "0x" to the outputted string, we can drop the prefix. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: clarify units in the failed metadata io messageDarrick J. Wong2018-01-091-5/+6
| | | | | | | | | If a metadata IO error happens, we report the location of the failed IO request in units of daddrs. However, the printk message misleads people into thinking that the units are fs blocks, so fix the reported units. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: harden directory integrity checks some moreDarrick J. Wong2018-01-092-1/+9
| | | | | | | | | | | | | | | | | | | | | If a malicious filesystem image contains a block+ format directory wherein the directory inode's core.mode is set such that S_ISDIR(core.mode) == 0, and if there are subdirectories of the corrupted directory, an attempt to traverse up the directory tree will crash the kernel in __xfs_dir3_data_check. Running the online scrub's parent checks will tend to do this. The crash occurs because the directory inode's d_ops get set to xfs_dir[23]_nondir_ops (it's not a directory) but the parent pointer scrubber's indiscriminate call to xfs_readdir proceeds past the ASSERT if we have non fatal asserts configured. Fix the null pointer dereference crash in __xfs_dir3_data_check by looking for S_ISDIR or wrong d_ops; and teach the parent scrubber to bail out if it is fed a non-directory "parent". Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
* xfs: refactor the geometry structure filling functionDarrick J. Wong2018-01-086-78/+92
| | | | | | | | | Refactor the geometry structure filling function to use the superblock to fill the fields. While we're at it, make the function less indenty and use some whitespace to make the function easier to read. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
* xfs: hoist xfs_fs_geometry to libxfsDarrick J. Wong2018-01-086-78/+84
| | | | | | | | Move xfs_fs_geometry to libxfs so that we can clean up the fs geometry reporting in xfsprogs. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
* xfs: trace log reservations at mount timeDarrick J. Wong2018-01-084-1/+52
| | | | | | | | | | At each mount, emit the transaction reservation type information via tracepoints. This makes it easier to compare the log reservation info calculated by the kernel and xfsprogs so that we can more easily diagnose minimum log size failures on freshly formatted filesystems. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
* xfs: dump the first 128 bytes of any corrupt bufferDarrick J. Wong2018-01-083-4/+9
| | | | | | | | Increase the corrupt buffer dump to the first 128 bytes since v5 filesystems have larger block headers than before. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: teach error reporting functions to take xfs_failaddr_tDarrick J. Wong2018-01-082-6/+7
| | | | | | | | | Convert the two other error reporting functions to take xfs_failaddr_t when the caller wishes to capture a code pointer instead of the classic void * pointer. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: standardize quota verification function outputsDarrick J. Wong2018-01-085-122/+86
| | | | | | | | | | | Rename xfs_dqcheck to xfs_dquot_verify and make it return an xfs_failaddr_t like every other structure verifier function. This enables us to check on-disk quotas in the same way that we check everything else. Callers are now responsible for logging errors, as XFS_QMOPT_DOWARN goes away. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: separate dquot repair into a separate functionDarrick J. Wong2018-01-084-64/+22
| | | | | | | | | Move the dquot repair code into a separate function and remove XFS_QMOPT_DQREPAIR in favor of calling the helper directly. Remove other dead code because quotacheck is the only caller of DQREPAIR. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: create a new buf_ops pointer to verify structure metadataDarrick J. Wong2018-01-0817-21/+125
| | | | | | | | | Expose all metadata structure buffer verifier functions via buf_ops. These will be used by the online scrub mechanism to look for problems with buffers that are already sitting around in memory. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: fail out of xfs_attr3_leaf_lookup_int if it looks corruptDarrick J. Wong2018-01-081-3/+6
| | | | | | | | | If the xattr leaf block looks corrupt, return -EFSCORRUPTED to userspace instead of ASSERTing on debug kernels or running off the end of the buffer on regular kernels. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: provide a centralized method for verifying inline fork dataDarrick J. Wong2018-01-086-24/+99
| | | | | | | | | | | | | Replace the current haphazard dir2 shortform verifier callsites with a centralized verifier function that can be called either with the default verifier functions or with a custom set. This helps us strengthen integrity checking while providing us with flexibility for repair tools. xfs_repair wants this to be able to supply its own verifier functions when trying to fix possibly corrupt metadata. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: refactor short form directory structure verifier functionDarrick J. Wong2018-01-083-17/+16
| | | | | | | | Change the short form directory structure verifier function to return the instruction pointer of a failing check or NULL if everything's ok. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: create structure verifier function for short form symlinksDarrick J. Wong2018-01-082-0/+35
| | | | | | | Create a function to check the structure of short form symlink targets. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: create structure verifier function for shortform xattrsDarrick J. Wong2018-01-082-0/+75
| | | | | | | | Create a function to perform structure verification for short form extended attributes. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: move inode fork verifiers to xfs_dinode_verifyDarrick J. Wong2018-01-082-89/+69
| | | | | | | | Consolidate the fork size and format verifiers to xfs_dinode_verify so that we can reject bad inodes earlier and in a single place. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: verify dinode header firstDarrick J. Wong2018-01-081-10/+13
| | | | | | | | | | Move the v3 inode integrity information (crc, owner, metauuid) before we look at anything else in the inode so that we don't waste time on a torn write or a totally garbled block. This makes xfs_dinode_verify more consistent with the other verifiers. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: refactor verifier callers to print address of failing checkDarrick J. Wong2018-01-0820-102/+209
| | | | | | | | Refactor the callers of verifiers to print the instruction address of a failing check. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: have buffer verifier functions report failing addressDarrick J. Wong2018-01-0821-275/+323
| | | | | | | | | Modify each function that checks the contents of a metadata buffer to return the instruction address of the failing test so that we can report more precise failure errors to the log. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: refactor xfs_verifier_error and xfs_buf_ioerrorDarrick J. Wong2018-01-0823-151/+89
| | | | | | | | | | Since all verification errors also mark the buffer as having an error, we can combine these two calls. Later we'll add a xfs_failaddr_t parameter to promote the idea of reporting corruption errors and the address of the failing check to enable better debugging reports. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: remove XFS_WANT_CORRUPTED_RETURN from dir3 data verifiersDarrick J. Wong2018-01-083-53/+61
| | | | | | | | | | | Since __xfs_dir3_data_check verifies on-disk metadata, we can't have it noisily blowing asserts and hanging the system on corrupt data coming in off the disk. Instead, have it return a boolean like all the other checker functions, and only have it noisily fail if we fail in debug mode. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: refactor short form btree pointer verificationDarrick J. Wong2018-01-081-6/+6
| | | | | | | | Now that we have xfs_verify_agbno, use it to verify short form btree pointers instead of open-coding them. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: refactor long-format btree header verification routinesDarrick J. Wong2018-01-083-20/+50
| | | | | | | | Create two helper functions to verify the headers of a long format btree block. We'll use this later for the realtime rmapbt. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: remove XFS_FSB_SANITY_CHECKDarrick J. Wong2018-01-084-9/+5
| | | | | | | | We already have a function to verify fsb pointers, so get rid of the last users of the (less robust) macro. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: distinguish between corrupt inode and invalid inum in xfs_scrub_get_inodeDarrick J. Wong2018-01-081-4/+28
| | | | | | | | | | | | | | | | In xfs_scrub_get_inode, we don't do a good enough job distinguishing EINVAL returns from xfs_iget w/ IGET_UNTRUSTED -- this can happen if the passed in inode number is invalid (past eofs, inobt says it isn't an inode) or if the inum is actually valid but the inode buffer fails verifier. In the first case we still want to return ENOENT, but in the second case we want to capture the corruption error. Therefore, if xfs_iget returns EINVAL, try the raw imap lookup. If that succeeds, we conclude it's a corruption error, otherwise we just bounce out to userspace. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: always grab transaction when scrubbing inodeDarrick J. Wong2018-01-081-1/+1
| | | | | | | | | | | Always allocate a transaction for inode scrubbing, even if the _iget fails. This is something that is nice to have now for consistency with the other scrubbers but will become critical when we get to online repair where we'll actually use the transaction + raw buffer read to fix the verifier errors. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>