summaryrefslogtreecommitdiffstats
path: root/fs/xfs/libxfs/xfs_format.h
Commit message (Collapse)AuthorAgeFilesLines
* xfs: introduce XFS_MAX_FILEOFFDarrick J. Wong2020-01-141-0/+7
| | | | | | | | | | | Introduce a new #define for the maximum supported file block offset. We'll use this in the next patch to make it more obvious that we're doing some operation for all possible inode fork mappings after a given offset. We can't use ULLONG_MAX here because bunmapi uses that to detect when it's done. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: remove unused typedef definitionsEric Sandeen2019-11-131-2/+2
| | | | | | | | | Remove some typdefs for type_t's that are no longer referred to by their typedef'd types. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
* xfs: remove the xfs_disk_dquot_t and xfs_dquot_tPavel Reichl2019-11-131-5/+5
| | | | | | | Signed-off-by: Pavel Reichl <preichl@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> [darrick: fix some of the comments] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
* xfs: move xfs_ino_geometry to xfs_shared.hDarrick J. Wong2019-06-281-41/+0
| | | | | | | | | The inode geometry structure isn't related to ondisk format; it's support for the mount structure. Move it to xfs_shared.h. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: fix inode_cluster_size rounding mayhemDarrick J. Wong2019-06-121-2/+7
| | | | | | | | | | | | | | | | inode_cluster_size is supposed to represent the size (in bytes) of an inode cluster buffer. We avoid having to handle multiple clusters per filesystem block on filesystems with large blocks by openly rounding this value up to 1 FSB when necessary. However, we never reset inode_cluster_size to reflect this new rounded value, which adds to the potential for mistakes in calculating geometries. Fix this by setting inode_cluster_size to reflect the rounded-up size if needed, and special-case the few places in the sparse inodes code where we actually need the smaller value to validate on-disk metadata. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: separate inode geometryDarrick J. Wong2019-06-121-1/+37
| | | | | | | Separate the inode geometry information into a distinct structure. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: move XFS_INODE_FORMAT_STR mappings to libxfsDarrick J. Wong2018-12-191-0/+10
| | | | | | | | Move XFS_INODE_FORMAT_STR to libxfs so that we don't forget to keep it updated, and add necessary TRACE_DEFINE_ENUM. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
* xfs: add a block to inode count converterDarrick J. Wong2018-12-121-0/+2
| | | | | | | | | | Add new helpers to convert units of fs blocks into inodes, and AG blocks into AG inodes, respectively. Convert all the open-coded conversions and XFS_OFFBNO_TO_AGINO(, , 0) calls to use them, as appropriate. The OFFBNO_TO_AGINO macro is retained for xfs_repair. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
* xfs: remove suport for filesystems without unwritten extent flagChristoph Hellwig2018-10-181-6/+2
| | | | | | | | | | | | | | The option to enable unwritten extents was made default in 2003, removed from mkfs in 2007, and cannot be disabled in v5. We also rely on it for a lot of common functionality, so filesystems without it will run a completely untested and buggy code path. Enabling the support also is a simple bit flip using xfs_db, so legacy file systems can still be brought forward. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: don't treat unknown di_flags2 as corruption in scrubEric Sandeen2018-09-291-0/+2
| | | | | | | | | | | | | | | | | | xchk_inode_flags2() currently treats any di_flags2 values that the running kernel doesn't recognize as corruption, and calls xchk_ino_set_corrupt() if they are set. However, it's entirely possible that these flags were set in some newer kernel and are quite valid, but ignored in this kernel. (Validators don't care one bit about unknown di_flags2.) Call xchk_ino_set_warning instead, because this may or may not actually indicate a problem. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: don't allow insert-range to shift extents past the maximum offsetDarrick J. Wong2018-06-241-0/+2
| | | | | | | | | | | | | | | | | Zorro Lang reports that generic/485 blows an assert on a filesystem with 512 byte blocks. The test tries to fallocate a post-eof extent at the maximum file size and calls insert range to shift the extents right by two blocks. On a 512b block filesystem this causes startoff to overflow the 54-bit startoff field, leading to the assert. Therefore, always check the rightmost extent to see if it would overflow prior to invoking the insert range machinery. Reported-by: zlang@redhat.com Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=200137 Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: More robust inode extent count validationDave Chinner2018-06-211-0/+3
| | | | | | | | | | | | | | | | When the inode is in extent format, it can't have more extents that fit in the inode fork. We don't currenty check this, and so this corruption goes unnoticed by the inode verifiers. This can lead to crashes operating on invalid in-memory structures. Attempts to access such a inode will now error out in the verifier rather than allowing modification operations to proceed. Reported-by: Wen Xu <wen.xu@gatech.edu> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> [darrick: fix a typedef, add some braces and breaks to shut up compiler warnings] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
* xfs: convert to SPDX license tagsDave Chinner2018-06-061-13/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the verbose license text from XFS files and replace them with SPDX tags. This does not change the license of any of the code, merely refers to the common, up-to-date license files in LICENSES/ This change was mostly scripted. fs/xfs/Makefile and fs/xfs/libxfs/xfs_fs.h were modified by hand, the rest were detected and modified by the following command: for f in `git grep -l "GNU General" fs/xfs/` ; do echo $f cat $f | awk -f hdr.awk > $f.new mv -f $f.new $f done And the hdr.awk script that did the modification (including detecting the difference between GPL-2.0 and GPL-2.0+ licenses) is as follows: $ cat hdr.awk BEGIN { hdr = 1.0 tag = "GPL-2.0" str = "" } /^ \* This program is free software/ { hdr = 2.0; next } /any later version./ { tag = "GPL-2.0+" next } /^ \*\// { if (hdr > 0.0) { print "// SPDX-License-Identifier: " tag print str print $0 str="" hdr = 0.0 next } print $0 next } /^ \* / { if (hdr > 1.0) next if (hdr > 0.0) { if (str != "") str = str "\n" str = str $0 next } print $0 next } /^ \*/ { if (hdr > 0.0) next print $0 next } // { if (hdr > 0.0) { if (str != "") str = str "\n" str = str $0 next } print $0 } END { } $ Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
* xfs: implement online get/set fs labelEric Sandeen2018-05-161-2/+5
| | | | | | | | | | | | | | | | | | | | | The GET ioctl is trivial, just return the current label. The SET ioctl is more involved: It transactionally modifies the superblock to write a new filesystem label to the primary super. A new variant of xfs_sync_sb then writes the superblock buffer immediately to disk so that the change is visible from userspace. It then invalidates any page cache that userspace might have previously read on the block device so that i.e. blkid can see the change immediately, and updates all secondary superblocks as userspace relable does. Signed-off-by: Eric Sandeen <sandeen@redhat.com> [darrick: use dchinner's new xfs_update_secondary_sbs function] Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
* xfs: convert XFS_AGFL_SIZE to a helper functionDave Chinner2018-03-111-12/+1
| | | | | | | | | | The AGFL size calculation is about to get more complex, so lets turn the macro into a function first and remove the macro. Signed-off-by: Dave Chinner <dchinner@redhat.com> [darrick: forward port to newer kernel, simplify the helper] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
* xfs: remove u_int* type usageDarrick J. Wong2017-11-091-1/+1
| | | | | | | | Use the uint* types instead of the u_int* types. This will (hopefully) pair with an xfsprogs cleanup. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: move xfs_bmbt_irec and xfs_exntst_t to xfs_types.hChristoph Hellwig2017-11-061-18/+0
| | | | | | | | Neither defines an on-disk format, so move them out of xfs_format.h. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
* xfs: use a b+tree for the in-core extent listChristoph Hellwig2017-11-061-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace the current linear list and the indirection array for the in-core extent list with a b+tree to avoid the need for larger memory allocations for the indirection array when lots of extents are present. The current extent list implementations leads to heavy pressure on the memory allocator when modifying files with a high extent count, and can lead to high latencies because of that. The replacement is a b+tree with a few quirks. The leaf nodes directly store the extent record in two u64 values. The encoding is a little bit different from the existing in-core extent records so that the start offset and length which are required for lookups can be retreived with simple mask operations. The inner nodes store a 64-bit key containing the start offset in the first half of the node, and the pointers to the next lower level in the second half. In either case we walk the node from the beginninig to the end and do a linear search, as that is more efficient for the low number of cache lines touched during a search (2 for the inner nodes, 4 for the leaf nodes) than a binary search. We store termination markers (zero length for the leaf nodes, an otherwise impossible high bit for the inner nodes) to terminate the key list / records instead of storing a count to use the available cache lines as efficiently as possible. One quirk of the algorithm is that while we normally split a node half and half like usual btree implementations we just spill over entries added at the very end of the list to a new node on its own. This means we get a 100% fill grade for the common cases of bulk insertion when reading an inode into memory, and when only sequentially appending to a file. The downside is a slightly higher chance of splits on the first random insertions. Both insert and removal manually recurse into the lower levels, but the bulk deletion of the whole tree is still implemented as a recursive function call, although one limited by the overall depth and with very little stack usage in every iteration. For the first few extents we dynamically grow the list from a single extent to the next powers of two until we have a first full leaf block and that building the actual tree. The code started out based on the generic lib/btree.c code from Joern Engel based on earlier work from Peter Zijlstra, but has since been rewritten beyond recognition. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
* xfs: convert remaining xfs_sb_version_... checks to boolDave Chinner2017-11-011-2/+2
| | | | | | | | | Some were missed in the pass that converted the function return values from int to bool. Update the remaining ones for consistency. Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
* xfs: remove the never fully implemented UUID fork formatChristoph Hellwig2017-10-261-1/+1
| | | | | | | | | Remove the dead code dealing with the UUID fork format that was never implemented in Linux (and neither in IRIX as far as I know). Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
* xfs: scrub realtime bitmap/summaryDarrick J. Wong2017-10-261-0/+5
| | | | | | | Perform simple tests of the realtime bitmap and summary. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: scrub inode btreesDarrick J. Wong2017-10-261-1/+1
| | | | | | | | Check the records of the inode btrees to make sure that the values make sense given the inode records themselves. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: rename MAXPATHLEN to XFS_SYMLINK_MAXLENDarrick J. Wong2017-07-071-0/+1
| | | | | | | | | | | | | | | | | XFS has a maximum symlink target length of 1024 bytes; this is a holdover from the Irix days. Unfortunately, the constant establishing this is 'MAXPATHLEN' and is /not/ the same as the Linux MAXPATHLEN, which is 4096. The kernel enforces its 1024 byte MAXPATHLEN on symlink targets, but xfsprogs picks up the (Linux) system 4096 byte MAXPATHLEN, which means that xfs_repair doesn't complain about oversized symlinks. Since this is an on-disk format constraint, put the define in the XFS namespace and move everything over to use the new name. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
* xfs: remove double-underscore integer typesDarrick J. Wong2017-06-191-56/+56
| | | | | | | | | | | | | | | | | | | | This is a purely mechanical patch that removes the private __{u,}int{8,16,32,64}_t typedefs in favor of using the system {u,}int{8,16,32,64}_t typedefs. This is the sed script used to perform the transformation and fix the resulting whitespace and indentation errors: s/typedef\t__uint8_t/typedef __uint8_t\t/g s/typedef\t__uint/typedef __uint/g s/typedef\t__int\([0-9]*\)_t/typedef int\1_t\t/g s/__uint8_t\t/__uint8_t\t\t/g s/__uint/uint/g s/__int\([0-9]*\)_t\t/__int\1_t\t\t/g s/__int/int/g /^typedef.*int[0-9]*_t;$/d Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: simplify validation of the unwritten extent bitChristoph Hellwig2017-04-251-8/+0
| | | | | | | | | | | | | | | | | | | | | | | XFS only supports the unwritten extent bit in the data fork, and only if the file system has a version 5 superblock or the unwritten extent feature bit. We currently have two routines that validate the invariant: xfs_check_nostate_extents which return -EFSCORRUPTED when it's not met, and xfs_validate_extent that triggers and assert in debug build. Both of them iterate over all extents of an inode fork when called, which isn't very efficient. This patch instead adds a new helper that verifies the invariant one extent at a time, and calls it from the places where we iterate over all extents to converted them from or two the in-memory format. The callers then return -EFSCORRUPTED when reading invalid extents from disk, or trigger an assert when writing them to disk. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
* xfs: remove unused values from xfs_exntst_tChristoph Hellwig2017-04-251-1/+0
| | | | | | | | | | We only ever use the normal and unwritten states. And the actual ondisk format (this enum isn't despite being in xfs_format.h) only has space for the unwritten bit anyway. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
* xfs: remove the unused XFS_MAXLINK_1 defineChristoph Hellwig2017-04-251-2/+0
| | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
* libxfs: v3 inodes are only valid on crc-enabled filesystemsRoger Willcocks2016-10-201-1/+0
| | | | | | | | | | | | | | | | | | | | | xfs_repair was not detecting that version 3 inodes are invalid for for non-CRC filesystems. The result is specific inode corruptions go undetected and hence aren't repaired if only the version number is out of range. The core of the problem is that the XFS_DINODE_GOOD_VERSION() macro doesn't know that valid inode versions are dependent on a superblock version number. Fix this in libxfs, and propagate the new function out into the rest of xfsprogs to fix the issue. [Darrick: port to kernel from xfsprogs] Reported-by: Leslie Rhorer <lrhorer@mygrande.net> Signed-off-by: Roger Willcocks <roger@filmlight.ltd.uk> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: recognize the reflink feature bitDarrick J. Wong2016-10-051-1/+2
| | | | | | | | Add the reflink feature flag to the set of recognized feature flags. This enables users to write to reflink filesystems. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: create a separate cow extent size hint for the allocatorDarrick J. Wong2016-10-051-1/+2
| | | | | | | | | | | | | | | | | | | | | | Create a per-inode extent size allocator hint for copy-on-write. This hint is separate from the existing extent size hint so that CoW can take advantage of the fragmentation-reducing properties of extent size hints without disabling delalloc for regular writes. The extent size hint that's fed to the allocator during a copy on write operation is the greater of the cowextsize and regular extsize hint. During reflink, if we're sharing the entire source file to the entire destination file and the destination file doesn't already have a cowextsize hint, propagate the source file's cowextsize hint to the destination file. Furthermore, zero the bulkstat buffer prior to setting the fields so that we don't copy kernel memory contents into userspace. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: store in-progress CoW allocations in the refcount btreeDarrick J. Wong2016-10-051-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | Due to the way the CoW algorithm in XFS works, there's an interval during which blocks allocated to handle a CoW can be lost -- if the FS goes down after the blocks are allocated but before the block remapping takes place. This is exacerbated by the cowextsz hint -- allocated reservations can sit around for a while, waiting to get used. Since the refcount btree doesn't normally store records with refcount of 1, we can use it to record these in-progress extents. In-progress blocks cannot be shared because they're not user-visible, so there shouldn't be any conflicts with other programs. This is a better solution than holding EFIs during writeback because (a) EFIs can't be relogged currently, (b) even if they could, EFIs are bound by available log space, which puts an unnecessary upper bound on how much CoW we can have in flight, and (c) we already have a mechanism to track blocks. At mount time, read the refcount records and free anything we find with a refcount of 1 because those were in-progress when the FS went down. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: add refcount btree operationsDarrick J. Wong2016-10-031-2/+8
| | | | | | | | | | | | | Implement the generic btree operations required to manipulate refcount btree blocks. The implementation is similar to the bmapbt, though it will only allocate and free blocks from the AG. Since the refcount root and level fields are separate from the existing roots and levels array, they need a separate logging flag. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> [hch: fix logging of AGF refcount btree fields] Signed-off-by: Christoph Hellwig <hch@lst.de>
* xfs: define the on-disk refcount btree formatDarrick J. Wong2016-10-031-0/+36
| | | | | | | | | | Start constructing the refcount btree implementation by establishing the on-disk format and everything needed to read, write, and manipulate the refcount btree blocks. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: refcount btree add more reserved blocksDarrick J. Wong2016-10-031-0/+2
| | | | | | | | | Since XFS reserves a small amount of space in each AG as the minimum free space needed for an operation, save some more space in case we touch the refcount btree. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: introduce refcount btree definitionsDarrick J. Wong2016-10-031-4/+27
| | | | | | | | Add new per-AG refcount btree definitions to the per-AG structures. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: don't log the entire end of the AGFDarrick J. Wong2016-08-261-2/+4
| | | | | | | | | | | | When we're logging the last non-spare field in the AGF, we don't need to log the spare fields, so plumb in a new AGF logging flag to help us avoid that. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: store rmapbt block count in the AGFDarrick J. Wong2016-08-171-3/+8
| | | | | | | | | | | | Track the number of blocks used for the rmapbt in the AGF. When we get to the AG reservation code we need this counter to quickly make our reservation during mount. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: enable the rmap btree functionalityDarrick J. Wong2016-08-031-1/+2
| | | | | | | | | | | | | | Originally-From: Dave Chinner <dchinner@redhat.com> Add the feature flag to the supported matrix so that the kernel can mount and use rmap btree enabled filesystems Signed-off-by: Dave Chinner <dchinner@redhat.com> [darrick.wong@oracle.com: move the experimental tag] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: define the on-disk rmap btree formatDarrick J. Wong2016-08-031-0/+73
| | | | | | | | | | | | | | | | | | | | | | | Originally-From: Dave Chinner <dchinner@redhat.com> Now we have all the surrounding call infrastructure in place, we can start filling out the rmap btree implementation. Start with the on-disk btree format; add everything needed to read, write and manipulate rmap btree blocks. This prepares the way for adding the btree operations implementation. [darrick: record owner and offset info in rmap btree] [darrick: fork, bmbt and unwritten state in rmap btree] [darrick: flags are a separate field in xfs_rmap_irec] [darrick: calculate maxlevels separately] [darrick: move the 'unwritten' bit into unused parts of rm_offset] Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: add owner field to extent allocation and freeingDarrick J. Wong2016-08-031-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For the rmap btree to work, we have to feed the extent owner information to the the allocation and freeing functions. This information is what will end up in the rmap btree that tracks allocated extents. While we technically don't need the owner information when freeing extents, passing it allows us to validate that the extent we are removing from the rmap btree actually belonged to the owner we expected it to belong to. We also define a special set of owner values for internal metadata that would otherwise have no owner. This allows us to tell the difference between metadata owned by different per-ag btrees, as well as static fs metadata (e.g. AG headers) and internal journal blocks. There are also a couple of special cases we need to take care of - during EFI recovery, we don't actually know who the original owner was, so we need to pass a wildcard to indicate that we aren't checking the owner for validity. We also need special handling in growfs, as we "free" the space in the last AG when extending it, but because it's new space it has no actual owner... While touching the xfs_bmap_add_free() function, re-order the parameters to put the struct xfs_mount first. Extend the owner field to include both the owner type and some sort of index within the owner. The index field will be used to support reverse mappings when reflink is enabled. When we're freeing extents from an EFI, we don't have the owner information available (rmap updates have their own redo items). xfs_free_extent therefore doesn't need to do an rmap update. Make sure that the log replay code signals this correctly. This is based upon a patch originally from Dave Chinner. It has been extended to add more owner information with the intent of helping recovery operations when things go wrong (e.g. offset of user data block in a file). [dchinner: de-shout the xfs_rmap_*_owner helpers] [darrick: minor style fixes suggested by Christoph Hellwig] Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: rmap btree add more reserved blocksDarrick J. Wong2016-08-031-8/+1
| | | | | | | | | | | | | | | | | | | | Originally-From: Dave Chinner <dchinner@redhat.com> XFS reserves a small amount of space in each AG for the minimum number of free blocks needed for operation. Adding the rmap btree increases the number of reserved blocks, but it also increases the complexity of the calculation as the free inode btree is optional (like the rmbt). Rather than calculate the prealloc blocks every time we need to check it, add a function to calculate it at mount time and store it in the struct xfs_mount, and convert the XFS_PREALLOC_BLOCKS macro just to use the xfs-mount variable directly. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: introduce rmap btree definitionsDarrick J. Wong2016-08-031-5/+17
| | | | | | | | | | | | | | | Originally-From: Dave Chinner <dchinner@redhat.com> Add new per-ag rmap btree definitions to the per-ag structures. The rmap btree will sit in the empty slots on disk after the free space btrees, and hence form a part of the array of space management btrees. This requires the definition of the btree to be contiguous with the free space btrees. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: remove the magic numbers in xfs_btree_block-related len macrosHou Tao2016-07-201-25/+41
| | | | | | | | | | | | | replace the magic numbers by offsetof(...) and sizeof(...), and add two extra checks on xfs_check_ondisk_structs() [dchinner: renamed header structures to be more descriptive] Signed-off-by: Hou Tao <houtao1@huawei.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* Merge branch 'xfs-setxattr-promotion' into for-nextDave Chinner2016-01-191-2/+9
|\
| * xfs: introduce per-inode DAX enablementDave Chinner2016-01-041-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than just being able to turn DAX on and off via a mount option, some applications may only want to enable DAX for certain performance critical files in a filesystem. This patch introduces a new inode flag to enable DAX in the v3 inode di_flags2 field. It adds support for setting and clearing flags in the di_flags2 field via the XFS_IOC_FSSETXATTR ioctl, and sets the S_DAX inode flag appropriately when it is seen. When this flag is set on a directory, it acts as an "inherit flag". That is, inodes created in the directory will automatically inherit the on-disk inode DAX flag, enabling administrators to set up directory heirarchies that automatically use DAX. Setting this flag on an empty root directory will make the entire filesystem use DAX by default. Signed-off-by: Dave Chinner <dchinner@redhat.com>
| * xfs: use FS_XFLAG definitions directlyDave Chinner2016-01-041-2/+0
| | | | | | | | | | | | | | | | | | | | | | Now that the ioctls have been hoisted up to the VFS level, use the VFs definitions directly and remove the XFS specific definitions completely. Userspace is going to have to handle the change of this interface separately, so removing the definitions from xfs_fs.h is not an issue here at all. Signed-off-by: Dave Chinner <dchinner@redhat.com>
* | libxfs: pack the agfl header structure so XFS_AGFL_SIZE is correctDarrick J. Wong2016-01-041-1/+1
|/ | | | | | | | | | | | | | | | | | | Because struct xfs_agfl is 36 bytes long and has a 64-bit integer inside it, gcc will quietly round the structure size up to the nearest 64 bits -- in this case, 40 bytes. This results in the XFS_AGFL_SIZE macro returning incorrect results for v5 filesystems on 64-bit machines (118 items instead of 119). As a result, a 32-bit xfs_repair will see garbage in AGFL item 119 and complain. Therefore, tell gcc not to pad the structure so that the AGFL size calculation is correct. cc: <stable@vger.kernel.org> # 3.10 - 4.4 Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* Merge branch 'xfs-misc-fixes-for-4.4-2' into for-nextDave Chinner2015-11-031-2/+6
|\
| * xfs: Validate the length of on-disk ACLsAndreas Gruenbacher2015-11-031-2/+6
| | | | | | | | | | | | | | | | | | | | | | In xfs_acl_from_disk, instead of trusting that xfs_acl.acl_cnt is correct, make sure that the length of the attributes is correct as well. Also, turn the aclp parameter into a const pointer. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | xfs: avoid dependency on Linux XATTR_SIZE_MAXJan Tulak2015-10-121-1/+9
|/ | | | | | | | | | | | | | | | | Currently, we depends on Linux XATTR value for on disk definition. Which causes trouble on other platforms and maybe also if this value was to change. Fix it by creating a custom definition independent from those in Linux (although with the same values), so it is OK with the be16 fields used for holding these attributes. This patch reflects a change in xfsprogs. Signed-off-by: Jan Tulak <jtulak@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>