summaryrefslogtreecommitdiffstats
path: root/fs/btrfs/volumes.c
Commit message (Collapse)AuthorAgeFilesLines
* Btrfs: unplug every once and a whileChris Mason2011-12-151-0/+6
| | | | | | | | The btrfs io submission threads can build up massive plug lists. This keeps things more reasonable so we don't hand over huge dumps of IO at once. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: fix btrfs_end_bio to deal with write errors to a single mirrorChris Mason2011-12-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | btrfs_end_bio checks the number of errors on a bio against the max number of errors allowed before sending any EIOs up to the higher levels. If we got enough copies of the bio done for a given raid level, it is supposed to clear the bio error flag and return success. We have pointers to the original bio sent down by the higher layers and pointers to any cloned bios we made for raid purposes. If the original bio happens to be the one that got an io error, but not the last one to finish, it might not have the BIO_UPTODATE bit set. Then, when the last bio does finish, we'll call bio_end_io on the original bio. It won't have the uptodate bit set and we'll end up sending EIO to the higher layers. We already had a check for this, it just was conditional on getting the IO error on the very last bio. Make the check unconditional so we eat the EIOs properly. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: check if the to-be-added device is writableLi Zefan2011-12-081-1/+1
| | | | | | | | | | | | | If we call ioctl(BTRFS_IOC_ADD_DEV) directly, we'll succeed in adding a readonly device to a btrfs filesystem, and btrfs will write to that device, emitting kernel errors: [ 3109.833692] lost page write due to I/O error on loop2 [ 3109.833720] lost page write due to I/O error on loop2 ... Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: fix nocow when deleting the itemMiao Xie2011-11-101-1/+4
| | | | | | | | btrfs_previous_item() just search the b+ tree, do not COW the nodes or leaves, if we modify the result of it, the meta-data will be broken. fix it. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Merge git://git.jan-o-sch.net/btrfs-unstable into integrationChris Mason2011-11-061-60/+70
|\ | | | | | | | | | | | | | | | | | | Conflicts: fs/btrfs/Makefile fs/btrfs/extent_io.c fs/btrfs/extent_io.h fs/btrfs/scrub.c Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * btrfs: Put mirror_num in bi_bdevJan Schmidt2011-09-291-0/+2
| | | | | | | | | | | | | | | | | | | | The error correction code wants to make sure that only the bad mirror is rewritten. Thus, we need to know which mirror is the bad one. I did not find a more apropriate field than bi_bdev. But I think using this is fine, because it is modified by the block layer, anyway, and should not be read after the bio returned. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
| * btrfs: btrfs_multi_bio replaced with btrfs_bioJan Schmidt2011-09-291-60/+68
| | | | | | | | | | | | | | | | | | btrfs_bio is a bio abstraction able to split and not complete after the last bio has returned (like the old btrfs_multi_bio). Additionally, btrfs_bio tracks the mirror_num used to read data which can be used for error correction purposes. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
* | Merge branch 'for-chris' of git://github.com/sensille/linux into integrationChris Mason2011-11-061-0/+8
|\ \ | | | | | | | | | | | | | | | | | | Conflicts: fs/btrfs/ctree.h Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * | btrfs: state information for readaheadArne Jansen2011-10-021-0/+8
| |/ | | | | | | | | | | | | | | | | | | | | Add state information for readahead to btrfs_fs_info and btrfs_device Changes v2: - don't wait in radix_trees - add own set of workers for readahead Reviewed-by: Josef Bacik <josef@redhat.com> Signed-off-by: Arne Jansen <sensille@gmx.net>
* | btrfs: separate superblock items out of fs_infoDavid Sterba2011-11-061-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | fs_info has now ~9kb, more than fits into one page. This will cause mount failure when memory is too fragmented. Top space consumers are super block structures super_copy and super_for_commit, ~2.8kb each. Allocate them dynamically. fs_info will be ~3.5kb. (measured on x86_64) Add a wrapper for freeing fs_info and all of it's dynamically allocated members. Signed-off-by: David Sterba <dsterba@suse.cz>
* | Btrfs: close all bdevs on mount failureIlya Dryomov2011-10-201-4/+2
| | | | | | | | | | | | | | | | | | | | Fix a bug introduced by 20b45077. We have to return EINVAL on mount failure, but doing that too early in the sequence leaves all of the devices opened exclusively. This also fixes an issue where under some scenarios only a second mount -o degraded <devices> command would succeed. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
* | Btrfs: allow us to overcommit our enospc reservationsJosef Bacik2011-10-191-4/+35
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | One of the things that kills us is the fact that our ENOSPC reservations are horribly over the top in most normal cases. There isn't too much that can be done about this because when we are completely full we really need them to work like this so we don't under reserve. However if there is plenty of unallocated chunks on the disk we can use that to gauge how much we can overcommit. So this patch adds chunk free space accounting so we always know how much unallocated space we have. Then if we fail to make a reservation within our allocated space, check to see if we can overcommit. In the normal flushing case (like with delalloc metadata reservations) we'll take the free space and divide it by 2 if our metadata profile is setup for DUP or any of those, and then divide it by 8 to make sure we don't overcommit too much. Then if we're in a non-flushing case (we really need this reservation now!) we only limit ourselves to half of the free space. This makes this fio test [torrent] filename=torrent-test rw=randwrite size=4g ioengine=sync directory=/mnt/btrfs-test go from taking around 45 minutes to 10 seconds on my freshly formatted 3 TiB file system. This doesn't seem to break my other enospc tests, but could really use some more testing as this is a super scary change. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>
* Btrfs: fix uninitialized sync_pendingMiao Xie2011-08-161-1/+1
| | | | | | | sync_pending is uninitialized before it be used, fix it. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: fix a bug of balance on full multi-disk partitionsliubo2011-08-161-4/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When balancing, we'll first try to shrink devices for some space, but if it is working on a full multi-disk partition with raid protection, we may encounter a bug, that is, while shrinking, total_bytes may be less than bytes_used, and btrfs may allocate a dev extent that accesses out of device's bounds. Then we will not be able to write or read the data which stores at the end of the device, and get the followings: device fsid 0939f071-7ea3-46c8-95df-f176d773bfb6 devid 1 transid 10 /dev/sdb5 Btrfs detected SSD devices, enabling SSD mode btrfs: relocating block group 476315648 flags 9 btrfs: found 4 extents attempt to access beyond end of device sdb5: rw=145, want=546176, limit=546147 attempt to access beyond end of device sdb5: rw=145, want=546304, limit=546147 attempt to access beyond end of device sdb5: rw=145, want=546432, limit=546147 attempt to access beyond end of device sdb5: rw=145, want=546560, limit=546147 attempt to access beyond end of device Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: detect wether a device supports discardJosef Bacik2011-08-161-0/+17
| | | | | | | | | | | | | | We have a problem where if a user specifies discard but doesn't actually support it we will return EOPNOTSUPP from btrfs_discard_extent. This is a problem because this gets called (in a fashion) from the tree log recovery code, which has a nice little BUG_ON(ret) after it, which causes us to fail the tree log replay. So instead detect wether our devices support discard when we're adding them and then don't issue discards if we know that the device doesn't support it. And just for good measure set ret = 0 in btrfs_issue_discard just in case we still get EOPNOTSUPP so we don't screw anybody up like this again. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: force unplugs when switching from high to regular priority biosChris Mason2011-08-051-0/+17
| | | | | | | | | | | | Btrfs does bio submissions from a worker thread, and each device has a list of high priority bios and regular priority bios. Synchronous writes go to the high priority thread while async writes go to regular list. This commit brings back an explicit unplug any time we switch from high to regular priority, which makes it easier for the block layer to give us low latencies. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Merge branch 'alloc_path' of ↵Chris Mason2011-08-011-4/+8
|\ | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/btrfs-error-handling into for-linus
| * btrfs: Don't BUG_ON alloc_path errors in find_next_chunkMark Fasheh2011-07-251-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | I also removed the BUG_ON from error return of find_next_chunk in init_first_rw_device(). It turns out that the only caller of init_first_rw_device() also BUGS on any nonzero return so no actual behavior change has occurred here. do_chunk_alloc() also needed an update since it calls btrfs_alloc_chunk() which can now return -ENOMEM. Instead of setting space_info->full on any error from btrfs_alloc_chunk() I catch and return every error value _except_ -ENOSPC. Thanks goes to Tsutomu Itoh for pointing that issue out. Signed-off-by: Mark Fasheh <mfasheh@suse.com>
| * btrfs: Don't BUG_ON alloc_path errors in btrfs_balance()Mark Fasheh2011-07-141-2/+4
| | | | | | | | | | | | | | | | | | Dealing with this seems trivial - the only caller of btrfs_balance() is btrfs_ioctl() which passes the error code directly back to userspace. There also isn't much state to unwind (if I'm wrong about this point, we can always safely move the allocation to the top of btrfs_balance() anyway). Signed-off-by: Mark Fasheh <mfasheh@suse.com>
* | Btrfs: make a lockdep class for each rootChris Mason2011-07-271-1/+1
|/ | | | | | | | | | | | This patch was originally from Tejun Heo. lockdep complains about the btrfs locking because we sometimes take btree locks from two different trees at the same time. The current classes are based only on level in the btree, which isn't enough information for lockdep to figure out if the lock is safe. This patch makes a class for each type of tree, and lumps all the FS trees that actually have files and directories into the same class. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs: don't panic if we get an error while balancing V2Josef Bacik2011-07-061-1/+2
| | | | | | | | | | A user reported an error where if we try to balance an fs after a device has been removed it will blow up. This is because we get an EIO back and this is where BUG_ON(ret) bites us in the ass. To fix we just exit. Thanks, Reported-by: Anand Jain <Anand.Jain@oracle.com> Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Btrfs - use %pU to print fsidIlya Dryomov2011-06-101-6/+2
| | | | | | | | Get rid of FIXME comment. Uuids from dmesg are now the same as uuids given by btrfs-progs. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* btrfs: false BUG_ON when degradedArne Jansen2011-06-041-1/+1
| | | | | | | | | In degraded mode the struct btrfs_device of missing devs don't have device->name set. A kstrdup of NULL correctly returns NULL. Don't BUG in this case. Signed-off-by: Arne Jansen <sensille@gmx.net> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* Merge branch 'cleanups_and_fixes' into inode_numbersChris Mason2011-05-231-36/+83
|\ | | | | | | | | | | | | | | Conflicts: fs/btrfs/tree-log.c fs/btrfs/volumes.c Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * Btrfs: using rcu lock in the reader side of devices listXiao Guangrong2011-05-231-26/+59
| | | | | | | | | | | | | | | | fs_devices->devices is only updated on remove and add device paths, so we can use rcu to protect it in the reader side Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * Btrfs: drop unnecessary device lockXiao Guangrong2011-05-231-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Drop device_list_mutex for the reader side on clone_fs_devices and btrfs_rm_device pathes since the fs_info->volume_mutex can ensure the device list is not updated btrfs_close_extra_devices is the initialized path, we can not add or remove device at this time, so we can simply drop the mutex safely, like other initialized function does(add_missing_dev, __find_device, __btrfs_open_devices ...). Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * Btrfs: fix the race between remove dev and alloc chunkXiao Guangrong2011-05-231-0/+6
| | | | | | | | | | | | | | | | On remove device path, it updates device->dev_alloc_list but does not hold chunk lock Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * Btrfs: fix the race between reading and updating devicesXiao Guangrong2011-05-231-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | On btrfs_congested_fn and __unplug_io_fn paths, we should hold device_list_mutex to avoid remove/add device path to update fs_devices->devices On __btrfs_close_devices and btrfs_prepare_sprout paths, the devices in fs_devices->devices or fs_devices->devices is updated, so we should hold the mutex to avoid the reader side to reach them Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * Btrfs: fix bh leak on __btrfs_open_devices pathXiao Guangrong2011-05-231-0/+1
| | | | | | | | | | | | | | 'bh' is forgot to release if no error is detected Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * Btrfs: return error code to caller when btrfs_del_item failsTsutomu Itoh2011-05-231-3/+1
| | | | | | | | | | | | | | | | The error code is returned instead of calling BUG_ON when btrfs_del_item returns the error. Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * Btrfs: return error code to caller when btrfs_previous_item failsTsutomu Itoh2011-05-231-2/+3
| | | | | | | | | | | | | | | | The error code is returned instead of calling BUG_ON when btrfs_previous_item returns the error. Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* | Merge branch 'for-chris' of ↵Chris Mason2011-05-231-3/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arne/btrfs-unstable-arne into inode_numbers Conflicts: fs/btrfs/Makefile fs/btrfs/ctree.h fs/btrfs/volumes.h Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * | btrfs: scrubArne Jansen2011-05-121-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds an initial implementation for scrub. It works quite straightforward. The usermode issues an ioctl for each device in the fs. For each device, it enumerates the allocated device chunks. For each chunk, the contained extents are enumerated and the data checksums fetched. The extents are read sequentially and the checksums verified. If an error occurs (checksum or EIO), a good copy is searched for. If one is found, the bad copy will be rewritten. All enumerations happen from the commit roots. During a transaction commit, the scrubs get paused and afterwards continue from the new roots. This commit is based on the series originally posted to linux-btrfs with some improvements that resulted from comments from David Sterba, Ilya Dryomov and Jan Schmidt. Signed-off-by: Arne Jansen <sensille@gmx.net>
* | | Merge branch 'allocator' of ↵Chris Mason2011-05-221-319/+174
|\ \ \ | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arne/btrfs-unstable-arne into inode_numbers Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * | | btrfs: quasi-round-robin for chunk allocationArne Jansen2011-05-131-305/+176
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In a multi device setup, the chunk allocator currently always allocates chunks on the devices in the same order. This leads to a very uneven distribution, especially with RAID1 or RAID10 and an uneven number of devices. This patch always sorts the devices before allocating, and allocates the stripes on the devices with the most available space, as long as there is enough space available. In a low space situation, it first tries to maximize striping. The patch also simplifies the allocator and reduces the checks for corner cases. The simplification is done by several means. First, it defines the properties of each RAID type upfront. These properties are used afterwards instead of differentiating cases in several places. Second, the old allocator defined a minimum stripe size for each block group type, tried to find a large enough chunk, and if this fails just allocates a smaller one. This is now done in one step. The largest possible chunk (up to max_chunk_size) is searched and allocated. Because we now have only one pass, the allocation of the map (struct map_lookup) is moved down to the point where the number of stripes is already known. This way we avoid reallocation of the map. We still avoid allocating stripes that are not a multiple of STRIPE_SIZE.
| * | | btrfs: heed alloc_startArne Jansen2011-05-131-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | currently alloc_start is disregarded if the requested chunk size is bigger than (device size - alloc_start), but smaller than the device size. The only situation where I see this could have made sense was when a chunk equal the size of the device has been requested. This was possible as the allocator failed to take alloc_start into account when calculating the request chunk size. As this gets fixed by this patch, the workaround is not necessary anymore.
| * | | btrfs: move btrfs_cmp_device_free_bytes to super.cArne Jansen2011-05-131-13/+0
| |/ / | | | | | | | | | | | | this function won't be used here anymore, so move it super.c where it is used for df-calculation
* | | btrfs: remove all unused functionsDavid Sterba2011-05-061-19/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove static and global declarations and/or definitions. Reduces size of btrfs.ko by ~3.4kB. text data bss dec hex filename 402081 7464 200 409745 64091 btrfs.ko.base 398620 7144 200 405964 631cc btrfs.ko.remove-all Signed-off-by: David Sterba <dsterba@suse.cz>
* | | btrfs: drop unused parameter from btrfs_release_pathDavid Sterba2011-05-021-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | parameter tree root it's not used since commit 5f39d397dfbe140a14edecd4e73c34ce23c4f9ee ("Btrfs: Create extent_buffer interface for large blocksizes") Signed-off-by: David Sterba <dsterba@suse.cz>
* | | btrfs: drop gfp parameter from alloc_extent_mapDavid Sterba2011-05-021-2/+2
| | | | | | | | | | | | | | | | | | pass GFP_NOFS directly to kmem_cache_alloc Signed-off-by: David Sterba <dsterba@suse.cz>
* | | btrfs: drop unused parameter from extent_map_tree_initDavid Sterba2011-05-021-1/+1
| |/ |/| | | | | | | | | | | the GFP flags are not stored anywhere and all allocations are done via alloc_extent_map(GFP_NOFS). Signed-off-by: David Sterba <dsterba@suse.cz>
* | Btrfs: do some plugging in the submit_bio threadsChris Mason2011-04-191-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Btrfs submit bio threads have a small number of threads responsible for pushing down bios we've collected for a large number of devices. Since we do all the bios for a single device at once, we want to make sure we unplug and send down the bios for each device as we're done processing them. The new plugging API removed the btrfs code to unplug while processing bios, this adds it back with the new API. Signed-off-by: Chris Mason <chris.mason@oracle.com>
* | Merge branch 'for-linus-unmerged' of ↵Linus Torvalds2011-03-281-26/+138
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable * 'for-linus-unmerged' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: (45 commits) Btrfs: fix __btrfs_map_block on 32 bit machines btrfs: fix possible deadlock by clearing __GFP_FS flag btrfs: check link counter overflow in link(2) btrfs: don't mess with i_nlink of unlocked inode in rename() Btrfs: check return value of btrfs_alloc_path() Btrfs: fix OOPS of empty filesystem after balance Btrfs: fix memory leak of empty filesystem after balance Btrfs: fix return value of setflags ioctl Btrfs: fix uncheck memory allocations btrfs: make inode ref log recovery faster Btrfs: add btrfs_trim_fs() to handle FITRIM Btrfs: adjust btrfs_discard_extent() return errors and trimmed bytes Btrfs: make btrfs_map_block() return entire free extent for each device of RAID0/1/10/DUP Btrfs: make update_reserved_bytes() public btrfs: return EXDEV when linking from different subvolumes Btrfs: Per file/directory controls for COW and compression Btrfs: add datacow flag in inode flag btrfs: use GFP_NOFS instead of GFP_KERNEL Btrfs: check return value of read_tree_block() btrfs: properly access unaligned checksum buffer ... Fix up trivial conflicts in fs/btrfs/volumes.c due to plug removal in the block layer.
| * Btrfs: fix __btrfs_map_block on 32 bit machinesChris Mason2011-03-281-6/+20
| | | | | | | | | | | | | | Recent changes for discard support didn't compile, this fixes them not to try and % 64 bit numbers. Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * Btrfs: make btrfs_map_block() return entire free extent for each device of ↵Li Dongyang2011-03-281-22/+128
| | | | | | | | | | | | | | | | | | | | | | RAID0/1/10/DUP btrfs_map_block() will only return a single stripe length, but we want the full extent be mapped to each disk when we are trimming the extent, so we add length to btrfs_bio_stripe and fill it if we are mapping for REQ_DISCARD. Signed-off-by: Li Dongyang <lidongyang@novell.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
| * Btrfs: add initial tracepoint support for btrfsliubo2011-03-281-11/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tracepoints can provide insight into why btrfs hits bugs and be greatly helpful for debugging, e.g dd-7822 [000] 2121.641088: btrfs_inode_request: root = 5(FS_TREE), gen = 4, ino = 256, blocks = 8, disk_i_size = 0, last_trans = 8, logged_trans = 0 dd-7822 [000] 2121.641100: btrfs_inode_new: root = 5(FS_TREE), gen = 8, ino = 257, blocks = 0, disk_i_size = 0, last_trans = 0, logged_trans = 0 btrfs-transacti-7804 [001] 2146.935420: btrfs_cow_block: root = 2(EXTENT_TREE), refs = 2, orig_buf = 29368320 (orig_level = 0), cow_buf = 29388800 (cow_level = 0) btrfs-transacti-7804 [001] 2146.935473: btrfs_cow_block: root = 1(ROOT_TREE), refs = 2, orig_buf = 29364224 (orig_level = 0), cow_buf = 29392896 (cow_level = 0) btrfs-transacti-7804 [001] 2146.972221: btrfs_transaction_commit: root = 1(ROOT_TREE), gen = 8 flush-btrfs-2-7821 [001] 2155.824210: btrfs_chunk_alloc: root = 3(CHUNK_TREE), offset = 1103101952, size = 1073741824, num_stripes = 1, sub_stripes = 0, type = DATA flush-btrfs-2-7821 [001] 2155.824241: btrfs_cow_block: root = 2(EXTENT_TREE), refs = 2, orig_buf = 29388800 (orig_level = 0), cow_buf = 29396992 (cow_level = 0) flush-btrfs-2-7821 [001] 2155.824255: btrfs_cow_block: root = 4(DEV_TREE), refs = 2, orig_buf = 29372416 (orig_level = 0), cow_buf = 29401088 (cow_level = 0) flush-btrfs-2-7821 [000] 2155.824329: btrfs_cow_block: root = 3(CHUNK_TREE), refs = 2, orig_buf = 20971520 (orig_level = 0), cow_buf = 20975616 (cow_level = 0) btrfs-endio-wri-7800 [001] 2155.898019: btrfs_cow_block: root = 5(FS_TREE), refs = 2, orig_buf = 29384704 (orig_level = 0), cow_buf = 29405184 (cow_level = 0) btrfs-endio-wri-7800 [001] 2155.898043: btrfs_cow_block: root = 7(CSUM_TREE), refs = 2, orig_buf = 29376512 (orig_level = 0), cow_buf = 29409280 (cow_level = 0) Here is what I have added: 1) ordere_extent: btrfs_ordered_extent_add btrfs_ordered_extent_remove btrfs_ordered_extent_start btrfs_ordered_extent_put These provide critical information to understand how ordered_extents are updated. 2) extent_map: btrfs_get_extent extent_map is used in both read and write cases, and it is useful for tracking how btrfs specific IO is running. 3) writepage: __extent_writepage btrfs_writepage_end_io_hook Pages are cirtical resourses and produce a lot of corner cases during writeback, so it is valuable to know how page is written to disk. 4) inode: btrfs_inode_new btrfs_inode_request btrfs_inode_evict These can show where and when a inode is created, when a inode is evicted. 5) sync: btrfs_sync_file btrfs_sync_fs These show sync arguments. 6) transaction: btrfs_transaction_commit In transaction based filesystem, it will be useful to know the generation and who does commit. 7) back reference and cow: btrfs_delayed_tree_ref btrfs_delayed_data_ref btrfs_delayed_ref_head btrfs_cow_block Btrfs natively supports back references, these tracepoints are helpful on understanding btrfs's COW mechanism. 8) chunk: btrfs_chunk_alloc btrfs_chunk_free Chunk is a link between physical offset and logical offset, and stands for space infomation in btrfs, and these are helpful on tracing space things. 9) reserved_extent: btrfs_reserved_extent_alloc btrfs_reserved_extent_free These can show how btrfs uses its space. Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
* | Merge branch 'for-2.6.39/stack-plug' into for-2.6.39/coreJens Axboe2011-03-101-80/+11
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | Conflicts: block/blk-core.c block/blk-flush.c drivers/md/raid1.c drivers/md/raid10.c drivers/md/raid5.c fs/nilfs2/btnode.c fs/nilfs2/mdt.c Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * block: remove per-queue pluggingJens Axboe2011-03-101-80/+11
| | | | | | | | | | | | | | | | Code has been converted over to the new explicit on-stack plugging, and delay users have been converted to use the new API for that. So lets kill off the old plugging along with aops->sync_page(). Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstableLinus Torvalds2011-02-251-3/+10
|\ \ | |/ |/| | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: fix fiemap bugs with delalloc Btrfs: set FMODE_EXCL in btrfs_device->mode Btrfs: make btrfs_rm_device() fail gracefully Btrfs: Avoid accessing unmapped kernel address Btrfs: Fix BTRFS_IOC_SUBVOL_SETFLAGS ioctl Btrfs: allow balance to explicitly allocate chunks as it relocates Btrfs: put ENOSPC debugging under a mount option
| * Btrfs: set FMODE_EXCL in btrfs_device->modeIlya Dryomov2011-02-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | This fixes a bug introduced in d4d77629, where the device added online (and therefore initialized via btrfs_init_new_device()) would be left with the positive bdev->bd_holders after unmount. Since d4d77629 we no longer OR FMODE_EXCL explicitly on blkdev_put(), set it in btrfs_device->mode. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>