summaryrefslogtreecommitdiffstats
path: root/fs/btrfs/free-space-cache.c
Commit message (Collapse)AuthorAgeFilesLines
* Btrfs: fix use-after-free when dumping free spaceFilipe Manana2018-11-131-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 9084cb6a24bf5838a665af92ded1af8363f9e563 upstream. We were iterating a block group's free space cache rbtree without locking first the lock that protects it (the free_space_ctl->free_space_offset rbtree is protected by the free_space_ctl->tree_lock spinlock). KASAN reported an use-after-free problem when iterating such a rbtree due to a concurrent rbtree delete: [ 9520.359168] ================================================================== [ 9520.359656] BUG: KASAN: use-after-free in rb_next+0x13/0x90 [ 9520.359949] Read of size 8 at addr ffff8800b7ada500 by task btrfs-transacti/1721 [ 9520.360357] [ 9520.360530] CPU: 4 PID: 1721 Comm: btrfs-transacti Tainted: G L 4.19.0-rc8-nbor #555 [ 9520.360990] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 9520.362682] Call Trace: [ 9520.362887] dump_stack+0xa4/0xf5 [ 9520.363146] print_address_description+0x78/0x280 [ 9520.363412] kasan_report+0x263/0x390 [ 9520.363650] ? rb_next+0x13/0x90 [ 9520.363873] __asan_load8+0x54/0x90 [ 9520.364102] rb_next+0x13/0x90 [ 9520.364380] btrfs_dump_free_space+0x146/0x160 [btrfs] [ 9520.364697] dump_space_info+0x2cd/0x310 [btrfs] [ 9520.364997] btrfs_reserve_extent+0x1ee/0x1f0 [btrfs] [ 9520.365310] __btrfs_prealloc_file_range+0x1cc/0x620 [btrfs] [ 9520.365646] ? btrfs_update_time+0x180/0x180 [btrfs] [ 9520.365923] ? _raw_spin_unlock+0x27/0x40 [ 9520.366204] ? btrfs_alloc_data_chunk_ondemand+0x2c0/0x5c0 [btrfs] [ 9520.366549] btrfs_prealloc_file_range_trans+0x23/0x30 [btrfs] [ 9520.366880] cache_save_setup+0x42e/0x580 [btrfs] [ 9520.367220] ? btrfs_check_data_free_space+0xd0/0xd0 [btrfs] [ 9520.367518] ? lock_downgrade+0x2f0/0x2f0 [ 9520.367799] ? btrfs_write_dirty_block_groups+0x11f/0x6e0 [btrfs] [ 9520.368104] ? kasan_check_read+0x11/0x20 [ 9520.368349] ? do_raw_spin_unlock+0xa8/0x140 [ 9520.368638] btrfs_write_dirty_block_groups+0x2af/0x6e0 [btrfs] [ 9520.368978] ? btrfs_start_dirty_block_groups+0x870/0x870 [btrfs] [ 9520.369282] ? do_raw_spin_unlock+0xa8/0x140 [ 9520.369534] ? _raw_spin_unlock+0x27/0x40 [ 9520.369811] ? btrfs_run_delayed_refs+0x1b8/0x230 [btrfs] [ 9520.370137] commit_cowonly_roots+0x4b9/0x610 [btrfs] [ 9520.370560] ? commit_fs_roots+0x350/0x350 [btrfs] [ 9520.370926] ? btrfs_run_delayed_refs+0x1b8/0x230 [btrfs] [ 9520.371285] btrfs_commit_transaction+0x5e5/0x10e0 [btrfs] [ 9520.371612] ? btrfs_apply_pending_changes+0x90/0x90 [btrfs] [ 9520.371943] ? start_transaction+0x168/0x6c0 [btrfs] [ 9520.372257] transaction_kthread+0x21c/0x240 [btrfs] [ 9520.372537] kthread+0x1d2/0x1f0 [ 9520.372793] ? btrfs_cleanup_transaction+0xb50/0xb50 [btrfs] [ 9520.373090] ? kthread_park+0xb0/0xb0 [ 9520.373329] ret_from_fork+0x3a/0x50 [ 9520.373567] [ 9520.373738] Allocated by task 1804: [ 9520.373974] kasan_kmalloc+0xff/0x180 [ 9520.374208] kasan_slab_alloc+0x11/0x20 [ 9520.374447] kmem_cache_alloc+0xfc/0x2d0 [ 9520.374731] __btrfs_add_free_space+0x40/0x580 [btrfs] [ 9520.375044] unpin_extent_range+0x4f7/0x7a0 [btrfs] [ 9520.375383] btrfs_finish_extent_commit+0x15f/0x4d0 [btrfs] [ 9520.375707] btrfs_commit_transaction+0xb06/0x10e0 [btrfs] [ 9520.376027] btrfs_alloc_data_chunk_ondemand+0x237/0x5c0 [btrfs] [ 9520.376365] btrfs_check_data_free_space+0x81/0xd0 [btrfs] [ 9520.376689] btrfs_delalloc_reserve_space+0x25/0x80 [btrfs] [ 9520.377018] btrfs_direct_IO+0x42e/0x6d0 [btrfs] [ 9520.377284] generic_file_direct_write+0x11e/0x220 [ 9520.377587] btrfs_file_write_iter+0x472/0xac0 [btrfs] [ 9520.377875] aio_write+0x25c/0x360 [ 9520.378106] io_submit_one+0xaa0/0xdc0 [ 9520.378343] __se_sys_io_submit+0xfa/0x2f0 [ 9520.378589] __x64_sys_io_submit+0x43/0x50 [ 9520.378840] do_syscall_64+0x7d/0x240 [ 9520.379081] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 9520.379387] [ 9520.379557] Freed by task 1802: [ 9520.379782] __kasan_slab_free+0x173/0x260 [ 9520.380028] kasan_slab_free+0xe/0x10 [ 9520.380262] kmem_cache_free+0xc1/0x2c0 [ 9520.380544] btrfs_find_space_for_alloc+0x4cd/0x4e0 [btrfs] [ 9520.380866] find_free_extent+0xa99/0x17e0 [btrfs] [ 9520.381166] btrfs_reserve_extent+0xd5/0x1f0 [btrfs] [ 9520.381474] btrfs_get_blocks_direct+0x60b/0xbd0 [btrfs] [ 9520.381761] __blockdev_direct_IO+0x10ee/0x58a1 [ 9520.382059] btrfs_direct_IO+0x25a/0x6d0 [btrfs] [ 9520.382321] generic_file_direct_write+0x11e/0x220 [ 9520.382623] btrfs_file_write_iter+0x472/0xac0 [btrfs] [ 9520.382904] aio_write+0x25c/0x360 [ 9520.383172] io_submit_one+0xaa0/0xdc0 [ 9520.383416] __se_sys_io_submit+0xfa/0x2f0 [ 9520.383678] __x64_sys_io_submit+0x43/0x50 [ 9520.383927] do_syscall_64+0x7d/0x240 [ 9520.384165] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 9520.384439] [ 9520.384610] The buggy address belongs to the object at ffff8800b7ada500 which belongs to the cache btrfs_free_space of size 72 [ 9520.385175] The buggy address is located 0 bytes inside of 72-byte region [ffff8800b7ada500, ffff8800b7ada548) [ 9520.385691] The buggy address belongs to the page: [ 9520.385957] page:ffffea0002deb680 count:1 mapcount:0 mapping:ffff880108a1d700 index:0x0 compound_mapcount: 0 [ 9520.388030] flags: 0x8100(slab|head) [ 9520.388281] raw: 0000000000008100 ffffea0002deb608 ffffea0002728808 ffff880108a1d700 [ 9520.388722] raw: 0000000000000000 0000000000130013 00000001ffffffff 0000000000000000 [ 9520.389169] page dumped because: kasan: bad access detected [ 9520.389473] [ 9520.389658] Memory state around the buggy address: [ 9520.389943] ffff8800b7ada400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 9520.390368] ffff8800b7ada480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 9520.390796] >ffff8800b7ada500: fb fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc [ 9520.391223] ^ [ 9520.391461] ffff8800b7ada580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 9520.391885] ffff8800b7ada600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 9520.392313] ================================================================== [ 9520.392772] BTRFS critical (device vdc): entry offset 2258497536, bytes 131072, bitmap no [ 9520.393247] BUG: unable to handle kernel NULL pointer dereference at 0000000000000011 [ 9520.393705] PGD 800000010dbab067 P4D 800000010dbab067 PUD 107551067 PMD 0 [ 9520.394059] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 9520.394378] CPU: 4 PID: 1721 Comm: btrfs-transacti Tainted: G B L 4.19.0-rc8-nbor #555 [ 9520.394858] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 9520.395350] RIP: 0010:rb_next+0x3c/0x90 [ 9520.396461] RSP: 0018:ffff8801074ff780 EFLAGS: 00010292 [ 9520.396762] RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffffff81b5ac4c [ 9520.397115] RDX: 0000000000000000 RSI: 0000000000000008 RDI: 0000000000000011 [ 9520.397468] RBP: ffff8801074ff7a0 R08: ffffed0021d64ccc R09: ffffed0021d64ccc [ 9520.397821] R10: 0000000000000001 R11: ffffed0021d64ccb R12: ffff8800b91e0000 [ 9520.398188] R13: ffff8800a3ceba48 R14: ffff8800b627bf80 R15: 0000000000020000 [ 9520.398555] FS: 0000000000000000(0000) GS:ffff88010eb00000(0000) knlGS:0000000000000000 [ 9520.399007] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 9520.399335] CR2: 0000000000000011 CR3: 0000000106b52000 CR4: 00000000000006a0 [ 9520.399679] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 9520.400023] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 9520.400400] Call Trace: [ 9520.400648] btrfs_dump_free_space+0x146/0x160 [btrfs] [ 9520.400974] dump_space_info+0x2cd/0x310 [btrfs] [ 9520.401287] btrfs_reserve_extent+0x1ee/0x1f0 [btrfs] [ 9520.401609] __btrfs_prealloc_file_range+0x1cc/0x620 [btrfs] [ 9520.401952] ? btrfs_update_time+0x180/0x180 [btrfs] [ 9520.402232] ? _raw_spin_unlock+0x27/0x40 [ 9520.402522] ? btrfs_alloc_data_chunk_ondemand+0x2c0/0x5c0 [btrfs] [ 9520.402882] btrfs_prealloc_file_range_trans+0x23/0x30 [btrfs] [ 9520.403261] cache_save_setup+0x42e/0x580 [btrfs] [ 9520.403570] ? btrfs_check_data_free_space+0xd0/0xd0 [btrfs] [ 9520.403871] ? lock_downgrade+0x2f0/0x2f0 [ 9520.404161] ? btrfs_write_dirty_block_groups+0x11f/0x6e0 [btrfs] [ 9520.404481] ? kasan_check_read+0x11/0x20 [ 9520.404732] ? do_raw_spin_unlock+0xa8/0x140 [ 9520.405026] btrfs_write_dirty_block_groups+0x2af/0x6e0 [btrfs] [ 9520.405375] ? btrfs_start_dirty_block_groups+0x870/0x870 [btrfs] [ 9520.405694] ? do_raw_spin_unlock+0xa8/0x140 [ 9520.405958] ? _raw_spin_unlock+0x27/0x40 [ 9520.406243] ? btrfs_run_delayed_refs+0x1b8/0x230 [btrfs] [ 9520.406574] commit_cowonly_roots+0x4b9/0x610 [btrfs] [ 9520.406899] ? commit_fs_roots+0x350/0x350 [btrfs] [ 9520.407253] ? btrfs_run_delayed_refs+0x1b8/0x230 [btrfs] [ 9520.407589] btrfs_commit_transaction+0x5e5/0x10e0 [btrfs] [ 9520.407925] ? btrfs_apply_pending_changes+0x90/0x90 [btrfs] [ 9520.408262] ? start_transaction+0x168/0x6c0 [btrfs] [ 9520.408582] transaction_kthread+0x21c/0x240 [btrfs] [ 9520.408870] kthread+0x1d2/0x1f0 [ 9520.409138] ? btrfs_cleanup_transaction+0xb50/0xb50 [btrfs] [ 9520.409440] ? kthread_park+0xb0/0xb0 [ 9520.409682] ret_from_fork+0x3a/0x50 [ 9520.410508] Dumping ftrace buffer: [ 9520.410764] (ftrace buffer empty) [ 9520.411007] CR2: 0000000000000011 [ 9520.411297] ---[ end trace 01a0863445cf360a ]--- [ 9520.411568] RIP: 0010:rb_next+0x3c/0x90 [ 9520.412644] RSP: 0018:ffff8801074ff780 EFLAGS: 00010292 [ 9520.412932] RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffffff81b5ac4c [ 9520.413274] RDX: 0000000000000000 RSI: 0000000000000008 RDI: 0000000000000011 [ 9520.413616] RBP: ffff8801074ff7a0 R08: ffffed0021d64ccc R09: ffffed0021d64ccc [ 9520.414007] R10: 0000000000000001 R11: ffffed0021d64ccb R12: ffff8800b91e0000 [ 9520.414349] R13: ffff8800a3ceba48 R14: ffff8800b627bf80 R15: 0000000000020000 [ 9520.416074] FS: 0000000000000000(0000) GS:ffff88010eb00000(0000) knlGS:0000000000000000 [ 9520.416536] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 9520.416848] CR2: 0000000000000011 CR3: 0000000106b52000 CR4: 00000000000006a0 [ 9520.418477] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 9520.418846] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 9520.419204] Kernel panic - not syncing: Fatal exception [ 9520.419666] Dumping ftrace buffer: [ 9520.419930] (ftrace buffer empty) [ 9520.420168] Kernel Offset: disabled [ 9520.420406] ---[ end Kernel panic - not syncing: Fatal exception ]--- Fix this by acquiring the respective lock before iterating the rbtree. Reported-by: Nikolay Borisov <nborisov@suse.com> CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* btrfs: set max_extent_size properlyJosef Bacik2018-11-131-10/+20
| | | | | | | | | | | | | | | commit ad22cf6ea47fa20fbe11ac324a0a15c0a9a4a2a9 upstream. We can't use entry->bytes if our entry is a bitmap entry, we need to use entry->max_extent_size in that case. Fix up all the logic to make this consistent. CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Josef Bacik <jbacik@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* btrfs: reset max_extent_size on clear in a bitmapJosef Bacik2018-11-131-0/+2
| | | | | | | | | | | | | | | commit 553cceb49681d60975d00892877d4c871bf220f9 upstream. We need to clear the max_extent_size when we clear bits from a bitmap since it could have been from the range that contains the max_extent_size. CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* btrfs: protect space cache inode alloc with GFP_NOFSJosef Bacik2018-11-131-0/+8
| | | | | | | | | | | | | | | | | commit 84de76a2fb217dc1b6bc2965cc397d1648aa1404 upstream. If we're allocating a new space cache inode it's likely going to be under a transaction handle, so we need to use memalloc_nofs_save() in order to avoid deadlocks, and more importantly lockdep messages that make xfstests fail. CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* btrfs: fix deadlock when writing out space cacheJosef Bacik2018-02-031-1/+2
| | | | | | | | | | | | | | | | | | [ Upstream commit b77000ed558daa3bef0899d29bf171b8c9b5e6a8 ] If we fail to prepare our pages for whatever reason (out of memory in our case) we need to make sure to drop the block_group->data_rwsem, otherwise hilarity ensues. Signed-off-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> [ add label and use existing unlocking code ] Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* btrfs: drop newlines from strings when using btrfs_* helpersDavid Sterba2017-08-161-1/+1
| | | | | | | | | | The helpers append "\n" so we can keep the actual strings shorter. The extra newline will print an empty line. Some messages have been slightly modified to be more consistent with the rest (lowercase first letter). Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
* btrfs: use clear_page where appropriateDavid Sterba2017-04-181-1/+1
| | | | | | | | There's a helper to clear whole page, with a arch-specific optimized code. The replaced cases do not seem to be in performace critical code, but we still might get some percent gain. Signed-off-by: David Sterba <dsterba@suse.com>
* Merge branch 'WIP.sched-core-for-linus' of ↵Linus Torvalds2017-03-031-0/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull sched.h split-up from Ingo Molnar: "The point of these changes is to significantly reduce the <linux/sched.h> header footprint, to speed up the kernel build and to have a cleaner header structure. After these changes the new <linux/sched.h>'s typical preprocessed size goes down from a previous ~0.68 MB (~22K lines) to ~0.45 MB (~15K lines), which is around 40% faster to build on typical configs. Not much changed from the last version (-v2) posted three weeks ago: I eliminated quirks, backmerged fixes plus I rebased it to an upstream SHA1 from yesterday that includes most changes queued up in -next plus all sched.h changes that were pending from Andrew. I've re-tested the series both on x86 and on cross-arch defconfigs, and did a bisectability test at a number of random points. I tried to test as many build configurations as possible, but some build breakage is probably still left - but it should be mostly limited to architectures that have no cross-compiler binaries available on kernel.org, and non-default configurations" * 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (146 commits) sched/headers: Clean up <linux/sched.h> sched/headers: Remove #ifdefs from <linux/sched.h> sched/headers: Remove the <linux/topology.h> include from <linux/sched.h> sched/headers, hrtimer: Remove the <linux/wait.h> include from <linux/hrtimer.h> sched/headers, x86/apic: Remove the <linux/pm.h> header inclusion from <asm/apic.h> sched/headers, timers: Remove the <linux/sysctl.h> include from <linux/timer.h> sched/headers: Remove <linux/magic.h> from <linux/sched/task_stack.h> sched/headers: Remove <linux/sched.h> from <linux/sched/init.h> sched/core: Remove unused prefetch_stack() sched/headers: Remove <linux/rculist.h> from <linux/sched.h> sched/headers: Remove the 'init_pid_ns' prototype from <linux/sched.h> sched/headers: Remove <linux/signal.h> from <linux/sched.h> sched/headers: Remove <linux/rwsem.h> from <linux/sched.h> sched/headers: Remove the runqueue_is_locked() prototype sched/headers: Remove <linux/sched.h> from <linux/sched/hotplug.h> sched/headers: Remove <linux/sched.h> from <linux/sched/debug.h> sched/headers: Remove <linux/sched.h> from <linux/sched/nohz.h> sched/headers: Remove <linux/sched.h> from <linux/sched/stat.h> sched/headers: Remove the <linux/gfp.h> include from <linux/sched.h> sched/headers: Remove <linux/rtmutex.h> from <linux/sched.h> ...
| * sched/headers: Prepare for the reduction of <linux/sched.h>'s signal API ↵Ingo Molnar2017-03-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dependency Instead of including the full <linux/signal.h>, we are going to include the types-only <linux/signal_types.h> header in <linux/sched.h>, to further decouple the scheduler header from the signal headers. This means that various files which relied on the full <linux/signal.h> need to be updated to gain an explicit dependency on it. Update the code that relies on sched.h's inclusion of the <linux/signal.h> header. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | btrfs: all btrfs_delalloc_release_metadata take btrfs_inodeNikolay Borisov2017-02-281-1/+2
| | | | | | | | | | Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
* | btrfs: Make btrfs_i_size_write take btrfs_inodeNikolay Borisov2017-02-281-1/+1
|/ | | | | Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
* btrfs: btrfs_truncate_free_space_cache always allocates pathJeff Mahoney2017-02-171-7/+7
| | | | | | | | | btrfs_truncate_free_space_cache always allocates a btrfs_path structure but only uses it when the caller passes a block group. Let's move the allocation and free into the conditional. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
* btrfs: free-space-cache, clean up unnecessary root argumentsJeff Mahoney2017-02-171-14/+11
| | | | | | | | | | The free space cache APIs accept a root but always use the tree root. Also, btrfs_truncate_free_space_cache accepts a root AND an inode but the inode always points to the root anyway, so let's just pass the inode. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
* btrfs: remove unused parameters from __btrfs_write_out_cacheDavid Sterba2017-02-171-8/+3
| | | | | | | | Both unused after the call to update_cache_item has been moved to __btrfs_wait_cache_io. Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
* btrfs: remove unused parameter from cleanup_write_cache_enospcDavid Sterba2017-02-171-3/+2
| | | | | | | bitmap_list is unused since the io_ctl framework. Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
* btrfs: make space cache inode readahead failure nonfatalDavid Sterba2017-02-171-7/+3
| | | | | | | | | We do a readahead of the free space cache inode to speed things up but the failure is not fatal, like in other readahead cases. Proper reads would need to happen anyway and any errors would be caught there. Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
* btrfs: Make btrfs_ino take a struct btrfs_inodeNikolay Borisov2017-02-141-1/+1
| | | | | | | | | | | | | | | | Currently btrfs_ino takes a struct inode and this causes a lot of internal btrfs functions which consume this ino to take a VFS inode, rather than btrfs' own struct btrfs_inode. In order to fix this "leak" of VFS structs into the internals of btrfs first it's necessary to eliminate all uses of struct inode for the purpose of inode. This patch does that by using BTRFS_I to convert an inode to btrfs_inode. With this problem eliminated subsequent patches will start eliminating the passing of struct inode altogether, eventually resulting in a lot cleaner code. Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com> [ fix btrfs_get_extent tracepoint prototype ] Signed-off-by: David Sterba <dsterba@suse.com>
* btrfs: opencode chunk locking, remove helpersDavid Sterba2016-12-061-2/+2
| | | | | | The helpers are trivial and we don't use them consistently. Signed-off-by: David Sterba <dsterba@suse.com>
* btrfs: take an fs_info directly when the root is not used otherwiseJeff Mahoney2016-12-061-14/+10
| | | | | | | | | There are loads of functions in btrfs that accept a root parameter but only use it to obtain an fs_info pointer. Let's convert those to just accept an fs_info pointer directly. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
* btrfs: simplify btrfs_wait_cache_io prototypeJeff Mahoney2016-12-061-12/+28
| | | | | | | | | | | With the exception of the one case where btrfs_wait_cache_io is called without a block group, it's called with the same arguments. The root argument is only used in the special case, so let's factor out the core and simplify the call in the normal case to require a trans, block group, and path. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
* btrfs: root->fs_info cleanup, add fs_info convenience variablesJeff Mahoney2016-12-061-38/+51
| | | | | | | | | In routines where someptr->fs_info is referenced multiple times, we introduce a convenience variable. This makes the code considerably more readable. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
* btrfs: root->fs_info cleanup, lock/unlock_chunksJeff Mahoney2016-12-061-2/+2
| | | | | Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
* btrfs: root->fs_info cleanup, btrfs_calc_{trans,trunc}_metadata_sizeJeff Mahoney2016-12-061-2/+2
| | | | | Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
* btrfs: pull node/sector/stripe sizes out of root and into fs_infoJeff Mahoney2016-12-061-4/+4
| | | | | | | | | | We track the node sizes per-root, but they never vary from the values in the superblock. This patch messes with the 80-column style a bit, but subsequent patches to factor out root->fs_info into a convenience variable fix it up again. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
* btrfs: root->fs_info cleanup, io_ctl_initJeff Mahoney2016-12-061-6/+6
| | | | | | | The io_ctl->root member was only being used to access root->fs_info. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
* btrfs: call functions that overwrite their root parameter with fs_infoJeff Mahoney2016-12-061-3/+2
| | | | | | | | There are 11 functions that accept a root parameter and immediately overwrite it. We can pass those an fs_info pointer instead. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
* btrfs: remove redundant check of btrfs_iget return valueChristophe JAILLET2016-11-301-2/+0
| | | | | | | | 'btrfs_iget()' can not return NULL, so this test can be removed. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
* btrfs: change btrfs_csum_final result param type to u8Domagoj Tršan2016-11-301-2/+2
| | | | | | | | | | | csum member of struct btrfs_super_block has array type of u8. It makes sense that function btrfs_csum_final should be also declared to accept u8 *. I changed the declaration of method void btrfs_csum_final(u32 crc, char *result); to void btrfs_csum_final(u32 crc, u8 *result); Signed-off-by: Domagoj Tršan <domagoj.trsan@gmail.com> [ changed cast to u8 at several call sites ] Signed-off-by: David Sterba <dsterba@suse.com>
* btrfs: remove constant parameter to memset_extent_buffer and rename itDavid Sterba2016-11-301-2/+2
| | | | | | | The only memset we do is to 0, so sink the parameter to the function and simplify all calls. Rename the function to reflect the behaviour. Signed-off-by: David Sterba <dsterba@suse.com>
* btrfs: convert pr_* to btrfs_* where possibleJeff Mahoney2016-09-261-3/+5
| | | | | | | | | | | | | | For many printks, we want to know which file system issued the message. This patch converts most pr_* calls to use the btrfs_* versions instead. In some cases, this means adding plumbing to allow call sites access to an fs_info pointer. fs/btrfs/check-integrity.c is left alone for another day. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
* btrfs: convert printk(KERN_* to use pr_* callsJeff Mahoney2016-09-261-1/+1
| | | | | | | | | | This patch converts printk(KERN_* style messages to use the pr_* versions. One side effect is that anything that was KERN_DEBUG is now automatically a dynamic debug message. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
* btrfs: unsplit printed stringsJeff Mahoney2016-09-261-6/+7
| | | | | | | | | | | CodingStyle chapter 2: "[...] never break user-visible strings such as printk messages, because that breaks the ability to grep for them." This patch unsplits user-visible strings. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
* btrfs: btrfs_abort_transaction, drop root parameterJeff Mahoney2016-07-261-1/+1
| | | | | | | | | __btrfs_abort_transaction doesn't use its root parameter except to obtain an fs_info pointer. We can obtain that from trans->root->fs_info for now and from trans->fs_info in a later patch. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
* btrfs: btrfs_test_opt and friends should take a btrfs_fs_infoJeff Mahoney2016-07-261-3/+3
| | | | | | | | | btrfs_test_opt and friends only use the root pointer to access the fs_info. Let's pass the fs_info directly in preparation to eliminate similar patterns all over btrfs. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
* Btrfs: self-tests: Support non-4k page sizeFeifei Xu2016-06-021-1/+1
| | | | | | | | | | | self-tests code assumes 4k as the sectorsize and nodesize. This commit fix hardcoded 4K. Enables the self-tests code to be executed on non-4k page sized systems (e.g. ppc64). Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Feifei Xu <xufeifei@linux.vnet.ibm.com> Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: David Sterba <dsterba@suse.com>
* Btrfs: Fix integer overflow when calculating bytes_per_bitmapFeifei Xu2016-06-021-6/+6
| | | | | | | | | | On ppc64, bytes_per_bitmap will be (65536*8*65536). Hence append UL to fix integer overflow. Reviewed-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: Feifei Xu <xufeifei@linux.vnet.ibm.com> Signed-off-by: David Sterba <dsterba@suse.com>
* Btrfs: test_check_exists: Fix infinite loop when searching for free space ↵Feifei Xu2016-06-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | entries On a ppc64 machine using 64K as the block size, assume that the RB tree at btrfs_free_space_ctl->free_space_offset contains following two entries: 1. A bitmap entry having an offset value of 0 and having the bits corresponding to the address range [128M+512K, 128M+768K] set. 2. An extent entry corresponding to the address range [128M-256K, 128M-128K] In such a scenario, test_check_exists() invoked for checking the existence of address range [128M+768K, 256M] can lead to an infinite loop as explained below: - Checking for the extent entry fails. - Checking for a bitmap entry results in the free space info in range [128M+512K, 128M+768K] beng returned. - rb_prev(info) returns NULL because the bitmap entry starting from offset 0 comes first in the RB tree. - current_node = bitmap node. - while (current_node) tmp = rb_next(bitmap_node);/*tmp is extent based free space entry*/ Since extent based free space entry's last address is smaller than the address being searched for (i.e. 128M+768K) we incorrectly again obtain the extent node as the "next right node" of the RB tree and thus end up looping infinitely. This patch fixes the issue by checking the "tmp" variable which point to the most recently searched free space node. Reviewed-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: Feifei Xu <xufeifei@linux.vnet.ibm.com> Signed-off-by: David Sterba <dsterba@suse.com>
* btrfs: fix string and comment grammatical issues and typosNicholas D Steeves2016-05-251-1/+1
| | | | | Signed-off-by: Nicholas D Steeves <nsteeves@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
* mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macrosKirill A. Shutemov2016-04-041-15/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'misc-cleanups-4.5' of ↵Chris Mason2016-01-111-16/+8
|\ | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.5 Signed-off-by: Chris Mason <clm@fb.com>
| * btrfs: use list_for_each_entry_safe in free-space-cache.cGeliang Tang2016-01-071-10/+4
| | | | | | | | | | | | | | | | | | Use list_for_each_entry_safe() instead of list_for_each_safe() to simplify the code. Signed-off-by: Geliang Tang <geliangtang@163.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * Btrfs: use linux/sizes.h to represent constantsByongho Lee2016-01-071-6/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We use many constants to represent size and offset value. And to make code readable we use '256 * 1024 * 1024' instead of '268435456' to represent '256MB'. However we can make far more readable with 'SZ_256MB' which is defined in the 'linux/sizes.h'. So this patch replaces 'xxx * 1024 * 1024' kind of expression with single 'SZ_xxxMB' if 'xxx' is a power of 2 then 'xxx * SZ_1M' if 'xxx' is not a power of 2. And I haven't touched to '4096' & '8192' because it's more intuitive than 'SZ_4KB' & 'SZ_8KB'. Signed-off-by: Byongho Lee <bhlee.kernel@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
* | Merge branch 'misc-for-4.5' of ↵Chris Mason2016-01-111-1/+1
|\ \ | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.5
| * | btrfs: constify remaining structs with function pointersDavid Sterba2016-01-071-1/+1
| |/ | | | | | | | | | | | | * struct extent_io_ops * struct btrfs_free_space_op Signed-off-by: David Sterba <dsterba@suse.com>
* | Merge branch 'dev/simplify-set-bit' of ↵Chris Mason2015-12-231-1/+1
|\ \ | |/ |/| | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.5 Signed-off-by: Chris Mason <clm@fb.com>
| * btrfs: drop unused parameter from lock_extent_bitsDavid Sterba2015-12-031-1/+1
| | | | | | | | | | | | We've always passed 0. Stack usage will slightly decrease. Signed-off-by: David Sterba <dsterba@suse.com>
* | Merge branch 'for-linus-4.4' of ↵Linus Torvalds2015-12-181-4/+6
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "A couple of small fixes" * 'for-linus-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: check prepare_uptodate_page() error code earlier Btrfs: check for empty bitmap list in setup_cluster_bitmaps btrfs: fix misleading warning when space cache failed to load Btrfs: fix transaction handle leak in balance Btrfs: fix unprotected list move from unused_bgs to deleted_bgs list
| * Merge branch 'for-chris-4.4' of ↵Chris Mason2015-12-151-1/+1
| |\ | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/fdmanana/linux into for-linus-4.4
| | * btrfs: fix misleading warning when space cache failed to loadHolger Hoffstätte2015-12-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an inconsistent space cache is detected during loading we log a warning that users frequently mistake as instruction to invalidate the cache manually, even though this is not required. Fix the message to indicate that the cache will be rebuilt automatically. Signed-off-by: Holger Hoffstätte <holger.hoffstaette@googlemail.com> Acked-by: Filipe Manana <fdmanana@suse.com>
| * | Btrfs: check for empty bitmap list in setup_cluster_bitmapsChris Mason2015-12-151-3/+5
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dave Jones found a warning from kasan in setup_cluster_bitmaps() ================================================================== BUG: KASAN: stack-out-of-bounds in setup_cluster_bitmap+0xc4/0x5a0 at addr ffff88039bef6828 Read of size 8 by task nfsd/1009 page:ffffea000e6fbd80 count:0 mapcount:0 mapping: (null) index:0x0 flags: 0x8000000000000000() page dumped because: kasan: bad access detected CPU: 1 PID: 1009 Comm: nfsd Tainted: G W 4.4.0-rc3-backup-debug+ #1 ffff880065647b50 000000006bb712c2 ffff88039bef6640 ffffffffa680a43e 0000004559c00000 ffff88039bef66c8 ffffffffa62638d1 ffffffffa61121c0 ffff8803a5769de8 0000000000000296 ffff8803a5769df0 0000000000046280 Call Trace: [<ffffffffa680a43e>] dump_stack+0x4b/0x6d [<ffffffffa62638d1>] kasan_report_error+0x501/0x520 [<ffffffffa61121c0>] ? debug_show_all_locks+0x1e0/0x1e0 [<ffffffffa6263948>] kasan_report+0x58/0x60 [<ffffffffa6814b00>] ? rb_last+0x10/0x40 [<ffffffffa66f8af4>] ? setup_cluster_bitmap+0xc4/0x5a0 [<ffffffffa6262ead>] __asan_load8+0x5d/0x70 [<ffffffffa66f8af4>] setup_cluster_bitmap+0xc4/0x5a0 [<ffffffffa66f675a>] ? setup_cluster_no_bitmap+0x6a/0x400 [<ffffffffa66fcd16>] btrfs_find_space_cluster+0x4b6/0x640 [<ffffffffa66fc860>] ? btrfs_alloc_from_cluster+0x4e0/0x4e0 [<ffffffffa66fc36e>] ? btrfs_return_cluster_to_free_space+0x9e/0xb0 [<ffffffffa702dc37>] ? _raw_spin_unlock+0x27/0x40 [<ffffffffa666a1a1>] find_free_extent+0xba1/0x1520 Andrey noticed this was because we were doing list_first_entry on a list that might be empty. Rework the tests a bit so we don't do that. Signed-off-by: Chris Mason <clm@fb.com> Reprorted-by: Andrey Ryabinin <ryabinin.a.a@gmail.com> Reported-by: Dave Jones <dsj@fb.com>