summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'pstore-v4.8-rc1' of ↵Linus Torvalds2016-08-051-19/+10
|\ | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull pstore fixes from Kees Cook: "Fixes for pstore ramoops driver to catch bad kfree() and to use better DT bindings" * tag 'pstore-v4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: ramoops: use persistent_ram_free() instead of kfree() for freeing prz ramoops: use DT reserved-memory bindings
| * ramoops: use persistent_ram_free() instead of kfree() for freeing przHiraku Toyooka2016-08-051-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | persistent_ram_zone(=prz) structures are allocated by persistent_ram_new(), which includes vmap() or ioremap(). But they are currently freed by kfree(). This uses persistent_ram_free() for correct this asymmetry usage. Signed-off-by: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com> Signed-off-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.kw@hitachi.com> Cc: Mark Salyzyn <salyzyn@android.com> Cc: Seiji Aguchi <seiji.aguchi.tr@hitachi.com> Signed-off-by: Kees Cook <keescook@chromium.org>
| * ramoops: use DT reserved-memory bindingsKees Cook2016-08-051-16/+7
| | | | | | | | | | | | | | | | | | | | | | Instead of a ramoops-specific node, use a child node of /reserved-memory. This requires that of_platform_device_create() be explicitly called for the node, though, since "/reserved-memory" does not have its own "compatible" property. Suggested-by: Rob Herring <robh@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Rob Herring <robh@kernel.org>
* | Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds2016-08-055-12/+7
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block fixes from Jens Axboe: "Here's the second round of block updates for this merge window. It's a mix of fixes for changes that went in previously in this round, and fixes in general. This pull request contains: - Fixes for loop from Christoph - A bdi vs gendisk lifetime fix from Dan, worth two cookies. - A blk-mq timeout fix, when on frozen queues. From Gabriel. - Writeback fix from Jan, ensuring that __writeback_single_inode() does the right thing. - Fix for bio->bi_rw usage in f2fs from me. - Error path deadlock fix in blk-mq sysfs registration from me. - Floppy O_ACCMODE fix from Jiri. - Fix to the new bio op methods from Mike. One more followup will be coming here, ensuring that we don't propagate the block types outside of block. That, and a rename of bio->bi_rw is coming right after -rc1 is cut. - Various little fixes" * 'for-linus' of git://git.kernel.dk/linux-block: mm/block: convert rw_page users to bio op use loop: make do_req_filebacked more robust loop: don't try to use AIO for discards blk-mq: fix deadlock in blk_mq_register_disk() error path Include: blkdev: Removed duplicate 'struct request;' declaration. Fixup direct bi_rw modifiers block: fix bdi vs gendisk lifetime mismatch blk-mq: Allow timeouts to run while queue is freezing nbd: fix race in ioctl block: fix use-after-free in seq file f2fs: drop bio->bi_rw manual assignment block: add missing group association in bio-cloning functions blkcg: kill unused field nr_undestroyed_grps writeback: Write dirty times for WB_SYNC_ALL writeback floppy: fix open(O_ACCMODE) for ioctl-only open
| * | mm/block: convert rw_page users to bio op useMike Christie2016-08-042-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The rw_page users were not converted to use bio/req ops. As a result bdev_write_page is not passing down REQ_OP_WRITE and the IOs will be sent down as reads. Signed-off-by: Mike Christie <mchristi@redhat.com> Fixes: 4e1b2d52a80d ("block, fs, drivers: remove REQ_OP compat defs and related code") Modified by me to: 1) Drop op_flags passing into ->rw_page(), as we don't use it. 2) Make op_is_write() and friends safe to use for !CONFIG_BLOCK Signed-off-by: Jens Axboe <axboe@fb.com>
| * | Fixup direct bi_rw modifiersShaun Tancheff2016-08-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bi_rw should be using bio_set_op_attrs to set bi_rw. Signed-off-by: Shaun Tancheff <shaun@tancheff.com> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <jbacik@fb.com> Cc: David Sterba <dsterba@suse.com> Cc: Mike Christie <mchristi@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | f2fs: drop bio->bi_rw manual assignmentJens Axboe2016-08-041-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | Merge 4fc29c1aa375 included this extra line, but it's not needed (or useful) since we'll bio_set_op_attrs() right after to properly set the op and flags for the bio. Signed-off-by: Jens Axboe <axboe@fb.com>
| * | block: add missing group association in bio-cloning functionsPaolo Valente2016-08-041-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a bio is cloned, the newly created bio must be associated with the same blkcg as the original bio (if BLK_CGROUP is enabled). If this operation is not performed, then the new bio is not associated with any group, and the group of the current task is returned when the group of the bio is requested. Depending on the cloning frequency, this may cause a large percentage of the bios belonging to a given group to be treated as if belonging to other groups (in most cases as if belonging to the root group). The expected group isolation may thereby be broken. This commit adds the missing association in bio-cloning functions. Fixes: da2f0f74cf7d ("Btrfs: add support for blkio controllers") Cc: stable@vger.kernel.org # v4.3+ Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Reviewed-by: Nikolay Borisov <kernel@kyup.com> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | writeback: Write dirty times for WB_SYNC_ALL writebackJan Kara2016-08-041-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we take care to handle I_DIRTY_TIME in vfs_fsync() and queue_io() so that inodes which have only dirty timestamps are properly written on fsync(2) and sync(2). However there are other call sites - most notably going through write_inode_now() - which expect inode to be clean after WB_SYNC_ALL writeback. This is not currently true as we do not clear I_DIRTY_TIME in __writeback_single_inode() even for WB_SYNC_ALL writeback in all the cases. This then resulted in the following oops because bdev_write_inode() did not clean the inode and writeback code later stumbled over a dirty inode with detached wb. general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN Modules linked in: CPU: 3 PID: 32 Comm: kworker/u10:1 Not tainted 4.6.0-rc3+ #349 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Workqueue: writeback wb_workfn (flush-11:0) task: ffff88006ccf1840 ti: ffff88006cda8000 task.ti: ffff88006cda8000 RIP: 0010:[<ffffffff818884d2>] [<ffffffff818884d2>] locked_inode_to_wb_and_lock_list+0xa2/0x750 RSP: 0018:ffff88006cdaf7d0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88006ccf2050 RDX: 0000000000000000 RSI: 000000114c8a8484 RDI: 0000000000000286 RBP: ffff88006cdaf820 R08: ffff88006ccf1840 R09: 0000000000000000 R10: 000229915090805f R11: 0000000000000001 R12: ffff88006a72f5e0 R13: dffffc0000000000 R14: ffffed000d4e5eed R15: ffffffff8830cf40 FS: 0000000000000000(0000) GS:ffff88006d500000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000003301bf8 CR3: 000000006368f000 CR4: 00000000000006e0 DR0: 0000000000001ec9 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600 Stack: ffff88006a72f680 ffff88006a72f768 ffff8800671230d8 03ff88006cdaf948 ffff88006a72f668 ffff88006a72f5e0 ffff8800671230d8 ffff88006cdaf948 ffff880065b90cc8 ffff880067123100 ffff88006cdaf970 ffffffff8188e12e Call Trace: [< inline >] inode_to_wb_and_lock_list fs/fs-writeback.c:309 [<ffffffff8188e12e>] writeback_sb_inodes+0x4de/0x1250 fs/fs-writeback.c:1554 [<ffffffff8188efa4>] __writeback_inodes_wb+0x104/0x1e0 fs/fs-writeback.c:1600 [<ffffffff8188f9ae>] wb_writeback+0x7ce/0xc90 fs/fs-writeback.c:1709 [< inline >] wb_do_writeback fs/fs-writeback.c:1844 [<ffffffff81891079>] wb_workfn+0x2f9/0x1000 fs/fs-writeback.c:1884 [<ffffffff813bcd1e>] process_one_work+0x78e/0x15c0 kernel/workqueue.c:2094 [<ffffffff813bdc2b>] worker_thread+0xdb/0xfc0 kernel/workqueue.c:2228 [<ffffffff813cdeef>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303 [<ffffffff867bc5d2>] ret_from_fork+0x22/0x50 arch/x86/entry/entry_64.S:392 Code: 05 94 4a a8 06 85 c0 0f 85 03 03 00 00 e8 07 15 d0 ff 41 80 3e 00 0f 85 64 06 00 00 49 8b 9c 24 88 01 00 00 48 89 d8 48 c1 e8 03 <42> 80 3c 28 00 0f 85 17 06 00 00 48 8b 03 48 83 c0 50 48 39 c3 RIP [< inline >] wb_get include/linux/backing-dev-defs.h:212 RIP [<ffffffff818884d2>] locked_inode_to_wb_and_lock_list+0xa2/0x750 fs/fs-writeback.c:281 RSP <ffff88006cdaf7d0> ---[ end trace 986a4d314dcb2694 ]--- Fix the problem by making sure __writeback_single_inode() writes inode only with dirty times in WB_SYNC_ALL mode. Reported-by: Dmitry Vyukov <dvyukov@google.com> Tested-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
* | | Merge tag 'nfsd-4.8' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2016-08-0426-185/+575
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull nfsd updates from Bruce Fields: "Highlights: - Trond made a change to the server's tcp logic that allows a fast client to better take advantage of high bandwidth networks, but may increase the risk that a single client could starve other clients; a new sunrpc.svc_rpc_per_connection_limit parameter should help mitigate this in the (hopefully unlikely) event this becomes a problem in practice. - Tom Haynes added a minimal flex-layout pnfs server, which is of no use in production for now--don't build it unless you're doing client testing or further server development" * tag 'nfsd-4.8' of git://linux-nfs.org/~bfields/linux: (32 commits) nfsd: remove some dead code in nfsd_create_locked() nfsd: drop unnecessary MAY_EXEC check from create nfsd: clean up bad-type check in nfsd_create_locked nfsd: remove unnecessary positive-dentry check nfsd: reorganize nfsd_create nfsd: check d_can_lookup in fh_verify of directories nfsd: remove redundant zero-length check from create nfsd: Make creates return EEXIST instead of EACCES SUNRPC: Detect immediate closure of accepted sockets SUNRPC: accept() may return sockets that are still in SYN_RECV nfsd: allow nfsd to advertise multiple layout types nfsd: Close race between nfsd4_release_lockowner and nfsd4_lock nfsd/blocklayout: Make sure calculate signature/designator length aligned xfs: abstract block export operations from nfsd layouts SUNRPC: Remove unused callback xpo_adjust_wspace() SUNRPC: Change TCP socket space reservation SUNRPC: Add a server side per-connection limit SUNRPC: Micro optimisation for svc_data_ready SUNRPC: Call the default socket callbacks instead of open coding SUNRPC: lock the socket while detaching it ...
| * | | nfsd: remove some dead code in nfsd_create_locked()Dan Carpenter2016-08-041-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We changed this around in f135af1041f ('nfsd: reorganize nfsd_create') so "dchild" can't be an error pointer any more. Also, dchild can't be NULL here (and dput would already handle this even if it was). Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | nfsd: drop unnecessary MAY_EXEC check from createJ. Bruce Fields2016-08-042-11/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We need an fh_verify to make sure we at least have a dentry, but actual permission checks happen later. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | nfsd: clean up bad-type check in nfsd_create_lockedJ. Bruce Fields2016-08-041-7/+4
| | | | | | | | | | | | | | | | | | | | | | | | Minor cleanup, no change in behavior. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | nfsd: remove unnecessary positive-dentry checkJ. Bruce Fields2016-08-041-10/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vfs_{create,mkdir,mknod} each begin with a call to may_create(), which returns EEXIST if the object already exists. This check is therefore unnecessary. (In the NFSv2 case, nfsd_proc_create also has such a check. Contrary to RFC 1094, our code seems to believe that a CREATE of an existing file should succeed. I'm leaving that behavior alone.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | nfsd: reorganize nfsd_createJ. Bruce Fields2016-08-043-55/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's some odd logic in nfsd_create() that allows it to be called with the parent directory either locked or unlocked. The only already-locked caller is NFSv2's nfsd_proc_create(). It's less confusing to split out the unlocked case into a separate function which the NFSv2 code can call directly. Also fix some comments while we're here. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | nfsd: check d_can_lookup in fh_verify of directoriesJ. Bruce Fields2016-08-042-13/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Create and other nfsd ops generally assume we can call lookup_one_len on inodes with S_IFDIR set. Al says that this assumption isn't true in general, though it should be for the filesystem objects nfsd sees. Add a check just to make sure our assumption isn't violated. Remove a couple checks for i_op->lookup in create code. Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | nfsd: remove redundant zero-length check from createJ. Bruce Fields2016-08-042-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | lookup_one_len already has this check. The only effect of this patch is to return access instead of perm in the 0-length-filename case. I actually prefer nfserr_perm (or _inval?), but I doubt anyone cares. The isdotent check seems redundant too, but I worry that some client might actually care about that strange nfserr_exist error. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | nfsd: Make creates return EEXIST instead of EACCESOleg Drokin2016-08-042-2/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When doing a create (mkdir/mknod) on a name, it's worth checking the name exists first before returning EACCES in case the directory is not writeable by the user. This makes return values on the client more consistent regardless of whenever the entry there is cached in the local cache or not. Another positive side effect is certain programs only expect EEXIST in that case even despite POSIX allowing any valid error to be returned. Signed-off-by: Oleg Drokin <green@linuxhacker.ru> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | nfsd: allow nfsd to advertise multiple layout typesJeff Layton2016-07-155-24/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the underlying filesystem supports multiple layout types, then there is little reason not to advertise that fact to clients and let them choose what type to use. Turn the ex_layout_type field into a bitfield. For each supported layout type, we set a bit in that field. When the client requests a layout, ensure that the bit for that layout type is set. When the client requests attributes, send back a list of supported types. Signed-off-by: Jeff Layton <jlayton@poochiereds.net> Reviewed-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | nfsd: Close race between nfsd4_release_lockowner and nfsd4_lockChuck Lever2016-07-151-23/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nfsd4_release_lockowner finds a lock owner that has no lock state, and drops cl_lock. Then release_lockowner picks up cl_lock and unhashes the lock owner. During the window where cl_lock is dropped, I don't see anything preventing a concurrent nfsd4_lock from finding that same lock owner and adding lock state to it. Move release_lockowner() into nfsd4_release_lockowner and hang onto the cl_lock until after the lock owner's state cannot be found again. Found by inspection, we don't currently have a reproducer. Fixes: 2c41beb0e5cf ("nfsd: reduce cl_lock thrashing in ... ") Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | nfsd/blocklayout: Make sure calculate signature/designator length alignedKinglong Mee2016-07-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These values are all multiples of 4 already, so there's no change in behavior from this patch. But perhaps this will prevent mistakes in the future. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | xfs: abstract block export operations from nfsd layoutsBenjamin Coddington2016-07-155-5/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of creeping pnfs layout configuration into filesystems, move the definition of block-based export operations under a more abstract configuration. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Dave Chinner <david@fromorbit.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | nfsd: Fix some indent inconsistancyChristophe JAILLET2016-07-133-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Silent a few smatch warnings about indentation Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | nfsd: Correct a comment for NFSD_MAY_ defines locationOleg Drokin2016-07-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Those are now defined in fs/nfsd/vfs.h Signed-off-by: Oleg Drokin <green@linuxhacker.ru> Reviewed-by: Jeff Layton <jlayton@poochiereds.net> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | nfsd: Add a super simple flex file serverTom Haynes2016-07-137-1/+329
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Have a simple flex file server where the mds (NFSv4.1 or NFSv4.2) is also the ds (NFSv3). I.e., the metadata and the data file are the exact same file. This will allow testing of the flex file client. Simply add the "pnfs" export option to your export in /etc/exports and mount from a client that supports flex files. Signed-off-by: Tom Haynes <loghyr@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | nfsd: flex file device id encoding will need the server addressTom Haynes2016-07-133-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Tom Haynes <loghyr@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jeff Layton <jlayton@poochiereds.net> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | nfsd: implement machine credential support for some operationsAndrew Elble2016-07-137-28/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This addresses the conundrum referenced in RFC5661 18.35.3, and will allow clients to return state to the server using the machine credentials. The biggest part of the problem is that we need to allow the client to send a compound op with integrity/privacy on mounts that don't have it enabled. Add server support for properly decoding and using spo_must_enforce and spo_must_allow bits. Add support for machine credentials to be used for CLOSE, OPEN_DOWNGRADE, LOCKU, DELEGRETURN, and TEST/FREE STATEID. Implement a check so as to not throw WRONGSEC errors when these operations are used if integrity/privacy isn't turned on. Without this, Linux clients with credentials that expired while holding delegations were getting stuck in an endless loop. Signed-off-by: Andrew Elble <aweits@rit.edu> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
| * | | nfsd: allow mach_creds_match to be used more broadlyAndrew Elble2016-07-132-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename mach_creds_match() to nfsd4_mach_creds_match() and un-staticify Signed-off-by: Andrew Elble <aweits@rit.edu> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* | | | Merge branch 'for-linus-4.8' of ↵Linus Torvalds2016-08-0443-686/+912
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull more btrfs updates from Chris Mason: "This is part two of my btrfs pull, which is some cleanups and a batch of fixes. Most of the code here is from Jeff Mahoney, making the pointers we pass around internally more consistent and less confusing overall. I noticed a small problem right before I sent this out yesterday, so I fixed it up and re-tested overnight" * 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (40 commits) Btrfs: fix __MAX_CSUM_ITEMS btrfs: btrfs_abort_transaction, drop root parameter btrfs: add btrfs_trans_handle->fs_info pointer btrfs: btrfs_relocate_chunk pass extent_root to btrfs_end_transaction btrfs: convert nodesize macros to static inlines btrfs: introduce BTRFS_MAX_ITEM_SIZE btrfs: cleanup, remove prototype for btrfs_find_root_ref btrfs: copy_to_sk drop unused root parameter btrfs: simpilify btrfs_subvol_inherit_props btrfs: tests, use BTRFS_FS_STATE_DUMMY_FS_INFO instead of dummy root btrfs: tests, require fs_info for root btrfs: tests, move initialization into tests/ btrfs: btrfs_test_opt and friends should take a btrfs_fs_info btrfs: prefix fsid to all trace events btrfs: plumb fs_info into btrfs_work btrfs: remove obsolete part of comment in statfs btrfs: hide test-only member under ifdef btrfs: Ratelimit "no csum found" info message btrfs: Add ratelimit to btrfs printing Btrfs: fix unexpected balance crash due to BUG_ON ...
| * | | | Btrfs: fix __MAX_CSUM_ITEMSChris Mason2016-08-031-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jeff Mahoney's cleanup commit (14a1e067b4) wasn't correct for csums on machines where the pagesize >= metadata blocksize. This just reverts the relevant hunks to bring the old math back. Signed-off-by: Chris Mason <clm@fb.com>
| * | | | btrfs: btrfs_abort_transaction, drop root parameterJeff Mahoney2016-07-2617-152/+147
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __btrfs_abort_transaction doesn't use its root parameter except to obtain an fs_info pointer. We can obtain that from trans->root->fs_info for now and from trans->fs_info in a later patch. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | btrfs: add btrfs_trans_handle->fs_info pointerJeff Mahoney2016-07-264-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | btrfs_trans_handle->root is documented as for use for confirming that the root passed in to start the transaction is the same as the one ending it. It's used in several places when an fs_info pointer is needed, so let's just add an fs_info pointer directly. Eventually, the root pointer can be removed. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | btrfs: btrfs_relocate_chunk pass extent_root to btrfs_end_transactionJeff Mahoney2016-07-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In btrfs_relocate_chunk, we get a transaction handle via btrfs_start_trans_remove_block_group, which starts the transaction using the extent root. When we call btrfs_end_transaction, we're calling it using the chunk root. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | btrfs: convert nodesize macros to static inlinesJeff Mahoney2016-07-261-15/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch converts the macros used to calculate various node size limits to static inlines. That way we get type checking for free. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | btrfs: introduce BTRFS_MAX_ITEM_SIZEJeff Mahoney2016-07-264-10/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We use BTRFS_LEAF_DATA_SIZE - sizeof(struct btrfs_item) in several places. This introduces a BTRFS_MAX_ITEM_SIZE macro to do the same. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | btrfs: cleanup, remove prototype for btrfs_find_root_refJeff Mahoney2016-07-261-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function isn't implemented anywhere. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | btrfs: copy_to_sk drop unused root parameterJeff Mahoney2016-07-261-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The root parameter for copy_to_sk is not used at all. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | btrfs: simpilify btrfs_subvol_inherit_propsJeff Mahoney2016-07-261-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We just need a superblock, but we look it up using two different roots depending on the call site. Let's just use a superblock pointer initialized at the outset. This is mostly for Coccinelle not to choke on my root push up set. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | btrfs: tests, use BTRFS_FS_STATE_DUMMY_FS_INFO instead of dummy rootJeff Mahoney2016-07-267-21/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we have a dummy fs_info associated with each test that uses a root, we don't need the DUMMY_ROOT bit anymore. This lets us make choices without needing an actual root like in e.g. btrfs_find_create_tree_block. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | btrfs: tests, require fs_info for rootJeff Mahoney2016-07-2610-61/+103
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows the upcoming patchset to push nodesize and sectorsize into fs_info. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | btrfs: tests, move initialization into tests/Jeff Mahoney2016-07-263-77/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have all these stubs that only exist because they're called from btrfs_run_sanity_tests, which is a static inside super.c. Let's just move it all into tests/btrfs-tests.c and only have one stub. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | btrfs: btrfs_test_opt and friends should take a btrfs_fs_infoJeff Mahoney2016-07-2613-130/+135
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | btrfs_test_opt and friends only use the root pointer to access the fs_info. Let's pass the fs_info directly in preparation to eliminate similar patterns all over btrfs. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | btrfs: prefix fsid to all trace eventsJeff Mahoney2016-07-265-22/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using trace events to debug a problem, it's impossible to determine which file system generated a particular event. This patch adds a macro to prefix standard information to the head of a trace event. The extent_state alloc/free events are all that's left without an fs_info available. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | btrfs: plumb fs_info into btrfs_workJeff Mahoney2016-07-264-31/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to provide an fsid for trace events, we'll need a btrfs_fs_info pointer. The most lightweight way to do that for btrfs_work structures is to associate it with the __btrfs_workqueue structure. Each queued btrfs_work structure has a workqueue associated with it, so that's a natural fit. It's a privately defined structures, so we add accessors to retrieve the fs_info pointer. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | btrfs: remove obsolete part of comment in statfsDavid Sterba2016-07-261-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The mixed blockgroup reporting has been fixed by commit ae02d1bd070767e109f4a6f1bb1f466e9698a355 "btrfs: fix mixed block count of available space" Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | btrfs: hide test-only member under ifdefDavid Sterba2016-07-262-0/+4
| | | | | | | | | | | | | | | | | | | | Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | btrfs: Ratelimit "no csum found" info messageNikolay Borisov2016-07-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recently during a crash it became apparent that this particular message can be printed so many times that it causes the softlockup detector to trigger. Fix it by ratelimiting it. Signed-off-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | btrfs: Add ratelimit to btrfs printingNikolay Borisov2016-07-261-2/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds ratelimiting to all messages which are not using the _rl version of the various printing APIs in btrfs. This is designed to be used as a safety net, since a flood messages might cause the softlockup detector to trigger. To reduce interference between different classes of messages use a separate ratelimit state for every class of message. Signed-off-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | Btrfs: fix unexpected balance crash due to BUG_ONLiu Bo2016-07-261-4/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mounting a btrfs can resume previous balance operations asynchronously. An user got a crash when one drive has some corrupt sectors. Since balance can cancel itself in case of any error, we can gracefully return errors to upper layers and let balance do the cancel job. Reported-by: sash <master.b.at.raven@chefmail.de> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | | | Btrfs: fix panic in balance due to EIOLiu Bo2016-07-261-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During build_backref_tree(), if we fail to read a btree node, we can eventually run into BUG_ON(cache->nr_nodes) that we put in backref_cache_cleanup(), meaning we have at least one memory leak. This frees the backref_node that we's allocated at the very beginning of build_backref_tree(). Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>