summaryrefslogtreecommitdiffstats
path: root/drivers/md/bcache
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'for-3.11/drivers' of git://git.kernel.dk/linux-blockLinus Torvalds2013-07-2221-801/+869
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block IO driver bits from Jens Axboe: "As I mentioned in the core block pull request, due to real life circumstances the driver pull request would be late. Now it looks like -rc2 late... On the plus side, apart form the rsxx update, these are all things that I could argue could go in later in the cycle as they are fixes and not features. So even though things are late, it's not ALL bad. The pull request contains: - Updates to bcache, all bug fixes, from Kent. - A pile of drbd bug fixes (no big features this time!). - xen blk front/back fixes. - rsxx driver updates, some of them deferred form 3.10. So should be well cooked by now" * 'for-3.11/drivers' of git://git.kernel.dk/linux-block: (63 commits) bcache: Allocation kthread fixes bcache: Fix GC_SECTORS_USED() calculation bcache: Journal replay fix bcache: Shutdown fix bcache: Fix a sysfs splat on shutdown bcache: Advertise that flushes are supported bcache: check for allocation failures bcache: Fix a dumb race bcache: Use standard utility code bcache: Update email address bcache: Delete fuzz tester bcache: Document shrinker reserve better bcache: FUA fixes drbd: Allow online change of al-stripes and al-stripe-size drbd: Constants should be UPPERCASE drbd: Ignore the exit code of a fence-peer handler if it returns too late drbd: Fix rcu_read_lock balance on error path drbd: fix error return code in drbd_init() drbd: Do not sleep inside rcu bcache: Refresh usage docs ...
| * bcache: Allocation kthread fixesKent Overstreet2013-07-123-18/+15
| | | | | | | | | | | | | | | | The alloc kthread should've been using try_to_freeze() - and also there was the potential for the alloc kthread to get woken up after it had shut down, which would have been bad. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
| * bcache: Fix GC_SECTORS_USED() calculationKent Overstreet2013-07-121-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Part of the job of garbage collection is to add up however many sectors of live data it finds in each bucket, but that doesn't work very well if it doesn't reset GC_SECTORS_USED() when it starts. Whoops. This wouldn't have broken anything horribly, but allocation tries to preferentially reclaim buckets that are mostly empty and that's not gonna work with an incorrect GC_SECTORS_USED() value. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
| * bcache: Journal replay fixKent Overstreet2013-07-121-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The journal replay code starts by finding something that looks like a valid journal entry, then it does a binary search over the unchecked region of the journal for the journal entries with the highest sequence numbers. Trouble is, the logic was wrong - journal_read_bucket() returns true if it found journal entries we need, but if the range of journal entries we're looking for loops around the end of the journal - in that case journal_read_bucket() could return true when it hadn't found the highest sequence number we'd seen yet, and in that case the binary search did the wrong thing. Whoops. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
| * bcache: Shutdown fixKent Overstreet2013-07-121-7/+11
| | | | | | | | | | | | | | | | | | Stopping a cache set is supposed to make it stop attached backing devices, but somewhere along the way that code got lost. Fixing this mainly has the effect of fixing our reboot notifier. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
| * bcache: Fix a sysfs splat on shutdownKent Overstreet2013-07-122-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | If we stopped a bcache device when we were already detaching (or something like that), bcache_device_unlink() would try to remove a symlink from sysfs that was already gone because the bcache dev kobject had already been removed from sysfs. So keep track of whether we've removed stuff from sysfs. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
| * bcache: Advertise that flushes are supportedKent Overstreet2013-07-122-1/+9
| | | | | | | | | | | | | | | | Whoops - bcache's flush/FUA was mostly correct, but flushes get filtered out unless we say we support them... Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
| * bcache: check for allocation failuresDan Carpenter2013-07-121-0/+2
| | | | | | | | | | | | There is a missing NULL check after the kzalloc(). Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
| * bcache: Fix a dumb raceKent Overstreet2013-07-121-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the far-too-complicated closure code - closures can have destructors, for probably dubious reasons; they get run after the closure is no longer waiting on anything but before dropping the parent ref, intended just for freeing whatever memory the closure is embedded in. Trouble is, when remaining goes to 0 and we've got nothing more to run - we also have to unlock the closure, setting remaining to -1. If there's a destructor, that unlock isn't doing anything - nobody could be trying to lock it if we're about to free it - but if the unlock _is needed... that check for a destructor was racy. Argh. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
| * bcache: Use standard utility codeKent Overstreet2013-07-018-144/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some of bcache's utility code has made it into the rest of the kernel, so drop the bcache versions. Bcache used to have a workaround for allocating from a bio set under generic_make_request() (if you allocated more than once, the bios you already allocated would get stuck on current->bio_list when you submitted, and you'd risk deadlock) - bcache would mask out __GFP_WAIT when allocating bios under generic_make_request() so that allocation could fail and it could retry from workqueue. But bio_alloc_bioset() has a workaround now, so we can drop this hack and the associated error handling. Signed-off-by: Kent Overstreet <koverstreet@google.com>
| * bcache: Delete fuzz testerKent Overstreet2013-07-013-152/+2
| | | | | | | | | | | | This code has rotted and it hasn't been used in ages anyways. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
| * bcache: Document shrinker reserve betterKent Overstreet2013-07-011-0/+7
| | | | | | | | Signed-off-by: Kent Overstreet <kmo@daterainc.com>
| * bcache: FUA fixesKent Overstreet2013-07-013-5/+35
| | | | | | | | | | | | | | Journal writes need to be marked FUA, not just REQ_FLUSH. And btree node writes have... weird ordering requirements. Signed-off-by: Kent Overstreet <koverstreet@google.com>
| * bcache: Send label ueventsGabriel de Perthuis2013-06-262-1/+17
| | | | | | | | | | Signed-off-by: Gabriel de Perthuis <g2p.code@gmail.com> Signed-off-by: Kent Overstreet <koverstreet@google.com>
| * bcache: Send a uevent with a cached device's UUIDGabriel de Perthuis2013-06-261-3/+9
| | | | | | | | Signed-off-by: Gabriel de Perthuis <g2p.code@gmail.com>
| * bcache: Write out full stripesKent Overstreet2013-06-269-37/+121
| | | | | | | | | | | | | | | | | | | | | | | | | | Now that we're tracking dirty data per stripe, we can add two optimizations for raid5/6: * If a stripe is already dirty, force writes to that stripe to writeback mode - to help build up full stripes of dirty data * When flushing dirty data, preferentially write out full stripes first if there are any. Signed-off-by: Kent Overstreet <koverstreet@google.com>
| * bcache: Track dirty data by stripeKent Overstreet2013-06-267-26/+105
| | | | | | | | | | | | | | | | To make background writeback aware of raid5/6 stripes, we first need to track the amount of dirty data within each stripe - we do this by breaking up the existing sectors_dirty into per stripe atomic_ts Signed-off-by: Kent Overstreet <koverstreet@google.com>
| * bcache: Initialize sectors_dirty when attachingKent Overstreet2013-06-264-29/+39
| | | | | | | | | | | | | | | | | | | | | | Previously, dirty_data wouldn't get initialized until the first garbage collection... which was a bit of a problem for background writeback (as the PD controller keys off of it) and also confusing for users. This is also prep work for making background writeback aware of raid5/6 stripes. Signed-off-by: Kent Overstreet <koverstreet@google.com>
| * bcache: Improve lazy sortingKent Overstreet2013-06-263-17/+26
| | | | | | | | | | | | | | | | | | The old lazy sorting code was kind of hacky - rewrite in a way that mathematically makes more sense; the idea is that the size of the sets of keys in a btree node should increase by a more or less fixed ratio from smallest to biggest. Signed-off-by: Kent Overstreet <koverstreet@google.com>
| * bcache: Rip out pkey()/pbtree()Kent Overstreet2013-06-267-45/+57
| | | | | | | | | | | | | | | | Old gcc doesnt like the struct hack, and it is kind of ugly. So finish off the work to convert pr_debug() statements to tracepoints, and delete pkey()/pbtree(). Signed-off-by: Kent Overstreet <koverstreet@google.com>
| * bcache: Fix/revamp tracepointsKent Overstreet2013-06-2614-110/+126
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tracepoints were reworked to be more sensible, and fixed a null pointer deref in one of the tracepoints. Converted some of the pr_debug()s to tracepoints - this is partly a performance optimization; it used to be that with DEBUG or CONFIG_DYNAMIC_DEBUG pr_debug() was an empty macro; but at some point it was changed to an empty inline function. Some of the pr_debug() statements had rather expensive function calls as part of the arguments, so this code was getting run unnecessarily even on non debug kernels - in some fast paths, too. Signed-off-by: Kent Overstreet <koverstreet@google.com>
| * bcache: Refactor btree ioKent Overstreet2013-06-266-176/+140
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The most significant change is that btree reads are now done synchronously, instead of asynchronously and doing the post read stuff from a workqueue. This was originally done because we can't block on IO under generic_make_request(). But - we already have a mechanism to punt cache lookups to workqueue if needed, so if we just use that we don't have to deal with the complexity of doing things asynchronously. The main benefit is this makes the locking situation saner; we can hold our write lock on the btree node until we're finished reading it, and we don't need that btree_node_read_done() flag anymore. Also, for writes, btree_write() was broken out into btree_node_write() and btree_leaf_dirty() - the old code with the boolean argument was dumb and confusing. The prio_blocked mechanism was improved a bit too, now the only counter is in struct btree_write, we don't mess with transfering a count from struct btree anymore. This required changing garbage collection to block prios at the start and unblock when it finishes, which is cleaner than what it was doing anyways (the old code had mostly the same effect, but was doing it in a convoluted way) And the btree iter btree_node_read_done() uses was converted to a real mempool. Signed-off-by: Kent Overstreet <koverstreet@google.com>
| * bcache: Convert allocator thread to kthreadKent Overstreet2013-06-264-33/+43
| | | | | | | | | | | | Using a workqueue when we just want a single thread is a bit silly. Signed-off-by: Kent Overstreet <koverstreet@google.com>
| * bcache: Warn when a device is already registered.Gabriel de Perthuis2013-06-261-2/+37
| | | | | | | | | | Signed-off-by: Gabriel de Perthuis <g2p.code+bcache@gmail.com> Signed-off-by: Kent Overstreet <koverstreet@google.com>
| * bcache: fix a spurious gcc complaint, use scnprintfKent Overstreet2013-06-261-21/+22
| | | | | | | | | | | | | | | | | | | | An old version of gcc was complaining about using a const int as the size of a stack allocated array. Which should be fine - but using ARRAY_SIZE() is better, anyways. Also, refactor the code to use scnprintf(). Signed-off-by: Kent Overstreet <koverstreet@google.com>
| * md: bcache: io.c: fix a potential NULL pointer dereferenceKumar Amit Mehta2013-06-261-0/+2
| | | | | | | | | | | | | | | | bio_alloc_bioset returns NULL on failure. This fix adds a missing check for potential NULL pointer dereferencing. Signed-off-by: Kumar Amit Mehta <gmate.amit@gmail.com> Signed-off-by: Kent Overstreet <koverstreet@google.com>
* | Merge branch 'for-linus' of ↵Linus Torvalds2013-07-041-2/+2
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree updates from Jiri Kosina: "The usual stuff from trivial tree" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits) treewide: relase -> release Documentation/cgroups/memory.txt: fix stat file documentation sysctl/net.txt: delete reference to obsolete 2.4.x kernel spinlock_api_smp.h: fix preprocessor comments treewide: Fix typo in printk doc: device tree: clarify stuff in usage-model.txt. open firmware: "/aliasas" -> "/aliases" md: bcache: Fixed a typo with the word 'arithmetic' irq/generic-chip: fix a few kernel-doc entries frv: Convert use of typedef ctl_table to struct ctl_table sgi: xpc: Convert use of typedef ctl_table to struct ctl_table doc: clk: Fix incorrect wording Documentation/arm/IXP4xx fix a typo Documentation/networking/ieee802154 fix a typo Documentation/DocBook/media/v4l fix a typo Documentation/video4linux/si476x.txt fix a typo Documentation/virtual/kvm/api.txt fix a typo Documentation/early-userspace/README fix a typo Documentation/video4linux/soc-camera.txt fix a typo lguest: fix CONFIG_PAE -> CONFIG_x86_PAE in comment ...
| * md: bcache: Fixed a typo with the word 'arithmetic'Phil Viana2013-06-181-2/+2
| | | | | | | | | | | | | | The word 'arithmetic' was typed as 'arithmatic' Signed-off-by: Phil Viana <phillip.l.viana@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* | bcache: Fix error handling in init codeKent Overstreet2013-05-154-121/+99
| | | | | | | | | | | | | | This code appears to have rotted... fix various bugs and do some refactoring. Signed-off-by: Kent Overstreet <koverstreet@google.com>
* | bcache: drop "select CLOSURES"Paul Bolle2013-05-151-1/+0
| | | | | | | | | | | | | | | | | | The Kconfig entry for BCACHE selects CLOSURES. But there's no Kconfig symbol CLOSURES. That symbol was used in development versions of bcache, but was removed when the closures code was no longer provided as a kernel library. It can safely be dropped. Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
* | bcache: Fix incompatible pointer type warningEmil Goode2013-05-151-2/+1
|/ | | | | | | | | | | | | | | | | | | | | | | The function pointer release in struct block_device_operations should point to functions declared as void. Sparse warnings: drivers/md/bcache/super.c:656:27: warning: incorrect type in initializer (different base types) drivers/md/bcache/super.c:656:27: expected void ( *release )( ... ) drivers/md/bcache/super.c:656:27: got int ( static [toplevel] *<noident> )( ... ) drivers/md/bcache/super.c:656:2: warning: initialization from incompatible pointer type [enabled by default] drivers/md/bcache/super.c:656:2: warning: (near initialization for ‘bcache_ops.release’) [enabled by default] Signed-off-by: Emil Goode <emilgoode@gmail.com> Signed-off-by: Kent Overstreet <koverstreet@google.com>
* bcache: Use bd_link_disk_holder()Kent Overstreet2013-04-301-17/+35
| | | | Signed-off-by: Kent Overstreet <koverstreet@google.com>
* bcache: Allocator cleanup/fixesKent Overstreet2013-04-302-25/+50
| | | | | | | | | The main fix is that bch_allocator_thread() wasn't waiting on garbage collection to finish (if invalidate_buckets had set ca->invalidate_needs_gc); we need that to make sure the allocator doesn't spin and potentially block gc from finishing. Signed-off-by: Kent Overstreet <koverstreet@google.com>
* bcache: Make sure blocksize isn't smaller than device blocksizeKent Overstreet2013-04-241-2/+6
| | | | | | | Sanity check to make sure we don't end up doing IO the device doesn't support. Signed-off-by: Kent Overstreet <koverstreet@google.com>
* bcache: Fix merge_bvec_fn usage for when it modifies the bvmKent Overstreet2013-04-221-9/+8
| | | | | | | | Stacked md devices reuse the bvm for the subordinate device, causing problems... Reported-by: Michael Balser <michael.balser@profitbricks.com> Signed-off-by: Kent Overstreet <koverstreet@google.com>
* bcache: Correctly check against BIO_MAX_PAGESKent Overstreet2013-04-201-5/+4
| | | | | | | bch_bio_max_sectors() was checking against BIO_MAX_PAGES as if the limit was for the total bytes in the bio, not the number of segments. Signed-off-by: Kent Overstreet <koverstreet@google.com>
* bcache: Hack around stuff that clones up to bi_max_vecsKent Overstreet2013-04-201-0/+9
| | | | Signed-off-by: Kent Overstreet <koverstreet@google.com>
* bcache: Set ra_pages based on backing device's ra_pagesKent Overstreet2013-04-201-0/+4
| | | | Signed-off-by: Kent Overstreet <koverstreet@google.com>
* bcache: Take data offset from the bdev superblock.Kent Overstreet2013-04-203-57/+100
| | | | | | | Add a new superblock version, and consolidate related defines. Signed-off-by: Gabriel de Perthuis <g2p.code+bcache@gmail.com> Signed-off-by: Kent Overstreet <koverstreet@google.com>
* bcache: Disable broken btree fuzz testerKent Overstreet2013-04-081-2/+4
| | | | | Reported-by: <sasha.levin@oracle.com> Signed-off-by: Kent Overstreet <koverstreet@google.com>
* bcache: Fix a format string overflowKent Overstreet2013-04-081-2/+2
| | | | | Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Kent Overstreet <koverstreet@google.com>
* bcache: Fix a minor memory leak on device teardownKent Overstreet2013-04-081-1/+3
| | | | | Reported-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Kent Overstreet <koverstreet@google.com>
* bcache: Use WARN_ONCE() instead of __WARN()Kent Overstreet2013-04-081-1/+1
| | | | Signed-off-by: Kent Overstreet <koverstreet@google.com>
* bcache: Add missing #include <linux/prefetch.h>Geert Uytterhoeven2013-04-082-0/+2
| | | | | | | | | | | | | m68k/allmodconfig: drivers/md/bcache/bset.c: In function ‘bset_search_tree’: drivers/md/bcache/bset.c:727: error: implicit declaration of function ‘prefetch’ drivers/md/bcache/btree.c: In function ‘bch_btree_node_get’: drivers/md/bcache/btree.c:933: error: implicit declaration of function ‘prefetch’ Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Kent Overstreet <koverstreet@google.com>
* bcache: Sparse fixesKent Overstreet2013-04-084-90/+92
| | | | Signed-off-by: Kent Overstreet <koverstreet@google.com>
* bcache: Don't export utility code, prefix with bch_Kent Overstreet2013-03-2813-101/+89
| | | | | | Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: linux-bcache@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
* bcache: Fix for the build fixesKent Overstreet2013-03-251-1/+0
| | | | | | | | Commit 82a84eaf7e51ba3da0c36cbc401034a4e943492d left a return 0 in closure_debug_init(). Whoops. Signed-off-by: Kent Overstreet <koverstreet@google.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* bcache: Style/checkpatch fixesKent Overstreet2013-03-2510-56/+51
| | | | | | | | | Took out some nested functions, and fixed some more checkpatch complaints. Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: linux-bcache@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
* bcache: Build fixes from test robotKent Overstreet2013-03-255-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | config: make ARCH=i386 allmodconfig All error/warnings: drivers/md/bcache/bset.c: In function 'bch_ptr_bad': >> drivers/md/bcache/bset.c:164:2: warning: format '%li' expects argument of type 'long int', but argument 4 has type 'size_t' [-Wformat] -- drivers/md/bcache/debug.c: In function 'bch_pbtree': >> drivers/md/bcache/debug.c:86:4: warning: format '%li' expects argument of type 'long int', but argument 4 has type 'size_t' [-Wformat] -- drivers/md/bcache/btree.c: In function 'bch_btree_read_done': >> drivers/md/bcache/btree.c:245:8: warning: format '%lu' expects argument of type 'long unsigned int', but argument 4 has type 'size_t' [-Wformat] -- drivers/md/bcache/closure.o: In function `closure_debug_init': >> (.init.text+0x0): multiple definition of `init_module' >> drivers/md/bcache/super.o:super.c:(.init.text+0x0): first defined here Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: linux-bcache@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
* bcache: A block layer cacheKent Overstreet2013-03-2327-0/+15680
Does writethrough and writeback caching, handles unclean shutdown, and has a bunch of other nifty features motivated by real world usage. See the wiki at http://bcache.evilpiepirate.org for more. Signed-off-by: Kent Overstreet <koverstreet@google.com>