linux.git - Linux kernel mainline tree

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge branch 'for-3.0-important' of git://git.drbd.org/linux-2.6-drbd into ↵	Jens Axboe	2011-06-30	4	-21/+31
\|\ \| \| \| \| \| \|	for-linus
\| *	drbd: we should write meta data updates with FLUSH FUA	Lars Ellenberg	2011-06-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We used to write these with BIO_RW_BARRIER aka REQ_HARDBARRIER (unless disabled in the configuration). The correct semantic now would be to write with FLUSH/FUA. For example, with activity log transactions, FUA alone is not enough, we need the corresponding bitmap update (and all related application updates) on stable storage as well. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: when receive times out on meta socket, also check last receive time on ↵	Lars Ellenberg	2011-06-30	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	data socket If we have an asymetrically congested network, we may send P_PING, but due to congestion, the corresponding P_PING_ACK would time out, and we would drop a (congested, but otherwise) healthy connection ("PingAck did not arrive in time.") Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: account bitmap IO during resync as resync-(related-)-io	Lars Ellenberg	2011-06-30	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we have a good resync rate, we will frequently update the on-disk bitmap, which, if not accounted for as resync io, may let an otherwise idle device appear to be "busy", and cause us to throttle resync. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: don't cond_resched_lock with IRQs disabled	Lars Ellenberg	2011-06-30	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The last commit, drbd: add missing spinlock to bitmap receive, introduced a cond_resched_lock(), where the lock in question is taken with irqs disabled. As we must not schedule with IRQs disabled, and cond_resched_lock_irq() does not exist, yet, we re-aquire the spin_lock_irq() for each bitmap page processed in turn. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: add missing spinlock to bitmap receive	Lars Ellenberg	2011-06-30	1	-15/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During bitmap exchange, when using the RLE bitmap compression scheme, we have a code path that can set the whole bitmap at once. To avoid holding spin_lock_irq() for too long, we used to lock out other bitmap modifications during bitmap exchange by other means, and then, knowing we have exclusive access to the bitmap, modify it without the spinlock, and with IRQs enabled. Since we now allow local IO to continue, potentially setting additional bits during the bitmap receive phase, this is no longer true, and we get uncoordinated updates of bitmap members, causing bm_set to no longer accurately reflect the total number of set bits. To actually see this, you'd need to have a large bitmap, use RLE bitmap compression, and have busy IO during sync handshake and bitmap exchange. Fix this by taking the spin_lock_irq() in this code path as well, but calling cond_resched_lock() after each page worth of bits processed. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: Use the correct max_bio_size when creating resync requests	Philipp Reisner	2011-06-30	1	-6/+1
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* \|	Merge branch 'for-2.6.40/drivers' of git://git.kernel.dk/linux-2.6-block	Linus Torvalds	2011-05-25	9	-139/+243
\|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* 'for-2.6.40/drivers' of git://git.kernel.dk/linux-2.6-block: (110 commits) loop: handle on-demand devices correctly loop: limit 'max_part' module param to DISK_MAX_PARTS drbd: fix warning drbd: fix warning drbd: Fix spelling drbd: fix schedule in atomic drbd: Take a more conservative approach when deciding max_bio_size drbd: Fixed state transitions after async outdate-peer-handler returned drbd: Disallow the peer_disk_state to be D_OUTDATED while connected drbd: Fix for the connection problems on high latency links drbd: fix potential activity log refcount imbalance in error path drbd: Only downgrade the disk state in case of disk failures drbd: fix disconnect/reconnect loop, if ping-timeout == ping-int drbd: fix potential distributed deadlock lru_cache.h: fix comments referring to ts_ instead of lc_ drbd: Fix for application IO with the on-io-error=pass-on policy xen/p2m: Add EXPORT_SYMBOL_GPL to the M2P override functions. xen/p2m/m2p/gnttab: Support GNTMAP_host_map in the M2P override. xen/blkback: don't fail empty barrier requests xen/blkback: fix xenbus_transaction_start() hang caused by double xenbus_transaction_end() ...
\| *	drbd: fix warning	Andrew Morton	2011-05-24	2	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In file included from drivers/block/drbd/drbd_main.c:54: drivers/block/drbd/drbd_int.h:1190: warning: parameter has incomplete type Forward declarations of enums do not work. Fix it unpleasantly by moving the prototype. Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Lars Ellenberg <drbd-dev@lists.linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
\| *	drbd: fix warning	Philipp Reisner	2011-05-24	2	-7/+1
\| \| \| \| \| \| \| \|	Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
\| *	drbd: Fix spelling	Bart Van Assche	2011-05-24	9	-40/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Found these with the help of ispell -l. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
\| *	drbd: fix schedule in atomic	Lars Ellenberg	2011-05-24	2	-3/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	An administrative detach used to request a state change directly to D_DISKLESS, first suspending IO to avoid the last put_ldev() occuring from an endio handler, potentially in irq context. This is not enough on the receiving side (typically secondary), we may miss some peer_req on the way to local disk, which then may do the last put_ldev() from their drbd_peer_request_endio(). This patch makes the detach always go through the intermediate D_FAILED state. We may consider to rename it D_DETACHING. Alternative approach would be to create yet an other work item to be scheduled on the worker, do the destructor work from there, and get the timing right. manually picked commit 564040f from the drbd 8.4 branch. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: Take a more conservative approach when deciding max_bio_size	Philipp Reisner	2011-05-24	4	-50/+97
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The old (optimistic) implementation could shrink the bio size on an primary device. Shrinking the bio size on a primary device is bad. Since there we might get BIOs with the old (bigger) size shortly after we published the new size. The new implementation is more conservative, and eventually increases the max_bio_size on a primary device (which is valid). It does so, when it knows the local limit AND the remote limit. We cache the last seen max_bio_size of the peer in the meta data, and rely on that, to make the operation of single nodes more efficient. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: Fixed state transitions after async outdate-peer-handler returned	Philipp Reisner	2011-05-24	1	-1/+14
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: Disallow the peer_disk_state to be D_OUTDATED while connected	Philipp Reisner	2011-05-24	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: Fix for the connection problems on high latency links	Philipp Reisner	2011-05-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It seems that the real cause of all the issues where that we did not noticed in drbd_try_connect() when the other guy closes one socket if the round trip time gets higher than 100ms. There were that 100ms hard coded! Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: fix potential activity log refcount imbalance in error path	Lars Ellenberg	2011-05-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is no longer sufficient to trigger on local WRITE, we need to check on (rq_state & RQ_IN_ACT_LOG) before calling drbd_al_complete_io also in the error path. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: Only downgrade the disk state in case of disk failures	Philipp Reisner	2011-05-24	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: fix disconnect/reconnect loop, if ping-timeout == ping-int	Lars Ellenberg	2011-05-24	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If there is no replication traffic within the idle timeout (ping-int seconds), DRBD will send a P_PING, and adjust the timeout to ping-timeout. If there is no P_PING_ACK received within this ping-timeout, DRBD finally drops the connection, and tries to re-establish it. To decide which timeout was active, we compared the current timeout with the ping-timeout, and dropped the connection, if that was the case. By default, ping-int is 10 seconds, ping-timeout is 500 ms. Unfortunately, if you configure ping-timeout to be the same as ping-int, expiry of the idle-timeout had been mistaken for a missing ping ack, and caused an immediate reconnection attempt. Fix: Allow both timeouts to be equal, use a local variable to store which timeout is active. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: fix potential distributed deadlock	Lars Ellenberg	2011-05-24	1	-35/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We limit ourselves to a configurable maximum number of pages used as temporary bio pages. If the configured "max_buffers" is not big enough to match the bandwidth of the respective deployment, a distributed deadlock could be triggered by e.g. fast online verify and heavy application IO. TCP connections would block on congestion, because both receivers would wait on pages to become available. Fortunately the respective senders in this case would be able to give back some pages already. So do that. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: Fix for application IO with the on-io-error=pass-on policy	Philipp Reisner	2011-05-24	2	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In case a write failes on the local disk, go into D_INCONSISTENT disk state. That causes future reads of that block to be shipped to the peer. Read retry remote was already in place. Actually the documentation needs to get fixed now. Since the application is still shielded from the error. (as long as we have only a single disk failing) The difference to detach is that we keep the disk. And therefore might keep all the other, still working sectors up to date. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* \|	Add appropriate <linux/prefetch.h> include for prefetch users	Paul Gortmaker	2011-05-22	1	-0/+1
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After discovering that wide use of prefetch on modern CPUs could be a net loss instead of a win, net drivers which were relying on the implicit inclusion of prefetch.h via the list headers showed up in the resulting cleanup fallout. Give them an explicit include via the following $0.02 script. ========================================= #!/bin/bash MANUAL="" for i in `git grep -l 'prefetch(.*)' .` ; do grep -q '<linux/prefetch.h>' $i if [ $? = 0 ] ; then continue fi ( echo '?^#include <linux/?a' echo '#include <linux/prefetch.h>' echo . echo w echo q ) \| ed -s $i > /dev/null 2>&1 if [ $? != 0 ]; then echo $i needs manual fixup MANUAL="$i $MANUAL" fi done echo ------------------- 8\<---------------------- echo vi $MANUAL ========================================= Signed-off-by: Paul <paul.gortmaker@windriver.com> [ Fixed up some incorrect #include placements, and added some non-network drivers and the fib_trie.c case - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	Fix common misspellings	Lucas De Marchi	2011-03-31	5	-10/+10
\| \| \| \| \| \|	Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
*	drbd: fix up merge error	Linus Torvalds	2011-03-28	1	-8/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In commit 95a0f10cddbf ("drbd: store in-core bitmap little endian, regardless of architecture") drbd had made the sane choice to use little-endian bitmap functions everywhere. However, it used the horrible old functions names from <asm-generic/bitops/le.h>, that were never really meant to be exported. In the meantime, things got cleaned up, and in commit c4945b9ed472 ("asm-generic: rename generic little-endian bitops functions") we renamed the LE bitops to something sane, exactly so that they could be used in random code without people gouging their eyes out when seeing the crazy jumble of letters that were the old internal names. As a result the drbd thing merged cleanly (commit 8d49a77568d1: "Merge branch 'for-2.6.39/drivers' of git://git.kernel.dk/linux-2.6-block"), since there was no data conflict - but the end result obviously doesn't actually compile. Reported-and-tested-by: Ingo Molnar <mingo@elte.hu> Cc: Jens Axboe <jaxboe@fusionio.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	Merge branch 'for-2.6.39/drivers' of git://git.kernel.dk/linux-2.6-block	Linus Torvalds	2011-03-27	12	-1376/+2132
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* 'for-2.6.39/drivers' of git://git.kernel.dk/linux-2.6-block: (122 commits) cciss: fix lost command issue drbd: need include for bitops functions declarations Revert "cciss: Add missing allocation in scsi_cmd_stack_setup and corresponding deallocation" cciss: fix missed command status value CMD_UNABORTABLE cciss: remove unnecessary casts cciss: Mask off error bits of c->busaddr in cmd_special_free when calling pci_free_consistent cciss: Inform controller we are using 32-bit tags. cciss: hoist tag masking out of loop cciss: Add missing allocation in scsi_cmd_stack_setup and corresponding deallocation cciss: export resettable host attribute drbd: drop code present under #ifdef which is relevant to 2.6.28 and below drbd: Fixed handling of read errors on a 'VerifyS' node drbd: Fixed handling of read errors on a 'VerifyT' node drbd: Implemented real timeout checking for request processing time drbd: Remove unused function atodb_endio() drbd: improve log message if received sector offset exceeds local capacity drbd: kill dead code drbd: don't BUG_ON, if bio_add_page of a single page to an empty bio fails drbd: Removed left over, now wrong comments drbd: serialize admin requests for new verify run with pending bitmap io ...
\| *	drbd: need include for bitops functions declarations	Stephen Rothwell	2011-03-17	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
\| *	drbd: drop code present under #ifdef which is relevant to 2.6.28 and below	Or Gerlitz	2011-03-10	1	-5/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: Fixed handling of read errors on a 'VerifyS' node	Philipp Reisner	2011-03-10	1	-4/+0
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: Fixed handling of read errors on a 'VerifyT' node	Philipp Reisner	2011-03-10	1	-13/+15
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: Implemented real timeout checking for request processing time	Philipp Reisner	2011-03-10	5	-0/+47
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: Remove unused function atodb_endio()	Andreas Gruenbacher	2011-03-10	2	-36/+6
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: improve log message if received sector offset exceeds local capacity	Lars Ellenberg	2011-03-10	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: kill dead code	Lars Ellenberg	2011-03-10	1	-93/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This code became obsolete and unused last December with drbd: bitmap keep track of changes vs on-disk bitmap Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: don't BUG_ON, if bio_add_page of a single page to an empty bio fails	Lars Ellenberg	2011-03-10	2	-18/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Just deal with it more gracefully, if we fail to add even a single page to an empty bio. We used to BUG_ON() there, but it has been observed in some Xen deployment, so we need to handle that case more robustly now. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: Removed left over, now wrong comments	Philipp Reisner	2011-03-10	1	-7/+1
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: serialize admin requests for new verify run with pending bitmap io	Lars Ellenberg	2011-03-10	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is an addendum to drbd: serialize admin requests for new resync with pending bitmap io It avoids a race that could trigger "FIXME" assert log messages. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: fix potential imbalance of ap_in_flight	Lars Ellenberg	2011-03-10	2	-29/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we receive a barrier ack, we walk the ring list of drbd requests in the transfer log of the respective epoch, do some housekeeping, and free those objects. We tried to keep epochs of mirrored and unmirrored drbd requests separate, and assert that no local-only requests are present in a barrier_acked epoch. It turns out that this has quite a number of corner cases and would add bloated code without functional benefit. We now revert the (insufficient) commits drbd: Fixed an issue with AHEAD -> SYNC_SOURCE transitions drbd: Ensure that an epoch contains only requests of one kind and instead fix the processing of barrier acks to cope with a mix of local-only and mirrored requests. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: silence some noisy log messages during disconnect	Lars Ellenberg	2011-03-10	2	-20/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we fail to send the information that we lost our disk, we have no connection, and no disk: no access to data anymore. That is either expected (deconfiguration), or there will be so much noise in the logs that "Sending state failed" is not useful at all. Drop it. If the reason for a shorter than expected receive was a signal, which we sent because we already decided to disconnect, these additional log messages are confusing and useless. This patch follows this pattern: - dev_warn(DEV, "short read expecting header on sock: r=%d\n", r); + if (!signal_pending(current)) + dev_warn(DEV, "short read expecting header on sock: r=%d\n", r); Also make them all dev_warn for consistency. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: describe bitmap locking for bulk operation in finer detail	Lars Ellenberg	2011-03-10	5	-63/+115
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we do no longer in-place endian-swap the bitmap, we allow selected bitmap operations (testing bits, sometimes even settting bits) during some bulk operations. This caused us to hit a lot of FIXME asserts similar to FIXME asender in drbd_bm_count_bits, bitmap locked for 'write from resync_finished' by worker Which now is nonsense: looking at the bitmap is perfectly legal as long as it is not being resized. This cosmetic patch defines some flags to describe expectations in finer detail, so the asserts in e.g. bm_change_bits_to() can be skipped if appropriate. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: log UUIDs whenever they change	Lars Ellenberg	2011-03-10	5	-51/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	All decisions about sync, sync direction, and wether or not to allow a connect or attach are based on our set of UUIDs to tag a data generation. Log changes to the UUIDs whenever they occur, logging "new current UUID P:Q:R:S" is more useful than "Creating new current UUID". Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: We can not process BIOs with a size of 0	Philipp Reisner	2011-03-10	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: Provide hints with the error message when clearing the sync pause flag	Philipp Reisner	2011-03-10	1	-2/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the user clears the sync-pause flag, and sync stays in pause state, give hints to the user, why it still is in pause state. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: queue bitmap writeout more intelligently	Lars Ellenberg	2011-03-10	2	-1/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The "lazy writeout" of cleared bitmap pages happens during resync, and should happen again once the resync finishes cleanly, or is aborted. If resync finished cleanly, or was aborted because of peer disk failure, we trigger the writeout from worker context in the after state change work. If resync was aborted because of connection failure, we should not immediately trigger bitmap writeout, but rather postpone the writeout to after the connection cleanup happened. We now do it in the receiver context from drbd_disconnect(). If resync was aborted because of local disk failure, well, there is nothing to write to anymore. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: don't pointlessly queue bitmap send, if we lost connection	Lars Ellenberg	2011-03-10	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a minor optimization and cleanup, and also considerably reduces some harmless (but noisy) race with the connection cleanup code. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: serialize admin requests for new resync with pending bitmap io	Lars Ellenberg	2011-03-10	1	-1/+8
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: only generate and send a new sync uuid after a successful state change	Lars Ellenberg	2011-03-10	1	-13/+12
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: cleaned up __set_current_state() followed by schedule_timeout() calls	Philipp Reisner	2011-03-10	3	-10/+5
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: Ensure that an epoch contains only requests of one kind	Philipp Reisner	2011-03-10	3	-26/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The assert in drbd_req.c:755 forces us to have only requests of one kind in an epoch. The two kinds we distinguish here are: local-only or mirrored. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: Fixed P_NEG_ACK processing for protocol A and B	Philipp Reisner	2011-03-10	1	-12/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Protocol A has no P_WRITE_ACKs, but has P_NEG_ACKs. The master bio might already be completed, therefore the request is no longer in the collision hash. => Do not try to validate block_id as request In Protocol B we might already have got a P_RECV_ACK but then get a P_NEG_ACK after wards. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
\| *	drbd: Killed an assert that is no longer valid	Philipp Reisner	2011-03-10	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The point is that drbd_disconnect() can be called with a cstate of WFConnection. That happens if the user issues "drbdsetup disconnect" while the drbd_connect() function executes. Then drbdd_init() will call drbdd(), which in turn will return without receiving any packets. Then drbdd_init() will end up calling drbd_disconnect() with a cstate of WFConnection. Bottom line: This assertion is wrong as it is, and we do not see value in fixing it. => Removing it. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>