md: reduce the number of synchronize_rcu() calls when multiple devices fail.

Every time a device is removed with ->hot_remove_disk() a synchronize_rcu() call is made which can delay several milliseconds in some case. If lots of devices fail at once - as could happen with a large RAID10 where one set of devices are removed all at once - these delays can add up to be very inconcenient. As failure is not reversible we can check for that first, setting a separate flag if it is found, and then all synchronize_rcu() once for all the flagged devices. Then ->hot_remove_disk() function can skip the synchronize_rcu() step if the flag is set. fix build error(Shaohua) Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Shaohua Li <shli@fb.com>
author: NeilBrown <neilb@suse.com> 2016-06-02 16:19:53 +1000
committer: Shaohua Li <shli@fb.com> 2016-06-13 11:54:22 -0700
commit: d787be4092e27728cb4c012bee9762098ef3c662 (patch)
tree: b8de57ed842d3c01f6fdd4f6ee5be6408763a993 /drivers/md/raid1.c
parent: f5b67ae86ee317db20c0e10d54f16a0bbbd3207d (diff)
download: linux-d787be4092e27728cb4c012bee9762098ef3c662.tar.gz
linux-d787be4092e27728cb4c012bee9762098ef3c662.tar.bz2
linux-d787be4092e27728cb4c012bee9762098ef3c662.zip
1 files changed, 10 insertions, 7 deletions
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 34f20c03d1f6..5027ef4752ac 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1656,13 +1656,16 @@ static int raid1_remove_disk(struct mddev *mddev, struct md_rdev *rdev)
 			goto abort;
 		}
 		p->rdev = NULL;
-		synchronize_rcu();
-		if (atomic_read(&rdev->nr_pending)) {
-			/* lost the race, try later */
-			err = -EBUSY;
-			p->rdev = rdev;
-			goto abort;
-		} else if (conf->mirrors[conf->raid_disks + number].rdev) {
+		if (!test_bit(RemoveSynchronized, &rdev->flags)) {
+			synchronize_rcu();
+			if (atomic_read(&rdev->nr_pending)) {
+				/* lost the race, try later */
+				err = -EBUSY;
+				p->rdev = rdev;
+				goto abort;
+			}
+		}
+		if (conf->mirrors[conf->raid_disks + number].rdev) {
 			/* We just removed a device that is being replaced.
 			 * Move down the replacement.  We drain all IO before
 			 * doing this to avoid confusion.
author	NeilBrown <neilb@suse.com>	2016-06-02 16:19:53 +1000
committer	Shaohua Li <shli@fb.com>	2016-06-13 11:54:22 -0700
commit	d787be4092e27728cb4c012bee9762098ef3c662 (patch)
tree	b8de57ed842d3c01f6fdd4f6ee5be6408763a993 /drivers/md/raid1.c
parent	f5b67ae86ee317db20c0e10d54f16a0bbbd3207d (diff)
download	linux-d787be4092e27728cb4c012bee9762098ef3c662.tar.gz linux-d787be4092e27728cb4c012bee9762098ef3c662.tar.bz2 linux-d787be4092e27728cb4c012bee9762098ef3c662.zip