linux.git - Linux kernel mainline tree

	Commit message (Collapse)	Author	Age	Files	Lines
*	md/raid6 algorithms: delta syndrome functions	Markus Stockhausen	2015-04-22	1	-7/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	v3: s-o-b comment, explanation of performance and descision for the start/stop implementation Implementing rmw functionality for RAID6 requires optimized syndrome calculation. Up to now we can only generate a complete syndrome. The target P/Q pages are always overwritten. With this patch we provide a framework for inplace P/Q modification. In the first place simply fill those functions with NULL values. xor_syndrome() has two additional parameters: start & stop. These will indicate the first and last page that are changing during a rmw run. That makes it possible to avoid several unneccessary loops and speed up calculation. The caller needs to implement the following logic to make the functions work. 1) xor_syndrome(disks, start, stop, ...): "Remove" all data of source blocks inside P/Q between (and including) start and end. 2) modify any block with start <= block <= stop 3) xor_syndrome(disks, start, stop, ...): "Reinsert" all data of source blocks into P/Q between (and including) start and end. Pages between start and stop that won't be changed should be filled with a pointer to the kernel zero page. The reasons for not taking NULL pages are: 1) Algorithms cross the whole source data line by line. Thus avoid additional branches. 2) Having a NULL page avoids calculating the XOR P parity but still need calulation steps for the Q parity. Depending on the algorithm unrolling that might be only a difference of 2 instructions per loop. The benchmark numbers of the gen_syndrome() functions are displayed in the kernel log. Do the same for the xor_syndrome() functions. This will help to analyze performance problems and give an rough estimate how well the algorithm works. The choice of the fastest algorithm will still depend on the gen_syndrome() performance. With the start/stop page implementation the speed can vary a lot in real life. E.g. a change of page 0 & page 15 on a stripe will be harder to compute than the case where page 0 & page 1 are XOR candidates. To be not to enthusiatic about the expected speeds we will run a worse case test that simulates a change on the upper half of the stripe. So we do: 1) calculation of P/Q for the upper pages 2) continuation of Q for the lower (empty) pages Signed-off-by: Markus Stockhausen <stockhausen@collogia.de> Signed-off-by: NeilBrown <neilb@suse.de>
*	x86/raid6: correctly check for assembler capabilities	Jan Beulich	2015-02-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Just like for AVX2 (which simply needs an #if -> #ifdef conversion), SSSE3 assembler support should be checked for before using it. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Jim Kukunas <james.t.kukunas@linux.intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NeilBrown <neilb@suse.de>
*	lib/raid6: Add log level to printks	Anton Blanchard	2014-10-14	1	-6/+6
\| \| \| \| \|	Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: NeilBrown <neilb@suse.de>
*	Merge tag 'md/3.12' of git://neil.brown.name/md	Linus Torvalds	2013-09-10	1	-0/+3
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull md update from Neil Brown: "Headline item is multithreading for RAID5 so that more IO/sec can be supported on fast (SSD) devices. Also TILE-Gx SIMD suppor for RAID6 calculations and an assortment of bug fixes" * tag 'md/3.12' of git://neil.brown.name/md: raid5: only wakeup necessary threads md/raid5: flush out all pending requests before proceeding with reshape. md/raid5: use seqcount to protect access to shape in make_request. raid5: sysfs entry to control worker thread number raid5: offload stripe handle to workqueue raid5: fix stripe release order raid5: make release_stripe lockless md: avoid deadlock when dirty buffers during md_stop. md: Don't test all of mddev->flags at once. md: Fix apparent cut-and-paste error in super_90_validate raid6/test: replace echo -e with printf RAID: add tilegx SIMD implementation of raid6 md: fix safe_mode buglet. md: don't call md_allow_write in get_bitmap_file.
\| *	RAID: add tilegx SIMD implementation of raid6	Ken Steele	2013-08-27	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change adds TILE-Gx SIMD instructions to the software raid (md), modeling the Altivec implementation. This is only for Syndrome generation; there is more that could be done to improve recovery, as in the recent Intel SSE3 recovery implementation. The code unrolls 8 times; this turns out to be the best on tilegx hardware among the set 1, 2, 4, 8 or 16. The code reads one cache-line of data from each disk, stores P and Q then goes to the next cache-line. The test code in sys/linux/lib/raid6/test reports 2008 MB/s data read rate for syndrome generation using 18 disks (16 data and 2 parity). It was 1512 MB/s before this SIMD optimizations. This is running on 1 core with all the data in cache. This is based on the paper The Mathematics of RAID-6. (http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf). Signed-off-by: Ken Steele <ken@tilera.com> Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: NeilBrown <neilb@suse.de>
* \|	lib/raid6: add ARM-NEON accelerated syndrome calculation	Ard Biesheuvel	2013-07-08	1	-0/+6
\|/ \| \| \| \| \| \| \| \| \| \| \|	Rebased/reworked a patch contributed by Rob Herring that uses NEON intrinsics to perform the RAID-6 syndrome calculations. It uses the existing unroll.awk code to generate several unrolled versions of which the best performing one is selected at boot time. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Nicolas Pitre <nico@linaro.org> Cc: hpa@linux.intel.com
*	lib/raid6: Add AVX2 optimized gen_syndrome functions	Yuanhan Liu	2012-12-13	1	-0/+9
\| \| \| \| \| \| \| \| \| \|	Add AVX2 optimized gen_syndrom functions, which is simply based on sse2.c written by hpa. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Jim Kukunas <james.t.kukunas@linux.intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
*	lib/raid6: Add AVX2 optimized recovery functions	Jim Kukunas	2012-12-13	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	Optimize RAID6 recovery functions to take advantage of the 256-bit YMM integer instructions introduced in AVX2. The patch was tested and benchmarked before submission. However hardware is not yet released so benchmark numbers cannot be reported. Acked-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Jim Kukunas <james.t.kukunas@linux.intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
*	lib/raid6: cleanup gen_syndrome function selection	Jim Kukunas	2012-05-22	1	-47/+57
\| \| \| \| \| \| \| \| \| \| \|	Reorders functions in raid6_algos as well as the preference check to reduce the number of functions tested on initialization. Also, creates symmetry between choosing the gen_syndrome functions and choosing the recovery functions. Signed-off-by: Jim Kukunas <james.t.kukunas@linux.intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
*	lib/raid6: Add SSSE3 optimized recovery functions	Jim Kukunas	2012-05-22	1	-0/+37
\| \| \| \| \| \| \| \| \|	Add SSSE3 optimized recovery functions, as well as a system for selecting the most appropriate recovery functions to use. Originally-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Jim Kukunas <james.t.kukunas@linux.intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
*	lib/raid6: fix test program build	Jim Kukunas	2012-05-22	1	-1/+1
\| \| \| \| \| \| \| \|	<linux/module.h> drags in headers which are not visible to userspace, thus breaking the build for the test program. Signed-off-by: Jim Kukunas <james.t.kukunas@linux.intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
*	md: Add module.h to all files using it implicitly	Paul Gortmaker	2011-10-31	1	-0/+1
\| \| \| \| \| \| \| \|	A pending cleanup will mean that module.h won't be implicitly everywhere anymore. Make sure the modular drivers in md dir are actually calling out for <module.h> explicitly in advance. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
*	Further tidyup of raid6 naming in lib/raid6	NeilBrown	2010-08-12	1	-1/+1
\| \| \| \| \| \| \|	Rename raid6/raid6x86.h to raid6/x86.h and modify some comments. Signed-off-by: NeilBrown <neilb@suse.de>
*	Make lib/raid6/test build correctly.	NeilBrown	2010-08-12	1	-1/+1
\| \| \| \| \| \|	Some bit-rot needs to be cleaned out. Signed-off-by: NeilBrown <neilb@suse.de>
*	Rename raid6 files now they're in a 'raid6' directory.	David Woodhouse	2010-08-11	1	-0/+154
	Linus asks 'why "raid6" twice?'. No reason. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>