summaryrefslogtreecommitdiffstats
path: root/arch/x86/math-emu/fpu_proto.h
diff options
context:
space:
mode:
authorMathias Krause <minipli@googlemail.com>2014-09-28 22:23:59 +0200
committerHerbert Xu <herbert@gondor.apana.org.au>2014-10-02 14:35:03 +0800
commit80dca4734b3561be59879b02bce359b6f661e921 (patch)
tree4822ed2121ea7371691ab6a5b14e5e42584a7ae8 /arch/x86/math-emu/fpu_proto.h
parent7a1ae9c0ce39d839044745956f08eabbea00d420 (diff)
downloadlinux-80dca4734b3561be59879b02bce359b6f661e921.tar.gz
linux-80dca4734b3561be59879b02bce359b6f661e921.tar.bz2
linux-80dca4734b3561be59879b02bce359b6f661e921.zip
crypto: aesni - fix counter overflow handling in "by8" variant
The "by8" CTR AVX implementation fails to propperly handle counter overflows. That was the reason it got disabled in commit 7da4b29d496b ("crypto: aesni - disable "by8" AVX CTR optimization"). Fix the overflow handling by incrementing the counter block as a double quad word, i.e. a 128 bit, and testing for overflows afterwards. We need to use VPTEST to do so as VPADD* does not set the flags itself and silently drops the carry bit. As this change adds branches to the hot path, minor performance regressions might be a side effect. But, OTOH, we now have a conforming implementation -- the preferable goal. A tcrypt test on a SandyBridge system (i7-2620M) showed almost identical numbers for the old and this version with differences within the noise range. A dm-crypt test with the fixed version gave even slightly better results for this version. So the performance impact might not be as big as expected. Tested-by: Romain Francoise <romain@orebokech.com> Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Chandramouli Narayanan <mouli@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Diffstat (limited to 'arch/x86/math-emu/fpu_proto.h')
0 files changed, 0 insertions, 0 deletions