diff options
author | Mathias Krause <minipli@googlemail.com> | 2014-09-28 22:23:59 +0200 |
---|---|---|
committer | Herbert Xu <herbert@gondor.apana.org.au> | 2014-10-02 14:35:03 +0800 |
commit | 80dca4734b3561be59879b02bce359b6f661e921 (patch) | |
tree | 4822ed2121ea7371691ab6a5b14e5e42584a7ae8 /arch/x86/math-emu/fpu_proto.h | |
parent | 7a1ae9c0ce39d839044745956f08eabbea00d420 (diff) | |
download | linux-80dca4734b3561be59879b02bce359b6f661e921.tar.gz linux-80dca4734b3561be59879b02bce359b6f661e921.tar.bz2 linux-80dca4734b3561be59879b02bce359b6f661e921.zip |
crypto: aesni - fix counter overflow handling in "by8" variant
The "by8" CTR AVX implementation fails to propperly handle counter
overflows. That was the reason it got disabled in commit 7da4b29d496b
("crypto: aesni - disable "by8" AVX CTR optimization").
Fix the overflow handling by incrementing the counter block as a double
quad word, i.e. a 128 bit, and testing for overflows afterwards. We need
to use VPTEST to do so as VPADD* does not set the flags itself and
silently drops the carry bit.
As this change adds branches to the hot path, minor performance
regressions might be a side effect. But, OTOH, we now have a conforming
implementation -- the preferable goal.
A tcrypt test on a SandyBridge system (i7-2620M) showed almost identical
numbers for the old and this version with differences within the noise
range. A dm-crypt test with the fixed version gave even slightly better
results for this version. So the performance impact might not be as big
as expected.
Tested-by: Romain Francoise <romain@orebokech.com>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Chandramouli Narayanan <mouli@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Diffstat (limited to 'arch/x86/math-emu/fpu_proto.h')
0 files changed, 0 insertions, 0 deletions