summaryrefslogtreecommitdiffstats
path: root/arch/arm64/crypto/chacha-neon-glue.c
Commit message (Collapse)AuthorAgeFilesLines
* crypto: arch/lib - limit simd usage to 4k chunksJason A. Donenfeld2020-04-301-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The initial Zinc patchset, after some mailing list discussion, contained code to ensure that kernel_fpu_enable would not be kept on for more than a 4k chunk, since it disables preemption. The choice of 4k isn't totally scientific, but it's not a bad guess either, and it's what's used in both the x86 poly1305, blake2s, and nhpoly1305 code already (in the form of PAGE_SIZE, which this commit corrects to be explicitly 4k for the former two). Ard did some back of the envelope calculations and found that at 5 cycles/byte (overestimate) on a 1ghz processor (pretty slow), 4k means we have a maximum preemption disabling of 20us, which Sebastian confirmed was probably a good limit. Unfortunately the chunking appears to have been left out of the final patchset that added the glue code. So, this commit adds it back in. Fixes: 84e03fa39fbe ("crypto: x86/chacha - expose SIMD ChaCha routine as library function") Fixes: b3aad5bad26a ("crypto: arm64/chacha - expose arm64 ChaCha routine as library function") Fixes: a44a3430d71b ("crypto: arm/chacha - expose ARM ChaCha routine as library function") Fixes: d7d7b8535662 ("crypto: x86/poly1305 - wire up faster implementations for kernel") Fixes: f569ca164751 ("crypto: arm64/poly1305 - incorporate OpenSSL/CRYPTOGAMS NEON implementation") Fixes: a6b803b3ddc7 ("crypto: arm/poly1305 - incorporate OpenSSL/CRYPTOGAMS NEON implementation") Fixes: ed0356eda153 ("crypto: blake2s - x86_64 SIMD implementation") Cc: Eric Biggers <ebiggers@google.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: arm64/chacha - correctly walk through blocksJason A. Donenfeld2020-03-201-4/+4
| | | | | | | | | | | | | | | | | | | | Prior, passing in chunks of 2, 3, or 4, followed by any additional chunks would result in the chacha state counter getting out of sync, resulting in incorrect encryption/decryption, which is a pretty nasty crypto vuln: "why do images look weird on webpages?" WireGuard users never experienced this prior, because we have always, out of tree, used a different crypto library, until the recent Frankenzinc addition. This commit fixes the issue by advancing the pointers and state counter by the actual size processed. It also fixes up a bug in the (optional, costly) stride test that prevented it from running on arm64. Fixes: b3aad5bad26a ("crypto: arm64/chacha - expose arm64 ChaCha routine as library function") Reported-and-tested-by: Emil Renner Berthing <kernel@esmil.dk> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: stable@vger.kernel.org # v5.5+ Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: arch - conditionalize crypto api in arch glue for lib codeJason A. Donenfeld2019-11-271-2/+3
| | | | | | | | | | | | For glue code that's used by Zinc, the actual Crypto API functions might not necessarily exist, and don't need to exist either. Before this patch, there are valid build configurations that lead to a unbuildable kernel. This fixes it to conditionalize those symbols on the existence of the proper config entry. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: arm/chacha - remove dependency on generic ChaCha driverArd Biesheuvel2019-11-171-1/+1
| | | | | | | | | | | Instead of falling back to the generic ChaCha skcipher driver for non-SIMD cases, use a fast scalar implementation for ARM authored by Eric Biggers. This removes the module dependency on chacha-generic altogether, which also simplifies things when we expose the ChaCha library interface from this module. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: arm64/chacha - expose arm64 ChaCha routine as library functionArd Biesheuvel2019-11-171-11/+42
| | | | | | | | | | | | | | Expose the accelerated NEON ChaCha routine directly as a symbol export so that users of the ChaCha library API can use it directly. Given that calls into the library API will always go through the routines in this module if it is enabled, switch to static keys to select the optimal implementation available (which may be none at all, in which case we defer to the generic implementation for all invocations). Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: arm64/chacha - depend on generic chacha library instead of crypto driverArd Biesheuvel2019-11-171-18/+22
| | | | | | | | | | | | | | | Depend on the generic ChaCha library routines instead of pulling in the generic ChaCha skcipher driver, which is more than we need, and makes managing the dependencies between the generic library, generic driver, accelerated library and driver more complicated. While at it, drop the logic to prefer the scalar code on short inputs. Turning the NEON on and off is cheap these days, and one major use case for ChaCha20 is ChaCha20-Poly1305, which is guaranteed to hit the scalar path upon every invocation (when doing the Poly1305 nonce generation) Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: chacha - move existing library code into lib/cryptoArd Biesheuvel2019-11-171-1/+1
| | | | | | | | | | | | | | | | | | | | Currently, our generic ChaCha implementation consists of a permute function in lib/chacha.c that operates on the 64-byte ChaCha state directly [and which is always included into the core kernel since it is used by the /dev/random driver], and the crypto API plumbing to expose it as a skcipher. In order to support in-kernel users that need the ChaCha streamcipher but have no need [or tolerance] for going through the abstractions of the crypto API, let's expose the streamcipher bits via a library API as well, in a way that permits the implementation to be superseded by an architecture specific one if provided. So move the streamcipher code into a separate module in lib/crypto, and expose the init() and crypt() routines to users of the library. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: chacha - constify ctx and iv argumentsEric Biggers2019-06-131-1/+1
| | | | | | | | | | Constify the ctx and iv arguments to crypto_chacha_init() and the various chacha*_stream_xor() functions. This makes it clear that they are not modified. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* Merge branch 'linus' of ↵Linus Torvalds2019-05-061-2/+3
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto update from Herbert Xu: "API: - Add support for AEAD in simd - Add fuzz testing to testmgr - Add panic_on_fail module parameter to testmgr - Use per-CPU struct instead multiple variables in scompress - Change verify API for akcipher Algorithms: - Convert x86 AEAD algorithms over to simd - Forbid 2-key 3DES in FIPS mode - Add EC-RDSA (GOST 34.10) algorithm Drivers: - Set output IV with ctr-aes in crypto4xx - Set output IV in rockchip - Fix potential length overflow with hashing in sun4i-ss - Fix computation error with ctr in vmx - Add SM4 protected keys support in ccree - Remove long-broken mxc-scc driver - Add rfc4106(gcm(aes)) cipher support in cavium/nitrox" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (179 commits) crypto: ccree - use a proper le32 type for le32 val crypto: ccree - remove set but not used variable 'du_size' crypto: ccree - Make cc_sec_disable static crypto: ccree - fix spelling mistake "protedcted" -> "protected" crypto: caam/qi2 - generate hash keys in-place crypto: caam/qi2 - fix DMA mapping of stack memory crypto: caam/qi2 - fix zero-length buffer DMA mapping crypto: stm32/cryp - update to return iv_out crypto: stm32/cryp - remove request mutex protection crypto: stm32/cryp - add weak key check for DES crypto: atmel - remove set but not used variable 'alg_name' crypto: picoxcell - Use dev_get_drvdata() crypto: crypto4xx - get rid of redundant using_sd variable crypto: crypto4xx - use sync skcipher for fallback crypto: crypto4xx - fix cfb and ofb "overran dst buffer" issues crypto: crypto4xx - fix ctr-aes missing output IV crypto: ecrdsa - select ASN1 and OID_REGISTRY for EC-RDSA crypto: ux500 - use ccflags-y instead of CFLAGS_<basename>.o crypto: ccree - handle tee fips error during power management resume crypto: ccree - add function to handle cryptocell tee fips error ...
| * crypto: arm64 - convert to use crypto_simd_usable()Eric Biggers2019-03-221-2/+3
| | | | | | | | | | | | | | | | | | Replace all calls to may_use_simd() in the arm64 crypto code with crypto_simd_usable(), in order to allow testing the no-SIMD code paths. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* | arm64: HWCAP: add support for AT_HWCAP2Andrew Murray2019-04-161-1/+1
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | As we will exhaust the first 32 bits of AT_HWCAP let's start exposing AT_HWCAP2 to userspace to give us up to 64 caps. Whilst it's possible to use the remaining 32 bits of AT_HWCAP, we prefer to expand into AT_HWCAP2 in order to provide a consistent view to userspace between ILP32 and LP64. However internal to the kernel we prefer to continue to use the full space of elf_hwcap. To reduce complexity and allow for future expansion, we now represent hwcaps in the kernel as ordinals and use a KERNEL_HWCAP_ prefix. This allows us to support automatic feature based module loading for all our hwcaps. We introduce cpu_set_feature to set hwcaps which complements the existing cpu_have_feature helper. These helpers allow us to clean up existing direct uses of elf_hwcap and reduce any future effort required to move beyond 64 caps. For convenience we also introduce cpu_{have,set}_named_feature which makes use of the cpu_feature macro to allow providing a hwcap name without a {KERNEL_}HWCAP_ prefix. Signed-off-by: Andrew Murray <andrew.murray@arm.com> [will: use const_ilog2() and tweak documentation] Signed-off-by: Will Deacon <will.deacon@arm.com>
* crypto: arm64/chacha - use combined SIMD/ALU routine for more speedArd Biesheuvel2018-12-131-19/+20
| | | | | | | | | | | To some degree, most known AArch64 micro-architectures appear to be able to issue ALU instructions in parellel to SIMD instructions without affecting the SIMD throughput. This means we can use the ALU to process a fifth ChaCha block while the SIMD is processing four blocks in parallel. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: arm64/chacha - optimize for arbitrary length inputsArd Biesheuvel2018-12-131-24/+14
| | | | | | | | | | | | | | | Update the 4-way NEON ChaCha routine so it can handle input of any length >64 bytes in its entirety, rather than having to call into the 1-way routine and/or memcpy()s via temp buffers to handle the tail of a ChaCha invocation that is not a multiple of 256 bytes. On inputs that are a multiple of 256 bytes (and thus in tcrypt benchmarks), performance drops by around 1% on Cortex-A57, while performance for inputs drawn randomly from the range [64, 1024) increases by around 30%. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: arm64/chacha - add XChaCha12 supportEric Biggers2018-12-131-0/+18
| | | | | | | | | | | Now that the ARM64 NEON implementation of ChaCha20 and XChaCha20 has been refactored to support varying the number of rounds, add support for XChaCha12. This is identical to XChaCha20 except for the number of rounds, which is 12 instead of 20. This can be used by Adiantum. Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: arm64/chacha20 - refactor to allow varying number of roundsEric Biggers2018-12-131-0/+189
In preparation for adding XChaCha12 support, rename/refactor the ARM64 NEON implementation of ChaCha20 to support different numbers of rounds. Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>