summaryrefslogtreecommitdiffstats
path: root/drivers/crypto/intel/iaa/iaa_crypto_main.c
Commit message (Collapse)AuthorAgeFilesLines
* crypto: iaa - Use acomp stack fallbackHerbert Xu2025-03-211-22/+6
| | | | | | | Use ACOMP_REQUEST_ON_STACK instead of allocating legacy fallback compression transform. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: iaa - Remove dst_null supportHerbert Xu2025-03-211-130/+6
| | | | | | Remove the unused dst_null support. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: iaa - Test the correct request flagHerbert Xu2025-03-081-2/+2
| | | | | | | Test the correct flags for the MAY_SLEEP bit. Fixes: 2ec6761df889 ("crypto: iaa - Add support for deflate-iaa compression algorithm") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: iaa - Fix IAA disabling that occurs when sync_mode is set to 'async'Kanchana P Sridhar2024-12-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the latest mm-unstable, setting the iaa_crypto sync_mode to 'async' causes crypto testmgr.c test_acomp() failure and dmesg call traces, and zswap being unable to use 'deflate-iaa' as a compressor: echo async > /sys/bus/dsa/drivers/crypto/sync_mode [ 255.271030] zswap: compressor deflate-iaa not available [ 369.960673] INFO: task cryptomgr_test:4889 blocked for more than 122 seconds. [ 369.970127] Not tainted 6.13.0-rc1-mm-unstable-12-16-2024+ #324 [ 369.977411] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 369.986246] task:cryptomgr_test state:D stack:0 pid:4889 tgid:4889 ppid:2 flags:0x00004000 [ 369.986253] Call Trace: [ 369.986256] <TASK> [ 369.986260] __schedule+0x45c/0xfa0 [ 369.986273] schedule+0x2e/0xb0 [ 369.986277] schedule_timeout+0xe7/0x100 [ 369.986284] ? __prepare_to_swait+0x4e/0x70 [ 369.986290] wait_for_completion+0x8d/0x120 [ 369.986293] test_acomp+0x284/0x670 [ 369.986305] ? __pfx_cryptomgr_test+0x10/0x10 [ 369.986312] alg_test_comp+0x263/0x440 [ 369.986315] ? sched_balance_newidle+0x259/0x430 [ 369.986320] ? __pfx_cryptomgr_test+0x10/0x10 [ 369.986323] alg_test.part.27+0x103/0x410 [ 369.986326] ? __schedule+0x464/0xfa0 [ 369.986330] ? __pfx_cryptomgr_test+0x10/0x10 [ 369.986333] cryptomgr_test+0x20/0x40 [ 369.986336] kthread+0xda/0x110 [ 369.986344] ? __pfx_kthread+0x10/0x10 [ 369.986346] ret_from_fork+0x2d/0x40 [ 369.986355] ? __pfx_kthread+0x10/0x10 [ 369.986358] ret_from_fork_asm+0x1a/0x30 [ 369.986365] </TASK> This happens because the only async polling without interrupts that iaa_crypto currently implements is with the 'sync' mode. With 'async', iaa_crypto calls to compress/decompress submit the descriptor and return -EINPROGRESS, without any mechanism in the driver to poll for completions. Hence callers such as test_acomp() in crypto/testmgr.c or zswap, that wrap the calls to crypto_acomp_compress() and crypto_acomp_decompress() in synchronous wrappers, will block indefinitely. Even before zswap can notice this problem, the crypto testmgr.c's test_acomp() will fail and prevent registration of "deflate-iaa" as a valid crypto acomp algorithm, thereby disallowing the use of "deflate-iaa" as a zswap compress (zswap will fall-back to the default compressor in this case). To fix this issue, this patch modifies the iaa_crypto sync_mode set function to treat 'async' equivalent to 'sync', so that the correct and only supported driver async polling without interrupts implementation is enabled, and zswap can use 'deflate-iaa' as the compressor. Hence, with this patch, this is what will happen: echo async > /sys/bus/dsa/drivers/crypto/sync_mode cat /sys/bus/dsa/drivers/crypto/sync_mode sync There are no crypto/testmgr.c test_acomp() errors, no call traces and zswap can use 'deflate-iaa' without any errors. The iaa_crypto documentation has also been updated to mention this caveat with 'async' and what to expect with this fix. True iaa_crypto async polling without interrupts is enabled in patch "crypto: iaa - Implement batch_compress(), batch_decompress() API in iaa_crypto." [1] which is under review as part of the "zswap IAA compress batching" patch-series [2]. Until this is merged, we would appreciate it if this current patch can be considered for a hotfix. [1]: https://patchwork.kernel.org/project/linux-mm/patch/20241221063119.29140-5-kanchana.p.sridhar@intel.com/ [2]: https://patchwork.kernel.org/project/linux-mm/list/?series=920084 Fixes: 09646c98d ("crypto: iaa - Add irq support for the crypto async interface") Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* module: Convert symbol namespace to string literalPeter Zijlstra2024-12-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clean up the existing export namespace code along the same lines of commit 33def8498fdd ("treewide: Convert macro and uses of __section(foo) to __section("foo")") and for the same reason, it is not desired for the namespace argument to be a macro expansion itself. Scripted using git grep -l -e MODULE_IMPORT_NS -e EXPORT_SYMBOL_NS | while read file; do awk -i inplace ' /^#define EXPORT_SYMBOL_NS/ { gsub(/__stringify\(ns\)/, "ns"); print; next; } /^#define MODULE_IMPORT_NS/ { gsub(/__stringify\(ns\)/, "ns"); print; next; } /MODULE_IMPORT_NS/ { $0 = gensub(/MODULE_IMPORT_NS\(([^)]*)\)/, "MODULE_IMPORT_NS(\"\\1\")", "g"); } /EXPORT_SYMBOL_NS/ { if ($0 ~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+),/) { if ($0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/ && $0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(\)/ && $0 !~ /^my/) { getline line; gsub(/[[:space:]]*\\$/, ""); gsub(/[[:space:]]/, "", line); $0 = $0 " " line; } $0 = gensub(/(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/, "\\1(\\2, \"\\3\")", "g"); } } { print }' $file; done Requested-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://mail.google.com/mail/u/2/#inbox/FMfcgzQXKWgMmjdFwwdsfgxzKpVHWPlc Acked-by: Greg KH <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* crypto: iaa - Remove potential infinite loop in check_completion()Zanussi, Tom2024-10-051-0/+10
| | | | | | | | | | | | | For iaa_crypto operations, it's assumed that if an operation doesn't make progress, the IAA watchdog timer will kick in and set the completion status bit to failure and the reason to completion timeout. Some systems may have broken hardware that doesn't even do that, which can result in an infinite status-checking loop. Add a check for that in the loop, and disable the driver if it occurs. Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: iaa - Fix potential use after free bugDan Carpenter2024-08-021-2/+2
| | | | | | | | | | | | | | | The free_device_compression_mode(iaa_device, device_mode) function frees "device_mode" but it iss passed to iaa_compression_modes[i]->free() a few lines later resulting in a use after free. The good news is that, so far as I can tell, nothing implements the ->free() function and the use after free happens in dead code. But, with this fix, when something does implement it, we'll be ready. :) Fixes: b190447e0fa3 ("crypto: iaa - Add compression mode management along with fixed mode") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: iaa - Use kmemdup() instead of kzalloc() and memcpy()Thorsten Blum2024-05-101-4/+2
| | | | | | | | | | | | Fixes the following two Coccinelle/coccicheck warnings reported by memdup.cocci: iaa_crypto_main.c:350:19-26: WARNING opportunity for kmemdup iaa_crypto_main.c:358:18-25: WARNING opportunity for kmemdup Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: iaa - Use cpumask_weight() when rebalancingTom Zanussi2024-04-121-2/+2
| | | | | | | | | | | | | | | If some cpus are offlined, or if the node mask is smaller than expected, the 'nonexistent cpu' warning in rebalance_wq_table() may be erroneously triggered. Use cpumask_weight() to make sure we only iterate over the exact number of cpus in the mask. Also use num_possible_cpus() instead of num_online_cpus() to make sure all slots in the wq table are initialized. Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: iaa - Remove comp/decomp delay statisticsTom Zanussi2024-04-021-9/+0
| | | | | | | | | | | As part of the simplification/cleanup of the iaa statistics, remove the comp/decomp delay statistics. They're actually not really useful and can be/are being more flexibly generated using standard kernel tracing infrastructure. Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: iaa - fix decomp_bytes_in statsTom Zanussi2024-04-021-2/+2
| | | | | | | | Decomp stats should use slen, not dlen. Change both the global and per-wq stats to use the correct value. Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: iaa - Fix nr_cpus < nr_iaa caseTom Zanussi2024-03-221-3/+7
| | | | | | | | | | | | | | If nr_cpus < nr_iaa, the calculated cpus_per_iaa will be 0, which causes a divide-by-0 in rebalance_wq_table(). Make sure cpus_per_iaa is 1 in that case, and also in the nr_iaa == 0 case, even though cpus_per_iaa is never used if nr_iaa == 0, for paranoia. Cc: <stable@vger.kernel.org> # v6.8+ Reported-by: Jerry Snitselaar <jsnitsel@redhat.com> Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: iaa - fix the missing CRYPTO_ALG_ASYNC in cra_flagsBarry Song2024-03-081-0/+1
| | | | | | | | | Add the missing CRYPTO_ALG_ASYNC flag since intel iaa driver works asynchronously. Signed-off-by: Barry Song <v-songbaohua@oppo.com> Acked-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: iaa - Fix comp/decomp delay statisticsTom Zanussi2024-03-011-0/+9
| | | | | | | | | | | | | | The comp/decomp delay statistics currently have no callers; somehow they were dropped during refactoring. There originally were also two sets, one for the async algorithm, the other for the synchronous version. Because the synchronous algorithm was dropped, one set should be removed. To keep it consistent with the rest of the stats, and since there's no ambiguity, remove the acomp/adecomp versions. Also add back the callers. Reported-by: Rex Zhang <rex.zhang@intel.com> Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: iaa - Fix async_disable descriptor leakTom Zanussi2024-03-011-2/+2
| | | | | | | | | | The disable_async paths of iaa_compress/decompress() don't free idxd descriptors in the async_disable case. Currently this only happens in the testcases where req->dst is set to null. Add a test to free them in those paths. Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: iaa - Remove header table codeTom Zanussi2024-01-261-105/+3
| | | | | | | | | | | | | | The header table and related code is currently unused - it was included and used for canned mode, but canned mode has been removed, so this code can be safely removed as well. This indirectly fixes a bug reported by Dan Carpenter. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/linux-crypto/b2e0bd974981291e16882686a2b9b1db3986abe4.camel@linux.intel.com/T/#m4403253d6a4347a925fab4fc1cdb4ef7c095fb86 Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: iaa - Account for cpu-less numa nodesTom Zanussi2023-12-291-2/+13
| | | | | | | | | | | | | | | | | In some configurations e.g. systems with CXL, a numa node can have 0 cpus and cpumask_nth() will return a cpu value that doesn't exist, which will result in an attempt to add an entry to the wq table at a bad index. To fix this, when iterating the cpus for a node, skip any node that doesn't have cpus. Also, as a precaution, add a warning and bail if cpumask_nth() returns a nonexistent cpu. Reported-by: Zhang, Rex <rex.zhang@intel.com> Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: iaa - remove unneeded semicolonJiapeng Chong2023-12-291-1/+1
| | | | | | | | | | | | No functional modification involved. ./drivers/crypto/intel/iaa/iaa_crypto_main.c:979:2-3: Unneeded semicolon. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7772 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Acked-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: iaa - Change desc->priv to 0Tom Zanussi2023-12-291-4/+4
| | | | | | | | | | | | | | | | | | In order for shared workqeues to work properly, desc->priv should be set to 0 rather than 1. The need for this is described in commit f5ccf55e1028 (dmaengine/idxd: Re-enable kernel workqueue under DMA API), so we need to make IAA consistent with IOMMU settings, otherwise we get: [ 141.948389] IOMMU: dmar15: Page request in Privilege Mode [ 141.948394] dmar15: Invalid page request: 2000026a100101 ffffb167 Dedicated workqueues ignore this field and are unaffected. Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: iaa - Add IAA Compression Accelerator statsTom Zanussi2023-12-151-2/+37
| | | | | | | | | | | | | | | Add support for optional debugfs statistics support for the IAA Compression Accelerator. This is enabled by the kernel config item: CRYPTO_DEV_IAA_CRYPTO_STATS When enabled, the IAA crypto driver will generate statistics which can be accessed at /sys/kernel/debug/iaa-crypto/. See Documentation/driver-api/crypto/iax/iax-crypto.rst for details. Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: iaa - Add irq support for the crypto async interfaceTom Zanussi2023-12-151-2/+264
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing iaa crypto async support provides an implementation that satisfies the interface but does so in a synchronous manner - it fills and submits the IDXD descriptor and then waits for it to complete before returning. This isn't a problem at the moment, since all existing callers (e.g. zswap) wrap any asynchronous callees in a synchronous wrapper anyway. This change makes the iaa crypto async implementation truly asynchronous: it fills and submits the IDXD descriptor, then returns immediately with -EINPROGRESS. It also sets the descriptor's 'request completion irq' bit and sets up a callback with the IDXD driver which is called when the operation completes and the irq fires. The existing callers such as zswap use synchronous wrappers to deal with -EINPROGRESS and so work as expected without any changes. This mode can be enabled by writing 'async_irq' to the sync_mode iaa_crypto driver attribute: echo async_irq > /sys/bus/dsa/drivers/crypto/sync_mode Async mode without interrupts (caller must poll) can be enabled by writing 'async' to it: echo async > /sys/bus/dsa/drivers/crypto/sync_mode The default sync mode can be enabled by writing 'sync' to it: echo sync > /sys/bus/dsa/drivers/crypto/sync_mode The sync_mode value setting at the time the IAA algorithms are registered is captured in each algorithm's crypto_ctx and used for all compresses and decompresses when using a given algorithm. Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: iaa - Add support for deflate-iaa compression algorithmTom Zanussi2023-12-151-18/+1033
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch registers the deflate-iaa deflate compression algorithm and hooks it up to the IAA hardware using the 'fixed' compression mode introduced in the previous patch. Because the IAA hardware has a 4k history-window limitation, only buffers <= 4k, or that have been compressed using a <= 4k history window, are technically compliant with the deflate spec, which allows for a window of up to 32k. Because of this limitation, the IAA fixed mode deflate algorithm is given its own algorithm name, 'deflate-iaa'. With this change, the deflate-iaa crypto algorithm is registered and operational, and compression and decompression operations are fully enabled following the successful binding of the first IAA workqueue to the iaa_crypto sub-driver. when there are no IAA workqueues bound to the driver, the IAA crypto algorithm can be unregistered by removing the module. A new iaa_crypto 'verify_compress' driver attribute is also added, allowing the user to toggle compression verification. If set, each compress will be internally decompressed and the contents verified, returning error codes if unsuccessful. This can be toggled with 0/1: echo 0 > /sys/bus/dsa/drivers/crypto/verify_compress The default setting is '1' - verify all compresses. The verify_compress value setting at the time the algorithm is registered is captured in the algorithm's crypto_ctx and used for all compresses when using the algorithm. [ Based on work originally by George Powley, Jing Lin and Kyung Min Park ] Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: iaa - Add compression mode management along with fixed modeTom Zanussi2023-12-151-1/+326
| | | | | | | | | | | | | | | | | Define an in-kernel API for adding and removing compression modes, which can be used by kernel modules or other kernel code that implements IAA compression modes. Also add a separate file, iaa_crypto_comp_fixed.c, containing huffman tables generated for the IAA 'fixed' compression mode. Future compression modes can be added in a similar fashion. One or more crypto compression algorithms will be created for each compression mode, each of which can be selected as the compression algorithm to be used by a particular facility. Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: iaa - Add per-cpu workqueue table with rebalancingTom Zanussi2023-12-151-0/+222
| | | | | | | | | | | | | | | | | | | | | | | | | | | The iaa compression/decompression algorithms in later patches need a way to retrieve an appropriate IAA workqueue depending on how close the associated IAA device is to the current cpu. For this purpose, add a per-cpu array of workqueues such that an appropriate workqueue can be retrieved by simply accessing the per-cpu array. Whenever a new workqueue is bound to or unbound from the iaa_crypto driver, the available workqueues are 'rebalanced' such that work submitted from a particular CPU is given to the most appropriate workqueue available. There currently isn't any way for the user to tweak the way this is done internally - if necessary, knobs can be added later for that purpose. Current best practice is to configure and bind at least one workqueue for each IAA device, but as long as there is at least one workqueue configured and bound to any IAA device in the system, the iaa_crypto driver will work, albeit most likely not as efficiently. [ Based on work originally by George Powley, Jing Lin and Kyung Min Park ] Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: iaa - Add Intel IAA Compression Accelerator crypto driver coreTom Zanussi2023-12-151-0/+323
The Intel Analytics Accelerator (IAA) is a hardware accelerator that provides very high thoughput compression/decompression compatible with the DEFLATE compression standard described in RFC 1951, which is the compression/decompression algorithm exported by this module. Users can select IAA compress/decompress acceleration by specifying one of the deflate-iaa* algorithms as the compression algorithm to use by whatever facility allows asynchronous compression algorithms to be selected. For example, zswap can select the IAA fixed deflate algorithm 'deflate-iaa' via: # echo deflate-iaa > /sys/module/zswap/parameters/compressor This patch adds iaa_crypto as an idxd sub-driver and tracks iaa devices and workqueues as they are probed or removed. [ Based on work originally by George Powley, Jing Lin and Kyung Min Park ] Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>