| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
Use ACOMP_REQUEST_ON_STACK instead of allocating legacy fallback
compression transform.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
| |
Remove the unused dst_null support.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
| |
Test the correct flags for the MAY_SLEEP bit.
Fixes: 2ec6761df889 ("crypto: iaa - Add support for deflate-iaa compression algorithm")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the latest mm-unstable, setting the iaa_crypto sync_mode to 'async'
causes crypto testmgr.c test_acomp() failure and dmesg call traces, and
zswap being unable to use 'deflate-iaa' as a compressor:
echo async > /sys/bus/dsa/drivers/crypto/sync_mode
[ 255.271030] zswap: compressor deflate-iaa not available
[ 369.960673] INFO: task cryptomgr_test:4889 blocked for more than 122 seconds.
[ 369.970127] Not tainted 6.13.0-rc1-mm-unstable-12-16-2024+ #324
[ 369.977411] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 369.986246] task:cryptomgr_test state:D stack:0 pid:4889 tgid:4889 ppid:2 flags:0x00004000
[ 369.986253] Call Trace:
[ 369.986256] <TASK>
[ 369.986260] __schedule+0x45c/0xfa0
[ 369.986273] schedule+0x2e/0xb0
[ 369.986277] schedule_timeout+0xe7/0x100
[ 369.986284] ? __prepare_to_swait+0x4e/0x70
[ 369.986290] wait_for_completion+0x8d/0x120
[ 369.986293] test_acomp+0x284/0x670
[ 369.986305] ? __pfx_cryptomgr_test+0x10/0x10
[ 369.986312] alg_test_comp+0x263/0x440
[ 369.986315] ? sched_balance_newidle+0x259/0x430
[ 369.986320] ? __pfx_cryptomgr_test+0x10/0x10
[ 369.986323] alg_test.part.27+0x103/0x410
[ 369.986326] ? __schedule+0x464/0xfa0
[ 369.986330] ? __pfx_cryptomgr_test+0x10/0x10
[ 369.986333] cryptomgr_test+0x20/0x40
[ 369.986336] kthread+0xda/0x110
[ 369.986344] ? __pfx_kthread+0x10/0x10
[ 369.986346] ret_from_fork+0x2d/0x40
[ 369.986355] ? __pfx_kthread+0x10/0x10
[ 369.986358] ret_from_fork_asm+0x1a/0x30
[ 369.986365] </TASK>
This happens because the only async polling without interrupts that
iaa_crypto currently implements is with the 'sync' mode. With 'async',
iaa_crypto calls to compress/decompress submit the descriptor and return
-EINPROGRESS, without any mechanism in the driver to poll for
completions. Hence callers such as test_acomp() in crypto/testmgr.c or
zswap, that wrap the calls to crypto_acomp_compress() and
crypto_acomp_decompress() in synchronous wrappers, will block
indefinitely. Even before zswap can notice this problem, the crypto
testmgr.c's test_acomp() will fail and prevent registration of
"deflate-iaa" as a valid crypto acomp algorithm, thereby disallowing the
use of "deflate-iaa" as a zswap compress (zswap will fall-back to the
default compressor in this case).
To fix this issue, this patch modifies the iaa_crypto sync_mode set
function to treat 'async' equivalent to 'sync', so that the correct and
only supported driver async polling without interrupts implementation is
enabled, and zswap can use 'deflate-iaa' as the compressor.
Hence, with this patch, this is what will happen:
echo async > /sys/bus/dsa/drivers/crypto/sync_mode
cat /sys/bus/dsa/drivers/crypto/sync_mode
sync
There are no crypto/testmgr.c test_acomp() errors, no call traces and zswap
can use 'deflate-iaa' without any errors. The iaa_crypto documentation has
also been updated to mention this caveat with 'async' and what to expect
with this fix.
True iaa_crypto async polling without interrupts is enabled in patch
"crypto: iaa - Implement batch_compress(), batch_decompress() API in
iaa_crypto." [1] which is under review as part of the "zswap IAA compress
batching" patch-series [2]. Until this is merged, we would appreciate it if
this current patch can be considered for a hotfix.
[1]: https://patchwork.kernel.org/project/linux-mm/patch/20241221063119.29140-5-kanchana.p.sridhar@intel.com/
[2]: https://patchwork.kernel.org/project/linux-mm/list/?series=920084
Fixes: 09646c98d ("crypto: iaa - Add irq support for the crypto async interface")
Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clean up the existing export namespace code along the same lines of
commit 33def8498fdd ("treewide: Convert macro and uses of __section(foo)
to __section("foo")") and for the same reason, it is not desired for the
namespace argument to be a macro expansion itself.
Scripted using
git grep -l -e MODULE_IMPORT_NS -e EXPORT_SYMBOL_NS | while read file;
do
awk -i inplace '
/^#define EXPORT_SYMBOL_NS/ {
gsub(/__stringify\(ns\)/, "ns");
print;
next;
}
/^#define MODULE_IMPORT_NS/ {
gsub(/__stringify\(ns\)/, "ns");
print;
next;
}
/MODULE_IMPORT_NS/ {
$0 = gensub(/MODULE_IMPORT_NS\(([^)]*)\)/, "MODULE_IMPORT_NS(\"\\1\")", "g");
}
/EXPORT_SYMBOL_NS/ {
if ($0 ~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+),/) {
if ($0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/ &&
$0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(\)/ &&
$0 !~ /^my/) {
getline line;
gsub(/[[:space:]]*\\$/, "");
gsub(/[[:space:]]/, "", line);
$0 = $0 " " line;
}
$0 = gensub(/(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/,
"\\1(\\2, \"\\3\")", "g");
}
}
{ print }' $file;
done
Requested-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://mail.google.com/mail/u/2/#inbox/FMfcgzQXKWgMmjdFwwdsfgxzKpVHWPlc
Acked-by: Greg KH <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For iaa_crypto operations, it's assumed that if an operation doesn't
make progress, the IAA watchdog timer will kick in and set the
completion status bit to failure and the reason to completion timeout.
Some systems may have broken hardware that doesn't even do that, which
can result in an infinite status-checking loop. Add a check for that
in the loop, and disable the driver if it occurs.
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The free_device_compression_mode(iaa_device, device_mode) function frees
"device_mode" but it iss passed to iaa_compression_modes[i]->free() a few
lines later resulting in a use after free.
The good news is that, so far as I can tell, nothing implements the
->free() function and the use after free happens in dead code. But, with
this fix, when something does implement it, we'll be ready. :)
Fixes: b190447e0fa3 ("crypto: iaa - Add compression mode management along with fixed mode")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes the following two Coccinelle/coccicheck warnings reported by
memdup.cocci:
iaa_crypto_main.c:350:19-26: WARNING opportunity for kmemdup
iaa_crypto_main.c:358:18-25: WARNING opportunity for kmemdup
Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If some cpus are offlined, or if the node mask is smaller than
expected, the 'nonexistent cpu' warning in rebalance_wq_table() may be
erroneously triggered.
Use cpumask_weight() to make sure we only iterate over the exact
number of cpus in the mask.
Also use num_possible_cpus() instead of num_online_cpus() to make sure
all slots in the wq table are initialized.
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
| |
As part of the simplification/cleanup of the iaa statistics, remove
the comp/decomp delay statistics.
They're actually not really useful and can be/are being more flexibly
generated using standard kernel tracing infrastructure.
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
| |
Decomp stats should use slen, not dlen. Change both the global and
per-wq stats to use the correct value.
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If nr_cpus < nr_iaa, the calculated cpus_per_iaa will be 0, which
causes a divide-by-0 in rebalance_wq_table().
Make sure cpus_per_iaa is 1 in that case, and also in the nr_iaa == 0
case, even though cpus_per_iaa is never used if nr_iaa == 0, for
paranoia.
Cc: <stable@vger.kernel.org> # v6.8+
Reported-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
| |
Add the missing CRYPTO_ALG_ASYNC flag since intel iaa driver
works asynchronously.
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Acked-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The comp/decomp delay statistics currently have no callers; somehow
they were dropped during refactoring. There originally were also two
sets, one for the async algorithm, the other for the synchronous
version. Because the synchronous algorithm was dropped, one set should
be removed. To keep it consistent with the rest of the stats, and
since there's no ambiguity, remove the acomp/adecomp versions. Also
add back the callers.
Reported-by: Rex Zhang <rex.zhang@intel.com>
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
| |
The disable_async paths of iaa_compress/decompress() don't free idxd
descriptors in the async_disable case. Currently this only happens in
the testcases where req->dst is set to null. Add a test to free them
in those paths.
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The header table and related code is currently unused - it was
included and used for canned mode, but canned mode has been removed,
so this code can be safely removed as well.
This indirectly fixes a bug reported by Dan Carpenter.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/linux-crypto/b2e0bd974981291e16882686a2b9b1db3986abe4.camel@linux.intel.com/T/#m4403253d6a4347a925fab4fc1cdb4ef7c095fb86
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some configurations e.g. systems with CXL, a numa node can have 0
cpus and cpumask_nth() will return a cpu value that doesn't exist,
which will result in an attempt to add an entry to the wq table at a
bad index.
To fix this, when iterating the cpus for a node, skip any node that
doesn't have cpus.
Also, as a precaution, add a warning and bail if cpumask_nth() returns
a nonexistent cpu.
Reported-by: Zhang, Rex <rex.zhang@intel.com>
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
| |
No functional modification involved.
./drivers/crypto/intel/iaa/iaa_crypto_main.c:979:2-3: Unneeded semicolon.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7772
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Acked-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order for shared workqeues to work properly, desc->priv should be
set to 0 rather than 1. The need for this is described in commit
f5ccf55e1028 (dmaengine/idxd: Re-enable kernel workqueue under DMA
API), so we need to make IAA consistent with IOMMU settings, otherwise
we get:
[ 141.948389] IOMMU: dmar15: Page request in Privilege Mode
[ 141.948394] dmar15: Invalid page request: 2000026a100101 ffffb167
Dedicated workqueues ignore this field and are unaffected.
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for optional debugfs statistics support for the IAA
Compression Accelerator. This is enabled by the kernel config item:
CRYPTO_DEV_IAA_CRYPTO_STATS
When enabled, the IAA crypto driver will generate statistics which can
be accessed at /sys/kernel/debug/iaa-crypto/.
See Documentation/driver-api/crypto/iax/iax-crypto.rst for details.
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The existing iaa crypto async support provides an implementation that
satisfies the interface but does so in a synchronous manner - it fills
and submits the IDXD descriptor and then waits for it to complete
before returning. This isn't a problem at the moment, since all
existing callers (e.g. zswap) wrap any asynchronous callees in a
synchronous wrapper anyway.
This change makes the iaa crypto async implementation truly
asynchronous: it fills and submits the IDXD descriptor, then returns
immediately with -EINPROGRESS. It also sets the descriptor's 'request
completion irq' bit and sets up a callback with the IDXD driver which
is called when the operation completes and the irq fires. The
existing callers such as zswap use synchronous wrappers to deal with
-EINPROGRESS and so work as expected without any changes.
This mode can be enabled by writing 'async_irq' to the sync_mode
iaa_crypto driver attribute:
echo async_irq > /sys/bus/dsa/drivers/crypto/sync_mode
Async mode without interrupts (caller must poll) can be enabled by
writing 'async' to it:
echo async > /sys/bus/dsa/drivers/crypto/sync_mode
The default sync mode can be enabled by writing 'sync' to it:
echo sync > /sys/bus/dsa/drivers/crypto/sync_mode
The sync_mode value setting at the time the IAA algorithms are
registered is captured in each algorithm's crypto_ctx and used for all
compresses and decompresses when using a given algorithm.
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch registers the deflate-iaa deflate compression algorithm and
hooks it up to the IAA hardware using the 'fixed' compression mode
introduced in the previous patch.
Because the IAA hardware has a 4k history-window limitation, only
buffers <= 4k, or that have been compressed using a <= 4k history
window, are technically compliant with the deflate spec, which allows
for a window of up to 32k. Because of this limitation, the IAA fixed
mode deflate algorithm is given its own algorithm name, 'deflate-iaa'.
With this change, the deflate-iaa crypto algorithm is registered and
operational, and compression and decompression operations are fully
enabled following the successful binding of the first IAA workqueue
to the iaa_crypto sub-driver.
when there are no IAA workqueues bound to the driver, the IAA crypto
algorithm can be unregistered by removing the module.
A new iaa_crypto 'verify_compress' driver attribute is also added,
allowing the user to toggle compression verification. If set, each
compress will be internally decompressed and the contents verified,
returning error codes if unsuccessful. This can be toggled with 0/1:
echo 0 > /sys/bus/dsa/drivers/crypto/verify_compress
The default setting is '1' - verify all compresses.
The verify_compress value setting at the time the algorithm is
registered is captured in the algorithm's crypto_ctx and used for all
compresses when using the algorithm.
[ Based on work originally by George Powley, Jing Lin and Kyung Min
Park ]
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Define an in-kernel API for adding and removing compression modes,
which can be used by kernel modules or other kernel code that
implements IAA compression modes.
Also add a separate file, iaa_crypto_comp_fixed.c, containing huffman
tables generated for the IAA 'fixed' compression mode. Future
compression modes can be added in a similar fashion.
One or more crypto compression algorithms will be created for each
compression mode, each of which can be selected as the compression
algorithm to be used by a particular facility.
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The iaa compression/decompression algorithms in later patches need a
way to retrieve an appropriate IAA workqueue depending on how close
the associated IAA device is to the current cpu.
For this purpose, add a per-cpu array of workqueues such that an
appropriate workqueue can be retrieved by simply accessing the per-cpu
array.
Whenever a new workqueue is bound to or unbound from the iaa_crypto
driver, the available workqueues are 'rebalanced' such that work
submitted from a particular CPU is given to the most appropriate
workqueue available. There currently isn't any way for the user to
tweak the way this is done internally - if necessary, knobs can be
added later for that purpose. Current best practice is to configure
and bind at least one workqueue for each IAA device, but as long as
there is at least one workqueue configured and bound to any IAA device
in the system, the iaa_crypto driver will work, albeit most likely not
as efficiently.
[ Based on work originally by George Powley, Jing Lin and Kyung Min
Park ]
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The Intel Analytics Accelerator (IAA) is a hardware accelerator that
provides very high thoughput compression/decompression compatible with
the DEFLATE compression standard described in RFC 1951, which is the
compression/decompression algorithm exported by this module.
Users can select IAA compress/decompress acceleration by specifying
one of the deflate-iaa* algorithms as the compression algorithm to use
by whatever facility allows asynchronous compression algorithms to be
selected.
For example, zswap can select the IAA fixed deflate algorithm
'deflate-iaa' via:
# echo deflate-iaa > /sys/module/zswap/parameters/compressor
This patch adds iaa_crypto as an idxd sub-driver and tracks iaa
devices and workqueues as they are probed or removed.
[ Based on work originally by George Powley, Jing Lin and Kyung Min
Park ]
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|