diff options
author | Nick Terrell <terrelln@meta.com> | 2025-03-08 12:09:33 -0800 |
---|---|---|
committer | Nick Terrell <terrelln@meta.com> | 2025-03-13 13:25:58 -0700 |
commit | 65d1f5507ed2c78c64fce40e44e5574a9419eb09 (patch) | |
tree | 4a1b819db2ea7f2a322e9bcc7582d946d0a4ea29 /lib/zstd/common/portability_macros.h | |
parent | 7eb172143d5508b4da468ed59ee857c6e5e01da6 (diff) | |
download | linux-65d1f5507ed2c78c64fce40e44e5574a9419eb09.tar.gz linux-65d1f5507ed2c78c64fce40e44e5574a9419eb09.tar.bz2 linux-65d1f5507ed2c78c64fce40e44e5574a9419eb09.zip |
zstd: Import upstream v1.5.7
In addition to keeping the kernel's copy of zstd up to date, this update
was requested by Intel to expose upstream's APIs that allow QAT to accelerate
the LZ match finding stage of Zstd.
This patch is imported from the upstream tag v1.5.7-kernel [0], which is signed
with upstream's signing key EF8FE99528B52FFD [1]. It was imported from upstream
using this command:
export ZSTD=/path/to/repo/zstd/
export LINUX=/path/to/repo/linux/
cd "$ZSTD/contrib/linux-kernel"
git checkout v1.5.7-kernel
make import LINUX="$LINUX"
This patch has been tested on x86-64, and has been boot tested with
a zstd compressed kernel & initramfs on i386 and aarch64. I benchmarked
the patch on x86-64 with gcc-14.2.1 on an Intel i9-9900K by measruing the
performance of compressed filesystem reads and writes.
Component, Level, Size delta, C. time delta, D. time delta
Btrfs , 1, +0.00%, -6.1%, +1.4%
Btrfs , 3, +0.00%, -9.8%, +3.0%
Btrfs , 5, +0.00%, +1.7%, +1.4%
Btrfs , 7, +0.00%, -1.9%, +2.7%
Btrfs , 9, +0.00%, -3.4%, +3.7%
Btrfs , 15, +0.00%, -0.3%, +3.6%
SquashFS , 1, +0.00%, N/A, +1.9%
The major changes that impact the kernel use cases for each version are:
v1.5.7: https://github.com/facebook/zstd/releases/tag/v1.5.7
* Add zstd_compress_sequences_and_literals() for use by Intel's QAT driver
to implement Zstd compression acceleration in the kernel.
* Fix an underflow bug in 32-bit builds that can cause data corruption when
processing more than 4GB of data with a single `ZSTD_CCtx` object, when an
input crosses the 4GB boundry. I don't believe this impacts any current kernel
use cases, because the `ZSTD_CCtx` is typically reconstructed between
compressions.
* Levels 1-4 see 5-10% compression speed improvements for inputs smaller than
128KB.
v1.5.6: https://github.com/facebook/zstd/releases/tag/v1.5.6
* Improved compression ratio for the highest compression levels. I don't expect
these see much use however, due to their slow speeds.
v1.5.5: https://github.com/facebook/zstd/releases/tag/v1.5.5
* Fix a rare corruption bug that can trigger on levels 13 and above.
* Improve compression speed of levels 5-11 on incompressible data.
v1.5.4: https://github.com/facebook/zstd/releases/tag/v1.5.4
* Improve copmression speed of levels 5-11 on ARM.
* Improve dictionary compression speed.
Signed-off-by: Nick Terrell <terrelln@fb.com>
Diffstat (limited to 'lib/zstd/common/portability_macros.h')
-rw-r--r-- | lib/zstd/common/portability_macros.h | 45 |
1 files changed, 35 insertions, 10 deletions
diff --git a/lib/zstd/common/portability_macros.h b/lib/zstd/common/portability_macros.h index 0e3b2c0a527d..05286af72683 100644 --- a/lib/zstd/common/portability_macros.h +++ b/lib/zstd/common/portability_macros.h @@ -1,5 +1,6 @@ +/* SPDX-License-Identifier: GPL-2.0+ OR BSD-3-Clause */ /* - * Copyright (c) Facebook, Inc. + * Copyright (c) Meta Platforms, Inc. and affiliates. * All rights reserved. * * This source code is licensed under both the BSD-style license (found in the @@ -12,7 +13,7 @@ #define ZSTD_PORTABILITY_MACROS_H /* - * This header file contains macro defintions to support portability. + * This header file contains macro definitions to support portability. * This header is shared between C and ASM code, so it MUST only * contain macro definitions. It MUST not contain any C code. * @@ -45,30 +46,35 @@ /* Mark the internal assembly functions as hidden */ #ifdef __ELF__ # define ZSTD_HIDE_ASM_FUNCTION(func) .hidden func +#elif defined(__APPLE__) +# define ZSTD_HIDE_ASM_FUNCTION(func) .private_extern func #else # define ZSTD_HIDE_ASM_FUNCTION(func) #endif +/* Compile time determination of BMI2 support */ + + /* Enable runtime BMI2 dispatch based on the CPU. * Enabled for clang & gcc >=4.8 on x86 when BMI2 isn't enabled by default. */ #ifndef DYNAMIC_BMI2 - #if ((defined(__clang__) && __has_attribute(__target__)) \ +# if ((defined(__clang__) && __has_attribute(__target__)) \ || (defined(__GNUC__) \ && (__GNUC__ >= 5 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 8)))) \ - && (defined(__x86_64__) || defined(_M_X64)) \ + && (defined(__i386__) || defined(__x86_64__) || defined(_M_IX86) || defined(_M_X64)) \ && !defined(__BMI2__) - # define DYNAMIC_BMI2 1 - #else - # define DYNAMIC_BMI2 0 - #endif +# define DYNAMIC_BMI2 1 +# else +# define DYNAMIC_BMI2 0 +# endif #endif /* - * Only enable assembly for GNUC comptabile compilers, + * Only enable assembly for GNU C compatible compilers, * because other platforms may not support GAS assembly syntax. * - * Only enable assembly for Linux / MacOS, other platforms may + * Only enable assembly for Linux / MacOS / Win32, other platforms may * work, but they haven't been tested. This could likely be * extended to BSD systems. * @@ -90,4 +96,23 @@ */ #define ZSTD_ENABLE_ASM_X86_64_BMI2 0 +/* + * For x86 ELF targets, add .note.gnu.property section for Intel CET in + * assembly sources when CET is enabled. + * + * Additionally, any function that may be called indirectly must begin + * with ZSTD_CET_ENDBRANCH. + */ +#if defined(__ELF__) && (defined(__x86_64__) || defined(__i386__)) \ + && defined(__has_include) +# if __has_include(<cet.h>) +# include <cet.h> +# define ZSTD_CET_ENDBRANCH _CET_ENDBR +# endif +#endif + +#ifndef ZSTD_CET_ENDBRANCH +# define ZSTD_CET_ENDBRANCH +#endif + #endif /* ZSTD_PORTABILITY_MACROS_H */ |