summaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2024-11-18 18:10:37 -0800
committerLinus Torvalds <torvalds@linux-foundation.org>2024-11-18 18:10:37 -0800
commitba1f9c8fe3d443a78814cdf8ac8f9829b5ca7095 (patch)
tree121eccce6f426f5d2adfc7f285144fd7956a4357 /Documentation
parent9aa4c37f71b9a0a4f51692fbb874888c139805ce (diff)
parent83ef4a378e563d085ddd7214c2a393116b5f3435 (diff)
downloadlinux-stable-ba1f9c8fe3d443a78814cdf8ac8f9829b5ca7095.tar.gz
linux-stable-ba1f9c8fe3d443a78814cdf8ac8f9829b5ca7095.tar.bz2
linux-stable-ba1f9c8fe3d443a78814cdf8ac8f9829b5ca7095.zip
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas: - Support for running Linux in a protected VM under the Arm Confidential Compute Architecture (CCA) - Guarded Control Stack user-space support. Current patches follow the x86 ABI of implicitly creating a shadow stack on clone(). Subsequent patches (already on the list) will add support for clone3() allowing finer-grained control of the shadow stack size and placement from libc - AT_HWCAP3 support (not running out of HWCAP2 bits yet but we are getting close with the upcoming dpISA support) - Other arch features: - In-kernel use of the memcpy instructions, FEAT_MOPS (previously only exposed to user; uaccess support not merged yet) - MTE: hugetlbfs support and the corresponding kselftests - Optimise CRC32 using the PMULL instructions - Support for FEAT_HAFT enabling ARCH_HAS_NONLEAF_PMD_YOUNG - Optimise the kernel TLB flushing to use the range operations - POE/pkey (permission overlays): further cleanups after bringing the signal handler in line with the x86 behaviour for 6.12 - arm64 perf updates: - Support for the NXP i.MX91 PMU in the existing IMX driver - Support for Ampere SoCs in the Designware PCIe PMU driver - Support for Marvell's 'PEM' PCIe PMU present in the 'Odyssey' SoC - Support for Samsung's 'Mongoose' CPU PMU - Support for PMUv3.9 finer-grained userspace counter access control - Switch back to platform_driver::remove() now that it returns 'void' - Add some missing events for the CXL PMU driver - Miscellaneous arm64 fixes/cleanups: - Page table accessors cleanup: type updates, drop unused macros, reorganise arch_make_huge_pte() and clean up pte_mkcont(), sanity check addresses before runtime P4D/PUD folding - Command line override for ID_AA64MMFR0_EL1.ECV (advertising the FEAT_ECV for the generic timers) allowing Linux to boot with firmware deployments that don't set SCTLR_EL3.ECVEn - ACPI/arm64: tighten the check for the array of platform timer structures and adjust the error handling procedure in gtdt_parse_timer_block() - Optimise the cache flush for the uprobes xol slot (skip if no change) and other uprobes/kprobes cleanups - Fix the context switching of tpidrro_el0 when kpti is enabled - Dynamic shadow call stack fixes - Sysreg updates - Various arm64 kselftest improvements * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (168 commits) arm64: tls: Fix context-switching of tpidrro_el0 when kpti is enabled kselftest/arm64: Try harder to generate different keys during PAC tests kselftest/arm64: Don't leak pipe fds in pac.exec_sign_all() arm64/ptrace: Clarify documentation of VL configuration via ptrace kselftest/arm64: Corrupt P0 in the irritator when testing SSVE acpi/arm64: remove unnecessary cast arm64/mm: Change protval as 'pteval_t' in map_range() kselftest/arm64: Fix missing printf() argument in gcs/gcs-stress.c kselftest/arm64: Add FPMR coverage to fp-ptrace kselftest/arm64: Expand the set of ZA writes fp-ptrace does kselftets/arm64: Use flag bits for features in fp-ptrace assembler code kselftest/arm64: Enable build of PAC tests with LLVM=1 kselftest/arm64: Check that SVCR is 0 in signal handlers selftests/mm: Fix unused function warning for aarch64_write_signal_pkey() kselftest/arm64: Fix printf() compiler warnings in the arm64 syscall-abi.c tests kselftest/arm64: Fix printf() warning in the arm64 MTE prctl() test kselftest/arm64: Fix printf() compiler warnings in the arm64 fp tests kselftest/arm64: Fix build with stricter assemblers arm64/scs: Drop unused prototype __pi_scs_patch_vmlinux() arm64/scs: Deal with 64-bit relative offsets in FDE frames ...
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/admin-guide/kernel-parameters.txt3
-rw-r--r--Documentation/admin-guide/perf/index.rst1
-rw-r--r--Documentation/admin-guide/perf/mrvl-pem-pmu.rst56
-rw-r--r--Documentation/arch/arm64/arm-cca.rst69
-rw-r--r--Documentation/arch/arm64/booting.rst38
-rw-r--r--Documentation/arch/arm64/elf_hwcaps.rst10
-rw-r--r--Documentation/arch/arm64/gcs.rst227
-rw-r--r--Documentation/arch/arm64/index.rst3
-rw-r--r--Documentation/arch/arm64/mops.rst44
-rw-r--r--Documentation/arch/arm64/sme.rst4
-rw-r--r--Documentation/arch/arm64/sve.rst4
-rw-r--r--Documentation/devicetree/bindings/arm/pmu.yaml1
-rw-r--r--Documentation/devicetree/bindings/perf/fsl-imx-ddr.yaml4
-rw-r--r--Documentation/filesystems/proc.rst2
14 files changed, 461 insertions, 5 deletions
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index d401577b5a6a..f2875484795a 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -446,6 +446,9 @@
arm64.nobti [ARM64] Unconditionally disable Branch Target
Identification support
+ arm64.nogcs [ARM64] Unconditionally disable Guarded Control Stack
+ support
+
arm64.nomops [ARM64] Unconditionally disable Memory Copy and Memory
Set instructions support
diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst
index 8502bc174640..a58bd3f7e190 100644
--- a/Documentation/admin-guide/perf/index.rst
+++ b/Documentation/admin-guide/perf/index.rst
@@ -26,3 +26,4 @@ Performance monitor support
meson-ddr-pmu
cxl
ampere_cspmu
+ mrvl-pem-pmu
diff --git a/Documentation/admin-guide/perf/mrvl-pem-pmu.rst b/Documentation/admin-guide/perf/mrvl-pem-pmu.rst
new file mode 100644
index 000000000000..c39007149b97
--- /dev/null
+++ b/Documentation/admin-guide/perf/mrvl-pem-pmu.rst
@@ -0,0 +1,56 @@
+=================================================================
+Marvell Odyssey PEM Performance Monitoring Unit (PMU UNCORE)
+=================================================================
+
+The PCI Express Interface Units(PEM) are associated with a corresponding
+monitoring unit. This includes performance counters to track various
+characteristics of the data that is transmitted over the PCIe link.
+
+The counters track inbound and outbound transactions which
+includes separate counters for posted/non-posted/completion TLPs.
+Also, inbound and outbound memory read requests along with their
+latencies can also be monitored. Address Translation Services(ATS)events
+such as ATS Translation, ATS Page Request, ATS Invalidation along with
+their corresponding latencies are also tracked.
+
+There are separate 64 bit counters to measure posted/non-posted/completion
+tlps in inbound and outbound transactions. ATS events are measured by
+different counters.
+
+The PMU driver exposes the available events and format options under sysfs,
+/sys/bus/event_source/devices/mrvl_pcie_rc_pmu_<>/events/
+/sys/bus/event_source/devices/mrvl_pcie_rc_pmu_<>/format/
+
+Examples::
+
+ # perf list | grep mrvl_pcie_rc_pmu
+ mrvl_pcie_rc_pmu_<>/ats_inv/ [Kernel PMU event]
+ mrvl_pcie_rc_pmu_<>/ats_inv_latency/ [Kernel PMU event]
+ mrvl_pcie_rc_pmu_<>/ats_pri/ [Kernel PMU event]
+ mrvl_pcie_rc_pmu_<>/ats_pri_latency/ [Kernel PMU event]
+ mrvl_pcie_rc_pmu_<>/ats_trans/ [Kernel PMU event]
+ mrvl_pcie_rc_pmu_<>/ats_trans_latency/ [Kernel PMU event]
+ mrvl_pcie_rc_pmu_<>/ib_inflight/ [Kernel PMU event]
+ mrvl_pcie_rc_pmu_<>/ib_reads/ [Kernel PMU event]
+ mrvl_pcie_rc_pmu_<>/ib_req_no_ro_ebus/ [Kernel PMU event]
+ mrvl_pcie_rc_pmu_<>/ib_req_no_ro_ncb/ [Kernel PMU event]
+ mrvl_pcie_rc_pmu_<>/ib_tlp_cpl_partid/ [Kernel PMU event]
+ mrvl_pcie_rc_pmu_<>/ib_tlp_dwords_cpl_partid/ [Kernel PMU event]
+ mrvl_pcie_rc_pmu_<>/ib_tlp_dwords_npr/ [Kernel PMU event]
+ mrvl_pcie_rc_pmu_<>/ib_tlp_dwords_pr/ [Kernel PMU event]
+ mrvl_pcie_rc_pmu_<>/ib_tlp_npr/ [Kernel PMU event]
+ mrvl_pcie_rc_pmu_<>/ib_tlp_pr/ [Kernel PMU event]
+ mrvl_pcie_rc_pmu_<>/ob_inflight_partid/ [Kernel PMU event]
+ mrvl_pcie_rc_pmu_<>/ob_merges_cpl_partid/ [Kernel PMU event]
+ mrvl_pcie_rc_pmu_<>/ob_merges_npr_partid/ [Kernel PMU event]
+ mrvl_pcie_rc_pmu_<>/ob_merges_pr_partid/ [Kernel PMU event]
+ mrvl_pcie_rc_pmu_<>/ob_reads_partid/ [Kernel PMU event]
+ mrvl_pcie_rc_pmu_<>/ob_tlp_cpl_partid/ [Kernel PMU event]
+ mrvl_pcie_rc_pmu_<>/ob_tlp_dwords_cpl_partid/ [Kernel PMU event]
+ mrvl_pcie_rc_pmu_<>/ob_tlp_dwords_npr_partid/ [Kernel PMU event]
+ mrvl_pcie_rc_pmu_<>/ob_tlp_dwords_pr_partid/ [Kernel PMU event]
+ mrvl_pcie_rc_pmu_<>/ob_tlp_npr_partid/ [Kernel PMU event]
+ mrvl_pcie_rc_pmu_<>/ob_tlp_pr_partid/ [Kernel PMU event]
+
+
+ # perf stat -e ib_inflight,ib_reads,ib_req_no_ro_ebus,ib_req_no_ro_ncb <workload>
diff --git a/Documentation/arch/arm64/arm-cca.rst b/Documentation/arch/arm64/arm-cca.rst
new file mode 100644
index 000000000000..c48b7d4ab6bd
--- /dev/null
+++ b/Documentation/arch/arm64/arm-cca.rst
@@ -0,0 +1,69 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=====================================
+Arm Confidential Compute Architecture
+=====================================
+
+Arm systems that support the Realm Management Extension (RME) contain
+hardware to allow a VM guest to be run in a way which protects the code
+and data of the guest from the hypervisor. It extends the older "two
+world" model (Normal and Secure World) into four worlds: Normal, Secure,
+Root and Realm. Linux can then also be run as a guest to a monitor
+running in the Realm world.
+
+The monitor running in the Realm world is known as the Realm Management
+Monitor (RMM) and implements the Realm Management Monitor
+specification[1]. The monitor acts a bit like a hypervisor (e.g. it runs
+in EL2 and manages the stage 2 page tables etc of the guests running in
+Realm world), however much of the control is handled by a hypervisor
+running in the Normal World. The Normal World hypervisor uses the Realm
+Management Interface (RMI) defined by the RMM specification to request
+the RMM to perform operations (e.g. mapping memory or executing a vCPU).
+
+The RMM defines an environment for guests where the address space (IPA)
+is split into two. The lower half is protected - any memory that is
+mapped in this half cannot be seen by the Normal World and the RMM
+restricts what operations the Normal World can perform on this memory
+(e.g. the Normal World cannot replace pages in this region without the
+guest's cooperation). The upper half is shared, the Normal World is free
+to make changes to the pages in this region, and is able to emulate MMIO
+devices in this region too.
+
+A guest running in a Realm may also communicate with the RMM using the
+Realm Services Interface (RSI) to request changes in its environment or
+to perform attestation about its environment. In particular it may
+request that areas of the protected address space are transitioned
+between 'RAM' and 'EMPTY' (in either direction). This allows a Realm
+guest to give up memory to be returned to the Normal World, or to
+request new memory from the Normal World. Without an explicit request
+from the Realm guest the RMM will otherwise prevent the Normal World
+from making these changes.
+
+Linux as a Realm Guest
+----------------------
+
+To run Linux as a guest within a Realm, the following must be provided
+either by the VMM or by a `boot loader` run in the Realm before Linux:
+
+ * All protected RAM described to Linux (by DT or ACPI) must be marked
+ RIPAS RAM before handing control over to Linux.
+
+ * MMIO devices must be either unprotected (e.g. emulated by the Normal
+ World) or marked RIPAS DEV.
+
+ * MMIO devices emulated by the Normal World and used very early in boot
+ (specifically earlycon) must be specified in the upper half of IPA.
+ For earlycon this can be done by specifying the address on the
+ command line, e.g. with an IPA size of 33 bits and the base address
+ of the emulated UART at 0x1000000: ``earlycon=uart,mmio,0x101000000``
+
+ * Linux will use bounce buffers for communicating with unprotected
+ devices. It will transition some protected memory to RIPAS EMPTY and
+ expect to be able to access unprotected pages at the same IPA address
+ but with the highest valid IPA bit set. The expectation is that the
+ VMM will remove the physical pages from the protected mapping and
+ provide those pages as unprotected pages.
+
+References
+----------
+[1] https://developer.arm.com/documentation/den0137/
diff --git a/Documentation/arch/arm64/booting.rst b/Documentation/arch/arm64/booting.rst
index b57776a68f15..3278fb4bf219 100644
--- a/Documentation/arch/arm64/booting.rst
+++ b/Documentation/arch/arm64/booting.rst
@@ -41,6 +41,9 @@ to automatically locate and size all RAM, or it may use knowledge of
the RAM in the machine, or any other method the boot loader designer
sees fit.)
+For Arm Confidential Compute Realms this includes ensuring that all
+protected RAM has a Realm IPA state (RIPAS) of "RAM".
+
2. Setup the device tree
-------------------------
@@ -385,6 +388,9 @@ Before jumping into the kernel, the following conditions must be met:
- HCRX_EL2.MSCEn (bit 11) must be initialised to 0b1.
+ - HCRX_EL2.MCE2 (bit 10) must be initialised to 0b1 and the hypervisor
+ must handle MOPS exceptions as described in :ref:`arm64_mops_hyp`.
+
For CPUs with the Extended Translation Control Register feature (FEAT_TCR2):
- If EL3 is present:
@@ -411,6 +417,38 @@ Before jumping into the kernel, the following conditions must be met:
- HFGRWR_EL2.nPIRE0_EL1 (bit 57) must be initialised to 0b1.
+ - For CPUs with Guarded Control Stacks (FEAT_GCS):
+
+ - GCSCR_EL1 must be initialised to 0.
+
+ - GCSCRE0_EL1 must be initialised to 0.
+
+ - If EL3 is present:
+
+ - SCR_EL3.GCSEn (bit 39) must be initialised to 0b1.
+
+ - If EL2 is present:
+
+ - GCSCR_EL2 must be initialised to 0.
+
+ - If the kernel is entered at EL1 and EL2 is present:
+
+ - HCRX_EL2.GCSEn must be initialised to 0b1.
+
+ - HFGITR_EL2.nGCSEPP (bit 59) must be initialised to 0b1.
+
+ - HFGITR_EL2.nGCSSTR_EL1 (bit 58) must be initialised to 0b1.
+
+ - HFGITR_EL2.nGCSPUSHM_EL1 (bit 57) must be initialised to 0b1.
+
+ - HFGRTR_EL2.nGCS_EL1 (bit 53) must be initialised to 0b1.
+
+ - HFGRTR_EL2.nGCS_EL0 (bit 52) must be initialised to 0b1.
+
+ - HFGWTR_EL2.nGCS_EL1 (bit 53) must be initialised to 0b1.
+
+ - HFGWTR_EL2.nGCS_EL0 (bit 52) must be initialised to 0b1.
+
The requirements described above for CPU mode, caches, MMUs, architected
timers, coherency and system registers apply to all CPUs. All CPUs must
enter the kernel in the same exception level. Where the values documented
diff --git a/Documentation/arch/arm64/elf_hwcaps.rst b/Documentation/arch/arm64/elf_hwcaps.rst
index 694f67fa07d1..2ff922a406ad 100644
--- a/Documentation/arch/arm64/elf_hwcaps.rst
+++ b/Documentation/arch/arm64/elf_hwcaps.rst
@@ -16,9 +16,9 @@ architected discovery mechanism available to userspace code at EL0. The
kernel exposes the presence of these features to userspace through a set
of flags called hwcaps, exposed in the auxiliary vector.
-Userspace software can test for features by acquiring the AT_HWCAP or
-AT_HWCAP2 entry of the auxiliary vector, and testing whether the relevant
-flags are set, e.g.::
+Userspace software can test for features by acquiring the AT_HWCAP,
+AT_HWCAP2 or AT_HWCAP3 entry of the auxiliary vector, and testing
+whether the relevant flags are set, e.g.::
bool floating_point_is_present(void)
{
@@ -170,6 +170,10 @@ HWCAP_PACG
ID_AA64ISAR1_EL1.GPI == 0b0001, as described by
Documentation/arch/arm64/pointer-authentication.rst.
+HWCAP_GCS
+ Functionality implied by ID_AA64PFR1_EL1.GCS == 0b1, as
+ described by Documentation/arch/arm64/gcs.rst.
+
HWCAP2_DCPODP
Functionality implied by ID_AA64ISAR1_EL1.DPB == 0b0010.
diff --git a/Documentation/arch/arm64/gcs.rst b/Documentation/arch/arm64/gcs.rst
new file mode 100644
index 000000000000..1f65a3193e77
--- /dev/null
+++ b/Documentation/arch/arm64/gcs.rst
@@ -0,0 +1,227 @@
+===============================================
+Guarded Control Stack support for AArch64 Linux
+===============================================
+
+This document outlines briefly the interface provided to userspace by Linux in
+order to support use of the ARM Guarded Control Stack (GCS) feature.
+
+This is an outline of the most important features and issues only and not
+intended to be exhaustive.
+
+
+
+1. General
+-----------
+
+* GCS is an architecture feature intended to provide greater protection
+ against return oriented programming (ROP) attacks and to simplify the
+ implementation of features that need to collect stack traces such as
+ profiling.
+
+* When GCS is enabled a separate guarded control stack is maintained by the
+ PE which is writeable only through specific GCS operations. This
+ stores the call stack only, when a procedure call instruction is
+ performed the current PC is pushed onto the GCS and on RET the
+ address in the LR is verified against that on the top of the GCS.
+
+* When active the current GCS pointer is stored in the system register
+ GCSPR_EL0. This is readable by userspace but can only be updated
+ via specific GCS instructions.
+
+* The architecture provides instructions for switching between guarded
+ control stacks with checks to ensure that the new stack is a valid
+ target for switching.
+
+* The functionality of GCS is similar to that provided by the x86 Shadow
+ Stack feature, due to sharing of userspace interfaces the ABI refers to
+ shadow stacks rather than GCS.
+
+* Support for GCS is reported to userspace via HWCAP_GCS in the aux vector
+ AT_HWCAP2 entry.
+
+* GCS is enabled per thread. While there is support for disabling GCS
+ at runtime this should be done with great care.
+
+* GCS memory access faults are reported as normal memory access faults.
+
+* GCS specific errors (those reported with EC 0x2d) will be reported as
+ SIGSEGV with a si_code of SEGV_CPERR (control protection error).
+
+* GCS is supported only for AArch64.
+
+* On systems where GCS is supported GCSPR_EL0 is always readable by EL0
+ regardless of the GCS configuration for the thread.
+
+* The architecture supports enabling GCS without verifying that return values
+ in LR match those in the GCS, the LR will be ignored. This is not supported
+ by Linux.
+
+
+
+2. Enabling and disabling Guarded Control Stacks
+-------------------------------------------------
+
+* GCS is enabled and disabled for a thread via the PR_SET_SHADOW_STACK_STATUS
+ prctl(), this takes a single flags argument specifying which GCS features
+ should be used.
+
+* When set PR_SHADOW_STACK_ENABLE flag allocates a Guarded Control Stack
+ and enables GCS for the thread, enabling the functionality controlled by
+ GCSCRE0_EL1.{nTR, RVCHKEN, PCRSEL}.
+
+* When set the PR_SHADOW_STACK_PUSH flag enables the functionality controlled
+ by GCSCRE0_EL1.PUSHMEn, allowing explicit GCS pushes.
+
+* When set the PR_SHADOW_STACK_WRITE flag enables the functionality controlled
+ by GCSCRE0_EL1.STREn, allowing explicit stores to the Guarded Control Stack.
+
+* Any unknown flags will cause PR_SET_SHADOW_STACK_STATUS to return -EINVAL.
+
+* PR_LOCK_SHADOW_STACK_STATUS is passed a bitmask of features with the same
+ values as used for PR_SET_SHADOW_STACK_STATUS. Any future changes to the
+ status of the specified GCS mode bits will be rejected.
+
+* PR_LOCK_SHADOW_STACK_STATUS allows any bit to be locked, this allows
+ userspace to prevent changes to any future features.
+
+* There is no support for a process to remove a lock that has been set for
+ it.
+
+* PR_SET_SHADOW_STACK_STATUS and PR_LOCK_SHADOW_STACK_STATUS affect only the
+ thread that called them, any other running threads will be unaffected.
+
+* New threads inherit the GCS configuration of the thread that created them.
+
+* GCS is disabled on exec().
+
+* The current GCS configuration for a thread may be read with the
+ PR_GET_SHADOW_STACK_STATUS prctl(), this returns the same flags that
+ are passed to PR_SET_SHADOW_STACK_STATUS.
+
+* If GCS is disabled for a thread after having previously been enabled then
+ the stack will remain allocated for the lifetime of the thread. At present
+ any attempt to reenable GCS for the thread will be rejected, this may be
+ revisited in future.
+
+* It should be noted that since enabling GCS will result in GCS becoming
+ active immediately it is not normally possible to return from the function
+ that invoked the prctl() that enabled GCS. It is expected that the normal
+ usage will be that GCS is enabled very early in execution of a program.
+
+
+
+3. Allocation of Guarded Control Stacks
+----------------------------------------
+
+* When GCS is enabled for a thread a new Guarded Control Stack will be
+ allocated for it of half the standard stack size or 2 gigabytes,
+ whichever is smaller.
+
+* When a new thread is created by a thread which has GCS enabled then a
+ new Guarded Control Stack will be allocated for the new thread with
+ half the size of the standard stack.
+
+* When a stack is allocated by enabling GCS or during thread creation then
+ the top 8 bytes of the stack will be initialised to 0 and GCSPR_EL0 will
+ be set to point to the address of this 0 value, this can be used to
+ detect the top of the stack.
+
+* Additional Guarded Control Stacks can be allocated using the
+ map_shadow_stack() system call.
+
+* Stacks allocated using map_shadow_stack() can optionally have an end of
+ stack marker and cap placed at the top of the stack. If the flag
+ SHADOW_STACK_SET_TOKEN is specified a cap will be placed on the stack,
+ if SHADOW_STACK_SET_MARKER is not specified the cap will be the top 8
+ bytes of the stack and if it is specified then the cap will be the next
+ 8 bytes. While specifying just SHADOW_STACK_SET_MARKER by itself is
+ valid since the marker is all bits 0 it has no observable effect.
+
+* Stacks allocated using map_shadow_stack() must have a size which is a
+ multiple of 8 bytes larger than 8 bytes and must be 8 bytes aligned.
+
+* An address can be specified to map_shadow_stack(), if one is provided then
+ it must be aligned to a page boundary.
+
+* When a thread is freed the Guarded Control Stack initially allocated for
+ that thread will be freed. Note carefully that if the stack has been
+ switched this may not be the stack currently in use by the thread.
+
+
+4. Signal handling
+--------------------
+
+* A new signal frame record gcs_context encodes the current GCS mode and
+ pointer for the interrupted context on signal delivery. This will always
+ be present on systems that support GCS.
+
+* The record contains a flag field which reports the current GCS configuration
+ for the interrupted context as PR_GET_SHADOW_STACK_STATUS would.
+
+* The signal handler is run with the same GCS configuration as the interrupted
+ context.
+
+* When GCS is enabled for the interrupted thread a signal handling specific
+ GCS cap token will be written to the GCS, this is an architectural GCS cap
+ with the token type (bits 0..11) all clear. The GCSPR_EL0 reported in the
+ signal frame will point to this cap token.
+
+* The signal handler will use the same GCS as the interrupted context.
+
+* When GCS is enabled on signal entry a frame with the address of the signal
+ return handler will be pushed onto the GCS, allowing return from the signal
+ handler via RET as normal. This will not be reported in the gcs_context in
+ the signal frame.
+
+
+5. Signal return
+-----------------
+
+When returning from a signal handler:
+
+* If there is a gcs_context record in the signal frame then the GCS flags
+ and GCSPR_EL0 will be restored from that context prior to further
+ validation.
+
+* If there is no gcs_context record in the signal frame then the GCS
+ configuration will be unchanged.
+
+* If GCS is enabled on return from a signal handler then GCSPR_EL0 must
+ point to a valid GCS signal cap record, this will be popped from the
+ GCS prior to signal return.
+
+* If the GCS configuration is locked when returning from a signal then any
+ attempt to change the GCS configuration will be treated as an error. This
+ is true even if GCS was not enabled prior to signal entry.
+
+* GCS may be disabled via signal return but any attempt to enable GCS via
+ signal return will be rejected.
+
+
+6. ptrace extensions
+---------------------
+
+* A new regset NT_ARM_GCS is defined for use with PTRACE_GETREGSET and
+ PTRACE_SETREGSET.
+
+* The GCS mode, including enable and disable, may be configured via ptrace.
+ If GCS is enabled via ptrace no new GCS will be allocated for the thread.
+
+* Configuration via ptrace ignores locking of GCS mode bits.
+
+
+7. ELF coredump extensions
+---------------------------
+
+* NT_ARM_GCS notes will be added to each coredump for each thread of the
+ dumped process. The contents will be equivalent to the data that would
+ have been read if a PTRACE_GETREGSET of the corresponding type were
+ executed for each thread when the coredump was generated.
+
+
+
+8. /proc extensions
+--------------------
+
+* Guarded Control Stack pages will include "ss" in their VmFlags in
+ /proc/<pid>/smaps.
diff --git a/Documentation/arch/arm64/index.rst b/Documentation/arch/arm64/index.rst
index 78544de0a8a9..6a012c98bdcd 100644
--- a/Documentation/arch/arm64/index.rst
+++ b/Documentation/arch/arm64/index.rst
@@ -10,16 +10,19 @@ ARM64 Architecture
acpi_object_usage
amu
arm-acpi
+ arm-cca
asymmetric-32bit
booting
cpu-feature-registers
cpu-hotplug
elf_hwcaps
+ gcs
hugetlbpage
kdump
legacy_instructions
memory
memory-tagging-extension
+ mops
perf
pointer-authentication
ptdump
diff --git a/Documentation/arch/arm64/mops.rst b/Documentation/arch/arm64/mops.rst
new file mode 100644
index 000000000000..2ef5b147f8dc
--- /dev/null
+++ b/Documentation/arch/arm64/mops.rst
@@ -0,0 +1,44 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===================================
+Memory copy/set instructions (MOPS)
+===================================
+
+A MOPS memory copy/set operation consists of three consecutive CPY* or SET*
+instructions: a prologue, main and epilogue (for example: CPYP, CPYM, CPYE).
+
+A main or epilogue instruction can take a MOPS exception for various reasons,
+for example when a task is migrated to a CPU with a different MOPS
+implementation, or when the instruction's alignment and size requirements are
+not met. The software exception handler is then expected to reset the registers
+and restart execution from the prologue instruction. Normally this is handled
+by the kernel.
+
+For more details refer to "D1.3.5.7 Memory Copy and Memory Set exceptions" in
+the Arm Architecture Reference Manual DDI 0487K.a (Arm ARM).
+
+.. _arm64_mops_hyp:
+
+Hypervisor requirements
+-----------------------
+
+A hypervisor running a Linux guest must handle all MOPS exceptions from the
+guest kernel, as Linux may not be able to handle the exception at all times.
+For example, a MOPS exception can be taken when the hypervisor migrates a vCPU
+to another physical CPU with a different MOPS implementation.
+
+To do this, the hypervisor must:
+
+ - Set HCRX_EL2.MCE2 to 1 so that the exception is taken to the hypervisor.
+
+ - Have an exception handler that implements the algorithm from the Arm ARM
+ rules CNTMJ and MWFQH.
+
+ - Set the guest's PSTATE.SS to 0 in the exception handler, to handle a
+ potential step of the current instruction.
+
+ Note: Clearing PSTATE.SS is needed so that a single step exception is taken
+ on the next instruction (the prologue instruction). Otherwise prologue
+ would get silently stepped over and the single step exception taken on the
+ main instruction. Note that if the guest instruction is not being stepped
+ then clearing PSTATE.SS has no effect.
diff --git a/Documentation/arch/arm64/sme.rst b/Documentation/arch/arm64/sme.rst
index be317d457417..b2fa01f85cb5 100644
--- a/Documentation/arch/arm64/sme.rst
+++ b/Documentation/arch/arm64/sme.rst
@@ -346,6 +346,10 @@ The regset data starts with struct user_za_header, containing:
* Writes to NT_ARM_ZT will set PSTATE.ZA to 1.
+* If any register data is provided along with SME_PT_VL_ONEXEC then the
+ registers data will be interpreted with the current vector length, not
+ the vector length configured for use on exec.
+
8. ELF coredump extensions
---------------------------
diff --git a/Documentation/arch/arm64/sve.rst b/Documentation/arch/arm64/sve.rst
index 8d8837fc39ec..28152492c29c 100644
--- a/Documentation/arch/arm64/sve.rst
+++ b/Documentation/arch/arm64/sve.rst
@@ -402,6 +402,10 @@ The regset data starts with struct user_sve_header, containing:
streaming mode and any SETREGSET of NT_ARM_SSVE will enter streaming mode
if the target was not in streaming mode.
+* If any register data is provided along with SVE_PT_VL_ONEXEC then the
+ registers data will be interpreted with the current vector length, not
+ the vector length configured for use on exec.
+
* The effect of writing a partial, incomplete payload is unspecified.
diff --git a/Documentation/devicetree/bindings/arm/pmu.yaml b/Documentation/devicetree/bindings/arm/pmu.yaml
index 528544d0a161..a148ff54f2b8 100644
--- a/Documentation/devicetree/bindings/arm/pmu.yaml
+++ b/Documentation/devicetree/bindings/arm/pmu.yaml
@@ -74,6 +74,7 @@ properties:
- qcom,krait-pmu
- qcom,scorpion-pmu
- qcom,scorpion-mp-pmu
+ - samsung,mongoose-pmu
interrupts:
# Don't know how many CPUs, so no constraints to specify
diff --git a/Documentation/devicetree/bindings/perf/fsl-imx-ddr.yaml b/Documentation/devicetree/bindings/perf/fsl-imx-ddr.yaml
index 37e8b98f2cdc..8597ea625edb 100644
--- a/Documentation/devicetree/bindings/perf/fsl-imx-ddr.yaml
+++ b/Documentation/devicetree/bindings/perf/fsl-imx-ddr.yaml
@@ -31,7 +31,9 @@ properties:
- const: fsl,imx8dxl-ddr-pmu
- const: fsl,imx8-ddr-pmu
- items:
- - const: fsl,imx95-ddr-pmu
+ - enum:
+ - fsl,imx91-ddr-pmu
+ - fsl,imx95-ddr-pmu
- const: fsl,imx93-ddr-pmu
reg:
diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
index e834779d9611..6a882c57a7e7 100644
--- a/Documentation/filesystems/proc.rst
+++ b/Documentation/filesystems/proc.rst
@@ -579,7 +579,7 @@ encoded manner. The codes are the following:
mt arm64 MTE allocation tags are enabled
um userfaultfd missing tracking
uw userfaultfd wr-protect tracking
- ss shadow stack page
+ ss shadow/guarded control stack page
sl sealed
== =======================================