arm64 updates for 6.13:

* Support for running Linux in a protected VM under the Arm Confidential
   Compute Architecture (CCA)
 
 * Guarded Control Stack user-space support. Current patches follow the
   x86 ABI of implicitly creating a shadow stack on clone(). Subsequent
   patches (already on the list) will add support for clone3() allowing
   finer-grained control of the shadow stack size and placement from libc
 
 * AT_HWCAP3 support (not running out of HWCAP2 bits yet but we are
   getting close with the upcoming dpISA support)
 
 * Other arch features:
 
   - In-kernel use of the memcpy instructions, FEAT_MOPS (previously only
     exposed to user; uaccess support not merged yet)
 
   - MTE: hugetlbfs support and the corresponding kselftests
 
   - Optimise CRC32 using the PMULL instructions
 
   - Support for FEAT_HAFT enabling ARCH_HAS_NONLEAF_PMD_YOUNG
 
   - Optimise the kernel TLB flushing to use the range operations
 
   - POE/pkey (permission overlays): further cleanups after bringing the
     signal handler in line with the x86 behaviour for 6.12
 
 * arm64 perf updates:
 
   - Support for the NXP i.MX91 PMU in the existing IMX driver
 
   - Support for Ampere SoCs in the Designware PCIe PMU driver
 
   - Support for Marvell's 'PEM' PCIe PMU present in the 'Odyssey' SoC
 
   - Support for Samsung's 'Mongoose' CPU PMU
 
   - Support for PMUv3.9 finer-grained userspace counter access control
 
   - Switch back to platform_driver::remove() now that it returns 'void'
 
   - Add some missing events for the CXL PMU driver
 
 * Miscellaneous arm64 fixes/cleanups:
 
   - Page table accessors cleanup: type updates, drop unused macros,
     reorganise arch_make_huge_pte() and clean up pte_mkcont(), sanity
     check addresses before runtime P4D/PUD folding
 
   - Command line override for ID_AA64MMFR0_EL1.ECV (advertising the
     FEAT_ECV for the generic timers) allowing Linux to boot with
     firmware deployments that don't set SCTLR_EL3.ECVEn
 
   - ACPI/arm64: tighten the check for the array of platform timer
     structures and adjust the error handling procedure in
     gtdt_parse_timer_block()
 
   - Optimise the cache flush for the uprobes xol slot (skip if no
     change) and other uprobes/kprobes cleanups
 
   - Fix the context switching of tpidrro_el0 when kpti is enabled
 
   - Dynamic shadow call stack fixes
 
   - Sysreg updates
 
   - Various arm64 kselftest improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmc5POIACgkQa9axLQDI
 XvEDYA//a3eeNkgMuGdnSCVcLz+zy+oNwAwboG/4X1DqL8jiCbI4npwugPx95RIA
 YZOUvo9T2aL3OyefpUHll4gFHqx9OwoZIig2F70TEUmlPsGUbh0KBkdfQF3xZPdl
 EwV0kHSGEqMWMBwsGJGwgCYrUaf1MUQzh1GBl7VJ2ts5XsJBaBeOyKkysij26wtZ
 V+aHq2IUx7qQS7+HC/4P6IoHxKziFcsCMovaKaynP4cw9xXBQbDMcNlHEwndOMyk
 pu2zrv7GG0j3KQuVP/2Alf5FKhmI0GVGP/6Nc/zsOmw96w8Kf7HfzEtkHawr2aRq
 rqg/c9ivzDn1p+fUBo4ZYtrRk4IAY+yKu6hdzdLTP5+bQrBTWTO9rjQVBm9FAGYT
 sCdEj1NqzvExvNHD7X6ut/GJ05lmce3K+qeSXSEysN9gqiT3eomYWMXrD2V2lxzb
 rIDDcb/icfaqjt14Mksh19r/rzNeq7noj9CGSmcqw0BHZfHzl38Lai6pdfYzCNyn
 vCM/c4c1D/WWX8/lifO1JZVbhDk1jy82Iphg2KEhL8iKPxDsKBBZLmYuU1oa7tMo
 WryGAz9+GQwd+W9chFuaOEtMnzvW2scEJ5Eb2fEf0Qj0aEurkL+C9dZR6o1GN77V
 DBUxtU628Ef4PJJGfbNCwZzdd8UPYG3a/mKfQQ3dz0oz2LySlW4=
 =wDot
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

 - Support for running Linux in a protected VM under the Arm
   Confidential Compute Architecture (CCA)

 - Guarded Control Stack user-space support. Current patches follow the
   x86 ABI of implicitly creating a shadow stack on clone(). Subsequent
   patches (already on the list) will add support for clone3() allowing
   finer-grained control of the shadow stack size and placement from
   libc

 - AT_HWCAP3 support (not running out of HWCAP2 bits yet but we are
   getting close with the upcoming dpISA support)

 - Other arch features:

     - In-kernel use of the memcpy instructions, FEAT_MOPS (previously
       only exposed to user; uaccess support not merged yet)

     - MTE: hugetlbfs support and the corresponding kselftests

     - Optimise CRC32 using the PMULL instructions

     - Support for FEAT_HAFT enabling ARCH_HAS_NONLEAF_PMD_YOUNG

     - Optimise the kernel TLB flushing to use the range operations

     - POE/pkey (permission overlays): further cleanups after bringing
       the signal handler in line with the x86 behaviour for 6.12

 - arm64 perf updates:

     - Support for the NXP i.MX91 PMU in the existing IMX driver

     - Support for Ampere SoCs in the Designware PCIe PMU driver

     - Support for Marvell's 'PEM' PCIe PMU present in the 'Odyssey' SoC

     - Support for Samsung's 'Mongoose' CPU PMU

     - Support for PMUv3.9 finer-grained userspace counter access
       control

     - Switch back to platform_driver::remove() now that it returns
       'void'

     - Add some missing events for the CXL PMU driver

 - Miscellaneous arm64 fixes/cleanups:

     - Page table accessors cleanup: type updates, drop unused macros,
       reorganise arch_make_huge_pte() and clean up pte_mkcont(), sanity
       check addresses before runtime P4D/PUD folding

     - Command line override for ID_AA64MMFR0_EL1.ECV (advertising the
       FEAT_ECV for the generic timers) allowing Linux to boot with
       firmware deployments that don't set SCTLR_EL3.ECVEn

     - ACPI/arm64: tighten the check for the array of platform timer
       structures and adjust the error handling procedure in
       gtdt_parse_timer_block()

     - Optimise the cache flush for the uprobes xol slot (skip if no
       change) and other uprobes/kprobes cleanups

     - Fix the context switching of tpidrro_el0 when kpti is enabled

     - Dynamic shadow call stack fixes

     - Sysreg updates

     - Various arm64 kselftest improvements

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (168 commits)
  arm64: tls: Fix context-switching of tpidrro_el0 when kpti is enabled
  kselftest/arm64: Try harder to generate different keys during PAC tests
  kselftest/arm64: Don't leak pipe fds in pac.exec_sign_all()
  arm64/ptrace: Clarify documentation of VL configuration via ptrace
  kselftest/arm64: Corrupt P0 in the irritator when testing SSVE
  acpi/arm64: remove unnecessary cast
  arm64/mm: Change protval as 'pteval_t' in map_range()
  kselftest/arm64: Fix missing printf() argument in gcs/gcs-stress.c
  kselftest/arm64: Add FPMR coverage to fp-ptrace
  kselftest/arm64: Expand the set of ZA writes fp-ptrace does
  kselftets/arm64: Use flag bits for features in fp-ptrace assembler code
  kselftest/arm64: Enable build of PAC tests with LLVM=1
  kselftest/arm64: Check that SVCR is 0 in signal handlers
  selftests/mm: Fix unused function warning for aarch64_write_signal_pkey()
  kselftest/arm64: Fix printf() compiler warnings in the arm64 syscall-abi.c tests
  kselftest/arm64: Fix printf() warning in the arm64 MTE prctl() test
  kselftest/arm64: Fix printf() compiler warnings in the arm64 fp tests
  kselftest/arm64: Fix build with stricter assemblers
  arm64/scs: Drop unused prototype __pi_scs_patch_vmlinux()
  arm64/scs: Deal with 64-bit relative offsets in FDE frames
  ...
This commit is contained in:
Linus Torvalds 2024-11-18 18:10:37 -08:00
commit ba1f9c8fe3
214 changed files with 7916 additions and 587 deletions

View File

@ -446,6 +446,9 @@
arm64.nobti [ARM64] Unconditionally disable Branch Target
Identification support
arm64.nogcs [ARM64] Unconditionally disable Guarded Control Stack
support
arm64.nomops [ARM64] Unconditionally disable Memory Copy and Memory
Set instructions support

View File

@ -26,3 +26,4 @@ Performance monitor support
meson-ddr-pmu
cxl
ampere_cspmu
mrvl-pem-pmu

View File

@ -0,0 +1,56 @@
=================================================================
Marvell Odyssey PEM Performance Monitoring Unit (PMU UNCORE)
=================================================================
The PCI Express Interface Units(PEM) are associated with a corresponding
monitoring unit. This includes performance counters to track various
characteristics of the data that is transmitted over the PCIe link.
The counters track inbound and outbound transactions which
includes separate counters for posted/non-posted/completion TLPs.
Also, inbound and outbound memory read requests along with their
latencies can also be monitored. Address Translation Services(ATS)events
such as ATS Translation, ATS Page Request, ATS Invalidation along with
their corresponding latencies are also tracked.
There are separate 64 bit counters to measure posted/non-posted/completion
tlps in inbound and outbound transactions. ATS events are measured by
different counters.
The PMU driver exposes the available events and format options under sysfs,
/sys/bus/event_source/devices/mrvl_pcie_rc_pmu_<>/events/
/sys/bus/event_source/devices/mrvl_pcie_rc_pmu_<>/format/
Examples::
# perf list | grep mrvl_pcie_rc_pmu
mrvl_pcie_rc_pmu_<>/ats_inv/ [Kernel PMU event]
mrvl_pcie_rc_pmu_<>/ats_inv_latency/ [Kernel PMU event]
mrvl_pcie_rc_pmu_<>/ats_pri/ [Kernel PMU event]
mrvl_pcie_rc_pmu_<>/ats_pri_latency/ [Kernel PMU event]
mrvl_pcie_rc_pmu_<>/ats_trans/ [Kernel PMU event]
mrvl_pcie_rc_pmu_<>/ats_trans_latency/ [Kernel PMU event]
mrvl_pcie_rc_pmu_<>/ib_inflight/ [Kernel PMU event]
mrvl_pcie_rc_pmu_<>/ib_reads/ [Kernel PMU event]
mrvl_pcie_rc_pmu_<>/ib_req_no_ro_ebus/ [Kernel PMU event]
mrvl_pcie_rc_pmu_<>/ib_req_no_ro_ncb/ [Kernel PMU event]
mrvl_pcie_rc_pmu_<>/ib_tlp_cpl_partid/ [Kernel PMU event]
mrvl_pcie_rc_pmu_<>/ib_tlp_dwords_cpl_partid/ [Kernel PMU event]
mrvl_pcie_rc_pmu_<>/ib_tlp_dwords_npr/ [Kernel PMU event]
mrvl_pcie_rc_pmu_<>/ib_tlp_dwords_pr/ [Kernel PMU event]
mrvl_pcie_rc_pmu_<>/ib_tlp_npr/ [Kernel PMU event]
mrvl_pcie_rc_pmu_<>/ib_tlp_pr/ [Kernel PMU event]
mrvl_pcie_rc_pmu_<>/ob_inflight_partid/ [Kernel PMU event]
mrvl_pcie_rc_pmu_<>/ob_merges_cpl_partid/ [Kernel PMU event]
mrvl_pcie_rc_pmu_<>/ob_merges_npr_partid/ [Kernel PMU event]
mrvl_pcie_rc_pmu_<>/ob_merges_pr_partid/ [Kernel PMU event]
mrvl_pcie_rc_pmu_<>/ob_reads_partid/ [Kernel PMU event]
mrvl_pcie_rc_pmu_<>/ob_tlp_cpl_partid/ [Kernel PMU event]
mrvl_pcie_rc_pmu_<>/ob_tlp_dwords_cpl_partid/ [Kernel PMU event]
mrvl_pcie_rc_pmu_<>/ob_tlp_dwords_npr_partid/ [Kernel PMU event]
mrvl_pcie_rc_pmu_<>/ob_tlp_dwords_pr_partid/ [Kernel PMU event]
mrvl_pcie_rc_pmu_<>/ob_tlp_npr_partid/ [Kernel PMU event]
mrvl_pcie_rc_pmu_<>/ob_tlp_pr_partid/ [Kernel PMU event]
# perf stat -e ib_inflight,ib_reads,ib_req_no_ro_ebus,ib_req_no_ro_ncb <workload>

View File

@ -0,0 +1,69 @@
.. SPDX-License-Identifier: GPL-2.0
=====================================
Arm Confidential Compute Architecture
=====================================
Arm systems that support the Realm Management Extension (RME) contain
hardware to allow a VM guest to be run in a way which protects the code
and data of the guest from the hypervisor. It extends the older "two
world" model (Normal and Secure World) into four worlds: Normal, Secure,
Root and Realm. Linux can then also be run as a guest to a monitor
running in the Realm world.
The monitor running in the Realm world is known as the Realm Management
Monitor (RMM) and implements the Realm Management Monitor
specification[1]. The monitor acts a bit like a hypervisor (e.g. it runs
in EL2 and manages the stage 2 page tables etc of the guests running in
Realm world), however much of the control is handled by a hypervisor
running in the Normal World. The Normal World hypervisor uses the Realm
Management Interface (RMI) defined by the RMM specification to request
the RMM to perform operations (e.g. mapping memory or executing a vCPU).
The RMM defines an environment for guests where the address space (IPA)
is split into two. The lower half is protected - any memory that is
mapped in this half cannot be seen by the Normal World and the RMM
restricts what operations the Normal World can perform on this memory
(e.g. the Normal World cannot replace pages in this region without the
guest's cooperation). The upper half is shared, the Normal World is free
to make changes to the pages in this region, and is able to emulate MMIO
devices in this region too.
A guest running in a Realm may also communicate with the RMM using the
Realm Services Interface (RSI) to request changes in its environment or
to perform attestation about its environment. In particular it may
request that areas of the protected address space are transitioned
between 'RAM' and 'EMPTY' (in either direction). This allows a Realm
guest to give up memory to be returned to the Normal World, or to
request new memory from the Normal World. Without an explicit request
from the Realm guest the RMM will otherwise prevent the Normal World
from making these changes.
Linux as a Realm Guest
----------------------
To run Linux as a guest within a Realm, the following must be provided
either by the VMM or by a `boot loader` run in the Realm before Linux:
* All protected RAM described to Linux (by DT or ACPI) must be marked
RIPAS RAM before handing control over to Linux.
* MMIO devices must be either unprotected (e.g. emulated by the Normal
World) or marked RIPAS DEV.
* MMIO devices emulated by the Normal World and used very early in boot
(specifically earlycon) must be specified in the upper half of IPA.
For earlycon this can be done by specifying the address on the
command line, e.g. with an IPA size of 33 bits and the base address
of the emulated UART at 0x1000000: ``earlycon=uart,mmio,0x101000000``
* Linux will use bounce buffers for communicating with unprotected
devices. It will transition some protected memory to RIPAS EMPTY and
expect to be able to access unprotected pages at the same IPA address
but with the highest valid IPA bit set. The expectation is that the
VMM will remove the physical pages from the protected mapping and
provide those pages as unprotected pages.
References
----------
[1] https://developer.arm.com/documentation/den0137/

View File

@ -41,6 +41,9 @@ to automatically locate and size all RAM, or it may use knowledge of
the RAM in the machine, or any other method the boot loader designer
sees fit.)
For Arm Confidential Compute Realms this includes ensuring that all
protected RAM has a Realm IPA state (RIPAS) of "RAM".
2. Setup the device tree
-------------------------
@ -385,6 +388,9 @@ Before jumping into the kernel, the following conditions must be met:
- HCRX_EL2.MSCEn (bit 11) must be initialised to 0b1.
- HCRX_EL2.MCE2 (bit 10) must be initialised to 0b1 and the hypervisor
must handle MOPS exceptions as described in :ref:`arm64_mops_hyp`.
For CPUs with the Extended Translation Control Register feature (FEAT_TCR2):
- If EL3 is present:
@ -411,6 +417,38 @@ Before jumping into the kernel, the following conditions must be met:
- HFGRWR_EL2.nPIRE0_EL1 (bit 57) must be initialised to 0b1.
- For CPUs with Guarded Control Stacks (FEAT_GCS):
- GCSCR_EL1 must be initialised to 0.
- GCSCRE0_EL1 must be initialised to 0.
- If EL3 is present:
- SCR_EL3.GCSEn (bit 39) must be initialised to 0b1.
- If EL2 is present:
- GCSCR_EL2 must be initialised to 0.
- If the kernel is entered at EL1 and EL2 is present:
- HCRX_EL2.GCSEn must be initialised to 0b1.
- HFGITR_EL2.nGCSEPP (bit 59) must be initialised to 0b1.
- HFGITR_EL2.nGCSSTR_EL1 (bit 58) must be initialised to 0b1.
- HFGITR_EL2.nGCSPUSHM_EL1 (bit 57) must be initialised to 0b1.
- HFGRTR_EL2.nGCS_EL1 (bit 53) must be initialised to 0b1.
- HFGRTR_EL2.nGCS_EL0 (bit 52) must be initialised to 0b1.
- HFGWTR_EL2.nGCS_EL1 (bit 53) must be initialised to 0b1.
- HFGWTR_EL2.nGCS_EL0 (bit 52) must be initialised to 0b1.
The requirements described above for CPU mode, caches, MMUs, architected
timers, coherency and system registers apply to all CPUs. All CPUs must
enter the kernel in the same exception level. Where the values documented

View File

@ -16,9 +16,9 @@ architected discovery mechanism available to userspace code at EL0. The
kernel exposes the presence of these features to userspace through a set
of flags called hwcaps, exposed in the auxiliary vector.
Userspace software can test for features by acquiring the AT_HWCAP or
AT_HWCAP2 entry of the auxiliary vector, and testing whether the relevant
flags are set, e.g.::
Userspace software can test for features by acquiring the AT_HWCAP,
AT_HWCAP2 or AT_HWCAP3 entry of the auxiliary vector, and testing
whether the relevant flags are set, e.g.::
bool floating_point_is_present(void)
{
@ -170,6 +170,10 @@ HWCAP_PACG
ID_AA64ISAR1_EL1.GPI == 0b0001, as described by
Documentation/arch/arm64/pointer-authentication.rst.
HWCAP_GCS
Functionality implied by ID_AA64PFR1_EL1.GCS == 0b1, as
described by Documentation/arch/arm64/gcs.rst.
HWCAP2_DCPODP
Functionality implied by ID_AA64ISAR1_EL1.DPB == 0b0010.

View File

@ -0,0 +1,227 @@
===============================================
Guarded Control Stack support for AArch64 Linux
===============================================
This document outlines briefly the interface provided to userspace by Linux in
order to support use of the ARM Guarded Control Stack (GCS) feature.
This is an outline of the most important features and issues only and not
intended to be exhaustive.
1. General
-----------
* GCS is an architecture feature intended to provide greater protection
against return oriented programming (ROP) attacks and to simplify the
implementation of features that need to collect stack traces such as
profiling.
* When GCS is enabled a separate guarded control stack is maintained by the
PE which is writeable only through specific GCS operations. This
stores the call stack only, when a procedure call instruction is
performed the current PC is pushed onto the GCS and on RET the
address in the LR is verified against that on the top of the GCS.
* When active the current GCS pointer is stored in the system register
GCSPR_EL0. This is readable by userspace but can only be updated
via specific GCS instructions.
* The architecture provides instructions for switching between guarded
control stacks with checks to ensure that the new stack is a valid
target for switching.
* The functionality of GCS is similar to that provided by the x86 Shadow
Stack feature, due to sharing of userspace interfaces the ABI refers to
shadow stacks rather than GCS.
* Support for GCS is reported to userspace via HWCAP_GCS in the aux vector
AT_HWCAP2 entry.
* GCS is enabled per thread. While there is support for disabling GCS
at runtime this should be done with great care.
* GCS memory access faults are reported as normal memory access faults.
* GCS specific errors (those reported with EC 0x2d) will be reported as
SIGSEGV with a si_code of SEGV_CPERR (control protection error).
* GCS is supported only for AArch64.
* On systems where GCS is supported GCSPR_EL0 is always readable by EL0
regardless of the GCS configuration for the thread.
* The architecture supports enabling GCS without verifying that return values
in LR match those in the GCS, the LR will be ignored. This is not supported
by Linux.
2. Enabling and disabling Guarded Control Stacks
-------------------------------------------------
* GCS is enabled and disabled for a thread via the PR_SET_SHADOW_STACK_STATUS
prctl(), this takes a single flags argument specifying which GCS features
should be used.
* When set PR_SHADOW_STACK_ENABLE flag allocates a Guarded Control Stack
and enables GCS for the thread, enabling the functionality controlled by
GCSCRE0_EL1.{nTR, RVCHKEN, PCRSEL}.
* When set the PR_SHADOW_STACK_PUSH flag enables the functionality controlled
by GCSCRE0_EL1.PUSHMEn, allowing explicit GCS pushes.
* When set the PR_SHADOW_STACK_WRITE flag enables the functionality controlled
by GCSCRE0_EL1.STREn, allowing explicit stores to the Guarded Control Stack.
* Any unknown flags will cause PR_SET_SHADOW_STACK_STATUS to return -EINVAL.
* PR_LOCK_SHADOW_STACK_STATUS is passed a bitmask of features with the same
values as used for PR_SET_SHADOW_STACK_STATUS. Any future changes to the
status of the specified GCS mode bits will be rejected.
* PR_LOCK_SHADOW_STACK_STATUS allows any bit to be locked, this allows
userspace to prevent changes to any future features.
* There is no support for a process to remove a lock that has been set for
it.
* PR_SET_SHADOW_STACK_STATUS and PR_LOCK_SHADOW_STACK_STATUS affect only the
thread that called them, any other running threads will be unaffected.
* New threads inherit the GCS configuration of the thread that created them.
* GCS is disabled on exec().
* The current GCS configuration for a thread may be read with the
PR_GET_SHADOW_STACK_STATUS prctl(), this returns the same flags that
are passed to PR_SET_SHADOW_STACK_STATUS.
* If GCS is disabled for a thread after having previously been enabled then
the stack will remain allocated for the lifetime of the thread. At present
any attempt to reenable GCS for the thread will be rejected, this may be
revisited in future.
* It should be noted that since enabling GCS will result in GCS becoming
active immediately it is not normally possible to return from the function
that invoked the prctl() that enabled GCS. It is expected that the normal
usage will be that GCS is enabled very early in execution of a program.
3. Allocation of Guarded Control Stacks
----------------------------------------
* When GCS is enabled for a thread a new Guarded Control Stack will be
allocated for it of half the standard stack size or 2 gigabytes,
whichever is smaller.
* When a new thread is created by a thread which has GCS enabled then a
new Guarded Control Stack will be allocated for the new thread with
half the size of the standard stack.
* When a stack is allocated by enabling GCS or during thread creation then
the top 8 bytes of the stack will be initialised to 0 and GCSPR_EL0 will
be set to point to the address of this 0 value, this can be used to
detect the top of the stack.
* Additional Guarded Control Stacks can be allocated using the
map_shadow_stack() system call.
* Stacks allocated using map_shadow_stack() can optionally have an end of
stack marker and cap placed at the top of the stack. If the flag
SHADOW_STACK_SET_TOKEN is specified a cap will be placed on the stack,
if SHADOW_STACK_SET_MARKER is not specified the cap will be the top 8
bytes of the stack and if it is specified then the cap will be the next
8 bytes. While specifying just SHADOW_STACK_SET_MARKER by itself is
valid since the marker is all bits 0 it has no observable effect.
* Stacks allocated using map_shadow_stack() must have a size which is a
multiple of 8 bytes larger than 8 bytes and must be 8 bytes aligned.
* An address can be specified to map_shadow_stack(), if one is provided then
it must be aligned to a page boundary.
* When a thread is freed the Guarded Control Stack initially allocated for
that thread will be freed. Note carefully that if the stack has been
switched this may not be the stack currently in use by the thread.
4. Signal handling
--------------------
* A new signal frame record gcs_context encodes the current GCS mode and
pointer for the interrupted context on signal delivery. This will always
be present on systems that support GCS.
* The record contains a flag field which reports the current GCS configuration
for the interrupted context as PR_GET_SHADOW_STACK_STATUS would.
* The signal handler is run with the same GCS configuration as the interrupted
context.
* When GCS is enabled for the interrupted thread a signal handling specific
GCS cap token will be written to the GCS, this is an architectural GCS cap
with the token type (bits 0..11) all clear. The GCSPR_EL0 reported in the
signal frame will point to this cap token.
* The signal handler will use the same GCS as the interrupted context.
* When GCS is enabled on signal entry a frame with the address of the signal
return handler will be pushed onto the GCS, allowing return from the signal
handler via RET as normal. This will not be reported in the gcs_context in
the signal frame.
5. Signal return
-----------------
When returning from a signal handler:
* If there is a gcs_context record in the signal frame then the GCS flags
and GCSPR_EL0 will be restored from that context prior to further
validation.
* If there is no gcs_context record in the signal frame then the GCS
configuration will be unchanged.
* If GCS is enabled on return from a signal handler then GCSPR_EL0 must
point to a valid GCS signal cap record, this will be popped from the
GCS prior to signal return.
* If the GCS configuration is locked when returning from a signal then any
attempt to change the GCS configuration will be treated as an error. This
is true even if GCS was not enabled prior to signal entry.
* GCS may be disabled via signal return but any attempt to enable GCS via
signal return will be rejected.
6. ptrace extensions
---------------------
* A new regset NT_ARM_GCS is defined for use with PTRACE_GETREGSET and
PTRACE_SETREGSET.
* The GCS mode, including enable and disable, may be configured via ptrace.
If GCS is enabled via ptrace no new GCS will be allocated for the thread.
* Configuration via ptrace ignores locking of GCS mode bits.
7. ELF coredump extensions
---------------------------
* NT_ARM_GCS notes will be added to each coredump for each thread of the
dumped process. The contents will be equivalent to the data that would
have been read if a PTRACE_GETREGSET of the corresponding type were
executed for each thread when the coredump was generated.
8. /proc extensions
--------------------
* Guarded Control Stack pages will include "ss" in their VmFlags in
/proc/<pid>/smaps.

View File

@ -10,16 +10,19 @@ ARM64 Architecture
acpi_object_usage
amu
arm-acpi
arm-cca
asymmetric-32bit
booting
cpu-feature-registers
cpu-hotplug
elf_hwcaps
gcs
hugetlbpage
kdump
legacy_instructions
memory
memory-tagging-extension
mops
perf
pointer-authentication
ptdump

View File

@ -0,0 +1,44 @@
.. SPDX-License-Identifier: GPL-2.0
===================================
Memory copy/set instructions (MOPS)
===================================
A MOPS memory copy/set operation consists of three consecutive CPY* or SET*
instructions: a prologue, main and epilogue (for example: CPYP, CPYM, CPYE).
A main or epilogue instruction can take a MOPS exception for various reasons,
for example when a task is migrated to a CPU with a different MOPS
implementation, or when the instruction's alignment and size requirements are
not met. The software exception handler is then expected to reset the registers
and restart execution from the prologue instruction. Normally this is handled
by the kernel.
For more details refer to "D1.3.5.7 Memory Copy and Memory Set exceptions" in
the Arm Architecture Reference Manual DDI 0487K.a (Arm ARM).
.. _arm64_mops_hyp:
Hypervisor requirements
-----------------------
A hypervisor running a Linux guest must handle all MOPS exceptions from the
guest kernel, as Linux may not be able to handle the exception at all times.
For example, a MOPS exception can be taken when the hypervisor migrates a vCPU
to another physical CPU with a different MOPS implementation.
To do this, the hypervisor must:
- Set HCRX_EL2.MCE2 to 1 so that the exception is taken to the hypervisor.
- Have an exception handler that implements the algorithm from the Arm ARM
rules CNTMJ and MWFQH.
- Set the guest's PSTATE.SS to 0 in the exception handler, to handle a
potential step of the current instruction.
Note: Clearing PSTATE.SS is needed so that a single step exception is taken
on the next instruction (the prologue instruction). Otherwise prologue
would get silently stepped over and the single step exception taken on the
main instruction. Note that if the guest instruction is not being stepped
then clearing PSTATE.SS has no effect.

View File

@ -346,6 +346,10 @@ The regset data starts with struct user_za_header, containing:
* Writes to NT_ARM_ZT will set PSTATE.ZA to 1.
* If any register data is provided along with SME_PT_VL_ONEXEC then the
registers data will be interpreted with the current vector length, not
the vector length configured for use on exec.
8. ELF coredump extensions
---------------------------

View File

@ -402,6 +402,10 @@ The regset data starts with struct user_sve_header, containing:
streaming mode and any SETREGSET of NT_ARM_SSVE will enter streaming mode
if the target was not in streaming mode.
* If any register data is provided along with SVE_PT_VL_ONEXEC then the
registers data will be interpreted with the current vector length, not
the vector length configured for use on exec.
* The effect of writing a partial, incomplete payload is unspecified.

View File

@ -74,6 +74,7 @@ properties:
- qcom,krait-pmu
- qcom,scorpion-pmu
- qcom,scorpion-mp-pmu
- samsung,mongoose-pmu
interrupts:
# Don't know how many CPUs, so no constraints to specify

View File

@ -31,7 +31,9 @@ properties:
- const: fsl,imx8dxl-ddr-pmu
- const: fsl,imx8-ddr-pmu
- items:
- const: fsl,imx95-ddr-pmu
- enum:
- fsl,imx91-ddr-pmu
- fsl,imx95-ddr-pmu
- const: fsl,imx93-ddr-pmu
reg:

View File

@ -579,7 +579,7 @@ encoded manner. The codes are the following:
mt arm64 MTE allocation tags are enabled
um userfaultfd missing tracking
uw userfaultfd wr-protect tracking
ss shadow stack page
ss shadow/guarded control stack page
sl sealed
== =======================================

View File

@ -13816,6 +13816,12 @@ S: Supported
F: Documentation/networking/device_drivers/ethernet/marvell/octeontx2.rst
F: drivers/net/ethernet/marvell/octeontx2/af/
MARVELL PEM PMU DRIVER
M: Linu Cherian <lcherian@marvell.com>
M: Gowthami Thiagarajan <gthiagarajan@marvell.com>
S: Supported
F: drivers/perf/marvell_pem_pmu.c
MARVELL PRESTERA ETHERNET SWITCH DRIVER
M: Taras Chornyi <taras.chornyi@plvision.eu>
S: Supported

View File

@ -212,6 +212,8 @@ static inline void write_pmuserenr(u32 val)
write_sysreg(val, PMUSERENR);
}
static inline void write_pmuacr(u64 val) {}
static inline void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr) {}
static inline void kvm_clr_pmu_events(u32 clr) {}
static inline bool kvm_pmu_counter_deferred(struct perf_event_attr *attr)
@ -231,6 +233,7 @@ static inline void kvm_vcpu_pmu_resync_el0(void) {}
#define ARMV8_PMU_DFR_VER_V3P1 0x4
#define ARMV8_PMU_DFR_VER_V3P4 0x5
#define ARMV8_PMU_DFR_VER_V3P5 0x6
#define ARMV8_PMU_DFR_VER_V3P9 0x9
#define ARMV8_PMU_DFR_VER_IMP_DEF 0xF
static inline bool pmuv3_implemented(int pmuver)
@ -249,6 +252,11 @@ static inline bool is_pmuv3p5(int pmuver)
return pmuver >= ARMV8_PMU_DFR_VER_V3P5;
}
static inline bool is_pmuv3p9(int pmuver)
{
return pmuver >= ARMV8_PMU_DFR_VER_V3P9;
}
static inline u64 read_pmceid0(void)
{
u64 val = read_sysreg(PMCEID0);

View File

@ -21,6 +21,7 @@ config ARM64
select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
select ARCH_HAS_CACHE_LINE_SIZE
select ARCH_HAS_CC_PLATFORM
select ARCH_HAS_CURRENT_STACK_POINTER
select ARCH_HAS_DEBUG_VIRTUAL
select ARCH_HAS_DEBUG_VM_PGTABLE
@ -38,12 +39,15 @@ config ARM64
select ARCH_HAS_MEM_ENCRYPT
select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
select ARCH_HAS_NONLEAF_PMD_YOUNG if ARM64_HAFT
select ARCH_HAS_PTE_DEVMAP
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_HW_PTE_YOUNG
select ARCH_HAS_SETUP_DMA_OPS
select ARCH_HAS_SET_DIRECT_MAP
select ARCH_HAS_SET_MEMORY
select ARCH_HAS_MEM_ENCRYPT
select ARCH_HAS_FORCE_DMA_UNENCRYPTED
select ARCH_STACKWALK
select ARCH_HAS_STRICT_KERNEL_RWX
select ARCH_HAS_STRICT_MODULE_RWX
@ -2159,6 +2163,9 @@ config ARM64_EPAN
if the cpu does not implement the feature.
endmenu # "ARMv8.7 architectural features"
config AS_HAS_MOPS
def_bool $(as-instr,.arch_extension mops)
menu "ARMv8.9 architectural features"
config ARM64_POE
@ -2180,8 +2187,44 @@ config ARCH_PKEY_BITS
int
default 3
config ARM64_HAFT
bool "Support for Hardware managed Access Flag for Table Descriptors"
depends on ARM64_HW_AFDBM
default y
help
The ARMv8.9/ARMv9.5 introduces the feature Hardware managed Access
Flag for Table descriptors. When enabled an architectural executed
memory access will update the Access Flag in each Table descriptor
which is accessed during the translation table walk and for which
the Access Flag is 0. The Access Flag of the Table descriptor use
the same bit of PTE_AF.
The feature will only be enabled if all the CPUs in the system
support this feature. If unsure, say Y.
endmenu # "ARMv8.9 architectural features"
menu "v9.4 architectural features"
config ARM64_GCS
bool "Enable support for Guarded Control Stack (GCS)"
default y
select ARCH_HAS_USER_SHADOW_STACK
select ARCH_USES_HIGH_VMA_FLAGS
depends on !UPROBES
help
Guarded Control Stack (GCS) provides support for a separate
stack with restricted access which contains only return
addresses. This can be used to harden against some attacks
by comparing return address used by the program with what is
stored in the GCS, and may also be used to efficiently obtain
the call stack for applications such as profiling.
The feature is detected at runtime, and will remain disabled
if the system does not implement the feature.
endmenu # "v9.4 architectural features"
config ARM64_SVE
bool "ARM Scalable Vector Extension support"
default y

View File

@ -152,6 +152,11 @@ static inline void write_pmuserenr(u32 val)
write_sysreg(val, pmuserenr_el0);
}
static inline void write_pmuacr(u64 val)
{
write_sysreg_s(val, SYS_PMUACR_EL1);
}
static inline u64 read_pmceid0(void)
{
return read_sysreg(pmceid0_el0);
@ -178,4 +183,9 @@ static inline bool is_pmuv3p5(int pmuver)
return pmuver >= ID_AA64DFR0_EL1_PMUVer_V3P5;
}
static inline bool is_pmuv3p9(int pmuver)
{
return pmuver >= ID_AA64DFR0_EL1_PMUVer_V3P9;
}
#endif

View File

@ -248,13 +248,6 @@ alternative_endif
ldr \dst, [\dst, \tmp]
.endm
/*
* vma_vm_mm - get mm pointer from vma pointer (vma->vm_mm)
*/
.macro vma_vm_mm, rd, rn
ldr \rd, [\rn, #VMA_VM_MM]
.endm
/*
* read_ctr - read CTR_EL0. If the system has mismatched register fields,
* provide the system wide safe value from arm64_ftr_reg_ctrel0.sys_val

View File

@ -42,6 +42,8 @@ cpucap_is_possible(const unsigned int cap)
return IS_ENABLED(CONFIG_ARM64_BTI);
case ARM64_HAS_TLB_RANGE:
return IS_ENABLED(CONFIG_ARM64_TLB_RANGE);
case ARM64_HAS_S1POE:
return IS_ENABLED(CONFIG_ARM64_POE);
case ARM64_UNMAP_KERNEL_AT_EL0:
return IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0);
case ARM64_WORKAROUND_843419:

View File

@ -12,7 +12,7 @@
#include <asm/hwcap.h>
#include <asm/sysreg.h>
#define MAX_CPU_FEATURES 128
#define MAX_CPU_FEATURES 192
#define cpu_feature(x) KERNEL_HWCAP_ ## x
#define ARM64_SW_FEATURE_OVERRIDE_NOKASLR 0
@ -438,6 +438,7 @@ void cpu_set_feature(unsigned int num);
bool cpu_have_feature(unsigned int num);
unsigned long cpu_get_elf_hwcap(void);
unsigned long cpu_get_elf_hwcap2(void);
unsigned long cpu_get_elf_hwcap3(void);
#define cpu_set_named_feature(name) cpu_set_feature(cpu_feature(name))
#define cpu_have_named_feature(name) cpu_have_feature(cpu_feature(name))
@ -834,8 +835,19 @@ static inline bool system_supports_lpa2(void)
static inline bool system_supports_poe(void)
{
return IS_ENABLED(CONFIG_ARM64_POE) &&
alternative_has_cap_unlikely(ARM64_HAS_S1POE);
return alternative_has_cap_unlikely(ARM64_HAS_S1POE);
}
static inline bool system_supports_gcs(void)
{
return IS_ENABLED(CONFIG_ARM64_GCS) &&
alternative_has_cap_unlikely(ARM64_HAS_GCS);
}
static inline bool system_supports_haft(void)
{
return IS_ENABLED(CONFIG_ARM64_HAFT) &&
cpus_have_final_cap(ARM64_HAFT);
}
int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt);

View File

@ -132,7 +132,7 @@ static inline void local_daif_inherit(struct pt_regs *regs)
trace_hardirqs_on();
if (system_uses_irq_prio_masking())
gic_write_pmr(regs->pmr_save);
gic_write_pmr(regs->pmr);
/*
* We can't use local_daif_restore(regs->pstate) here as

View File

@ -105,6 +105,7 @@ void kernel_enable_single_step(struct pt_regs *regs);
void kernel_disable_single_step(void);
int kernel_active_single_step(void);
void kernel_rewind_single_step(struct pt_regs *regs);
void kernel_fastforward_single_step(struct pt_regs *regs);
#ifdef CONFIG_HAVE_HW_BREAKPOINT
int reinstall_suspended_bps(struct pt_regs *regs);

View File

@ -27,6 +27,14 @@
ubfx x0, x0, #ID_AA64MMFR1_EL1_HCX_SHIFT, #4
cbz x0, .Lskip_hcrx_\@
mov_q x0, HCRX_HOST_FLAGS
/* Enable GCS if supported */
mrs_s x1, SYS_ID_AA64PFR1_EL1
ubfx x1, x1, #ID_AA64PFR1_EL1_GCS_SHIFT, #4
cbz x1, .Lset_hcrx_\@
orr x0, x0, #HCRX_EL2_GCSEn
.Lset_hcrx_\@:
msr_s SYS_HCRX_EL2, x0
.Lskip_hcrx_\@:
.endm
@ -200,6 +208,16 @@
orr x0, x0, #HFGxTR_EL2_nPOR_EL0
.Lskip_poe_fgt_\@:
/* GCS depends on PIE so we don't check it if PIE is absent */
mrs_s x1, SYS_ID_AA64PFR1_EL1
ubfx x1, x1, #ID_AA64PFR1_EL1_GCS_SHIFT, #4
cbz x1, .Lset_fgt_\@
/* Disable traps of access to GCS registers at EL0 and EL1 */
orr x0, x0, #HFGxTR_EL2_nGCS_EL1_MASK
orr x0, x0, #HFGxTR_EL2_nGCS_EL0_MASK
.Lset_fgt_\@:
msr_s SYS_HFGRTR_EL2, x0
msr_s SYS_HFGWTR_EL2, x0
msr_s SYS_HFGITR_EL2, xzr
@ -215,6 +233,17 @@
.Lskip_fgt_\@:
.endm
.macro __init_el2_gcs
mrs_s x1, SYS_ID_AA64PFR1_EL1
ubfx x1, x1, #ID_AA64PFR1_EL1_GCS_SHIFT, #4
cbz x1, .Lskip_gcs_\@
/* Ensure GCS is not enabled when we start trying to do BLs */
msr_s SYS_GCSCR_EL1, xzr
msr_s SYS_GCSCRE0_EL1, xzr
.Lskip_gcs_\@:
.endm
.macro __init_el2_nvhe_prepare_eret
mov x0, #INIT_PSTATE_EL1
msr spsr_el2, x0
@ -240,6 +269,7 @@
__init_el2_nvhe_idregs
__init_el2_cptr
__init_el2_fgt
__init_el2_gcs
.endm
#ifndef __KVM_NVHE_HYPERVISOR__

View File

@ -51,7 +51,8 @@
#define ESR_ELx_EC_FP_EXC32 UL(0x28)
/* Unallocated EC: 0x29 - 0x2B */
#define ESR_ELx_EC_FP_EXC64 UL(0x2C)
/* Unallocated EC: 0x2D - 0x2E */
#define ESR_ELx_EC_GCS UL(0x2D)
/* Unallocated EC: 0x2E */
#define ESR_ELx_EC_SERROR UL(0x2F)
#define ESR_ELx_EC_BREAKPT_LOW UL(0x30)
#define ESR_ELx_EC_BREAKPT_CUR UL(0x31)
@ -386,6 +387,31 @@
#define ESR_ELx_MOPS_ISS_SRCREG(esr) (((esr) & (UL(0x1f) << 5)) >> 5)
#define ESR_ELx_MOPS_ISS_SIZEREG(esr) (((esr) & (UL(0x1f) << 0)) >> 0)
/* ISS field definitions for GCS */
#define ESR_ELx_ExType_SHIFT (20)
#define ESR_ELx_ExType_MASK GENMASK(23, 20)
#define ESR_ELx_Raddr_SHIFT (10)
#define ESR_ELx_Raddr_MASK GENMASK(14, 10)
#define ESR_ELx_Rn_SHIFT (5)
#define ESR_ELx_Rn_MASK GENMASK(9, 5)
#define ESR_ELx_Rvalue_SHIFT 5
#define ESR_ELx_Rvalue_MASK GENMASK(9, 5)
#define ESR_ELx_IT_SHIFT (0)
#define ESR_ELx_IT_MASK GENMASK(4, 0)
#define ESR_ELx_ExType_DATA_CHECK 0
#define ESR_ELx_ExType_EXLOCK 1
#define ESR_ELx_ExType_STR 2
#define ESR_ELx_IT_RET 0
#define ESR_ELx_IT_GCSPOPM 1
#define ESR_ELx_IT_RET_KEYA 2
#define ESR_ELx_IT_RET_KEYB 3
#define ESR_ELx_IT_GCSSS1 4
#define ESR_ELx_IT_GCSSS2 5
#define ESR_ELx_IT_GCSPOPCX 6
#define ESR_ELx_IT_GCSPOPX 7
#ifndef __ASSEMBLY__
#include <asm/types.h>

View File

@ -57,6 +57,8 @@ void do_el0_undef(struct pt_regs *regs, unsigned long esr);
void do_el1_undef(struct pt_regs *regs, unsigned long esr);
void do_el0_bti(struct pt_regs *regs);
void do_el1_bti(struct pt_regs *regs, unsigned long esr);
void do_el0_gcs(struct pt_regs *regs, unsigned long esr);
void do_el1_gcs(struct pt_regs *regs, unsigned long esr);
void do_debug_exception(unsigned long addr_if_watchpoint, unsigned long esr,
struct pt_regs *regs);
void do_fpsimd_acc(unsigned long esr, struct pt_regs *regs);
@ -73,6 +75,7 @@ void do_el0_svc_compat(struct pt_regs *regs);
void do_el0_fpac(struct pt_regs *regs, unsigned long esr);
void do_el1_fpac(struct pt_regs *regs, unsigned long esr);
void do_el0_mops(struct pt_regs *regs, unsigned long esr);
void do_el1_mops(struct pt_regs *regs, unsigned long esr);
void do_serror(struct pt_regs *regs, unsigned long esr);
void do_signal(struct pt_regs *regs);

View File

@ -0,0 +1,107 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* Copyright (C) 2023 ARM Ltd.
*/
#ifndef __ASM_GCS_H
#define __ASM_GCS_H
#include <asm/types.h>
#include <asm/uaccess.h>
struct kernel_clone_args;
struct ksignal;
static inline void gcsb_dsync(void)
{
asm volatile(".inst 0xd503227f" : : : "memory");
}
static inline void gcsstr(u64 *addr, u64 val)
{
register u64 *_addr __asm__ ("x0") = addr;
register long _val __asm__ ("x1") = val;
/* GCSSTTR x1, x0 */
asm volatile(
".inst 0xd91f1c01\n"
:
: "rZ" (_val), "r" (_addr)
: "memory");
}
static inline void gcsss1(u64 Xt)
{
asm volatile (
"sys #3, C7, C7, #2, %0\n"
:
: "rZ" (Xt)
: "memory");
}
static inline u64 gcsss2(void)
{
u64 Xt;
asm volatile(
"SYSL %0, #3, C7, C7, #3\n"
: "=r" (Xt)
:
: "memory");
return Xt;
}
#define PR_SHADOW_STACK_SUPPORTED_STATUS_MASK \
(PR_SHADOW_STACK_ENABLE | PR_SHADOW_STACK_WRITE | PR_SHADOW_STACK_PUSH)
#ifdef CONFIG_ARM64_GCS
static inline bool task_gcs_el0_enabled(struct task_struct *task)
{
return current->thread.gcs_el0_mode & PR_SHADOW_STACK_ENABLE;
}
void gcs_set_el0_mode(struct task_struct *task);
void gcs_free(struct task_struct *task);
void gcs_preserve_current_state(void);
unsigned long gcs_alloc_thread_stack(struct task_struct *tsk,
const struct kernel_clone_args *args);
static inline int gcs_check_locked(struct task_struct *task,
unsigned long new_val)
{
unsigned long cur_val = task->thread.gcs_el0_mode;
cur_val &= task->thread.gcs_el0_locked;
new_val &= task->thread.gcs_el0_locked;
if (cur_val != new_val)
return -EBUSY;
return 0;
}
#else
static inline bool task_gcs_el0_enabled(struct task_struct *task)
{
return false;
}
static inline void gcs_set_el0_mode(struct task_struct *task) { }
static inline void gcs_free(struct task_struct *task) { }
static inline void gcs_preserve_current_state(void) { }
static inline unsigned long gcs_alloc_thread_stack(struct task_struct *tsk,
const struct kernel_clone_args *args)
{
return -ENOTSUPP;
}
static inline int gcs_check_locked(struct task_struct *task,
unsigned long new_val)
{
return 0;
}
#endif
#endif

View File

@ -11,6 +11,7 @@
#define __ASM_HUGETLB_H
#include <asm/cacheflush.h>
#include <asm/mte.h>
#include <asm/page.h>
#ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION
@ -21,6 +22,13 @@ extern bool arch_hugetlb_migration_supported(struct hstate *h);
static inline void arch_clear_hugetlb_flags(struct folio *folio)
{
clear_bit(PG_dcache_clean, &folio->flags);
#ifdef CONFIG_ARM64_MTE
if (system_supports_mte()) {
clear_bit(PG_mte_tagged, &folio->flags);
clear_bit(PG_mte_lock, &folio->flags);
}
#endif
}
#define arch_clear_hugetlb_flags arch_clear_hugetlb_flags

View File

@ -92,6 +92,7 @@
#define KERNEL_HWCAP_SB __khwcap_feature(SB)
#define KERNEL_HWCAP_PACA __khwcap_feature(PACA)
#define KERNEL_HWCAP_PACG __khwcap_feature(PACG)
#define KERNEL_HWCAP_GCS __khwcap_feature(GCS)
#define __khwcap2_feature(x) (const_ilog2(HWCAP2_ ## x) + 64)
#define KERNEL_HWCAP_DCPODP __khwcap2_feature(DCPODP)
@ -159,17 +160,21 @@
#define KERNEL_HWCAP_SME_SF8DP2 __khwcap2_feature(SME_SF8DP2)
#define KERNEL_HWCAP_POE __khwcap2_feature(POE)
#define __khwcap3_feature(x) (const_ilog2(HWCAP3_ ## x) + 128)
/*
* This yields a mask that user programs can use to figure out what
* instruction set this cpu supports.
*/
#define ELF_HWCAP cpu_get_elf_hwcap()
#define ELF_HWCAP2 cpu_get_elf_hwcap2()
#define ELF_HWCAP3 cpu_get_elf_hwcap3()
#ifdef CONFIG_COMPAT
#define COMPAT_ELF_HWCAP (compat_elf_hwcap)
#define COMPAT_ELF_HWCAP2 (compat_elf_hwcap2)
extern unsigned int compat_elf_hwcap, compat_elf_hwcap2;
#define COMPAT_ELF_HWCAP3 (compat_elf_hwcap3)
extern unsigned int compat_elf_hwcap, compat_elf_hwcap2, compat_elf_hwcap3;
#endif
enum {

View File

@ -353,6 +353,7 @@ __AARCH64_INSN_FUNCS(ldrsw_lit, 0xFF000000, 0x98000000)
__AARCH64_INSN_FUNCS(exclusive, 0x3F800000, 0x08000000)
__AARCH64_INSN_FUNCS(load_ex, 0x3F400000, 0x08400000)
__AARCH64_INSN_FUNCS(store_ex, 0x3F400000, 0x08000000)
__AARCH64_INSN_FUNCS(mops, 0x3B200C00, 0x19000400)
__AARCH64_INSN_FUNCS(stp, 0x7FC00000, 0x29000000)
__AARCH64_INSN_FUNCS(ldp, 0x7FC00000, 0x29400000)
__AARCH64_INSN_FUNCS(stp_post, 0x7FC00000, 0x28800000)
@ -575,6 +576,11 @@ static __always_inline u32 aarch64_insn_gen_nop(void)
return aarch64_insn_gen_hint(AARCH64_INSN_HINT_NOP);
}
static __always_inline bool aarch64_insn_is_nop(u32 insn)
{
return insn == aarch64_insn_gen_nop();
}
u32 aarch64_insn_gen_branch_reg(enum aarch64_insn_register reg,
enum aarch64_insn_branch_type type);
u32 aarch64_insn_gen_load_store_reg(enum aarch64_insn_register reg,

View File

@ -17,6 +17,7 @@
#include <asm/early_ioremap.h>
#include <asm/alternative.h>
#include <asm/cpufeature.h>
#include <asm/rsi.h>
/*
* Generic IO read/write. These perform native-endian accesses.
@ -318,4 +319,11 @@ extern bool arch_memremap_can_ram_remap(resource_size_t offset, size_t size,
unsigned long flags);
#define arch_memremap_can_ram_remap arch_memremap_can_ram_remap
static inline bool arm64_is_protected_mmio(phys_addr_t phys_addr, size_t size)
{
if (unlikely(is_realm_world()))
return __arm64_is_protected_mmio(phys_addr, size);
return false;
}
#endif /* __ASM_IO_H */

View File

@ -26,7 +26,6 @@
#define SWAPPER_SKIP_LEVEL 0
#endif
#define SWAPPER_BLOCK_SIZE (UL(1) << SWAPPER_BLOCK_SHIFT)
#define SWAPPER_TABLE_SHIFT (SWAPPER_BLOCK_SHIFT + PAGE_SHIFT - 3)
#define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS - SWAPPER_SKIP_LEVEL)
#define INIT_IDMAP_PGTABLE_LEVELS (IDMAP_LEVELS - SWAPPER_SKIP_LEVEL)

View File

@ -2,6 +2,8 @@
#ifndef __ASM_MEM_ENCRYPT_H
#define __ASM_MEM_ENCRYPT_H
#include <asm/rsi.h>
struct arm64_mem_crypt_ops {
int (*encrypt)(unsigned long addr, int numpages);
int (*decrypt)(unsigned long addr, int numpages);
@ -12,4 +14,11 @@ int arm64_mem_crypt_ops_register(const struct arm64_mem_crypt_ops *ops);
int set_memory_encrypted(unsigned long addr, int numpages);
int set_memory_decrypted(unsigned long addr, int numpages);
int realm_register_memory_enc_ops(void);
static inline bool force_dma_unencrypted(struct device *dev)
{
return is_realm_world();
}
#endif /* __ASM_MEM_ENCRYPT_H */

View File

@ -41,9 +41,12 @@ static inline unsigned long arch_calc_vm_flag_bits(struct file *file,
* backed by tags-capable memory. The vm_flags may be overridden by a
* filesystem supporting MTE (RAM-based).
*/
if (system_supports_mte() &&
((flags & MAP_ANONYMOUS) || shmem_file(file)))
return VM_MTE_ALLOWED;
if (system_supports_mte()) {
if (flags & (MAP_ANONYMOUS | MAP_HUGETLB))
return VM_MTE_ALLOWED;
if (shmem_file(file))
return VM_MTE_ALLOWED;
}
return 0;
}
@ -66,11 +69,26 @@ static inline bool arch_validate_prot(unsigned long prot,
static inline bool arch_validate_flags(unsigned long vm_flags)
{
if (!system_supports_mte())
return true;
if (system_supports_mte()) {
/*
* only allow VM_MTE if VM_MTE_ALLOWED has been set
* previously
*/
if ((vm_flags & VM_MTE) && !(vm_flags & VM_MTE_ALLOWED))
return false;
}
if (system_supports_gcs() && (vm_flags & VM_SHADOW_STACK)) {
/* An executable GCS isn't a good idea. */
if (vm_flags & VM_EXEC)
return false;
/* The memory management core should prevent this */
VM_WARN_ON(vm_flags & VM_SHARED);
}
return true;
/* only allow VM_MTE if VM_MTE_ALLOWED has been set previously */
return !(vm_flags & VM_MTE) || (vm_flags & VM_MTE_ALLOWED);
}
#define arch_validate_flags(vm_flags) arch_validate_flags(vm_flags)

View File

@ -20,6 +20,7 @@
#include <asm/cacheflush.h>
#include <asm/cpufeature.h>
#include <asm/daifflags.h>
#include <asm/gcs.h>
#include <asm/proc-fns.h>
#include <asm/cputype.h>
#include <asm/sysreg.h>
@ -311,6 +312,14 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
return por_el0_allows_pkey(vma_pkey(vma), write, execute);
}
#define deactivate_mm deactivate_mm
static inline void deactivate_mm(struct task_struct *tsk,
struct mm_struct *mm)
{
gcs_free(tsk);
}
#include <asm-generic/mmu_context.h>
#endif /* !__ASSEMBLY__ */

View File

@ -41,6 +41,8 @@ void mte_free_tag_storage(char *storage);
static inline void set_page_mte_tagged(struct page *page)
{
VM_WARN_ON_ONCE(folio_test_hugetlb(page_folio(page)));
/*
* Ensure that the tags written prior to this function are visible
* before the page flags update.
@ -53,6 +55,8 @@ static inline bool page_mte_tagged(struct page *page)
{
bool ret = test_bit(PG_mte_tagged, &page->flags);
VM_WARN_ON_ONCE(folio_test_hugetlb(page_folio(page)));
/*
* If the page is tagged, ensure ordering with a likely subsequent
* read of the tags.
@ -76,6 +80,8 @@ static inline bool page_mte_tagged(struct page *page)
*/
static inline bool try_page_mte_tagging(struct page *page)
{
VM_WARN_ON_ONCE(folio_test_hugetlb(page_folio(page)));
if (!test_and_set_bit(PG_mte_lock, &page->flags))
return true;
@ -157,6 +163,67 @@ static inline int mte_ptrace_copy_tags(struct task_struct *child,
#endif /* CONFIG_ARM64_MTE */
#if defined(CONFIG_HUGETLB_PAGE) && defined(CONFIG_ARM64_MTE)
static inline void folio_set_hugetlb_mte_tagged(struct folio *folio)
{
VM_WARN_ON_ONCE(!folio_test_hugetlb(folio));
/*
* Ensure that the tags written prior to this function are visible
* before the folio flags update.
*/
smp_wmb();
set_bit(PG_mte_tagged, &folio->flags);
}
static inline bool folio_test_hugetlb_mte_tagged(struct folio *folio)
{
bool ret = test_bit(PG_mte_tagged, &folio->flags);
VM_WARN_ON_ONCE(!folio_test_hugetlb(folio));
/*
* If the folio is tagged, ensure ordering with a likely subsequent
* read of the tags.
*/
if (ret)
smp_rmb();
return ret;
}
static inline bool folio_try_hugetlb_mte_tagging(struct folio *folio)
{
VM_WARN_ON_ONCE(!folio_test_hugetlb(folio));
if (!test_and_set_bit(PG_mte_lock, &folio->flags))
return true;
/*
* The tags are either being initialised or may have been initialised
* already. Check if the PG_mte_tagged flag has been set or wait
* otherwise.
*/
smp_cond_load_acquire(&folio->flags, VAL & (1UL << PG_mte_tagged));
return false;
}
#else
static inline void folio_set_hugetlb_mte_tagged(struct folio *folio)
{
}
static inline bool folio_test_hugetlb_mte_tagged(struct folio *folio)
{
return false;
}
static inline bool folio_try_hugetlb_mte_tagging(struct folio *folio)
{
return false;
}
#endif
static inline void mte_disable_tco_entry(struct task_struct *task)
{
if (!system_supports_mte())

View File

@ -28,7 +28,7 @@ static inline void __pud_populate(pud_t *pudp, phys_addr_t pmdp, pudval_t prot)
static inline void pud_populate(struct mm_struct *mm, pud_t *pudp, pmd_t *pmdp)
{
pudval_t pudval = PUD_TYPE_TABLE;
pudval_t pudval = PUD_TYPE_TABLE | PUD_TABLE_AF;
pudval |= (mm == &init_mm) ? PUD_TABLE_UXN : PUD_TABLE_PXN;
__pud_populate(pudp, __pa(pmdp), pudval);
@ -50,7 +50,7 @@ static inline void __p4d_populate(p4d_t *p4dp, phys_addr_t pudp, p4dval_t prot)
static inline void p4d_populate(struct mm_struct *mm, p4d_t *p4dp, pud_t *pudp)
{
p4dval_t p4dval = P4D_TYPE_TABLE;
p4dval_t p4dval = P4D_TYPE_TABLE | P4D_TABLE_AF;
p4dval |= (mm == &init_mm) ? P4D_TABLE_UXN : P4D_TABLE_PXN;
__p4d_populate(p4dp, __pa(pudp), p4dval);
@ -79,7 +79,7 @@ static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t p4dp, pgdval_t prot)
static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgdp, p4d_t *p4dp)
{
pgdval_t pgdval = PGD_TYPE_TABLE;
pgdval_t pgdval = PGD_TYPE_TABLE | PGD_TABLE_AF;
pgdval |= (mm == &init_mm) ? PGD_TABLE_UXN : PGD_TABLE_PXN;
__pgd_populate(pgdp, __pa(p4dp), pgdval);
@ -127,14 +127,16 @@ static inline void
pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmdp, pte_t *ptep)
{
VM_BUG_ON(mm && mm != &init_mm);
__pmd_populate(pmdp, __pa(ptep), PMD_TYPE_TABLE | PMD_TABLE_UXN);
__pmd_populate(pmdp, __pa(ptep),
PMD_TYPE_TABLE | PMD_TABLE_AF | PMD_TABLE_UXN);
}
static inline void
pmd_populate(struct mm_struct *mm, pmd_t *pmdp, pgtable_t ptep)
{
VM_BUG_ON(mm == &init_mm);
__pmd_populate(pmdp, page_to_phys(ptep), PMD_TYPE_TABLE | PMD_TABLE_PXN);
__pmd_populate(pmdp, page_to_phys(ptep),
PMD_TYPE_TABLE | PMD_TABLE_AF | PMD_TABLE_PXN);
}
#endif

View File

@ -99,6 +99,7 @@
#define PGD_TYPE_TABLE (_AT(pgdval_t, 3) << 0)
#define PGD_TABLE_BIT (_AT(pgdval_t, 1) << 1)
#define PGD_TYPE_MASK (_AT(pgdval_t, 3) << 0)
#define PGD_TABLE_AF (_AT(pgdval_t, 1) << 10) /* Ignored if no FEAT_HAFT */
#define PGD_TABLE_PXN (_AT(pgdval_t, 1) << 59)
#define PGD_TABLE_UXN (_AT(pgdval_t, 1) << 60)
@ -110,6 +111,7 @@
#define P4D_TYPE_MASK (_AT(p4dval_t, 3) << 0)
#define P4D_TYPE_SECT (_AT(p4dval_t, 1) << 0)
#define P4D_SECT_RDONLY (_AT(p4dval_t, 1) << 7) /* AP[2] */
#define P4D_TABLE_AF (_AT(p4dval_t, 1) << 10) /* Ignored if no FEAT_HAFT */
#define P4D_TABLE_PXN (_AT(p4dval_t, 1) << 59)
#define P4D_TABLE_UXN (_AT(p4dval_t, 1) << 60)
@ -121,6 +123,7 @@
#define PUD_TYPE_MASK (_AT(pudval_t, 3) << 0)
#define PUD_TYPE_SECT (_AT(pudval_t, 1) << 0)
#define PUD_SECT_RDONLY (_AT(pudval_t, 1) << 7) /* AP[2] */
#define PUD_TABLE_AF (_AT(pudval_t, 1) << 10) /* Ignored if no FEAT_HAFT */
#define PUD_TABLE_PXN (_AT(pudval_t, 1) << 59)
#define PUD_TABLE_UXN (_AT(pudval_t, 1) << 60)
@ -131,6 +134,7 @@
#define PMD_TYPE_TABLE (_AT(pmdval_t, 3) << 0)
#define PMD_TYPE_SECT (_AT(pmdval_t, 1) << 0)
#define PMD_TABLE_BIT (_AT(pmdval_t, 1) << 1)
#define PMD_TABLE_AF (_AT(pmdval_t, 1) << 10) /* Ignored if no FEAT_HAFT */
/*
* Section

View File

@ -35,7 +35,6 @@
#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */
#define _PROT_DEFAULT (PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
#define _PROT_SECT_DEFAULT (PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
#define PROT_DEFAULT (PTE_TYPE_PAGE | PTE_MAYBE_NG | PTE_MAYBE_SHARED | PTE_AF)
#define PROT_SECT_DEFAULT (PMD_TYPE_SECT | PMD_MAYBE_NG | PMD_MAYBE_SHARED | PMD_SECT_AF)
@ -68,8 +67,12 @@
#include <asm/cpufeature.h>
#include <asm/pgtable-types.h>
#include <asm/rsi.h>
extern bool arm64_use_ng_mappings;
extern unsigned long prot_ns_shared;
#define PROT_NS_SHARED (is_realm_world() ? prot_ns_shared : 0)
#define PTE_MAYBE_NG (arm64_use_ng_mappings ? PTE_NG : 0)
#define PMD_MAYBE_NG (arm64_use_ng_mappings ? PMD_SECT_NG : 0)
@ -144,15 +147,23 @@ static inline bool __pure lpa2_is_enabled(void)
/* 6: PTE_PXN | PTE_WRITE */
/* 7: PAGE_SHARED_EXEC PTE_PXN | PTE_WRITE | PTE_USER */
/* 8: PAGE_KERNEL_ROX PTE_UXN */
/* 9: PTE_UXN | PTE_USER */
/* 9: PAGE_GCS_RO PTE_UXN | PTE_USER */
/* a: PAGE_KERNEL_EXEC PTE_UXN | PTE_WRITE */
/* b: PTE_UXN | PTE_WRITE | PTE_USER */
/* b: PAGE_GCS PTE_UXN | PTE_WRITE | PTE_USER */
/* c: PAGE_KERNEL_RO PTE_UXN | PTE_PXN */
/* d: PAGE_READONLY PTE_UXN | PTE_PXN | PTE_USER */
/* e: PAGE_KERNEL PTE_UXN | PTE_PXN | PTE_WRITE */
/* f: PAGE_SHARED PTE_UXN | PTE_PXN | PTE_WRITE | PTE_USER */
#define _PAGE_GCS (_PAGE_DEFAULT | PTE_NG | PTE_UXN | PTE_WRITE | PTE_USER)
#define _PAGE_GCS_RO (_PAGE_DEFAULT | PTE_NG | PTE_UXN | PTE_USER)
#define PAGE_GCS __pgprot(_PAGE_GCS)
#define PAGE_GCS_RO __pgprot(_PAGE_GCS_RO)
#define PIE_E0 ( \
PIRx_ELx_PERM(pte_pi_index(_PAGE_GCS), PIE_GCS) | \
PIRx_ELx_PERM(pte_pi_index(_PAGE_GCS_RO), PIE_R) | \
PIRx_ELx_PERM(pte_pi_index(_PAGE_EXECONLY), PIE_X_O) | \
PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY_EXEC), PIE_RX_O) | \
PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED_EXEC), PIE_RWX_O) | \
@ -160,6 +171,8 @@ static inline bool __pure lpa2_is_enabled(void)
PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED), PIE_RW_O))
#define PIE_E1 ( \
PIRx_ELx_PERM(pte_pi_index(_PAGE_GCS), PIE_NONE_O) | \
PIRx_ELx_PERM(pte_pi_index(_PAGE_GCS_RO), PIE_NONE_O) | \
PIRx_ELx_PERM(pte_pi_index(_PAGE_EXECONLY), PIE_NONE_O) | \
PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY_EXEC), PIE_R) | \
PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED_EXEC), PIE_RW) | \

View File

@ -265,8 +265,7 @@ static inline pte_t pte_mkspecial(pte_t pte)
static inline pte_t pte_mkcont(pte_t pte)
{
pte = set_pte_bit(pte, __pgprot(PTE_CONT));
return set_pte_bit(pte, __pgprot(PTE_TYPE_PAGE));
return set_pte_bit(pte, __pgprot(PTE_CONT));
}
static inline pte_t pte_mknoncont(pte_t pte)
@ -338,7 +337,7 @@ static inline pte_t __ptep_get(pte_t *ptep)
}
extern void __sync_icache_dcache(pte_t pteval);
bool pgattr_change_is_safe(u64 old, u64 new);
bool pgattr_change_is_safe(pteval_t old, pteval_t new);
/*
* PTE bits configuration in the presence of hardware Dirty Bit Management
@ -438,11 +437,6 @@ static inline void __set_ptes(struct mm_struct *mm,
}
}
/*
* Huge pte definitions.
*/
#define pte_mkhuge(pte) (__pte(pte_val(pte) & ~PTE_TABLE_BIT))
/*
* Hugetlb definitions.
*/
@ -684,6 +678,11 @@ static inline void set_pud_at(struct mm_struct *mm, unsigned long addr,
#define pgprot_nx(prot) \
__pgprot_modify(prot, PTE_MAYBE_GP, PTE_PXN)
#define pgprot_decrypted(prot) \
__pgprot_modify(prot, PROT_NS_SHARED, PROT_NS_SHARED)
#define pgprot_encrypted(prot) \
__pgprot_modify(prot, PROT_NS_SHARED, 0)
/*
* Mark the prot value as uncacheable and unbufferable.
*/
@ -927,6 +926,9 @@ static inline phys_addr_t p4d_page_paddr(p4d_t p4d)
static inline pud_t *p4d_to_folded_pud(p4d_t *p4dp, unsigned long addr)
{
/* Ensure that 'p4dp' indexes a page table according to 'addr' */
VM_BUG_ON(((addr >> P4D_SHIFT) ^ ((u64)p4dp >> 3)) % PTRS_PER_P4D);
return (pud_t *)PTR_ALIGN_DOWN(p4dp, PAGE_SIZE) + pud_index(addr);
}
@ -1051,6 +1053,9 @@ static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
static inline p4d_t *pgd_to_folded_p4d(pgd_t *pgdp, unsigned long addr)
{
/* Ensure that 'pgdp' indexes a page table according to 'addr' */
VM_BUG_ON(((addr >> PGDIR_SHIFT) ^ ((u64)pgdp >> 3)) % PTRS_PER_PGD);
return (p4d_t *)PTR_ALIGN_DOWN(pgdp, PAGE_SIZE) + p4d_index(addr);
}
@ -1259,15 +1264,17 @@ static inline int __ptep_clear_flush_young(struct vm_area_struct *vma,
return young;
}
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG)
#define __HAVE_ARCH_PMDP_TEST_AND_CLEAR_YOUNG
static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma,
unsigned long address,
pmd_t *pmdp)
{
/* Operation applies to PMD table entry only if FEAT_HAFT is enabled */
VM_WARN_ON(pmd_table(READ_ONCE(*pmdp)) && !system_supports_haft());
return __ptep_test_and_clear_young(vma, address, (pte_t *)pmdp);
}
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
#endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG */
static inline pte_t __ptep_get_and_clear(struct mm_struct *mm,
unsigned long address, pte_t *ptep)
@ -1502,6 +1509,10 @@ static inline void update_mmu_cache_range(struct vm_fault *vmf,
*/
#define arch_has_hw_pte_young cpu_has_hw_af
#ifdef CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG
#define arch_has_hw_nonleaf_pmd_young system_supports_haft
#endif
/*
* Experimentally, it's cheap to set the access flag in hardware and we
* benefit from prefaulting mappings as 'old' to start with.

View File

@ -9,21 +9,18 @@
#include <asm/insn.h>
typedef u32 probe_opcode_t;
typedef void (probes_handler_t) (u32 opcode, long addr, struct pt_regs *);
/* architecture specific copy of original instruction */
struct arch_probe_insn {
probe_opcode_t *insn;
pstate_check_t *pstate_cc;
probes_handler_t *handler;
/* restore address after step xol */
unsigned long restore;
};
#ifdef CONFIG_KPROBES
typedef u32 kprobe_opcode_t;
typedef __le32 kprobe_opcode_t;
struct arch_specific_insn {
struct arch_probe_insn api;
kprobe_opcode_t *xol_insn;
/* restore address after step xol */
unsigned long xol_restore;
};
#endif

View File

@ -185,6 +185,13 @@ struct thread_struct {
u64 svcr;
u64 tpidr2_el0;
u64 por_el0;
#ifdef CONFIG_ARM64_GCS
unsigned int gcs_el0_mode;
unsigned int gcs_el0_locked;
u64 gcspr_el0;
u64 gcs_base;
u64 gcs_size;
#endif
};
static inline unsigned int thread_get_vl(struct thread_struct *thread,
@ -285,22 +292,44 @@ void tls_preserve_current_state(void);
.fpsimd_cpu = NR_CPUS, \
}
static inline void start_thread_common(struct pt_regs *regs, unsigned long pc)
static inline void start_thread_common(struct pt_regs *regs, unsigned long pc,
unsigned long pstate)
{
s32 previous_syscall = regs->syscallno;
memset(regs, 0, sizeof(*regs));
regs->syscallno = previous_syscall;
regs->pc = pc;
/*
* Ensure all GPRs are zeroed, and initialize PC + PSTATE.
* The SP (or compat SP) will be initialized later.
*/
regs->user_regs = (struct user_pt_regs) {
.pc = pc,
.pstate = pstate,
};
/*
* To allow the syscalls:sys_exit_execve tracepoint we need to preserve
* syscallno, but do not need orig_x0 or the original GPRs.
*/
regs->orig_x0 = 0;
/*
* An exec from a kernel thread won't have an existing PMR value.
*/
if (system_uses_irq_prio_masking())
regs->pmr_save = GIC_PRIO_IRQON;
regs->pmr = GIC_PRIO_IRQON;
/*
* The pt_regs::stackframe field must remain valid throughout this
* function as a stacktrace can be taken at any time. Any user or
* kernel task should have a valid final frame.
*/
WARN_ON_ONCE(regs->stackframe.record.fp != 0);
WARN_ON_ONCE(regs->stackframe.record.lr != 0);
WARN_ON_ONCE(regs->stackframe.type != FRAME_META_TYPE_FINAL);
}
static inline void start_thread(struct pt_regs *regs, unsigned long pc,
unsigned long sp)
{
start_thread_common(regs, pc);
regs->pstate = PSR_MODE_EL0t;
start_thread_common(regs, pc, PSR_MODE_EL0t);
spectre_v4_enable_task_mitigation(current);
regs->sp = sp;
}
@ -309,15 +338,13 @@ static inline void start_thread(struct pt_regs *regs, unsigned long pc,
static inline void compat_start_thread(struct pt_regs *regs, unsigned long pc,
unsigned long sp)
{
start_thread_common(regs, pc);
regs->pstate = PSR_AA32_MODE_USR;
unsigned long pstate = PSR_AA32_MODE_USR;
if (pc & 1)
regs->pstate |= PSR_AA32_T_BIT;
#ifdef __AARCH64EB__
regs->pstate |= PSR_AA32_E_BIT;
#endif
pstate |= PSR_AA32_T_BIT;
if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
pstate |= PSR_AA32_E_BIT;
start_thread_common(regs, pc, pstate);
spectre_v4_enable_task_mitigation(current);
regs->compat_sp = sp;
}

View File

@ -98,6 +98,8 @@
#include <linux/bug.h>
#include <linux/types.h>
#include <asm/stacktrace/frame.h>
/* sizeof(struct user) for AArch32 */
#define COMPAT_USER_SZ 296
@ -149,8 +151,7 @@ static inline unsigned long pstate_to_compat_psr(const unsigned long pstate)
/*
* This struct defines the way the registers are stored on the stack during an
* exception. Note that sizeof(struct pt_regs) has to be a multiple of 16 (for
* stack alignment). struct user_pt_regs must form a prefix of struct pt_regs.
* exception. struct user_pt_regs must form a prefix of struct pt_regs.
*/
struct pt_regs {
union {
@ -163,23 +164,20 @@ struct pt_regs {
};
};
u64 orig_x0;
#ifdef __AARCH64EB__
u32 unused2;
s32 syscallno;
#else
s32 syscallno;
u32 unused2;
#endif
u32 pmr;
u64 sdei_ttbr1;
/* Only valid when ARM64_HAS_GIC_PRIO_MASKING is enabled. */
u64 pmr_save;
u64 stackframe[2];
struct frame_record_meta stackframe;
/* Only valid for some EL1 exceptions. */
u64 lockdep_hardirqs;
u64 exit_rcu;
};
/* For correct stack alignment, pt_regs has to be a multiple of 16 bytes. */
static_assert(IS_ALIGNED(sizeof(struct pt_regs), 16));
static inline bool in_syscall(struct pt_regs const *regs)
{
return regs->syscallno != NO_SYSCALL;
@ -213,7 +211,7 @@ static inline void forget_syscall(struct pt_regs *regs)
#define irqs_priority_unmasked(regs) \
(system_uses_irq_prio_masking() ? \
(regs)->pmr_save == GIC_PRIO_IRQON : \
(regs)->pmr == GIC_PRIO_IRQON : \
true)
#define interrupts_enabled(regs) \

View File

@ -0,0 +1,68 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* Copyright (C) 2024 ARM Ltd.
*/
#ifndef __ASM_RSI_H_
#define __ASM_RSI_H_
#include <linux/errno.h>
#include <linux/jump_label.h>
#include <asm/rsi_cmds.h>
DECLARE_STATIC_KEY_FALSE(rsi_present);
void __init arm64_rsi_init(void);
bool __arm64_is_protected_mmio(phys_addr_t base, size_t size);
static inline bool is_realm_world(void)
{
return static_branch_unlikely(&rsi_present);
}
static inline int rsi_set_memory_range(phys_addr_t start, phys_addr_t end,
enum ripas state, unsigned long flags)
{
unsigned long ret;
phys_addr_t top;
while (start != end) {
ret = rsi_set_addr_range_state(start, end, state, flags, &top);
if (ret || top < start || top > end)
return -EINVAL;
start = top;
}
return 0;
}
/*
* Convert the specified range to RAM. Do not use this if you rely on the
* contents of a page that may already be in RAM state.
*/
static inline int rsi_set_memory_range_protected(phys_addr_t start,
phys_addr_t end)
{
return rsi_set_memory_range(start, end, RSI_RIPAS_RAM,
RSI_CHANGE_DESTROYED);
}
/*
* Convert the specified range to RAM. Do not convert any pages that may have
* been DESTROYED, without our permission.
*/
static inline int rsi_set_memory_range_protected_safe(phys_addr_t start,
phys_addr_t end)
{
return rsi_set_memory_range(start, end, RSI_RIPAS_RAM,
RSI_NO_CHANGE_DESTROYED);
}
static inline int rsi_set_memory_range_shared(phys_addr_t start,
phys_addr_t end)
{
return rsi_set_memory_range(start, end, RSI_RIPAS_EMPTY,
RSI_CHANGE_DESTROYED);
}
#endif /* __ASM_RSI_H_ */

View File

@ -0,0 +1,160 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* Copyright (C) 2023 ARM Ltd.
*/
#ifndef __ASM_RSI_CMDS_H
#define __ASM_RSI_CMDS_H
#include <linux/arm-smccc.h>
#include <asm/rsi_smc.h>
#define RSI_GRANULE_SHIFT 12
#define RSI_GRANULE_SIZE (_AC(1, UL) << RSI_GRANULE_SHIFT)
enum ripas {
RSI_RIPAS_EMPTY = 0,
RSI_RIPAS_RAM = 1,
RSI_RIPAS_DESTROYED = 2,
RSI_RIPAS_DEV = 3,
};
static inline unsigned long rsi_request_version(unsigned long req,
unsigned long *out_lower,
unsigned long *out_higher)
{
struct arm_smccc_res res;
arm_smccc_smc(SMC_RSI_ABI_VERSION, req, 0, 0, 0, 0, 0, 0, &res);
if (out_lower)
*out_lower = res.a1;
if (out_higher)
*out_higher = res.a2;
return res.a0;
}
static inline unsigned long rsi_get_realm_config(struct realm_config *cfg)
{
struct arm_smccc_res res;
arm_smccc_smc(SMC_RSI_REALM_CONFIG, virt_to_phys(cfg),
0, 0, 0, 0, 0, 0, &res);
return res.a0;
}
static inline unsigned long rsi_ipa_state_get(phys_addr_t start,
phys_addr_t end,
enum ripas *state,
phys_addr_t *top)
{
struct arm_smccc_res res;
arm_smccc_smc(SMC_RSI_IPA_STATE_GET,
start, end, 0, 0, 0, 0, 0,
&res);
if (res.a0 == RSI_SUCCESS) {
if (top)
*top = res.a1;
if (state)
*state = res.a2;
}
return res.a0;
}
static inline long rsi_set_addr_range_state(phys_addr_t start,
phys_addr_t end,
enum ripas state,
unsigned long flags,
phys_addr_t *top)
{
struct arm_smccc_res res;
arm_smccc_smc(SMC_RSI_IPA_STATE_SET, start, end, state,
flags, 0, 0, 0, &res);
if (top)
*top = res.a1;
if (res.a2 != RSI_ACCEPT)
return -EPERM;
return res.a0;
}
/**
* rsi_attestation_token_init - Initialise the operation to retrieve an
* attestation token.
*
* @challenge: The challenge data to be used in the attestation token
* generation.
* @size: Size of the challenge data in bytes.
*
* Initialises the attestation token generation and returns an upper bound
* on the attestation token size that can be used to allocate an adequate
* buffer. The caller is expected to subsequently call
* rsi_attestation_token_continue() to retrieve the attestation token data on
* the same CPU.
*
* Returns:
* On success, returns the upper limit of the attestation report size.
* Otherwise, -EINVAL
*/
static inline long
rsi_attestation_token_init(const u8 *challenge, unsigned long size)
{
struct arm_smccc_1_2_regs regs = { 0 };
/* The challenge must be at least 32bytes and at most 64bytes */
if (!challenge || size < 32 || size > 64)
return -EINVAL;
regs.a0 = SMC_RSI_ATTESTATION_TOKEN_INIT;
memcpy(&regs.a1, challenge, size);
arm_smccc_1_2_smc(&regs, &regs);
if (regs.a0 == RSI_SUCCESS)
return regs.a1;
return -EINVAL;
}
/**
* rsi_attestation_token_continue - Continue the operation to retrieve an
* attestation token.
*
* @granule: {I}PA of the Granule to which the token will be written.
* @offset: Offset within Granule to start of buffer in bytes.
* @size: The size of the buffer.
* @len: The number of bytes written to the buffer.
*
* Retrieves up to a RSI_GRANULE_SIZE worth of token data per call. The caller
* is expected to call rsi_attestation_token_init() before calling this
* function to retrieve the attestation token.
*
* Return:
* * %RSI_SUCCESS - Attestation token retrieved successfully.
* * %RSI_INCOMPLETE - Token generation is not complete.
* * %RSI_ERROR_INPUT - A parameter was not valid.
* * %RSI_ERROR_STATE - Attestation not in progress.
*/
static inline unsigned long rsi_attestation_token_continue(phys_addr_t granule,
unsigned long offset,
unsigned long size,
unsigned long *len)
{
struct arm_smccc_res res;
arm_smccc_1_1_invoke(SMC_RSI_ATTESTATION_TOKEN_CONTINUE,
granule, offset, size, 0, &res);
if (len)
*len = res.a1;
return res.a0;
}
#endif /* __ASM_RSI_CMDS_H */

View File

@ -0,0 +1,193 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* Copyright (C) 2023 ARM Ltd.
*/
#ifndef __ASM_RSI_SMC_H_
#define __ASM_RSI_SMC_H_
#include <linux/arm-smccc.h>
/*
* This file describes the Realm Services Interface (RSI) Application Binary
* Interface (ABI) for SMC calls made from within the Realm to the RMM and
* serviced by the RMM.
*/
/*
* The major version number of the RSI implementation. This is increased when
* the binary format or semantics of the SMC calls change.
*/
#define RSI_ABI_VERSION_MAJOR UL(1)
/*
* The minor version number of the RSI implementation. This is increased when
* a bug is fixed, or a feature is added without breaking binary compatibility.
*/
#define RSI_ABI_VERSION_MINOR UL(0)
#define RSI_ABI_VERSION ((RSI_ABI_VERSION_MAJOR << 16) | \
RSI_ABI_VERSION_MINOR)
#define RSI_ABI_VERSION_GET_MAJOR(_version) ((_version) >> 16)
#define RSI_ABI_VERSION_GET_MINOR(_version) ((_version) & 0xFFFF)
#define RSI_SUCCESS UL(0)
#define RSI_ERROR_INPUT UL(1)
#define RSI_ERROR_STATE UL(2)
#define RSI_INCOMPLETE UL(3)
#define RSI_ERROR_UNKNOWN UL(4)
#define SMC_RSI_FID(n) ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
ARM_SMCCC_SMC_64, \
ARM_SMCCC_OWNER_STANDARD, \
n)
/*
* Returns RSI version.
*
* arg1 == Requested interface revision
* ret0 == Status / error
* ret1 == Lower implemented interface revision
* ret2 == Higher implemented interface revision
*/
#define SMC_RSI_ABI_VERSION SMC_RSI_FID(0x190)
/*
* Read feature register.
*
* arg1 == Feature register index
* ret0 == Status / error
* ret1 == Feature register value
*/
#define SMC_RSI_FEATURES SMC_RSI_FID(0x191)
/*
* Read measurement for the current Realm.
*
* arg1 == Index, which measurements slot to read
* ret0 == Status / error
* ret1 == Measurement value, bytes: 0 - 7
* ret2 == Measurement value, bytes: 8 - 15
* ret3 == Measurement value, bytes: 16 - 23
* ret4 == Measurement value, bytes: 24 - 31
* ret5 == Measurement value, bytes: 32 - 39
* ret6 == Measurement value, bytes: 40 - 47
* ret7 == Measurement value, bytes: 48 - 55
* ret8 == Measurement value, bytes: 56 - 63
*/
#define SMC_RSI_MEASUREMENT_READ SMC_RSI_FID(0x192)
/*
* Extend Realm Extensible Measurement (REM) value.
*
* arg1 == Index, which measurements slot to extend
* arg2 == Size of realm measurement in bytes, max 64 bytes
* arg3 == Measurement value, bytes: 0 - 7
* arg4 == Measurement value, bytes: 8 - 15
* arg5 == Measurement value, bytes: 16 - 23
* arg6 == Measurement value, bytes: 24 - 31
* arg7 == Measurement value, bytes: 32 - 39
* arg8 == Measurement value, bytes: 40 - 47
* arg9 == Measurement value, bytes: 48 - 55
* arg10 == Measurement value, bytes: 56 - 63
* ret0 == Status / error
*/
#define SMC_RSI_MEASUREMENT_EXTEND SMC_RSI_FID(0x193)
/*
* Initialize the operation to retrieve an attestation token.
*
* arg1 == Challenge value, bytes: 0 - 7
* arg2 == Challenge value, bytes: 8 - 15
* arg3 == Challenge value, bytes: 16 - 23
* arg4 == Challenge value, bytes: 24 - 31
* arg5 == Challenge value, bytes: 32 - 39
* arg6 == Challenge value, bytes: 40 - 47
* arg7 == Challenge value, bytes: 48 - 55
* arg8 == Challenge value, bytes: 56 - 63
* ret0 == Status / error
* ret1 == Upper bound of token size in bytes
*/
#define SMC_RSI_ATTESTATION_TOKEN_INIT SMC_RSI_FID(0x194)
/*
* Continue the operation to retrieve an attestation token.
*
* arg1 == The IPA of token buffer
* arg2 == Offset within the granule of the token buffer
* arg3 == Size of the granule buffer
* ret0 == Status / error
* ret1 == Length of token bytes copied to the granule buffer
*/
#define SMC_RSI_ATTESTATION_TOKEN_CONTINUE SMC_RSI_FID(0x195)
#ifndef __ASSEMBLY__
struct realm_config {
union {
struct {
unsigned long ipa_bits; /* Width of IPA in bits */
unsigned long hash_algo; /* Hash algorithm */
};
u8 pad[0x200];
};
union {
u8 rpv[64]; /* Realm Personalization Value */
u8 pad2[0xe00];
};
/*
* The RMM requires the configuration structure to be aligned to a 4k
* boundary, ensure this happens by aligning this structure.
*/
} __aligned(0x1000);
#endif /* __ASSEMBLY__ */
/*
* Read configuration for the current Realm.
*
* arg1 == struct realm_config addr
* ret0 == Status / error
*/
#define SMC_RSI_REALM_CONFIG SMC_RSI_FID(0x196)
/*
* Request RIPAS of a target IPA range to be changed to a specified value.
*
* arg1 == Base IPA address of target region
* arg2 == Top of the region
* arg3 == RIPAS value
* arg4 == flags
* ret0 == Status / error
* ret1 == Top of modified IPA range
* ret2 == Whether the Host accepted or rejected the request
*/
#define SMC_RSI_IPA_STATE_SET SMC_RSI_FID(0x197)
#define RSI_NO_CHANGE_DESTROYED UL(0)
#define RSI_CHANGE_DESTROYED UL(1)
#define RSI_ACCEPT UL(0)
#define RSI_REJECT UL(1)
/*
* Get RIPAS of a target IPA range.
*
* arg1 == Base IPA of target region
* arg2 == End of target IPA region
* ret0 == Status / error
* ret1 == Top of IPA region which has the reported RIPAS value
* ret2 == RIPAS value
*/
#define SMC_RSI_IPA_STATE_GET SMC_RSI_FID(0x198)
/*
* Make a Host call.
*
* arg1 == IPA of host call structure
* ret0 == Status / error
*/
#define SMC_RSI_HOST_CALL SMC_RSI_FID(0x199)
#endif /* __ASM_RSI_SMC_H_ */

View File

@ -46,8 +46,14 @@ static inline void dynamic_scs_init(void)
static inline void dynamic_scs_init(void) {}
#endif
enum {
EDYNSCS_INVALID_CIE_HEADER = 1,
EDYNSCS_INVALID_CIE_SDATA_SIZE = 2,
EDYNSCS_INVALID_FDE_AUGM_DATA_SIZE = 3,
EDYNSCS_INVALID_CFA_OPCODE = 4,
};
int __pi_scs_patch(const u8 eh_frame[], int size);
asmlinkage void __pi_scs_patch_vmlinux(void);
#endif /* __ASSEMBLY __ */

View File

@ -15,4 +15,7 @@ int set_direct_map_invalid_noflush(struct page *page);
int set_direct_map_default_noflush(struct page *page);
bool kernel_page_present(struct page *page);
int set_memory_encrypted(unsigned long addr, int numpages);
int set_memory_decrypted(unsigned long addr, int numpages);
#endif /* _ASM_ARM64_SET_MEMORY_H */

View File

@ -60,13 +60,27 @@ static inline void unwind_init_common(struct unwind_state *state)
state->stack = stackinfo_get_unknown();
}
static struct stack_info *unwind_find_next_stack(const struct unwind_state *state,
unsigned long sp,
unsigned long size)
/**
* unwind_find_stack() - Find the accessible stack which entirely contains an
* object.
*
* @state: the current unwind state.
* @sp: the base address of the object.
* @size: the size of the object.
*
* Return: a pointer to the relevant stack_info if found; NULL otherwise.
*/
static struct stack_info *unwind_find_stack(struct unwind_state *state,
unsigned long sp,
unsigned long size)
{
for (int i = 0; i < state->nr_stacks; i++) {
struct stack_info *info = &state->stacks[i];
struct stack_info *info = &state->stack;
if (stackinfo_on_stack(info, sp, size))
return info;
for (int i = 0; i < state->nr_stacks; i++) {
info = &state->stacks[i];
if (stackinfo_on_stack(info, sp, size))
return info;
}
@ -75,36 +89,31 @@ static struct stack_info *unwind_find_next_stack(const struct unwind_state *stat
}
/**
* unwind_consume_stack() - Check if an object is on an accessible stack,
* updating stack boundaries so that future unwind steps cannot consume this
* object again.
* unwind_consume_stack() - Update stack boundaries so that future unwind steps
* cannot consume this object again.
*
* @state: the current unwind state.
* @info: the stack_info of the stack containing the object.
* @sp: the base address of the object.
* @size: the size of the object.
*
* Return: 0 upon success, an error code otherwise.
*/
static inline int unwind_consume_stack(struct unwind_state *state,
unsigned long sp,
unsigned long size)
static inline void unwind_consume_stack(struct unwind_state *state,
struct stack_info *info,
unsigned long sp,
unsigned long size)
{
struct stack_info *next;
if (stackinfo_on_stack(&state->stack, sp, size))
goto found;
next = unwind_find_next_stack(state, sp, size);
if (!next)
return -EINVAL;
struct stack_info tmp;
/*
* Stack transitions are strictly one-way, and once we've
* transitioned from one stack to another, it's never valid to
* unwind back to the old stack.
*
* Remove the current stack from the list of stacks so that it cannot
* be found on a subsequent transition.
* Destroy the old stack info so that it cannot be found upon a
* subsequent transition. If the stack has not changed, we'll
* immediately restore the current stack info.
*
* Note that stacks can nest in several valid orders, e.g.
*
@ -115,16 +124,15 @@ static inline int unwind_consume_stack(struct unwind_state *state,
* ... so we do not check the specific order of stack
* transitions.
*/
state->stack = *next;
*next = stackinfo_get_unknown();
tmp = *info;
*info = stackinfo_get_unknown();
state->stack = tmp;
found:
/*
* Future unwind steps can only consume stack above this frame record.
* Update the current stack to start immediately above it.
*/
state->stack.low = sp + size;
return 0;
}
/**
@ -137,21 +145,25 @@ static inline int unwind_consume_stack(struct unwind_state *state,
static inline int
unwind_next_frame_record(struct unwind_state *state)
{
struct stack_info *info;
struct frame_record *record;
unsigned long fp = state->fp;
int err;
if (fp & 0x7)
return -EINVAL;
err = unwind_consume_stack(state, fp, 16);
if (err)
return err;
info = unwind_find_stack(state, fp, sizeof(*record));
if (!info)
return -EINVAL;
unwind_consume_stack(state, info, fp, sizeof(*record));
/*
* Record this frame record's values.
*/
state->fp = READ_ONCE(*(unsigned long *)(fp));
state->pc = READ_ONCE(*(unsigned long *)(fp + 8));
record = (struct frame_record *)fp;
state->fp = READ_ONCE(record->fp);
state->pc = READ_ONCE(record->lr);
return 0;
}

View File

@ -0,0 +1,48 @@
/* SPDX-License-Identifier: GPL-2.0-only */
#ifndef __ASM_STACKTRACE_FRAME_H
#define __ASM_STACKTRACE_FRAME_H
/*
* - FRAME_META_TYPE_NONE
*
* This value is reserved.
*
* - FRAME_META_TYPE_FINAL
*
* The record is the last entry on the stack.
* Unwinding should terminate successfully.
*
* - FRAME_META_TYPE_PT_REGS
*
* The record is embedded within a struct pt_regs, recording the registers at
* an arbitrary point in time.
* Unwinding should consume pt_regs::pc, followed by pt_regs::lr.
*
* Note: all other values are reserved and should result in unwinding
* terminating with an error.
*/
#define FRAME_META_TYPE_NONE 0
#define FRAME_META_TYPE_FINAL 1
#define FRAME_META_TYPE_PT_REGS 2
#ifndef __ASSEMBLY__
/*
* A standard AAPCS64 frame record.
*/
struct frame_record {
u64 fp;
u64 lr;
};
/*
* A metadata frame record indicating a special unwind.
* The record::{fp,lr} fields must be zero to indicate the presence of
* metadata.
*/
struct frame_record_meta {
struct frame_record record;
u64 type;
};
#endif /* __ASSEMBLY */
#endif /* __ASM_STACKTRACE_FRAME_H */

View File

@ -1101,6 +1101,26 @@
/* Initial value for Permission Overlay Extension for EL0 */
#define POR_EL0_INIT POE_RXW
/*
* Definitions for Guarded Control Stack
*/
#define GCS_CAP_ADDR_MASK GENMASK(63, 12)
#define GCS_CAP_ADDR_SHIFT 12
#define GCS_CAP_ADDR_WIDTH 52
#define GCS_CAP_ADDR(x) FIELD_GET(GCS_CAP_ADDR_MASK, x)
#define GCS_CAP_TOKEN_MASK GENMASK(11, 0)
#define GCS_CAP_TOKEN_SHIFT 0
#define GCS_CAP_TOKEN_WIDTH 12
#define GCS_CAP_TOKEN(x) FIELD_GET(GCS_CAP_TOKEN_MASK, x)
#define GCS_CAP_VALID_TOKEN 0x1
#define GCS_CAP_IN_PROGRESS_TOKEN 0x5
#define GCS_CAP(x) ((((unsigned long)x) & GCS_CAP_ADDR_MASK) | \
GCS_CAP_VALID_TOKEN)
#define ARM64_FEATURE_FIELD_BITS 4
/* Defined for compatibility only, do not add new users. */

View File

@ -431,6 +431,23 @@ do { \
#define __flush_s2_tlb_range_op(op, start, pages, stride, tlb_level) \
__flush_tlb_range_op(op, start, pages, stride, 0, tlb_level, false, kvm_lpa2_is_enabled());
static inline bool __flush_tlb_range_limit_excess(unsigned long start,
unsigned long end, unsigned long pages, unsigned long stride)
{
/*
* When the system does not support TLB range based flush
* operation, (MAX_DVM_OPS - 1) pages can be handled. But
* with TLB range based operation, MAX_TLBI_RANGE_PAGES
* pages can be handled.
*/
if ((!system_supports_tlb_range() &&
(end - start) >= (MAX_DVM_OPS * stride)) ||
pages > MAX_TLBI_RANGE_PAGES)
return true;
return false;
}
static inline void __flush_tlb_range_nosync(struct vm_area_struct *vma,
unsigned long start, unsigned long end,
unsigned long stride, bool last_level,
@ -442,15 +459,7 @@ static inline void __flush_tlb_range_nosync(struct vm_area_struct *vma,
end = round_up(end, stride);
pages = (end - start) >> PAGE_SHIFT;
/*
* When not uses TLB range ops, we can handle up to
* (MAX_DVM_OPS - 1) pages;
* When uses TLB range ops, we can handle up to
* MAX_TLBI_RANGE_PAGES pages.
*/
if ((!system_supports_tlb_range() &&
(end - start) >= (MAX_DVM_OPS * stride)) ||
pages > MAX_TLBI_RANGE_PAGES) {
if (__flush_tlb_range_limit_excess(start, end, pages, stride)) {
flush_tlb_mm(vma->vm_mm);
return;
}
@ -492,19 +501,21 @@ static inline void flush_tlb_range(struct vm_area_struct *vma,
static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end)
{
unsigned long addr;
const unsigned long stride = PAGE_SIZE;
unsigned long pages;
if ((end - start) > (MAX_DVM_OPS * PAGE_SIZE)) {
start = round_down(start, stride);
end = round_up(end, stride);
pages = (end - start) >> PAGE_SHIFT;
if (__flush_tlb_range_limit_excess(start, end, pages, stride)) {
flush_tlb_all();
return;
}
start = __TLBI_VADDR(start, 0);
end = __TLBI_VADDR(end, 0);
dsb(ishst);
for (addr = start; addr < end; addr += 1 << (PAGE_SHIFT - 12))
__tlbi(vaale1is, addr);
__flush_tlb_range_op(vaale1is, start, pages, stride, 0,
TLBI_TTL_UNKNOWN, false, lpa2_is_enabled());
dsb(ish);
isb();
}

View File

@ -502,4 +502,44 @@ static inline size_t probe_subpage_writeable(const char __user *uaddr,
#endif /* CONFIG_ARCH_HAS_SUBPAGE_FAULTS */
#ifdef CONFIG_ARM64_GCS
static inline int gcssttr(unsigned long __user *addr, unsigned long val)
{
register unsigned long __user *_addr __asm__ ("x0") = addr;
register unsigned long _val __asm__ ("x1") = val;
int err = 0;
/* GCSSTTR x1, x0 */
asm volatile(
"1: .inst 0xd91f1c01\n"
"2: \n"
_ASM_EXTABLE_UACCESS_ERR(1b, 2b, %w0)
: "+r" (err)
: "rZ" (_val), "r" (_addr)
: "memory");
return err;
}
static inline void put_user_gcs(unsigned long val, unsigned long __user *addr,
int *err)
{
int ret;
if (!access_ok((char __user *)addr, sizeof(u64))) {
*err = -EFAULT;
return;
}
uaccess_ttbr0_enable();
ret = gcssttr(addr, val);
if (ret != 0)
*err = ret;
uaccess_ttbr0_disable();
}
#endif /* CONFIG_ARM64_GCS */
#endif /* __ASM_UACCESS_H */

View File

@ -21,7 +21,7 @@
* HWCAP flags - for AT_HWCAP
*
* Bits 62 and 63 are reserved for use by libc.
* Bits 32-61 are unallocated for potential use by libc.
* Bits 33-61 are unallocated for potential use by libc.
*/
#define HWCAP_FP (1 << 0)
#define HWCAP_ASIMD (1 << 1)
@ -55,6 +55,7 @@
#define HWCAP_SB (1 << 29)
#define HWCAP_PACA (1 << 30)
#define HWCAP_PACG (1UL << 31)
#define HWCAP_GCS (1UL << 32)
/*
* HWCAP2 flags - for AT_HWCAP2
@ -124,4 +125,8 @@
#define HWCAP2_SME_SF8DP2 (1UL << 62)
#define HWCAP2_POE (1UL << 63)
/*
* HWCAP3 flags - for AT_HWCAP3
*/
#endif /* _UAPI__ASM_HWCAP_H */

View File

@ -324,6 +324,14 @@ struct user_za_header {
#define ZA_PT_SIZE(vq) \
(ZA_PT_ZA_OFFSET + ZA_PT_ZA_SIZE(vq))
/* GCS state (NT_ARM_GCS) */
struct user_gcs {
__u64 features_enabled;
__u64 features_locked;
__u64 gcspr_el0;
};
#endif /* __ASSEMBLY__ */
#endif /* _UAPI__ASM_PTRACE_H */

View File

@ -183,6 +183,15 @@ struct zt_context {
__u16 __reserved[3];
};
#define GCS_MAGIC 0x47435300
struct gcs_context {
struct _aarch64_ctx head;
__u64 gcspr;
__u64 features_enabled;
__u64 reserved;
};
#endif /* !__ASSEMBLY__ */
#include <asm/sve_context.h>

View File

@ -33,7 +33,8 @@ obj-y := debug-monitors.o entry.o irq.o fpsimd.o \
return_address.o cpuinfo.o cpu_errata.o \
cpufeature.o alternative.o cacheinfo.o \
smp.o smp_spin_table.o topology.o smccc-call.o \
syscall.o proton-pack.o idle.o patching.o pi/
syscall.o proton-pack.o idle.o patching.o pi/ \
rsi.o
obj-$(CONFIG_COMPAT) += sys32.o signal32.o \
sys_compat.o

View File

@ -12,15 +12,12 @@
#include <linux/ftrace.h>
#include <linux/kexec.h>
#include <linux/mm.h>
#include <linux/dma-mapping.h>
#include <linux/kvm_host.h>
#include <linux/preempt.h>
#include <linux/suspend.h>
#include <asm/cpufeature.h>
#include <asm/fixmap.h>
#include <asm/thread_info.h>
#include <asm/memory.h>
#include <asm/signal32.h>
#include <asm/smp_plat.h>
#include <asm/suspend.h>
#include <linux/kbuild.h>
@ -28,8 +25,6 @@
int main(void)
{
DEFINE(TSK_ACTIVE_MM, offsetof(struct task_struct, active_mm));
BLANK();
DEFINE(TSK_TI_CPU, offsetof(struct task_struct, thread_info.cpu));
DEFINE(TSK_TI_FLAGS, offsetof(struct task_struct, thread_info.flags));
DEFINE(TSK_TI_PREEMPT, offsetof(struct task_struct, thread_info.preempt_count));
@ -79,8 +74,9 @@ int main(void)
DEFINE(S_PSTATE, offsetof(struct pt_regs, pstate));
DEFINE(S_SYSCALLNO, offsetof(struct pt_regs, syscallno));
DEFINE(S_SDEI_TTBR1, offsetof(struct pt_regs, sdei_ttbr1));
DEFINE(S_PMR_SAVE, offsetof(struct pt_regs, pmr_save));
DEFINE(S_PMR, offsetof(struct pt_regs, pmr));
DEFINE(S_STACKFRAME, offsetof(struct pt_regs, stackframe));
DEFINE(S_STACKFRAME_TYPE, offsetof(struct pt_regs, stackframe.type));
DEFINE(PT_REGS_SIZE, sizeof(struct pt_regs));
BLANK();
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_ARGS
@ -99,25 +95,6 @@ int main(void)
DEFINE(FREGS_SIZE, sizeof(struct ftrace_regs));
BLANK();
#endif
#ifdef CONFIG_COMPAT
DEFINE(COMPAT_SIGFRAME_REGS_OFFSET, offsetof(struct compat_sigframe, uc.uc_mcontext.arm_r0));
DEFINE(COMPAT_RT_SIGFRAME_REGS_OFFSET, offsetof(struct compat_rt_sigframe, sig.uc.uc_mcontext.arm_r0));
BLANK();
#endif
DEFINE(MM_CONTEXT_ID, offsetof(struct mm_struct, context.id.counter));
BLANK();
DEFINE(VMA_VM_MM, offsetof(struct vm_area_struct, vm_mm));
DEFINE(VMA_VM_FLAGS, offsetof(struct vm_area_struct, vm_flags));
BLANK();
DEFINE(VM_EXEC, VM_EXEC);
BLANK();
DEFINE(PAGE_SZ, PAGE_SIZE);
BLANK();
DEFINE(DMA_TO_DEVICE, DMA_TO_DEVICE);
DEFINE(DMA_FROM_DEVICE, DMA_FROM_DEVICE);
BLANK();
DEFINE(PREEMPT_DISABLE_OFFSET, PREEMPT_DISABLE_OFFSET);
BLANK();
DEFINE(CPU_BOOT_TASK, offsetof(struct secondary_data, task));
BLANK();
DEFINE(FTR_OVR_VAL_OFFSET, offsetof(struct arm64_ftr_override, val));

View File

@ -103,6 +103,7 @@ static DECLARE_BITMAP(elf_hwcap, MAX_CPU_FEATURES) __read_mostly;
COMPAT_HWCAP_LPAE)
unsigned int compat_elf_hwcap __read_mostly = COMPAT_ELF_HWCAP_DEFAULT;
unsigned int compat_elf_hwcap2 __read_mostly;
unsigned int compat_elf_hwcap3 __read_mostly;
#endif
DECLARE_BITMAP(system_cpucaps, ARM64_NCAPS);
@ -228,6 +229,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar0[] = {
};
static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_XS_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_I8MM_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_DGH_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_BF16_SHIFT, 4, 0),
@ -291,6 +293,8 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
};
static const struct arm64_ftr_bits ftr_id_aa64pfr1[] = {
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_GCS),
FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_EL1_GCS_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_EL1_SME_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_EL1_MPAM_frac_SHIFT, 4, 0),
@ -2358,6 +2362,14 @@ static void cpu_enable_poe(const struct arm64_cpu_capabilities *__unused)
}
#endif
#ifdef CONFIG_ARM64_GCS
static void cpu_enable_gcs(const struct arm64_cpu_capabilities *__unused)
{
/* GCSPR_EL0 is always readable */
write_sysreg_s(GCSCRE0_EL1_nTR, SYS_GCSCRE0_EL1);
}
#endif
/* Internal helper functions to match cpu capability type */
static bool
cpucap_late_cpu_optional(const struct arm64_cpu_capabilities *cap)
@ -2590,6 +2602,21 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
.cpus = &dbm_cpus,
ARM64_CPUID_FIELDS(ID_AA64MMFR1_EL1, HAFDBS, DBM)
},
#endif
#ifdef CONFIG_ARM64_HAFT
{
.desc = "Hardware managed Access Flag for Table Descriptors",
/*
* Contrary to the page/block access flag, the table access flag
* cannot be emulated in software (no access fault will occur).
* Therefore this should be used only if it's supported system
* wide.
*/
.type = ARM64_CPUCAP_SYSTEM_FEATURE,
.capability = ARM64_HAFT,
.matches = has_cpuid_feature,
ARM64_CPUID_FIELDS(ID_AA64MMFR1_EL1, HAFDBS, HAFT)
},
#endif
{
.desc = "CRC32 instructions",
@ -2889,6 +2916,16 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
.cpu_enable = cpu_enable_poe,
ARM64_CPUID_FIELDS(ID_AA64MMFR3_EL1, S1POE, IMP)
},
#endif
#ifdef CONFIG_ARM64_GCS
{
.desc = "Guarded Control Stack (GCS)",
.capability = ARM64_HAS_GCS,
.type = ARM64_CPUCAP_SYSTEM_FEATURE,
.cpu_enable = cpu_enable_gcs,
.matches = has_cpuid_feature,
ARM64_CPUID_FIELDS(ID_AA64PFR1_EL1, GCS, IMP)
},
#endif
{},
};
@ -3005,6 +3042,9 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
HWCAP_CAP(ID_AA64ZFR0_EL1, I8MM, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEI8MM),
HWCAP_CAP(ID_AA64ZFR0_EL1, F32MM, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEF32MM),
HWCAP_CAP(ID_AA64ZFR0_EL1, F64MM, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEF64MM),
#endif
#ifdef CONFIG_ARM64_GCS
HWCAP_CAP(ID_AA64PFR1_EL1, GCS, IMP, CAP_HWCAP, KERNEL_HWCAP_GCS),
#endif
HWCAP_CAP(ID_AA64PFR1_EL1, SSBS, SSBS2, CAP_HWCAP, KERNEL_HWCAP_SSBS),
#ifdef CONFIG_ARM64_BTI
@ -3499,6 +3539,11 @@ unsigned long cpu_get_elf_hwcap2(void)
return elf_hwcap[1];
}
unsigned long cpu_get_elf_hwcap3(void)
{
return elf_hwcap[2];
}
static void __init setup_boot_cpu_capabilities(void)
{
/*

View File

@ -80,6 +80,7 @@ static const char *const hwcap_str[] = {
[KERNEL_HWCAP_SB] = "sb",
[KERNEL_HWCAP_PACA] = "paca",
[KERNEL_HWCAP_PACG] = "pacg",
[KERNEL_HWCAP_GCS] = "gcs",
[KERNEL_HWCAP_DCPODP] = "dcpodp",
[KERNEL_HWCAP_SVE2] = "sve2",
[KERNEL_HWCAP_SVEAES] = "sveaes",

View File

@ -303,7 +303,6 @@ static int call_break_hook(struct pt_regs *regs, unsigned long esr)
{
struct break_hook *hook;
struct list_head *list;
int (*fn)(struct pt_regs *regs, unsigned long esr) = NULL;
list = user_mode(regs) ? &user_break_hook : &kernel_break_hook;
@ -313,10 +312,10 @@ static int call_break_hook(struct pt_regs *regs, unsigned long esr)
*/
list_for_each_entry_rcu(hook, list, node) {
if ((esr_brk_comment(esr) & ~hook->mask) == hook->imm)
fn = hook->fn;
return hook->fn(regs, esr);
}
return fn ? fn(regs, esr) : DBG_HOOK_ERROR;
return DBG_HOOK_ERROR;
}
NOKPROBE_SYMBOL(call_break_hook);
@ -441,6 +440,11 @@ void kernel_rewind_single_step(struct pt_regs *regs)
set_regs_spsr_ss(regs);
}
void kernel_fastforward_single_step(struct pt_regs *regs)
{
clear_regs_spsr_ss(regs);
}
/* ptrace API */
void user_enable_single_step(struct task_struct *task)
{

View File

@ -34,8 +34,16 @@ static __init pteval_t create_mapping_protection(efi_memory_desc_t *md)
u64 attr = md->attribute;
u32 type = md->type;
if (type == EFI_MEMORY_MAPPED_IO)
return PROT_DEVICE_nGnRE;
if (type == EFI_MEMORY_MAPPED_IO) {
pgprot_t prot = __pgprot(PROT_DEVICE_nGnRE);
if (arm64_is_protected_mmio(md->phys_addr,
md->num_pages << EFI_PAGE_SHIFT))
prot = pgprot_encrypted(prot);
else
prot = pgprot_decrypted(prot);
return pgprot_val(prot);
}
if (region_is_misaligned(md)) {
static bool __initdata code_is_misaligned;

View File

@ -463,6 +463,24 @@ static void noinstr el1_bti(struct pt_regs *regs, unsigned long esr)
exit_to_kernel_mode(regs);
}
static void noinstr el1_gcs(struct pt_regs *regs, unsigned long esr)
{
enter_from_kernel_mode(regs);
local_daif_inherit(regs);
do_el1_gcs(regs, esr);
local_daif_mask();
exit_to_kernel_mode(regs);
}
static void noinstr el1_mops(struct pt_regs *regs, unsigned long esr)
{
enter_from_kernel_mode(regs);
local_daif_inherit(regs);
do_el1_mops(regs, esr);
local_daif_mask();
exit_to_kernel_mode(regs);
}
static void noinstr el1_dbg(struct pt_regs *regs, unsigned long esr)
{
unsigned long far = read_sysreg(far_el1);
@ -505,6 +523,12 @@ asmlinkage void noinstr el1h_64_sync_handler(struct pt_regs *regs)
case ESR_ELx_EC_BTI:
el1_bti(regs, esr);
break;
case ESR_ELx_EC_GCS:
el1_gcs(regs, esr);
break;
case ESR_ELx_EC_MOPS:
el1_mops(regs, esr);
break;
case ESR_ELx_EC_BREAKPT_CUR:
case ESR_ELx_EC_SOFTSTP_CUR:
case ESR_ELx_EC_WATCHPT_CUR:
@ -684,6 +708,14 @@ static void noinstr el0_mops(struct pt_regs *regs, unsigned long esr)
exit_to_user_mode(regs);
}
static void noinstr el0_gcs(struct pt_regs *regs, unsigned long esr)
{
enter_from_user_mode(regs);
local_daif_restore(DAIF_PROCCTX);
do_el0_gcs(regs, esr);
exit_to_user_mode(regs);
}
static void noinstr el0_inv(struct pt_regs *regs, unsigned long esr)
{
enter_from_user_mode(regs);
@ -766,6 +798,9 @@ asmlinkage void noinstr el0t_64_sync_handler(struct pt_regs *regs)
case ESR_ELx_EC_MOPS:
el0_mops(regs, esr);
break;
case ESR_ELx_EC_GCS:
el0_gcs(regs, esr);
break;
case ESR_ELx_EC_BREAKPT_LOW:
case ESR_ELx_EC_SOFTSTP_LOW:
case ESR_ELx_EC_WATCHPT_LOW:

View File

@ -25,6 +25,7 @@
#include <asm/processor.h>
#include <asm/ptrace.h>
#include <asm/scs.h>
#include <asm/stacktrace/frame.h>
#include <asm/thread_info.h>
#include <asm/asm-uaccess.h>
#include <asm/unistd.h>
@ -284,15 +285,16 @@ alternative_else_nop_endif
stp lr, x21, [sp, #S_LR]
/*
* For exceptions from EL0, create a final frame record.
* For exceptions from EL1, create a synthetic frame record so the
* interrupted code shows up in the backtrace.
* Create a metadata frame record. The unwinder will use this to
* identify and unwind exception boundaries.
*/
.if \el == 0
stp xzr, xzr, [sp, #S_STACKFRAME]
.if \el == 0
mov x0, #FRAME_META_TYPE_FINAL
.else
stp x29, x22, [sp, #S_STACKFRAME]
mov x0, #FRAME_META_TYPE_PT_REGS
.endif
str x0, [sp, #S_STACKFRAME_TYPE]
add x29, sp, #S_STACKFRAME
#ifdef CONFIG_ARM64_SW_TTBR0_PAN
@ -315,7 +317,7 @@ alternative_if_not ARM64_HAS_GIC_PRIO_MASKING
alternative_else_nop_endif
mrs_s x20, SYS_ICC_PMR_EL1
str x20, [sp, #S_PMR_SAVE]
str w20, [sp, #S_PMR]
mov x20, #GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET
msr_s SYS_ICC_PMR_EL1, x20
@ -342,7 +344,7 @@ alternative_if_not ARM64_HAS_GIC_PRIO_MASKING
b .Lskip_pmr_restore\@
alternative_else_nop_endif
ldr x20, [sp, #S_PMR_SAVE]
ldr w20, [sp, #S_PMR]
msr_s SYS_ICC_PMR_EL1, x20
/* Ensure priority change is seen by redistributor */

View File

@ -386,7 +386,7 @@ static void task_fpsimd_load(void)
* fpsimd_save_user_state() or memory corruption, we
* should always record an explicit format
* when we save. We always at least have the
* memory allocated for FPSMID registers so
* memory allocated for FPSIMD registers so
* try that and hope for the best.
*/
WARN_ON_ONCE(1);

View File

@ -32,6 +32,7 @@
#include <asm/scs.h>
#include <asm/smp.h>
#include <asm/sysreg.h>
#include <asm/stacktrace/frame.h>
#include <asm/thread_info.h>
#include <asm/virt.h>
@ -199,6 +200,8 @@ SYM_CODE_END(preserve_boot_args)
sub sp, sp, #PT_REGS_SIZE
stp xzr, xzr, [sp, #S_STACKFRAME]
mov \tmp1, #FRAME_META_TYPE_FINAL
str \tmp1, [sp, #S_STACKFRAME_TYPE]
add x29, sp, #S_STACKFRAME
scs_load_current

View File

@ -266,9 +266,15 @@ static int swsusp_mte_save_tags(void)
max_zone_pfn = zone_end_pfn(zone);
for (pfn = zone->zone_start_pfn; pfn < max_zone_pfn; pfn++) {
struct page *page = pfn_to_online_page(pfn);
struct folio *folio;
if (!page)
continue;
folio = page_folio(page);
if (folio_test_hugetlb(folio) &&
!folio_test_hugetlb_mte_tagged(folio))
continue;
if (!page_mte_tagged(page))
continue;

View File

@ -462,14 +462,20 @@ int module_finalize(const Elf_Ehdr *hdr,
struct module *me)
{
const Elf_Shdr *s;
int ret;
s = find_section(hdr, sechdrs, ".altinstructions");
if (s)
apply_alternatives_module((void *)s->sh_addr, s->sh_size);
if (scs_is_dynamic()) {
s = find_section(hdr, sechdrs, ".init.eh_frame");
if (s)
__pi_scs_patch((void *)s->sh_addr, s->sh_size);
if (s) {
ret = __pi_scs_patch((void *)s->sh_addr, s->sh_size);
if (ret)
pr_err("module %s: error occurred during dynamic SCS patching (%d)\n",
me->name, ret);
}
}
return module_init_ftrace_plt(hdr, sechdrs, me);

View File

@ -38,7 +38,24 @@ EXPORT_SYMBOL_GPL(mte_async_or_asymm_mode);
void mte_sync_tags(pte_t pte, unsigned int nr_pages)
{
struct page *page = pte_page(pte);
unsigned int i;
struct folio *folio = page_folio(page);
unsigned long i;
if (folio_test_hugetlb(folio)) {
unsigned long nr = folio_nr_pages(folio);
/* Hugetlb MTE flags are set for head page only */
if (folio_try_hugetlb_mte_tagging(folio)) {
for (i = 0; i < nr; i++, page++)
mte_clear_page_tags(page_address(page));
folio_set_hugetlb_mte_tagged(folio);
}
/* ensure the tags are visible before the PTE is set */
smp_wmb();
return;
}
/* if PG_mte_tagged is set, tags have already been initialised */
for (i = 0; i < nr_pages; i++, page++) {
@ -410,6 +427,7 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
void *maddr;
struct page *page = get_user_page_vma_remote(mm, addr,
gup_flags, &vma);
struct folio *folio;
if (IS_ERR(page)) {
err = PTR_ERR(page);
@ -428,7 +446,12 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
put_page(page);
break;
}
WARN_ON_ONCE(!page_mte_tagged(page));
folio = page_folio(page);
if (folio_test_hugetlb(folio))
WARN_ON_ONCE(!folio_test_hugetlb_mte_tagged(folio));
else
WARN_ON_ONCE(!page_mte_tagged(page));
/* limit access to the end of the page */
offset = offset_in_page(addr);

View File

@ -38,6 +38,15 @@ struct ftr_set_desc {
#define FIELD(n, s, f) { .name = n, .shift = s, .width = 4, .filter = f }
static const struct ftr_set_desc mmfr0 __prel64_initconst = {
.name = "id_aa64mmfr0",
.override = &id_aa64mmfr0_override,
.fields = {
FIELD("ecv", ID_AA64MMFR0_EL1_ECV_SHIFT, NULL),
{}
},
};
static bool __init mmfr1_vh_filter(u64 val)
{
/*
@ -133,6 +142,7 @@ static const struct ftr_set_desc pfr1 __prel64_initconst = {
.override = &id_aa64pfr1_override,
.fields = {
FIELD("bt", ID_AA64PFR1_EL1_BT_SHIFT, NULL ),
FIELD("gcs", ID_AA64PFR1_EL1_GCS_SHIFT, NULL),
FIELD("mte", ID_AA64PFR1_EL1_MTE_SHIFT, NULL),
FIELD("sme", ID_AA64PFR1_EL1_SME_SHIFT, pfr1_sme_filter),
{}
@ -196,6 +206,7 @@ static const struct ftr_set_desc sw_features __prel64_initconst = {
static const
PREL64(const struct ftr_set_desc, reg) regs[] __prel64_initconst = {
{ &mmfr0 },
{ &mmfr1 },
{ &mmfr2 },
{ &pfr0 },
@ -215,6 +226,7 @@ static const struct {
{ "arm64.nosve", "id_aa64pfr0.sve=0" },
{ "arm64.nosme", "id_aa64pfr1.sme=0" },
{ "arm64.nobti", "id_aa64pfr1.bt=0" },
{ "arm64.nogcs", "id_aa64pfr1.gcs=0" },
{ "arm64.nopauth",
"id_aa64isar1.gpi=0 id_aa64isar1.gpa=0 "
"id_aa64isar1.api=0 id_aa64isar1.apa=0 "

View File

@ -30,7 +30,7 @@ void __init map_range(u64 *pte, u64 start, u64 end, u64 pa, pgprot_t prot,
int level, pte_t *tbl, bool may_use_cont, u64 va_offset)
{
u64 cmask = (level == 3) ? CONT_PTE_SIZE - 1 : U64_MAX;
u64 protval = pgprot_val(prot) & ~PTE_TYPE_MASK;
pteval_t protval = pgprot_val(prot) & ~PTE_TYPE_MASK;
int lshift = (3 - level) * (PAGE_SHIFT - 3);
u64 lmask = (PAGE_SIZE << lshift) - 1;

View File

@ -50,6 +50,10 @@ bool dynamic_scs_is_enabled;
#define DW_CFA_GNU_negative_offset_extended 0x2f
#define DW_CFA_hi_user 0x3f
#define DW_EH_PE_sdata4 0x0b
#define DW_EH_PE_sdata8 0x0c
#define DW_EH_PE_pcrel 0x10
enum {
PACIASP = 0xd503233f,
AUTIASP = 0xd50323bf,
@ -120,7 +124,12 @@ struct eh_frame {
union {
struct { // CIE
u8 version;
u8 augmentation_string[];
u8 augmentation_string[3];
u8 code_alignment_factor;
u8 data_alignment_factor;
u8 return_address_register;
u8 augmentation_data_size;
u8 fde_pointer_format;
};
struct { // FDE
@ -128,30 +137,39 @@ struct eh_frame {
s32 range;
u8 opcodes[];
};
struct { // FDE
s64 initial_loc64;
s64 range64;
u8 opcodes64[];
};
};
};
static int scs_handle_fde_frame(const struct eh_frame *frame,
bool fde_has_augmentation_data,
int code_alignment_factor,
bool use_sdata8,
bool dry_run)
{
int size = frame->size - offsetof(struct eh_frame, opcodes) + 4;
u64 loc = (u64)offset_to_ptr(&frame->initial_loc);
const u8 *opcode = frame->opcodes;
int l;
if (fde_has_augmentation_data) {
int l;
// assume single byte uleb128_t
if (WARN_ON(*opcode & BIT(7)))
return -ENOEXEC;
l = *opcode++;
opcode += l;
size -= l + 1;
if (use_sdata8) {
loc = (u64)&frame->initial_loc64 + frame->initial_loc64;
opcode = frame->opcodes64;
size -= 8;
}
// assume single byte uleb128_t for augmentation data size
if (*opcode & BIT(7))
return EDYNSCS_INVALID_FDE_AUGM_DATA_SIZE;
l = *opcode++;
opcode += l;
size -= l + 1;
/*
* Starting from 'loc', apply the CFA opcodes that advance the location
* pointer, and identify the locations of the PAC instructions.
@ -201,7 +219,7 @@ static int scs_handle_fde_frame(const struct eh_frame *frame,
break;
default:
return -ENOEXEC;
return EDYNSCS_INVALID_CFA_OPCODE;
}
}
return 0;
@ -209,12 +227,12 @@ static int scs_handle_fde_frame(const struct eh_frame *frame,
int scs_patch(const u8 eh_frame[], int size)
{
int code_alignment_factor = 1;
bool fde_use_sdata8 = false;
const u8 *p = eh_frame;
while (size > 4) {
const struct eh_frame *frame = (const void *)p;
bool fde_has_augmentation_data = true;
int code_alignment_factor = 1;
int ret;
if (frame->size == 0 ||
@ -223,28 +241,47 @@ int scs_patch(const u8 eh_frame[], int size)
break;
if (frame->cie_id_or_pointer == 0) {
const u8 *p = frame->augmentation_string;
/* a 'z' in the augmentation string must come first */
fde_has_augmentation_data = *p == 'z';
/*
* Require presence of augmentation data (z) with a
* specifier for the size of the FDE initial_loc and
* range fields (R), and nothing else.
*/
if (strcmp(frame->augmentation_string, "zR"))
return EDYNSCS_INVALID_CIE_HEADER;
/*
* The code alignment factor is a uleb128 encoded field
* but given that the only sensible values are 1 or 4,
* there is no point in decoding the whole thing.
* there is no point in decoding the whole thing. Also
* sanity check the size of the data alignment factor
* field, and the values of the return address register
* and augmentation data size fields.
*/
p += strlen(p) + 1;
if (!WARN_ON(*p & BIT(7)))
code_alignment_factor = *p;
if ((frame->code_alignment_factor & BIT(7)) ||
(frame->data_alignment_factor & BIT(7)) ||
frame->return_address_register != 30 ||
frame->augmentation_data_size != 1)
return EDYNSCS_INVALID_CIE_HEADER;
code_alignment_factor = frame->code_alignment_factor;
switch (frame->fde_pointer_format) {
case DW_EH_PE_pcrel | DW_EH_PE_sdata4:
fde_use_sdata8 = false;
break;
case DW_EH_PE_pcrel | DW_EH_PE_sdata8:
fde_use_sdata8 = true;
break;
default:
return EDYNSCS_INVALID_CIE_SDATA_SIZE;
}
} else {
ret = scs_handle_fde_frame(frame,
fde_has_augmentation_data,
code_alignment_factor,
true);
ret = scs_handle_fde_frame(frame, code_alignment_factor,
fde_use_sdata8, true);
if (ret)
return ret;
scs_handle_fde_frame(frame, fde_has_augmentation_data,
code_alignment_factor, false);
scs_handle_fde_frame(frame, code_alignment_factor,
fde_use_sdata8, false);
}
p += sizeof(frame->size) + frame->size;

View File

@ -58,10 +58,13 @@ static bool __kprobes aarch64_insn_is_steppable(u32 insn)
* Instructions which load PC relative literals are not going to work
* when executed from an XOL slot. Instructions doing an exclusive
* load/store are not going to complete successfully when single-step
* exception handling happens in the middle of the sequence.
* exception handling happens in the middle of the sequence. Memory
* copy/set instructions require that all three instructions be placed
* consecutively in memory.
*/
if (aarch64_insn_uses_literal(insn) ||
aarch64_insn_is_exclusive(insn))
aarch64_insn_is_exclusive(insn) ||
aarch64_insn_is_mops(insn))
return false;
return true;
@ -73,8 +76,17 @@ static bool __kprobes aarch64_insn_is_steppable(u32 insn)
* INSN_GOOD_NO_SLOT If instruction is supported but doesn't use its slot.
*/
enum probe_insn __kprobes
arm_probe_decode_insn(probe_opcode_t insn, struct arch_probe_insn *api)
arm_probe_decode_insn(u32 insn, struct arch_probe_insn *api)
{
/*
* While 'nop' instruction can execute in the out-of-line slot,
* simulating them in breakpoint handling offers better performance.
*/
if (aarch64_insn_is_nop(insn)) {
api->handler = simulate_nop;
return INSN_GOOD_NO_SLOT;
}
/*
* Instructions reading or modifying the PC won't work from the XOL
* slot.
@ -133,8 +145,8 @@ enum probe_insn __kprobes
arm_kprobe_decode_insn(kprobe_opcode_t *addr, struct arch_specific_insn *asi)
{
enum probe_insn decoded;
probe_opcode_t insn = le32_to_cpu(*addr);
probe_opcode_t *scan_end = NULL;
u32 insn = le32_to_cpu(*addr);
kprobe_opcode_t *scan_end = NULL;
unsigned long size = 0, offset = 0;
struct arch_probe_insn *api = &asi->api;

View File

@ -28,6 +28,6 @@ enum probe_insn __kprobes
arm_kprobe_decode_insn(kprobe_opcode_t *addr, struct arch_specific_insn *asi);
#endif
enum probe_insn __kprobes
arm_probe_decode_insn(probe_opcode_t insn, struct arch_probe_insn *asi);
arm_probe_decode_insn(u32 insn, struct arch_probe_insn *asi);
#endif /* _ARM_KERNEL_KPROBES_ARM64_H */

View File

@ -43,7 +43,7 @@ post_kprobe_handler(struct kprobe *, struct kprobe_ctlblk *, struct pt_regs *);
static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
{
kprobe_opcode_t *addr = p->ainsn.api.insn;
kprobe_opcode_t *addr = p->ainsn.xol_insn;
/*
* Prepare insn slot, Mark Rutland points out it depends on a coupe of
@ -64,20 +64,20 @@ static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
* the BRK exception handler, so it is unnecessary to generate
* Contex-Synchronization-Event via ISB again.
*/
aarch64_insn_patch_text_nosync(addr, p->opcode);
aarch64_insn_patch_text_nosync(addr, le32_to_cpu(p->opcode));
aarch64_insn_patch_text_nosync(addr + 1, BRK64_OPCODE_KPROBES_SS);
/*
* Needs restoring of return address after stepping xol.
*/
p->ainsn.api.restore = (unsigned long) p->addr +
p->ainsn.xol_restore = (unsigned long) p->addr +
sizeof(kprobe_opcode_t);
}
static void __kprobes arch_prepare_simulate(struct kprobe *p)
{
/* This instructions is not executed xol. No need to adjust the PC */
p->ainsn.api.restore = 0;
p->ainsn.xol_restore = 0;
}
static void __kprobes arch_simulate_insn(struct kprobe *p, struct pt_regs *regs)
@ -85,7 +85,7 @@ static void __kprobes arch_simulate_insn(struct kprobe *p, struct pt_regs *regs)
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
if (p->ainsn.api.handler)
p->ainsn.api.handler((u32)p->opcode, (long)p->addr, regs);
p->ainsn.api.handler(le32_to_cpu(p->opcode), (long)p->addr, regs);
/* single step simulated, now go for post processing */
post_kprobe_handler(p, kcb, regs);
@ -99,7 +99,7 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p)
return -EINVAL;
/* copy instruction */
p->opcode = le32_to_cpu(*p->addr);
p->opcode = *p->addr;
if (search_exception_tables(probe_addr))
return -EINVAL;
@ -110,18 +110,18 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p)
return -EINVAL;
case INSN_GOOD_NO_SLOT: /* insn need simulation */
p->ainsn.api.insn = NULL;
p->ainsn.xol_insn = NULL;
break;
case INSN_GOOD: /* instruction uses slot */
p->ainsn.api.insn = get_insn_slot();
if (!p->ainsn.api.insn)
p->ainsn.xol_insn = get_insn_slot();
if (!p->ainsn.xol_insn)
return -ENOMEM;
break;
}
/* prepare the instruction */
if (p->ainsn.api.insn)
if (p->ainsn.xol_insn)
arch_prepare_ss_slot(p);
else
arch_prepare_simulate(p);
@ -142,15 +142,16 @@ void __kprobes arch_arm_kprobe(struct kprobe *p)
void __kprobes arch_disarm_kprobe(struct kprobe *p)
{
void *addr = p->addr;
u32 insn = le32_to_cpu(p->opcode);
aarch64_insn_patch_text(&addr, &p->opcode, 1);
aarch64_insn_patch_text(&addr, &insn, 1);
}
void __kprobes arch_remove_kprobe(struct kprobe *p)
{
if (p->ainsn.api.insn) {
free_insn_slot(p->ainsn.api.insn, 0);
p->ainsn.api.insn = NULL;
if (p->ainsn.xol_insn) {
free_insn_slot(p->ainsn.xol_insn, 0);
p->ainsn.xol_insn = NULL;
}
}
@ -205,9 +206,9 @@ static void __kprobes setup_singlestep(struct kprobe *p,
}
if (p->ainsn.api.insn) {
if (p->ainsn.xol_insn) {
/* prepare for single stepping */
slot = (unsigned long)p->ainsn.api.insn;
slot = (unsigned long)p->ainsn.xol_insn;
kprobes_save_local_irqflag(kcb, regs);
instruction_pointer_set(regs, slot);
@ -245,8 +246,8 @@ static void __kprobes
post_kprobe_handler(struct kprobe *cur, struct kprobe_ctlblk *kcb, struct pt_regs *regs)
{
/* return addr restore if non-branching insn */
if (cur->ainsn.api.restore != 0)
instruction_pointer_set(regs, cur->ainsn.api.restore);
if (cur->ainsn.xol_restore != 0)
instruction_pointer_set(regs, cur->ainsn.xol_restore);
/* restore back original saved kprobe variables and continue */
if (kcb->kprobe_status == KPROBE_REENTER) {
@ -348,7 +349,7 @@ kprobe_breakpoint_ss_handler(struct pt_regs *regs, unsigned long esr)
struct kprobe *cur = kprobe_running();
if (cur && (kcb->kprobe_status & (KPROBE_HIT_SS | KPROBE_REENTER)) &&
((unsigned long)&cur->ainsn.api.insn[1] == addr)) {
((unsigned long)&cur->ainsn.xol_insn[1] == addr)) {
kprobes_restore_local_irqflag(kcb, regs);
post_kprobe_handler(cur, kcb, regs);

View File

@ -196,3 +196,9 @@ simulate_ldrsw_literal(u32 opcode, long addr, struct pt_regs *regs)
instruction_pointer_set(regs, instruction_pointer(regs) + 4);
}
void __kprobes
simulate_nop(u32 opcode, long addr, struct pt_regs *regs)
{
arm64_skip_faulting_instruction(regs, AARCH64_INSN_SIZE);
}

View File

@ -16,5 +16,6 @@ void simulate_cbz_cbnz(u32 opcode, long addr, struct pt_regs *regs);
void simulate_tbz_tbnz(u32 opcode, long addr, struct pt_regs *regs);
void simulate_ldr_literal(u32 opcode, long addr, struct pt_regs *regs);
void simulate_ldrsw_literal(u32 opcode, long addr, struct pt_regs *regs);
void simulate_nop(u32 opcode, long addr, struct pt_regs *regs);
#endif /* _ARM_KERNEL_KPROBES_SIMULATE_INSN_H */

View File

@ -17,12 +17,20 @@ void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr,
void *xol_page_kaddr = kmap_atomic(page);
void *dst = xol_page_kaddr + (vaddr & ~PAGE_MASK);
/*
* Initial cache maintenance of the xol page done via set_pte_at().
* Subsequent CMOs only needed if the xol slot changes.
*/
if (!memcmp(dst, src, len))
goto done;
/* Initialize the slot */
memcpy(dst, src, len);
/* flush caches (dcache/icache) */
sync_icache_aliases((unsigned long)dst, (unsigned long)dst + len);
done:
kunmap_atomic(xol_page_kaddr);
}
@ -34,7 +42,7 @@ unsigned long uprobe_get_swbp_addr(struct pt_regs *regs)
int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm,
unsigned long addr)
{
probe_opcode_t insn;
u32 insn;
/* TODO: Currently we do not support AARCH32 instruction probing */
if (mm->context.flags & MMCF_AARCH32)
@ -102,7 +110,7 @@ bool arch_uprobe_xol_was_trapped(struct task_struct *t)
bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
{
probe_opcode_t insn;
u32 insn;
unsigned long addr;
if (!auprobe->simulate)

View File

@ -49,6 +49,7 @@
#include <asm/cacheflush.h>
#include <asm/exec.h>
#include <asm/fpsimd.h>
#include <asm/gcs.h>
#include <asm/mmu_context.h>
#include <asm/mte.h>
#include <asm/processor.h>
@ -227,7 +228,7 @@ void __show_regs(struct pt_regs *regs)
printk("sp : %016llx\n", sp);
if (system_uses_irq_prio_masking())
printk("pmr_save: %08llx\n", regs->pmr_save);
printk("pmr: %08x\n", regs->pmr);
i = top_reg;
@ -280,6 +281,51 @@ static void flush_poe(void)
write_sysreg_s(POR_EL0_INIT, SYS_POR_EL0);
}
#ifdef CONFIG_ARM64_GCS
static void flush_gcs(void)
{
if (!system_supports_gcs())
return;
gcs_free(current);
current->thread.gcs_el0_mode = 0;
write_sysreg_s(GCSCRE0_EL1_nTR, SYS_GCSCRE0_EL1);
write_sysreg_s(0, SYS_GCSPR_EL0);
}
static int copy_thread_gcs(struct task_struct *p,
const struct kernel_clone_args *args)
{
unsigned long gcs;
if (!system_supports_gcs())
return 0;
p->thread.gcs_base = 0;
p->thread.gcs_size = 0;
gcs = gcs_alloc_thread_stack(p, args);
if (IS_ERR_VALUE(gcs))
return PTR_ERR((void *)gcs);
p->thread.gcs_el0_mode = current->thread.gcs_el0_mode;
p->thread.gcs_el0_locked = current->thread.gcs_el0_locked;
return 0;
}
#else
static void flush_gcs(void) { }
static int copy_thread_gcs(struct task_struct *p,
const struct kernel_clone_args *args)
{
return 0;
}
#endif
void flush_thread(void)
{
fpsimd_flush_thread();
@ -287,11 +333,13 @@ void flush_thread(void)
flush_ptrace_hw_breakpoint(current);
flush_tagged_addr_state();
flush_poe();
flush_gcs();
}
void arch_release_task_struct(struct task_struct *tsk)
{
fpsimd_release_task(tsk);
gcs_free(tsk);
}
int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
@ -355,6 +403,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
unsigned long stack_start = args->stack;
unsigned long tls = args->tls;
struct pt_regs *childregs = task_pt_regs(p);
int ret;
memset(&p->thread.cpu_context, 0, sizeof(struct cpu_context));
@ -399,6 +448,10 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
p->thread.uw.tp_value = tls;
p->thread.tpidr2_el0 = 0;
}
ret = copy_thread_gcs(p, args);
if (ret != 0)
return ret;
} else {
/*
* A kthread has no context to ERET to, so ensure any buggy
@ -409,6 +462,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
*/
memset(childregs, 0, sizeof(struct pt_regs));
childregs->pstate = PSR_MODE_EL1h | PSR_IL_BIT;
childregs->stackframe.type = FRAME_META_TYPE_FINAL;
p->thread.cpu_context.x19 = (unsigned long)args->fn;
p->thread.cpu_context.x20 = (unsigned long)args->fn_arg;
@ -422,7 +476,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
* For the benefit of the unwinder, set up childregs->stackframe
* as the final frame for the new task.
*/
p->thread.cpu_context.fp = (unsigned long)childregs->stackframe;
p->thread.cpu_context.fp = (unsigned long)&childregs->stackframe;
ptrace_hw_copy_thread(p);
@ -442,7 +496,7 @@ static void tls_thread_switch(struct task_struct *next)
if (is_compat_thread(task_thread_info(next)))
write_sysreg(next->thread.uw.tp_value, tpidrro_el0);
else if (!arm64_kernel_unmapped_at_el0())
else
write_sysreg(0, tpidrro_el0);
write_sysreg(*task_user_tls(next), tpidr_el0);
@ -487,6 +541,46 @@ static void entry_task_switch(struct task_struct *next)
__this_cpu_write(__entry_task, next);
}
#ifdef CONFIG_ARM64_GCS
void gcs_preserve_current_state(void)
{
current->thread.gcspr_el0 = read_sysreg_s(SYS_GCSPR_EL0);
}
static void gcs_thread_switch(struct task_struct *next)
{
if (!system_supports_gcs())
return;
/* GCSPR_EL0 is always readable */
gcs_preserve_current_state();
write_sysreg_s(next->thread.gcspr_el0, SYS_GCSPR_EL0);
if (current->thread.gcs_el0_mode != next->thread.gcs_el0_mode)
gcs_set_el0_mode(next);
/*
* Ensure that GCS memory effects of the 'prev' thread are
* ordered before other memory accesses with release semantics
* (or preceded by a DMB) on the current PE. In addition, any
* memory accesses with acquire semantics (or succeeded by a
* DMB) are ordered before GCS memory effects of the 'next'
* thread. This will ensure that the GCS memory effects are
* visible to other PEs in case of migration.
*/
if (task_gcs_el0_enabled(current) || task_gcs_el0_enabled(next))
gcsb_dsync();
}
#else
static void gcs_thread_switch(struct task_struct *next)
{
}
#endif
/*
* Handle sysreg updates for ARM erratum 1418040 which affects the 32bit view of
* CNTVCT, various other errata which require trapping all CNTVCT{,_EL0}
@ -583,6 +677,7 @@ struct task_struct *__switch_to(struct task_struct *prev,
cntkctl_thread_switch(prev, next);
ptrauth_thread_switch_user(next);
permission_overlay_switch(next);
gcs_thread_switch(next);
/*
* Complete any pending TLB or cache maintenance on this CPU in case

View File

@ -34,6 +34,7 @@
#include <asm/cpufeature.h>
#include <asm/debug-monitors.h>
#include <asm/fpsimd.h>
#include <asm/gcs.h>
#include <asm/mte.h>
#include <asm/pointer_auth.h>
#include <asm/stacktrace.h>
@ -898,7 +899,11 @@ static int sve_set_common(struct task_struct *target,
if (ret)
goto out;
/* Actual VL set may be less than the user asked for: */
/*
* Actual VL set may be different from what the user asked
* for, or we may have configured the _ONEXEC VL not the
* current VL:
*/
vq = sve_vq_from_vl(task_get_vl(target, type));
/* Enter/exit streaming mode */
@ -1125,7 +1130,11 @@ static int za_set(struct task_struct *target,
if (ret)
goto out;
/* Actual VL set may be less than the user asked for: */
/*
* Actual VL set may be different from what the user asked
* for, or we may have configured the _ONEXEC rather than
* current VL:
*/
vq = sve_vq_from_vl(task_get_sme_vl(target));
/* Ensure there is some SVE storage for streaming mode */
@ -1473,6 +1482,52 @@ static int poe_set(struct task_struct *target, const struct
}
#endif
#ifdef CONFIG_ARM64_GCS
static int gcs_get(struct task_struct *target,
const struct user_regset *regset,
struct membuf to)
{
struct user_gcs user_gcs;
if (!system_supports_gcs())
return -EINVAL;
if (target == current)
gcs_preserve_current_state();
user_gcs.features_enabled = target->thread.gcs_el0_mode;
user_gcs.features_locked = target->thread.gcs_el0_locked;
user_gcs.gcspr_el0 = target->thread.gcspr_el0;
return membuf_write(&to, &user_gcs, sizeof(user_gcs));
}
static int gcs_set(struct task_struct *target, const struct
user_regset *regset, unsigned int pos,
unsigned int count, const void *kbuf, const
void __user *ubuf)
{
int ret;
struct user_gcs user_gcs;
if (!system_supports_gcs())
return -EINVAL;
ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &user_gcs, 0, -1);
if (ret)
return ret;
if (user_gcs.features_enabled & ~PR_SHADOW_STACK_SUPPORTED_STATUS_MASK)
return -EINVAL;
target->thread.gcs_el0_mode = user_gcs.features_enabled;
target->thread.gcs_el0_locked = user_gcs.features_locked;
target->thread.gcspr_el0 = user_gcs.gcspr_el0;
return 0;
}
#endif
enum aarch64_regset {
REGSET_GPR,
REGSET_FPR,
@ -1503,7 +1558,10 @@ enum aarch64_regset {
REGSET_TAGGED_ADDR_CTRL,
#endif
#ifdef CONFIG_ARM64_POE
REGSET_POE
REGSET_POE,
#endif
#ifdef CONFIG_ARM64_GCS
REGSET_GCS,
#endif
};
@ -1674,6 +1732,16 @@ static const struct user_regset aarch64_regsets[] = {
.set = poe_set,
},
#endif
#ifdef CONFIG_ARM64_GCS
[REGSET_GCS] = {
.core_note_type = NT_ARM_GCS,
.n = sizeof(struct user_gcs) / sizeof(u64),
.size = sizeof(u64),
.align = sizeof(u64),
.regset_get = gcs_get,
.set = gcs_set,
},
#endif
};
static const struct user_regset_view user_aarch64_view = {

142
arch/arm64/kernel/rsi.c Normal file
View File

@ -0,0 +1,142 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
* Copyright (C) 2023 ARM Ltd.
*/
#include <linux/jump_label.h>
#include <linux/memblock.h>
#include <linux/psci.h>
#include <linux/swiotlb.h>
#include <linux/cc_platform.h>
#include <asm/io.h>
#include <asm/mem_encrypt.h>
#include <asm/rsi.h>
static struct realm_config config;
unsigned long prot_ns_shared;
EXPORT_SYMBOL(prot_ns_shared);
DEFINE_STATIC_KEY_FALSE_RO(rsi_present);
EXPORT_SYMBOL(rsi_present);
bool cc_platform_has(enum cc_attr attr)
{
switch (attr) {
case CC_ATTR_MEM_ENCRYPT:
return is_realm_world();
default:
return false;
}
}
EXPORT_SYMBOL_GPL(cc_platform_has);
static bool rsi_version_matches(void)
{
unsigned long ver_lower, ver_higher;
unsigned long ret = rsi_request_version(RSI_ABI_VERSION,
&ver_lower,
&ver_higher);
if (ret == SMCCC_RET_NOT_SUPPORTED)
return false;
if (ret != RSI_SUCCESS) {
pr_err("RME: RMM doesn't support RSI version %lu.%lu. Supported range: %lu.%lu-%lu.%lu\n",
RSI_ABI_VERSION_MAJOR, RSI_ABI_VERSION_MINOR,
RSI_ABI_VERSION_GET_MAJOR(ver_lower),
RSI_ABI_VERSION_GET_MINOR(ver_lower),
RSI_ABI_VERSION_GET_MAJOR(ver_higher),
RSI_ABI_VERSION_GET_MINOR(ver_higher));
return false;
}
pr_info("RME: Using RSI version %lu.%lu\n",
RSI_ABI_VERSION_GET_MAJOR(ver_lower),
RSI_ABI_VERSION_GET_MINOR(ver_lower));
return true;
}
static void __init arm64_rsi_setup_memory(void)
{
u64 i;
phys_addr_t start, end;
/*
* Iterate over the available memory ranges and convert the state to
* protected memory. We should take extra care to ensure that we DO NOT
* permit any "DESTROYED" pages to be converted to "RAM".
*
* panic() is used because if the attempt to switch the memory to
* protected has failed here, then future accesses to the memory are
* simply going to be reflected as a SEA (Synchronous External Abort)
* which we can't handle. Bailing out early prevents the guest limping
* on and dying later.
*/
for_each_mem_range(i, &start, &end) {
if (rsi_set_memory_range_protected_safe(start, end)) {
panic("Failed to set memory range to protected: %pa-%pa",
&start, &end);
}
}
}
bool __arm64_is_protected_mmio(phys_addr_t base, size_t size)
{
enum ripas ripas;
phys_addr_t end, top;
/* Overflow ? */
if (WARN_ON(base + size <= base))
return false;
end = ALIGN(base + size, RSI_GRANULE_SIZE);
base = ALIGN_DOWN(base, RSI_GRANULE_SIZE);
while (base < end) {
if (WARN_ON(rsi_ipa_state_get(base, end, &ripas, &top)))
break;
if (WARN_ON(top <= base))
break;
if (ripas != RSI_RIPAS_DEV)
break;
base = top;
}
return base >= end;
}
EXPORT_SYMBOL(__arm64_is_protected_mmio);
static int realm_ioremap_hook(phys_addr_t phys, size_t size, pgprot_t *prot)
{
if (__arm64_is_protected_mmio(phys, size))
*prot = pgprot_encrypted(*prot);
else
*prot = pgprot_decrypted(*prot);
return 0;
}
void __init arm64_rsi_init(void)
{
if (arm_smccc_1_1_get_conduit() != SMCCC_CONDUIT_SMC)
return;
if (!rsi_version_matches())
return;
if (WARN_ON(rsi_get_realm_config(&config)))
return;
prot_ns_shared = BIT(config.ipa_bits - 1);
if (arm64_ioremap_prot_hook_register(realm_ioremap_hook))
return;
if (realm_register_memory_enc_ops())
return;
arm64_rsi_setup_memory();
static_branch_enable(&rsi_present);
}

View File

@ -43,6 +43,7 @@
#include <asm/cpu_ops.h>
#include <asm/kasan.h>
#include <asm/numa.h>
#include <asm/rsi.h>
#include <asm/scs.h>
#include <asm/sections.h>
#include <asm/setup.h>
@ -351,6 +352,8 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p)
else
psci_acpi_init();
arm64_rsi_init();
init_bootcpu_ops();
smp_init_cpus();
smp_build_mpidr_hash();

View File

@ -26,6 +26,7 @@
#include <asm/elf.h>
#include <asm/exception.h>
#include <asm/cacheflush.h>
#include <asm/gcs.h>
#include <asm/ucontext.h>
#include <asm/unistd.h>
#include <asm/fpsimd.h>
@ -35,6 +36,15 @@
#include <asm/traps.h>
#include <asm/vdso.h>
#ifdef CONFIG_ARM64_GCS
#define GCS_SIGNAL_CAP(addr) (((unsigned long)addr) & GCS_CAP_ADDR_MASK)
static bool gcs_signal_cap_valid(u64 addr, u64 val)
{
return val == GCS_SIGNAL_CAP(addr);
}
#endif
/*
* Do a signal return; undo the signal stack. These are aligned to 128-bit.
*/
@ -43,11 +53,6 @@ struct rt_sigframe {
struct ucontext uc;
};
struct frame_record {
u64 fp;
u64 lr;
};
struct rt_sigframe_user_layout {
struct rt_sigframe __user *sigframe;
struct frame_record __user *next_frame;
@ -57,6 +62,7 @@ struct rt_sigframe_user_layout {
unsigned long fpsimd_offset;
unsigned long esr_offset;
unsigned long gcs_offset;
unsigned long sve_offset;
unsigned long tpidr2_offset;
unsigned long za_offset;
@ -79,7 +85,6 @@ struct user_access_state {
u64 por_el0;
};
#define BASE_SIGFRAME_SIZE round_up(sizeof(struct rt_sigframe), 16)
#define TERMINATOR_SIZE round_up(sizeof(struct _aarch64_ctx), 16)
#define EXTRA_CONTEXT_SIZE round_up(sizeof(struct extra_context), 16)
@ -242,6 +247,8 @@ struct user_ctxs {
u32 fpmr_size;
struct poe_context __user *poe;
u32 poe_size;
struct gcs_context __user *gcs;
u32 gcs_size;
};
static int preserve_fpsimd_context(struct fpsimd_context __user *ctx)
@ -689,6 +696,82 @@ extern int restore_zt_context(struct user_ctxs *user);
#endif /* ! CONFIG_ARM64_SME */
#ifdef CONFIG_ARM64_GCS
static int preserve_gcs_context(struct gcs_context __user *ctx)
{
int err = 0;
u64 gcspr = read_sysreg_s(SYS_GCSPR_EL0);
/*
* If GCS is enabled we will add a cap token to the frame,
* include it in the GCSPR_EL0 we report to support stack
* switching via sigreturn if GCS is enabled. We do not allow
* enabling via sigreturn so the token is only relevant for
* threads with GCS enabled.
*/
if (task_gcs_el0_enabled(current))
gcspr -= 8;
__put_user_error(GCS_MAGIC, &ctx->head.magic, err);
__put_user_error(sizeof(*ctx), &ctx->head.size, err);
__put_user_error(gcspr, &ctx->gcspr, err);
__put_user_error(0, &ctx->reserved, err);
__put_user_error(current->thread.gcs_el0_mode,
&ctx->features_enabled, err);
return err;
}
static int restore_gcs_context(struct user_ctxs *user)
{
u64 gcspr, enabled;
int err = 0;
if (user->gcs_size != sizeof(*user->gcs))
return -EINVAL;
__get_user_error(gcspr, &user->gcs->gcspr, err);
__get_user_error(enabled, &user->gcs->features_enabled, err);
if (err)
return err;
/* Don't allow unknown modes */
if (enabled & ~PR_SHADOW_STACK_SUPPORTED_STATUS_MASK)
return -EINVAL;
err = gcs_check_locked(current, enabled);
if (err != 0)
return err;
/* Don't allow enabling */
if (!task_gcs_el0_enabled(current) &&
(enabled & PR_SHADOW_STACK_ENABLE))
return -EINVAL;
/* If we are disabling disable everything */
if (!(enabled & PR_SHADOW_STACK_ENABLE))
enabled = 0;
current->thread.gcs_el0_mode = enabled;
/*
* We let userspace set GCSPR_EL0 to anything here, we will
* validate later in gcs_restore_signal().
*/
write_sysreg_s(gcspr, SYS_GCSPR_EL0);
return 0;
}
#else /* ! CONFIG_ARM64_GCS */
/* Turn any non-optimised out attempts to use these into a link error: */
extern int preserve_gcs_context(void __user *ctx);
extern int restore_gcs_context(struct user_ctxs *user);
#endif /* ! CONFIG_ARM64_GCS */
static int parse_user_sigframe(struct user_ctxs *user,
struct rt_sigframe __user *sf)
{
@ -707,6 +790,7 @@ static int parse_user_sigframe(struct user_ctxs *user,
user->zt = NULL;
user->fpmr = NULL;
user->poe = NULL;
user->gcs = NULL;
if (!IS_ALIGNED((unsigned long)base, 16))
goto invalid;
@ -823,6 +907,17 @@ static int parse_user_sigframe(struct user_ctxs *user,
user->fpmr_size = size;
break;
case GCS_MAGIC:
if (!system_supports_gcs())
goto invalid;
if (user->gcs)
goto invalid;
user->gcs = (struct gcs_context __user *)head;
user->gcs_size = size;
break;
case EXTRA_MAGIC:
if (have_extra_context)
goto invalid;
@ -943,6 +1038,9 @@ static int restore_sigframe(struct pt_regs *regs,
err = restore_fpsimd_context(&user);
}
if (err == 0 && system_supports_gcs() && user.gcs)
err = restore_gcs_context(&user);
if (err == 0 && system_supports_tpidr2() && user.tpidr2)
err = restore_tpidr2_context(&user);
@ -961,6 +1059,58 @@ static int restore_sigframe(struct pt_regs *regs,
return err;
}
#ifdef CONFIG_ARM64_GCS
static int gcs_restore_signal(void)
{
unsigned long __user *gcspr_el0;
u64 cap;
int ret;
if (!system_supports_gcs())
return 0;
if (!(current->thread.gcs_el0_mode & PR_SHADOW_STACK_ENABLE))
return 0;
gcspr_el0 = (unsigned long __user *)read_sysreg_s(SYS_GCSPR_EL0);
/*
* Ensure that any changes to the GCS done via GCS operations
* are visible to the normal reads we do to validate the
* token.
*/
gcsb_dsync();
/*
* GCSPR_EL0 should be pointing at a capped GCS, read the cap.
* We don't enforce that this is in a GCS page, if it is not
* then faults will be generated on GCS operations - the main
* concern is to protect GCS pages.
*/
ret = copy_from_user(&cap, gcspr_el0, sizeof(cap));
if (ret)
return -EFAULT;
/*
* Check that the cap is the actual GCS before replacing it.
*/
if (!gcs_signal_cap_valid((u64)gcspr_el0, cap))
return -EINVAL;
/* Invalidate the token to prevent reuse */
put_user_gcs(0, (__user void*)gcspr_el0, &ret);
if (ret != 0)
return -EFAULT;
write_sysreg_s(gcspr_el0 + 1, SYS_GCSPR_EL0);
return 0;
}
#else
static int gcs_restore_signal(void) { return 0; }
#endif
SYSCALL_DEFINE0(rt_sigreturn)
{
struct pt_regs *regs = current_pt_regs();
@ -985,6 +1135,9 @@ SYSCALL_DEFINE0(rt_sigreturn)
if (restore_sigframe(regs, frame, &ua_state))
goto badframe;
if (gcs_restore_signal())
goto badframe;
if (restore_altstack(&frame->uc.uc_stack))
goto badframe;
@ -1024,6 +1177,15 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
return err;
}
#ifdef CONFIG_ARM64_GCS
if (system_supports_gcs() && (add_all || current->thread.gcspr_el0)) {
err = sigframe_alloc(user, &user->gcs_offset,
sizeof(struct gcs_context));
if (err)
return err;
}
#endif
if (system_supports_sve() || system_supports_sme()) {
unsigned int vq = 0;
@ -1132,6 +1294,12 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user,
__put_user_error(current->thread.fault_code, &esr_ctx->esr, err);
}
if (system_supports_gcs() && err == 0 && user->gcs_offset) {
struct gcs_context __user *gcs_ctx =
apply_user_offset(user, user->gcs_offset);
err |= preserve_gcs_context(gcs_ctx);
}
/* Scalable Vector Extension state (including streaming), if present */
if ((system_supports_sve() || system_supports_sme()) &&
err == 0 && user->sve_offset) {
@ -1154,7 +1322,7 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user,
err |= preserve_fpmr_context(fpmr_ctx);
}
if (system_supports_poe() && err == 0 && user->poe_offset) {
if (system_supports_poe() && err == 0) {
struct poe_context __user *poe_ctx =
apply_user_offset(user, user->poe_offset);
@ -1249,7 +1417,48 @@ static int get_sigframe(struct rt_sigframe_user_layout *user,
return 0;
}
static void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
#ifdef CONFIG_ARM64_GCS
static int gcs_signal_entry(__sigrestore_t sigtramp, struct ksignal *ksig)
{
unsigned long __user *gcspr_el0;
int ret = 0;
if (!system_supports_gcs())
return 0;
if (!task_gcs_el0_enabled(current))
return 0;
/*
* We are entering a signal handler, current register state is
* active.
*/
gcspr_el0 = (unsigned long __user *)read_sysreg_s(SYS_GCSPR_EL0);
/*
* Push a cap and the GCS entry for the trampoline onto the GCS.
*/
put_user_gcs((unsigned long)sigtramp, gcspr_el0 - 2, &ret);
put_user_gcs(GCS_SIGNAL_CAP(gcspr_el0 - 1), gcspr_el0 - 1, &ret);
if (ret != 0)
return ret;
gcspr_el0 -= 2;
write_sysreg_s((unsigned long)gcspr_el0, SYS_GCSPR_EL0);
return 0;
}
#else
static int gcs_signal_entry(__sigrestore_t sigtramp, struct ksignal *ksig)
{
return 0;
}
#endif
static int setup_return(struct pt_regs *regs, struct ksignal *ksig,
struct rt_sigframe_user_layout *user, int usig)
{
__sigrestore_t sigtramp;
@ -1257,7 +1466,7 @@ static void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
regs->regs[0] = usig;
regs->sp = (unsigned long)user->sigframe;
regs->regs[29] = (unsigned long)&user->next_frame->fp;
regs->pc = (unsigned long)ka->sa.sa_handler;
regs->pc = (unsigned long)ksig->ka.sa.sa_handler;
/*
* Signal delivery is a (wacky) indirect function call in
@ -1297,12 +1506,14 @@ static void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
sme_smstop();
}
if (ka->sa.sa_flags & SA_RESTORER)
sigtramp = ka->sa.sa_restorer;
if (ksig->ka.sa.sa_flags & SA_RESTORER)
sigtramp = ksig->ka.sa.sa_restorer;
else
sigtramp = VDSO_SYMBOL(current->mm->context.vdso, sigtramp);
regs->regs[30] = (unsigned long)sigtramp;
return gcs_signal_entry(sigtramp, ksig);
}
static int setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set,
@ -1327,7 +1538,7 @@ static int setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set,
err |= __save_altstack(&frame->uc.uc_stack, regs->sp);
err |= setup_sigframe(&user, regs, set, &ua_state);
if (err == 0) {
setup_return(regs, &ksig->ka, &user, usig);
err = setup_return(regs, ksig, &user, usig);
if (ksig->ka.sa.sa_flags & SA_SIGINFO) {
err |= copy_siginfo_to_user(&frame->info, &ksig->info);
regs->regs[1] = (unsigned long)&frame->info;

View File

@ -20,6 +20,23 @@
#include <asm/stack_pointer.h>
#include <asm/stacktrace.h>
enum kunwind_source {
KUNWIND_SOURCE_UNKNOWN,
KUNWIND_SOURCE_FRAME,
KUNWIND_SOURCE_CALLER,
KUNWIND_SOURCE_TASK,
KUNWIND_SOURCE_REGS_PC,
KUNWIND_SOURCE_REGS_LR,
};
union unwind_flags {
unsigned long all;
struct {
unsigned long fgraph : 1,
kretprobe : 1;
};
};
/*
* Kernel unwind state
*
@ -37,6 +54,9 @@ struct kunwind_state {
#ifdef CONFIG_KRETPROBES
struct llist_node *kr_cur;
#endif
enum kunwind_source source;
union unwind_flags flags;
struct pt_regs *regs;
};
static __always_inline void
@ -45,6 +65,9 @@ kunwind_init(struct kunwind_state *state,
{
unwind_init_common(&state->common);
state->task = task;
state->source = KUNWIND_SOURCE_UNKNOWN;
state->flags.all = 0;
state->regs = NULL;
}
/*
@ -60,8 +83,10 @@ kunwind_init_from_regs(struct kunwind_state *state,
{
kunwind_init(state, current);
state->regs = regs;
state->common.fp = regs->regs[29];
state->common.pc = regs->pc;
state->source = KUNWIND_SOURCE_REGS_PC;
}
/*
@ -79,6 +104,7 @@ kunwind_init_from_caller(struct kunwind_state *state)
state->common.fp = (unsigned long)__builtin_frame_address(1);
state->common.pc = (unsigned long)__builtin_return_address(0);
state->source = KUNWIND_SOURCE_CALLER;
}
/*
@ -99,6 +125,7 @@ kunwind_init_from_task(struct kunwind_state *state,
state->common.fp = thread_saved_fp(task);
state->common.pc = thread_saved_pc(task);
state->source = KUNWIND_SOURCE_TASK;
}
static __always_inline int
@ -114,6 +141,7 @@ kunwind_recover_return_address(struct kunwind_state *state)
if (WARN_ON_ONCE(state->common.pc == orig_pc))
return -EINVAL;
state->common.pc = orig_pc;
state->flags.fgraph = 1;
}
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
@ -124,12 +152,110 @@ kunwind_recover_return_address(struct kunwind_state *state)
(void *)state->common.fp,
&state->kr_cur);
state->common.pc = orig_pc;
state->flags.kretprobe = 1;
}
#endif /* CONFIG_KRETPROBES */
return 0;
}
static __always_inline
int kunwind_next_regs_pc(struct kunwind_state *state)
{
struct stack_info *info;
unsigned long fp = state->common.fp;
struct pt_regs *regs;
regs = container_of((u64 *)fp, struct pt_regs, stackframe.record.fp);
info = unwind_find_stack(&state->common, (unsigned long)regs, sizeof(*regs));
if (!info)
return -EINVAL;
unwind_consume_stack(&state->common, info, (unsigned long)regs,
sizeof(*regs));
state->regs = regs;
state->common.pc = regs->pc;
state->common.fp = regs->regs[29];
state->source = KUNWIND_SOURCE_REGS_PC;
return 0;
}
static __always_inline int
kunwind_next_regs_lr(struct kunwind_state *state)
{
/*
* The stack for the regs was consumed by kunwind_next_regs_pc(), so we
* cannot consume that again here, but we know the regs are safe to
* access.
*/
state->common.pc = state->regs->regs[30];
state->common.fp = state->regs->regs[29];
state->regs = NULL;
state->source = KUNWIND_SOURCE_REGS_LR;
return 0;
}
static __always_inline int
kunwind_next_frame_record_meta(struct kunwind_state *state)
{
struct task_struct *tsk = state->task;
unsigned long fp = state->common.fp;
struct frame_record_meta *meta;
struct stack_info *info;
info = unwind_find_stack(&state->common, fp, sizeof(*meta));
if (!info)
return -EINVAL;
meta = (struct frame_record_meta *)fp;
switch (READ_ONCE(meta->type)) {
case FRAME_META_TYPE_FINAL:
if (meta == &task_pt_regs(tsk)->stackframe)
return -ENOENT;
WARN_ON_ONCE(1);
return -EINVAL;
case FRAME_META_TYPE_PT_REGS:
return kunwind_next_regs_pc(state);
default:
WARN_ON_ONCE(1);
return -EINVAL;
}
}
static __always_inline int
kunwind_next_frame_record(struct kunwind_state *state)
{
unsigned long fp = state->common.fp;
struct frame_record *record;
struct stack_info *info;
unsigned long new_fp, new_pc;
if (fp & 0x7)
return -EINVAL;
info = unwind_find_stack(&state->common, fp, sizeof(*record));
if (!info)
return -EINVAL;
record = (struct frame_record *)fp;
new_fp = READ_ONCE(record->fp);
new_pc = READ_ONCE(record->lr);
if (!new_fp && !new_pc)
return kunwind_next_frame_record_meta(state);
unwind_consume_stack(&state->common, info, fp, sizeof(*record));
state->common.fp = new_fp;
state->common.pc = new_pc;
state->source = KUNWIND_SOURCE_FRAME;
return 0;
}
/*
* Unwind from one frame record (A) to the next frame record (B).
*
@ -140,15 +266,24 @@ kunwind_recover_return_address(struct kunwind_state *state)
static __always_inline int
kunwind_next(struct kunwind_state *state)
{
struct task_struct *tsk = state->task;
unsigned long fp = state->common.fp;
int err;
/* Final frame; nothing to unwind */
if (fp == (unsigned long)task_pt_regs(tsk)->stackframe)
return -ENOENT;
state->flags.all = 0;
switch (state->source) {
case KUNWIND_SOURCE_FRAME:
case KUNWIND_SOURCE_CALLER:
case KUNWIND_SOURCE_TASK:
case KUNWIND_SOURCE_REGS_LR:
err = kunwind_next_frame_record(state);
break;
case KUNWIND_SOURCE_REGS_PC:
err = kunwind_next_regs_lr(state);
break;
default:
err = -EINVAL;
}
err = unwind_next_frame_record(&state->common);
if (err)
return err;
@ -294,10 +429,33 @@ noinline noinstr void arch_bpf_stack_walk(bool (*consume_entry)(void *cookie, u6
kunwind_stack_walk(arch_bpf_unwind_consume_entry, &data, current, NULL);
}
static bool dump_backtrace_entry(void *arg, unsigned long where)
static const char *state_source_string(const struct kunwind_state *state)
{
switch (state->source) {
case KUNWIND_SOURCE_FRAME: return NULL;
case KUNWIND_SOURCE_CALLER: return "C";
case KUNWIND_SOURCE_TASK: return "T";
case KUNWIND_SOURCE_REGS_PC: return "P";
case KUNWIND_SOURCE_REGS_LR: return "L";
default: return "U";
}
}
static bool dump_backtrace_entry(const struct kunwind_state *state, void *arg)
{
const char *source = state_source_string(state);
union unwind_flags flags = state->flags;
bool has_info = source || flags.all;
char *loglvl = arg;
printk("%s %pSb\n", loglvl, (void *)where);
printk("%s %pSb%s%s%s%s%s\n", loglvl,
(void *)state->common.pc,
has_info ? " (" : "",
source ? source : "",
flags.fgraph ? "F" : "",
flags.kretprobe ? "K" : "",
has_info ? ")" : "");
return true;
}
@ -316,7 +474,7 @@ void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk,
return;
printk("%sCall trace:\n", loglvl);
arch_stack_walk(dump_backtrace_entry, (void *)loglvl, tsk, regs);
kunwind_stack_walk(dump_backtrace_entry, (void *)loglvl, tsk, regs);
put_task_stack(tsk);
}

View File

@ -506,6 +506,16 @@ void do_el1_bti(struct pt_regs *regs, unsigned long esr)
die("Oops - BTI", regs, esr);
}
void do_el0_gcs(struct pt_regs *regs, unsigned long esr)
{
force_signal_inject(SIGSEGV, SEGV_CPERR, regs->pc, 0);
}
void do_el1_gcs(struct pt_regs *regs, unsigned long esr)
{
die("Oops - GCS", regs, esr);
}
void do_el0_fpac(struct pt_regs *regs, unsigned long esr)
{
force_signal_inject(SIGILL, ILL_ILLOPN, regs->pc, esr);
@ -531,6 +541,13 @@ void do_el0_mops(struct pt_regs *regs, unsigned long esr)
user_fastforward_single_step(current);
}
void do_el1_mops(struct pt_regs *regs, unsigned long esr)
{
arm64_mops_reset_regs(&regs->user_regs, esr);
kernel_fastforward_single_step(regs);
}
#define __user_cache_maint(insn, address, res) \
if (address >= TASK_SIZE_MAX) { \
res = -EFAULT; \
@ -852,6 +869,7 @@ static const char *esr_class_str[] = {
[ESR_ELx_EC_MOPS] = "MOPS",
[ESR_ELx_EC_FP_EXC32] = "FP (AArch32)",
[ESR_ELx_EC_FP_EXC64] = "FP (AArch64)",
[ESR_ELx_EC_GCS] = "Guarded Control Stack",
[ESR_ELx_EC_SERROR] = "SError",
[ESR_ELx_EC_BREAKPT_LOW] = "Breakpoint (lower EL)",
[ESR_ELx_EC_BREAKPT_CUR] = "Breakpoint (current EL)",

View File

@ -287,6 +287,9 @@ SECTIONS
__initdata_end = .;
__init_end = .;
.data.rel.ro : { *(.data.rel.ro) }
ASSERT(SIZEOF(.data.rel.ro) == 0, "Unexpected RELRO detected!")
_data = .;
_sdata = .;
RW_DATA(L1_CACHE_BYTES, PAGE_SIZE, THREAD_ALIGN)
@ -343,9 +346,6 @@ SECTIONS
*(.plt) *(.plt.*) *(.iplt) *(.igot .igot.plt)
}
ASSERT(SIZEOF(.plt) == 0, "Unexpected run-time procedure linkages detected!")
.data.rel.ro : { *(.data.rel.ro) }
ASSERT(SIZEOF(.data.rel.ro) == 0, "Unexpected RELRO detected!")
}
#include "image-vars.h"

View File

@ -1055,6 +1055,7 @@ int kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
void *maddr;
unsigned long num_tags;
struct page *page;
struct folio *folio;
if (is_error_noslot_pfn(pfn)) {
ret = -EFAULT;
@ -1068,10 +1069,13 @@ int kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
ret = -EFAULT;
goto out;
}
folio = page_folio(page);
maddr = page_address(page);
if (!write) {
if (page_mte_tagged(page))
if ((folio_test_hugetlb(folio) &&
folio_test_hugetlb_mte_tagged(folio)) ||
page_mte_tagged(page))
num_tags = mte_copy_tags_to_user(tags, maddr,
MTE_GRANULES_PER_PAGE);
else
@ -1085,14 +1089,20 @@ int kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
* __set_ptes() in the VMM but still overriding the
* tags, hence ignoring the return value.
*/
try_page_mte_tagging(page);
if (folio_test_hugetlb(folio))
folio_try_hugetlb_mte_tagging(folio);
else
try_page_mte_tagging(page);
num_tags = mte_copy_tags_from_user(maddr, tags,
MTE_GRANULES_PER_PAGE);
/* uaccess failed, don't leave stale tags */
if (num_tags != MTE_GRANULES_PER_PAGE)
mte_clear_page_tags(maddr);
set_page_mte_tagged(page);
if (folio_test_hugetlb(folio))
folio_set_hugetlb_mte_tagged(folio);
else
set_page_mte_tagged(page);
kvm_release_pfn_dirty(pfn);
}

View File

@ -1402,10 +1402,21 @@ static void sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn,
{
unsigned long i, nr_pages = size >> PAGE_SHIFT;
struct page *page = pfn_to_page(pfn);
struct folio *folio = page_folio(page);
if (!kvm_has_mte(kvm))
return;
if (folio_test_hugetlb(folio)) {
/* Hugetlb has MTE flags set on head page only */
if (folio_try_hugetlb_mte_tagging(folio)) {
for (i = 0; i < nr_pages; i++, page++)
mte_clear_page_tags(page_address(page));
folio_set_hugetlb_mte_tagged(folio);
}
return;
}
for (i = 0; i < nr_pages; i++, page++) {
if (try_page_mte_tagging(page)) {
mte_clear_page_tags(page_address(page));

View File

@ -13,7 +13,7 @@ endif
lib-$(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) += uaccess_flushcache.o
obj-$(CONFIG_CRC32) += crc32.o
obj-$(CONFIG_CRC32) += crc32.o crc32-glue.o
obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o

View File

@ -15,6 +15,19 @@
* x0 - dest
*/
SYM_FUNC_START(__pi_clear_page)
#ifdef CONFIG_AS_HAS_MOPS
.arch_extension mops
alternative_if_not ARM64_HAS_MOPS
b .Lno_mops
alternative_else_nop_endif
mov x1, #PAGE_SIZE
setpn [x0]!, x1!, xzr
setmn [x0]!, x1!, xzr
seten [x0]!, x1!, xzr
ret
.Lno_mops:
#endif
mrs x1, dczid_el0
tbnz x1, #4, 2f /* Branch if DC ZVA is prohibited */
and w1, w1, #0xf

View File

@ -18,6 +18,19 @@
* x1 - src
*/
SYM_FUNC_START(__pi_copy_page)
#ifdef CONFIG_AS_HAS_MOPS
.arch_extension mops
alternative_if_not ARM64_HAS_MOPS
b .Lno_mops
alternative_else_nop_endif
mov x2, #PAGE_SIZE
cpypwn [x0]!, [x1]!, x2!
cpymwn [x0]!, [x1]!, x2!
cpyewn [x0]!, [x1]!, x2!
ret
.Lno_mops:
#endif
ldp x2, x3, [x1]
ldp x4, x5, [x1, #16]
ldp x6, x7, [x1, #32]

View File

@ -0,0 +1,82 @@
// SPDX-License-Identifier: GPL-2.0-only
#include <linux/crc32.h>
#include <linux/linkage.h>
#include <asm/alternative.h>
#include <asm/cpufeature.h>
#include <asm/neon.h>
#include <asm/simd.h>
#include <crypto/internal/simd.h>
// The minimum input length to consider the 4-way interleaved code path
static const size_t min_len = 1024;
asmlinkage u32 crc32_le_arm64(u32 crc, unsigned char const *p, size_t len);
asmlinkage u32 crc32c_le_arm64(u32 crc, unsigned char const *p, size_t len);
asmlinkage u32 crc32_be_arm64(u32 crc, unsigned char const *p, size_t len);
asmlinkage u32 crc32_le_arm64_4way(u32 crc, unsigned char const *p, size_t len);
asmlinkage u32 crc32c_le_arm64_4way(u32 crc, unsigned char const *p, size_t len);
asmlinkage u32 crc32_be_arm64_4way(u32 crc, unsigned char const *p, size_t len);
u32 __pure crc32_le(u32 crc, unsigned char const *p, size_t len)
{
if (!alternative_has_cap_likely(ARM64_HAS_CRC32))
return crc32_le_base(crc, p, len);
if (len >= min_len && cpu_have_named_feature(PMULL) && crypto_simd_usable()) {
kernel_neon_begin();
crc = crc32_le_arm64_4way(crc, p, len);
kernel_neon_end();
p += round_down(len, 64);
len %= 64;
if (!len)
return crc;
}
return crc32_le_arm64(crc, p, len);
}
u32 __pure __crc32c_le(u32 crc, unsigned char const *p, size_t len)
{
if (!alternative_has_cap_likely(ARM64_HAS_CRC32))
return __crc32c_le_base(crc, p, len);
if (len >= min_len && cpu_have_named_feature(PMULL) && crypto_simd_usable()) {
kernel_neon_begin();
crc = crc32c_le_arm64_4way(crc, p, len);
kernel_neon_end();
p += round_down(len, 64);
len %= 64;
if (!len)
return crc;
}
return crc32c_le_arm64(crc, p, len);
}
u32 __pure crc32_be(u32 crc, unsigned char const *p, size_t len)
{
if (!alternative_has_cap_likely(ARM64_HAS_CRC32))
return crc32_be_base(crc, p, len);
if (len >= min_len && cpu_have_named_feature(PMULL) && crypto_simd_usable()) {
kernel_neon_begin();
crc = crc32_be_arm64_4way(crc, p, len);
kernel_neon_end();
p += round_down(len, 64);
len %= 64;
if (!len)
return crc;
}
return crc32_be_arm64(crc, p, len);
}

View File

@ -1,54 +1,60 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* Accelerated CRC32(C) using AArch64 CRC instructions
* Accelerated CRC32(C) using AArch64 CRC and PMULL instructions
*
* Copyright (C) 2016 - 2018 Linaro Ltd <ard.biesheuvel@linaro.org>
* Copyright (C) 2016 - 2018 Linaro Ltd.
* Copyright (C) 2024 Google LLC
*
* Author: Ard Biesheuvel <ardb@kernel.org>
*/
#include <linux/linkage.h>
#include <asm/alternative.h>
#include <asm/assembler.h>
.arch armv8-a+crc
.cpu generic+crc+crypto
.macro byteorder, reg, be
.if \be
CPU_LE( rev \reg, \reg )
.else
CPU_BE( rev \reg, \reg )
.endif
.macro bitle, reg
.endm
.macro byteorder16, reg, be
.if \be
CPU_LE( rev16 \reg, \reg )
.else
CPU_BE( rev16 \reg, \reg )
.endif
.endm
.macro bitorder, reg, be
.if \be
.macro bitbe, reg
rbit \reg, \reg
.endif
.endm
.macro bitorder16, reg, be
.if \be
rbit \reg, \reg
lsr \reg, \reg, #16
.endif
.macro bytele, reg
.endm
.macro bitorder8, reg, be
.if \be
.macro bytebe, reg
rbit \reg, \reg
lsr \reg, \reg, #24
.endif
.endm
.macro __crc32, c, be=0
bitorder w0, \be
.macro hwordle, reg
CPU_BE( rev16 \reg, \reg )
.endm
.macro hwordbe, reg
CPU_LE( rev \reg, \reg )
rbit \reg, \reg
CPU_BE( lsr \reg, \reg, #16 )
.endm
.macro le, regs:vararg
.irp r, \regs
CPU_BE( rev \r, \r )
.endr
.endm
.macro be, regs:vararg
.irp r, \regs
CPU_LE( rev \r, \r )
.endr
.irp r, \regs
rbit \r, \r
.endr
.endm
.macro __crc32, c, order=le
bit\order w0
cmp x2, #16
b.lt 8f // less than 16 bytes
@ -61,14 +67,7 @@ CPU_BE( rev16 \reg, \reg )
add x8, x8, x1
add x1, x1, x7
ldp x5, x6, [x8]
byteorder x3, \be
byteorder x4, \be
byteorder x5, \be
byteorder x6, \be
bitorder x3, \be
bitorder x4, \be
bitorder x5, \be
bitorder x6, \be
\order x3, x4, x5, x6
tst x7, #8
crc32\c\()x w8, w0, x3
@ -96,65 +95,268 @@ CPU_BE( rev16 \reg, \reg )
32: ldp x3, x4, [x1], #32
sub x2, x2, #32
ldp x5, x6, [x1, #-16]
byteorder x3, \be
byteorder x4, \be
byteorder x5, \be
byteorder x6, \be
bitorder x3, \be
bitorder x4, \be
bitorder x5, \be
bitorder x6, \be
\order x3, x4, x5, x6
crc32\c\()x w0, w0, x3
crc32\c\()x w0, w0, x4
crc32\c\()x w0, w0, x5
crc32\c\()x w0, w0, x6
cbnz x2, 32b
0: bitorder w0, \be
0: bit\order w0
ret
8: tbz x2, #3, 4f
ldr x3, [x1], #8
byteorder x3, \be
bitorder x3, \be
\order x3
crc32\c\()x w0, w0, x3
4: tbz x2, #2, 2f
ldr w3, [x1], #4
byteorder w3, \be
bitorder w3, \be
\order w3
crc32\c\()w w0, w0, w3
2: tbz x2, #1, 1f
ldrh w3, [x1], #2
byteorder16 w3, \be
bitorder16 w3, \be
hword\order w3
crc32\c\()h w0, w0, w3
1: tbz x2, #0, 0f
ldrb w3, [x1]
bitorder8 w3, \be
byte\order w3
crc32\c\()b w0, w0, w3
0: bitorder w0, \be
0: bit\order w0
ret
.endm
.align 5
SYM_FUNC_START(crc32_le)
alternative_if_not ARM64_HAS_CRC32
b crc32_le_base
alternative_else_nop_endif
SYM_FUNC_START(crc32_le_arm64)
__crc32
SYM_FUNC_END(crc32_le)
SYM_FUNC_END(crc32_le_arm64)
.align 5
SYM_FUNC_START(__crc32c_le)
alternative_if_not ARM64_HAS_CRC32
b __crc32c_le_base
alternative_else_nop_endif
SYM_FUNC_START(crc32c_le_arm64)
__crc32 c
SYM_FUNC_END(__crc32c_le)
SYM_FUNC_END(crc32c_le_arm64)
.align 5
SYM_FUNC_START(crc32_be)
alternative_if_not ARM64_HAS_CRC32
b crc32_be_base
alternative_else_nop_endif
__crc32 be=1
SYM_FUNC_END(crc32_be)
SYM_FUNC_START(crc32_be_arm64)
__crc32 order=be
SYM_FUNC_END(crc32_be_arm64)
in .req x1
len .req x2
/*
* w0: input CRC at entry, output CRC at exit
* x1: pointer to input buffer
* x2: length of input in bytes
*/
.macro crc4way, insn, table, order=le
bit\order w0
lsr len, len, #6 // len := # of 64-byte blocks
/* Process up to 64 blocks of 64 bytes at a time */
.La\@: mov x3, #64
cmp len, #64
csel x3, x3, len, hi // x3 := min(len, 64)
sub len, len, x3
/* Divide the input into 4 contiguous blocks */
add x4, x3, x3, lsl #1 // x4 := 3 * x3
add x7, in, x3, lsl #4 // x7 := in + 16 * x3
add x8, in, x3, lsl #5 // x8 := in + 32 * x3
add x9, in, x4, lsl #4 // x9 := in + 16 * x4
/* Load the folding coefficients from the lookup table */
adr_l x5, \table - 12 // entry 0 omitted
add x5, x5, x4, lsl #2 // x5 += 12 * x3
ldp s0, s1, [x5]
ldr s2, [x5, #8]
/* Zero init partial CRCs for this iteration */
mov w4, wzr
mov w5, wzr
mov w6, wzr
mov x17, xzr
.Lb\@: sub x3, x3, #1
\insn w6, w6, x17
ldp x10, x11, [in], #16
ldp x12, x13, [x7], #16
ldp x14, x15, [x8], #16
ldp x16, x17, [x9], #16
\order x10, x11, x12, x13, x14, x15, x16, x17
/* Apply the CRC transform to 4 16-byte blocks in parallel */
\insn w0, w0, x10
\insn w4, w4, x12
\insn w5, w5, x14
\insn w6, w6, x16
\insn w0, w0, x11
\insn w4, w4, x13
\insn w5, w5, x15
cbnz x3, .Lb\@
/* Combine the 4 partial results into w0 */
mov v3.d[0], x0
mov v4.d[0], x4
mov v5.d[0], x5
pmull v0.1q, v0.1d, v3.1d
pmull v1.1q, v1.1d, v4.1d
pmull v2.1q, v2.1d, v5.1d
eor v0.8b, v0.8b, v1.8b
eor v0.8b, v0.8b, v2.8b
mov x5, v0.d[0]
eor x5, x5, x17
\insn w0, w6, x5
mov in, x9
cbnz len, .La\@
bit\order w0
ret
.endm
.align 5
SYM_FUNC_START(crc32c_le_arm64_4way)
crc4way crc32cx, .L0
SYM_FUNC_END(crc32c_le_arm64_4way)
.align 5
SYM_FUNC_START(crc32_le_arm64_4way)
crc4way crc32x, .L1
SYM_FUNC_END(crc32_le_arm64_4way)
.align 5
SYM_FUNC_START(crc32_be_arm64_4way)
crc4way crc32x, .L1, be
SYM_FUNC_END(crc32_be_arm64_4way)
.section .rodata, "a", %progbits
.align 6
.L0: .long 0xddc0152b, 0xba4fc28e, 0x493c7d27
.long 0x0715ce53, 0x9e4addf8, 0xba4fc28e
.long 0xc96cfdc0, 0x0715ce53, 0xddc0152b
.long 0xab7aff2a, 0x0d3b6092, 0x9e4addf8
.long 0x299847d5, 0x878a92a7, 0x39d3b296
.long 0xb6dd949b, 0xab7aff2a, 0x0715ce53
.long 0xa60ce07b, 0x83348832, 0x47db8317
.long 0xd270f1a2, 0xb9e02b86, 0x0d3b6092
.long 0x65863b64, 0xb6dd949b, 0xc96cfdc0
.long 0xb3e32c28, 0xbac2fd7b, 0x878a92a7
.long 0xf285651c, 0xce7f39f4, 0xdaece73e
.long 0x271d9844, 0xd270f1a2, 0xab7aff2a
.long 0x6cb08e5c, 0x2b3cac5d, 0x2162d385
.long 0xcec3662e, 0x1b03397f, 0x83348832
.long 0x8227bb8a, 0xb3e32c28, 0x299847d5
.long 0xd7a4825c, 0xdd7e3b0c, 0xb9e02b86
.long 0xf6076544, 0x10746f3c, 0x18b33a4e
.long 0x98d8d9cb, 0x271d9844, 0xb6dd949b
.long 0x57a3d037, 0x93a5f730, 0x78d9ccb7
.long 0x3771e98f, 0x6b749fb2, 0xbac2fd7b
.long 0xe0ac139e, 0xcec3662e, 0xa60ce07b
.long 0x6f345e45, 0xe6fc4e6a, 0xce7f39f4
.long 0xa2b73df1, 0xb0cd4768, 0x61d82e56
.long 0x86d8e4d2, 0xd7a4825c, 0xd270f1a2
.long 0xa90fd27a, 0x0167d312, 0xc619809d
.long 0xca6ef3ac, 0x26f6a60a, 0x2b3cac5d
.long 0x4597456a, 0x98d8d9cb, 0x65863b64
.long 0xc9c8b782, 0x68bce87a, 0x1b03397f
.long 0x62ec6c6d, 0x6956fc3b, 0xebb883bd
.long 0x2342001e, 0x3771e98f, 0xb3e32c28
.long 0xe8b6368b, 0x2178513a, 0x064f7f26
.long 0x9ef68d35, 0x170076fa, 0xdd7e3b0c
.long 0x0b0bf8ca, 0x6f345e45, 0xf285651c
.long 0x02ee03b2, 0xff0dba97, 0x10746f3c
.long 0x135c83fd, 0xf872e54c, 0xc7a68855
.long 0x00bcf5f6, 0x86d8e4d2, 0x271d9844
.long 0x58ca5f00, 0x5bb8f1bc, 0x8e766a0c
.long 0xded288f8, 0xb3af077a, 0x93a5f730
.long 0x37170390, 0xca6ef3ac, 0x6cb08e5c
.long 0xf48642e9, 0xdd66cbbb, 0x6b749fb2
.long 0xb25b29f2, 0xe9e28eb4, 0x1393e203
.long 0x45cddf4e, 0xc9c8b782, 0xcec3662e
.long 0xdfd94fb2, 0x93e106a4, 0x96c515bb
.long 0x021ac5ef, 0xd813b325, 0xe6fc4e6a
.long 0x8e1450f7, 0x2342001e, 0x8227bb8a
.long 0xe0cdcf86, 0x6d9a4957, 0xb0cd4768
.long 0x613eee91, 0xd2c3ed1a, 0x39c7ff35
.long 0xbedc6ba1, 0x9ef68d35, 0xd7a4825c
.long 0x0cd1526a, 0xf2271e60, 0x0ab3844b
.long 0xd6c3a807, 0x2664fd8b, 0x0167d312
.long 0x1d31175f, 0x02ee03b2, 0xf6076544
.long 0x4be7fd90, 0x363bd6b3, 0x26f6a60a
.long 0x6eeed1c9, 0x5fabe670, 0xa741c1bf
.long 0xb3a6da94, 0x00bcf5f6, 0x98d8d9cb
.long 0x2e7d11a7, 0x17f27698, 0x49c3cc9c
.long 0x889774e1, 0xaa7c7ad5, 0x68bce87a
.long 0x8a074012, 0xded288f8, 0x57a3d037
.long 0xbd0bb25f, 0x6d390dec, 0x6956fc3b
.long 0x3be3c09b, 0x6353c1cc, 0x42d98888
.long 0x465a4eee, 0xf48642e9, 0x3771e98f
.long 0x2e5f3c8c, 0xdd35bc8d, 0xb42ae3d9
.long 0xa52f58ec, 0x9a5ede41, 0x2178513a
.long 0x47972100, 0x45cddf4e, 0xe0ac139e
.long 0x359674f7, 0xa51b6135, 0x170076fa
.L1: .long 0xaf449247, 0x81256527, 0xccaa009e
.long 0x57c54819, 0x1d9513d7, 0x81256527
.long 0x3f41287a, 0x57c54819, 0xaf449247
.long 0xf5e48c85, 0x910eeec1, 0x1d9513d7
.long 0x1f0c2cdd, 0x9026d5b1, 0xae0b5394
.long 0x71d54a59, 0xf5e48c85, 0x57c54819
.long 0x1c63267b, 0xfe807bbd, 0x0cbec0ed
.long 0xd31343ea, 0xe95c1271, 0x910eeec1
.long 0xf9d9c7ee, 0x71d54a59, 0x3f41287a
.long 0x9ee62949, 0xcec97417, 0x9026d5b1
.long 0xa55d1514, 0xf183c71b, 0xd1df2327
.long 0x21aa2b26, 0xd31343ea, 0xf5e48c85
.long 0x9d842b80, 0xeea395c4, 0x3c656ced
.long 0xd8110ff1, 0xcd669a40, 0xfe807bbd
.long 0x3f9e9356, 0x9ee62949, 0x1f0c2cdd
.long 0x1d6708a0, 0x0c30f51d, 0xe95c1271
.long 0xef82aa68, 0xdb3935ea, 0xb918a347
.long 0xd14bcc9b, 0x21aa2b26, 0x71d54a59
.long 0x99cce860, 0x356d209f, 0xff6f2fc2
.long 0xd8af8e46, 0xc352f6de, 0xcec97417
.long 0xf1996890, 0xd8110ff1, 0x1c63267b
.long 0x631bc508, 0xe95c7216, 0xf183c71b
.long 0x8511c306, 0x8e031a19, 0x9b9bdbd0
.long 0xdb3839f3, 0x1d6708a0, 0xd31343ea
.long 0x7a92fffb, 0xf7003835, 0x4470ac44
.long 0x6ce68f2a, 0x00eba0c8, 0xeea395c4
.long 0x4caaa263, 0xd14bcc9b, 0xf9d9c7ee
.long 0xb46f7cff, 0x9a1b53c8, 0xcd669a40
.long 0x60290934, 0x81b6f443, 0x6d40f445
.long 0x8e976a7d, 0xd8af8e46, 0x9ee62949
.long 0xdcf5088a, 0x9dbdc100, 0x145575d5
.long 0x1753ab84, 0xbbf2f6d6, 0x0c30f51d
.long 0x255b139e, 0x631bc508, 0xa55d1514
.long 0xd784eaa8, 0xce26786c, 0xdb3935ea
.long 0x6d2c864a, 0x8068c345, 0x2586d334
.long 0x02072e24, 0xdb3839f3, 0x21aa2b26
.long 0x06689b0a, 0x5efd72f5, 0xe0575528
.long 0x1e52f5ea, 0x4117915b, 0x356d209f
.long 0x1d3d1db6, 0x6ce68f2a, 0x9d842b80
.long 0x3796455c, 0xb8e0e4a8, 0xc352f6de
.long 0xdf3a4eb3, 0xc55a2330, 0xb84ffa9c
.long 0x28ae0976, 0xb46f7cff, 0xd8110ff1
.long 0x9764bc8d, 0xd7e7a22c, 0x712510f0
.long 0x13a13e18, 0x3e9a43cd, 0xe95c7216
.long 0xb8ee242e, 0x8e976a7d, 0x3f9e9356
.long 0x0c540e7b, 0x753c81ff, 0x8e031a19
.long 0x9924c781, 0xb9220208, 0x3edcde65
.long 0x3954de39, 0x1753ab84, 0x1d6708a0
.long 0xf32238b5, 0xbec81497, 0x9e70b943
.long 0xbbd2cd2c, 0x0925d861, 0xf7003835
.long 0xcc401304, 0xd784eaa8, 0xef82aa68
.long 0x4987e684, 0x6044fbb0, 0x00eba0c8
.long 0x3aa11427, 0x18fe3b4a, 0x87441142
.long 0x297aad60, 0x02072e24, 0xd14bcc9b
.long 0xf60c5e51, 0x6ef6f487, 0x5b7fdd0a
.long 0x632d78c5, 0x3fc33de4, 0x9a1b53c8
.long 0x25b8822a, 0x1e52f5ea, 0x99cce860
.long 0xd4fc84bc, 0x1af62fb8, 0x81b6f443
.long 0x5690aa32, 0xa91fdefb, 0x688a110e
.long 0x1357a093, 0x3796455c, 0xd8af8e46
.long 0x798fdd33, 0xaaa18a37, 0x357b9517
.long 0xc2815395, 0x54d42691, 0x9dbdc100
.long 0x21cfc0f7, 0x28ae0976, 0xf1996890
.long 0xa0decef3, 0x7b4aa8b7, 0xbbf2f6d6

View File

@ -57,7 +57,7 @@
The loop tail is handled by always copying 64 bytes from the end.
*/
SYM_FUNC_START(__pi_memcpy)
SYM_FUNC_START_LOCAL(__pi_memcpy_generic)
add srcend, src, count
add dstend, dstin, count
cmp count, 128
@ -238,7 +238,24 @@ L(copy64_from_start):
stp B_l, B_h, [dstin, 16]
stp C_l, C_h, [dstin]
ret
SYM_FUNC_END(__pi_memcpy_generic)
#ifdef CONFIG_AS_HAS_MOPS
.arch_extension mops
SYM_FUNC_START(__pi_memcpy)
alternative_if_not ARM64_HAS_MOPS
b __pi_memcpy_generic
alternative_else_nop_endif
mov dst, dstin
cpyp [dst]!, [src]!, count!
cpym [dst]!, [src]!, count!
cpye [dst]!, [src]!, count!
ret
SYM_FUNC_END(__pi_memcpy)
#else
SYM_FUNC_ALIAS(__pi_memcpy, __pi_memcpy_generic)
#endif
SYM_FUNC_ALIAS(__memcpy, __pi_memcpy)
EXPORT_SYMBOL(__memcpy)

View File

@ -26,6 +26,7 @@
*/
dstin .req x0
val_x .req x1
val .req w1
count .req x2
tmp1 .req x3
@ -42,7 +43,7 @@ dst .req x8
tmp3w .req w9
tmp3 .req x9
SYM_FUNC_START(__pi_memset)
SYM_FUNC_START_LOCAL(__pi_memset_generic)
mov dst, dstin /* Preserve return value. */
and A_lw, val, #255
orr A_lw, A_lw, A_lw, lsl #8
@ -201,7 +202,24 @@ SYM_FUNC_START(__pi_memset)
ands count, count, zva_bits_x
b.ne .Ltail_maybe_long
ret
SYM_FUNC_END(__pi_memset_generic)
#ifdef CONFIG_AS_HAS_MOPS
.arch_extension mops
SYM_FUNC_START(__pi_memset)
alternative_if_not ARM64_HAS_MOPS
b __pi_memset_generic
alternative_else_nop_endif
mov dst, dstin
setp [dst]!, count!, val_x
setm [dst]!, count!, val_x
sete [dst]!, count!, val_x
ret
SYM_FUNC_END(__pi_memset)
#else
SYM_FUNC_ALIAS(__pi_memset, __pi_memset_generic)
#endif
SYM_FUNC_ALIAS(__memset, __pi_memset)
EXPORT_SYMBOL(__memset)

View File

@ -11,6 +11,7 @@ obj-$(CONFIG_TRANS_TABLE) += trans_pgd.o
obj-$(CONFIG_TRANS_TABLE) += trans_pgd-asm.o
obj-$(CONFIG_DEBUG_VIRTUAL) += physaddr.o
obj-$(CONFIG_ARM64_MTE) += mteswap.o
obj-$(CONFIG_ARM64_GCS) += gcs.o
KASAN_SANITIZE_physaddr.o += n
obj-$(CONFIG_KASAN) += kasan_init.o

View File

@ -18,15 +18,40 @@ void copy_highpage(struct page *to, struct page *from)
{
void *kto = page_address(to);
void *kfrom = page_address(from);
struct folio *src = page_folio(from);
struct folio *dst = page_folio(to);
unsigned int i, nr_pages;
copy_page(kto, kfrom);
if (kasan_hw_tags_enabled())
page_kasan_tag_reset(to);
if (system_supports_mte() && page_mte_tagged(from)) {
if (!system_supports_mte())
return;
if (folio_test_hugetlb(src) &&
folio_test_hugetlb_mte_tagged(src)) {
if (!folio_try_hugetlb_mte_tagging(dst))
return;
/*
* Populate tags for all subpages.
*
* Don't assume the first page is head page since
* huge page copy may start from any subpage.
*/
nr_pages = folio_nr_pages(src);
for (i = 0; i < nr_pages; i++) {
kfrom = page_address(folio_page(src, i));
kto = page_address(folio_page(dst, i));
mte_copy_page_tags(kto, kfrom);
}
folio_set_hugetlb_mte_tagged(dst);
} else if (page_mte_tagged(from)) {
/* It's a new page, shouldn't have been tagged yet */
WARN_ON_ONCE(!try_page_mte_tagging(to));
mte_copy_page_tags(kto, kfrom);
set_page_mte_tagged(to);
}

View File

@ -504,6 +504,14 @@ static bool fault_from_pkey(unsigned long esr, struct vm_area_struct *vma,
false);
}
static bool is_gcs_fault(unsigned long esr)
{
if (!esr_is_data_abort(esr))
return false;
return ESR_ELx_ISS2(esr) & ESR_ELx_GCS;
}
static bool is_el0_instruction_abort(unsigned long esr)
{
return ESR_ELx_EC(esr) == ESR_ELx_EC_IABT_LOW;
@ -518,6 +526,23 @@ static bool is_write_abort(unsigned long esr)
return (esr & ESR_ELx_WNR) && !(esr & ESR_ELx_CM);
}
static bool is_invalid_gcs_access(struct vm_area_struct *vma, u64 esr)
{
if (!system_supports_gcs())
return false;
if (unlikely(is_gcs_fault(esr))) {
/* GCS accesses must be performed on a GCS page */
if (!(vma->vm_flags & VM_SHADOW_STACK))
return true;
} else if (unlikely(vma->vm_flags & VM_SHADOW_STACK)) {
/* Only GCS operations can write to a GCS page */
return esr_is_data_abort(esr) && is_write_abort(esr);
}
return false;
}
static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
struct pt_regs *regs)
{
@ -554,6 +579,14 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
/* It was exec fault */
vm_flags = VM_EXEC;
mm_flags |= FAULT_FLAG_INSTRUCTION;
} else if (is_gcs_fault(esr)) {
/*
* The GCS permission on a page implies both read and
* write so always handle any GCS fault as a write fault,
* we need to trigger CoW even for GCS reads.
*/
vm_flags = VM_WRITE;
mm_flags |= FAULT_FLAG_WRITE;
} else if (is_write_abort(esr)) {
/* It was write fault */
vm_flags = VM_WRITE;
@ -587,6 +620,13 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
if (!vma)
goto lock_mmap;
if (is_invalid_gcs_access(vma, esr)) {
vma_end_read(vma);
fault = 0;
si_code = SEGV_ACCERR;
goto bad_area;
}
if (!(vma->vm_flags & vm_flags)) {
vma_end_read(vma);
fault = 0;

View File

@ -47,7 +47,8 @@ static void __init early_fixmap_init_pte(pmd_t *pmdp, unsigned long addr)
if (pmd_none(pmd)) {
ptep = bm_pte[BM_PTE_TABLE_IDX(addr)];
__pmd_populate(pmdp, __pa_symbol(ptep), PMD_TYPE_TABLE);
__pmd_populate(pmdp, __pa_symbol(ptep),
PMD_TYPE_TABLE | PMD_TABLE_AF);
}
}
@ -59,7 +60,8 @@ static void __init early_fixmap_init_pmd(pud_t *pudp, unsigned long addr,
pmd_t *pmdp;
if (pud_none(pud))
__pud_populate(pudp, __pa_symbol(bm_pmd), PUD_TYPE_TABLE);
__pud_populate(pudp, __pa_symbol(bm_pmd),
PUD_TYPE_TABLE | PUD_TABLE_AF);
pmdp = pmd_offset_kimg(pudp, addr);
do {
@ -86,7 +88,8 @@ static void __init early_fixmap_init_pud(p4d_t *p4dp, unsigned long addr,
}
if (p4d_none(p4d))
__p4d_populate(p4dp, __pa_symbol(bm_pud), P4D_TYPE_TABLE);
__p4d_populate(p4dp, __pa_symbol(bm_pud),
P4D_TYPE_TABLE | P4D_TABLE_AF);
pudp = pud_offset_kimg(p4dp, addr);
early_fixmap_init_pmd(pudp, addr, end);

254
arch/arm64/mm/gcs.c Normal file
View File

@ -0,0 +1,254 @@
// SPDX-License-Identifier: GPL-2.0-only
#include <linux/mm.h>
#include <linux/mman.h>
#include <linux/syscalls.h>
#include <linux/types.h>
#include <asm/cmpxchg.h>
#include <asm/cpufeature.h>
#include <asm/gcs.h>
#include <asm/page.h>
static unsigned long alloc_gcs(unsigned long addr, unsigned long size)
{
int flags = MAP_ANONYMOUS | MAP_PRIVATE;
struct mm_struct *mm = current->mm;
unsigned long mapped_addr, unused;
if (addr)
flags |= MAP_FIXED_NOREPLACE;
mmap_write_lock(mm);
mapped_addr = do_mmap(NULL, addr, size, PROT_READ, flags,
VM_SHADOW_STACK | VM_WRITE, 0, &unused, NULL);
mmap_write_unlock(mm);
return mapped_addr;
}
static unsigned long gcs_size(unsigned long size)
{
if (size)
return PAGE_ALIGN(size);
/* Allocate RLIMIT_STACK/2 with limits of PAGE_SIZE..2G */
size = PAGE_ALIGN(min_t(unsigned long long,
rlimit(RLIMIT_STACK) / 2, SZ_2G));
return max(PAGE_SIZE, size);
}
unsigned long gcs_alloc_thread_stack(struct task_struct *tsk,
const struct kernel_clone_args *args)
{
unsigned long addr, size;
if (!system_supports_gcs())
return 0;
if (!task_gcs_el0_enabled(tsk))
return 0;
if ((args->flags & (CLONE_VFORK | CLONE_VM)) != CLONE_VM) {
tsk->thread.gcspr_el0 = read_sysreg_s(SYS_GCSPR_EL0);
return 0;
}
size = args->stack_size / 2;
size = gcs_size(size);
addr = alloc_gcs(0, size);
if (IS_ERR_VALUE(addr))
return addr;
tsk->thread.gcs_base = addr;
tsk->thread.gcs_size = size;
tsk->thread.gcspr_el0 = addr + size - sizeof(u64);
return addr;
}
SYSCALL_DEFINE3(map_shadow_stack, unsigned long, addr, unsigned long, size, unsigned int, flags)
{
unsigned long alloc_size;
unsigned long __user *cap_ptr;
unsigned long cap_val;
int ret = 0;
int cap_offset;
if (!system_supports_gcs())
return -EOPNOTSUPP;
if (flags & ~(SHADOW_STACK_SET_TOKEN | SHADOW_STACK_SET_MARKER))
return -EINVAL;
if (!PAGE_ALIGNED(addr))
return -EINVAL;
if (size == 8 || !IS_ALIGNED(size, 8))
return -EINVAL;
/*
* An overflow would result in attempting to write the restore token
* to the wrong location. Not catastrophic, but just return the right
* error code and block it.
*/
alloc_size = PAGE_ALIGN(size);
if (alloc_size < size)
return -EOVERFLOW;
addr = alloc_gcs(addr, alloc_size);
if (IS_ERR_VALUE(addr))
return addr;
/*
* Put a cap token at the end of the allocated region so it
* can be switched to.
*/
if (flags & SHADOW_STACK_SET_TOKEN) {
/* Leave an extra empty frame as a top of stack marker? */
if (flags & SHADOW_STACK_SET_MARKER)
cap_offset = 2;
else
cap_offset = 1;
cap_ptr = (unsigned long __user *)(addr + size -
(cap_offset * sizeof(unsigned long)));
cap_val = GCS_CAP(cap_ptr);
put_user_gcs(cap_val, cap_ptr, &ret);
if (ret != 0) {
vm_munmap(addr, size);
return -EFAULT;
}
/*
* Ensure the new cap is ordered before standard
* memory accesses to the same location.
*/
gcsb_dsync();
}
return addr;
}
/*
* Apply the GCS mode configured for the specified task to the
* hardware.
*/
void gcs_set_el0_mode(struct task_struct *task)
{
u64 gcscre0_el1 = GCSCRE0_EL1_nTR;
if (task->thread.gcs_el0_mode & PR_SHADOW_STACK_ENABLE)
gcscre0_el1 |= GCSCRE0_EL1_RVCHKEN | GCSCRE0_EL1_PCRSEL;
if (task->thread.gcs_el0_mode & PR_SHADOW_STACK_WRITE)
gcscre0_el1 |= GCSCRE0_EL1_STREn;
if (task->thread.gcs_el0_mode & PR_SHADOW_STACK_PUSH)
gcscre0_el1 |= GCSCRE0_EL1_PUSHMEn;
write_sysreg_s(gcscre0_el1, SYS_GCSCRE0_EL1);
}
void gcs_free(struct task_struct *task)
{
if (!system_supports_gcs())
return;
/*
* When fork() with CLONE_VM fails, the child (tsk) already
* has a GCS allocated, and exit_thread() calls this function
* to free it. In this case the parent (current) and the
* child share the same mm struct.
*/
if (!task->mm || task->mm != current->mm)
return;
if (task->thread.gcs_base)
vm_munmap(task->thread.gcs_base, task->thread.gcs_size);
task->thread.gcspr_el0 = 0;
task->thread.gcs_base = 0;
task->thread.gcs_size = 0;
}
int arch_set_shadow_stack_status(struct task_struct *task, unsigned long arg)
{
unsigned long gcs, size;
int ret;
if (!system_supports_gcs())
return -EINVAL;
if (is_compat_thread(task_thread_info(task)))
return -EINVAL;
/* Reject unknown flags */
if (arg & ~PR_SHADOW_STACK_SUPPORTED_STATUS_MASK)
return -EINVAL;
ret = gcs_check_locked(task, arg);
if (ret != 0)
return ret;
/* If we are enabling GCS then make sure we have a stack */
if (arg & PR_SHADOW_STACK_ENABLE &&
!task_gcs_el0_enabled(task)) {
/* Do not allow GCS to be reenabled */
if (task->thread.gcs_base || task->thread.gcspr_el0)
return -EINVAL;
if (task != current)
return -EBUSY;
size = gcs_size(0);
gcs = alloc_gcs(0, size);
if (!gcs)
return -ENOMEM;
task->thread.gcspr_el0 = gcs + size - sizeof(u64);
task->thread.gcs_base = gcs;
task->thread.gcs_size = size;
if (task == current)
write_sysreg_s(task->thread.gcspr_el0,
SYS_GCSPR_EL0);
}
task->thread.gcs_el0_mode = arg;
if (task == current)
gcs_set_el0_mode(task);
return 0;
}
int arch_get_shadow_stack_status(struct task_struct *task,
unsigned long __user *arg)
{
if (!system_supports_gcs())
return -EINVAL;
if (is_compat_thread(task_thread_info(task)))
return -EINVAL;
return put_user(task->thread.gcs_el0_mode, arg);
}
int arch_lock_shadow_stack_status(struct task_struct *task,
unsigned long arg)
{
if (!system_supports_gcs())
return -EINVAL;
if (is_compat_thread(task_thread_info(task)))
return -EINVAL;
/*
* We support locking unknown bits so applications can prevent
* any changes in a future proof manner.
*/
task->thread.gcs_el0_locked |= arg;
return 0;
}

Some files were not shown because too many files have changed in this diff Show More