Commit graph

3552 commits

Author SHA1 Message Date
Soby Mathew
ca3f2eee11 Merge "feat(rmmd): verify FEAT_MEC present before calling plat hoook" into integration 2025-03-26 17:39:57 +01:00
Juan Pablo Conde
609ada9691 feat(rmmd): verify FEAT_MEC present before calling plat hoook
Some platforms do not support FEAT_MEC. Hence, they do not provide
an interface to update the update of the key corresponding to a
MECID.

This patch adds a condition in order to verify FEAT_MEC is present
before calling the corresponding platform hook, thus preventing it
from being called when the platform does not support the feature.

Change-Id: Ib1eb9e42f475e27ec31529569e888b93b207148c
Signed-off-by: Juan Pablo Conde <juanpablo.conde@arm.com>
2025-03-26 15:46:38 +01:00
Soby Mathew
90f9c9bef5 Merge "feat(rme): add SMMU and PCIe information to Boot manifest" into integration 2025-03-25 12:35:47 +01:00
AlexeiFedorov
90552c612e feat(rme): add SMMU and PCIe information to Boot manifest
- Define information structures for SMMU, root complex,
  root port and BDF mappings.
- Add entries for SMMU and PCIe root complexes to Boot manifest.
- Update RMMD_MANIFEST_VERSION_MINOR from 4 to 5.

Change-Id: I0a76dc18edbaaff40116f376aeb56c750d57c7c1
Signed-off-by: AlexeiFedorov <Alexei.Fedorov@arm.com>
2025-03-25 10:26:18 +00:00
Manish Pandey
518b278bed Merge changes from topic "hm/handoff-aarch32" into integration
* changes:
  refactor(arm): simplify early platform setup functions
  feat(bl32): enable r3 usage for boot args
  feat(handoff): add lib to sp-min sources
  feat(handoff): add 32-bit variant of SRAM layout
  feat(handoff): add 32-bit variant of ep info
  fix(aarch32): avoid using r12 to store boot params
  fix(arm): reinit secure and non-secure tls
  refactor(handoff): downgrade error messages
2025-03-24 17:29:57 +01:00
Manish V Badarkhe
4c7fa977b7 Merge "chore(cm): add MDCR_EL3.RLTE to context management" into integration 2025-03-21 12:25:42 +01:00
Madhukar Pappireddy
38b5f93a2b Merge "feat(lib): implement strnlen secure and strcpy secure function" into integration 2025-03-20 15:44:44 +01:00
Harrison Mutai
8921349894 refactor(arm): simplify early platform setup functions
Refactor `arm_sp_min_early_platform_setup` to accept generic
`u_register_r` values to support receiving firmware handoff boot
arguments in common code. This has the added benefit of simplifying the
interface into common early platform setup.

Change-Id: Idfc3d41f94f2bf3a3a0c7ca39f6b9b0013836e3a
Signed-off-by: Harrison Mutai <harrison.mutai@arm.com>
2025-03-20 13:57:14 +00:00
Manish V Badarkhe
7e84854015 Merge changes from topic "dtpm_poc" into integration
* changes:
  feat(docs): update mboot threat model with dTPM
  docs(tpm): add design documentation for dTPM
  fix(rpi3):  expose BL1_RW to BL2 map for mboot
  feat(rpi3): add dTPM backed measured boot
  feat(tpm): add Infineon SLB9670 GPIO SPI config
  feat(tpm): add tpm drivers and framework
  feat(io): add generic gpio spi bit-bang driver
  feat(rpi3): implement eventlog handoff to BL33
  feat(rpi3): implement mboot for rpi3
2025-03-20 12:57:14 +01:00
Soby Mathew
4848824548 Merge changes from topic "mec" into integration
* changes:
  feat(qemu): add plat_rmmd_mecid_key_update()
  feat(rmmd): add RMM_MECID_KEY_UPDATE call
2025-03-20 10:26:23 +01:00
Boyan Karatotev
c1b0a97b7a chore(cm): add MDCR_EL3.RLTE to context management
The bit is already implicitly zero so no functional change. Adding it
helps fully describe how we expect FEAT_TRF to behave.

Change-Id: If7a7881e2b50188222ce46265b432d658a664c75
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
2025-03-20 07:26:15 +00:00
Jit Loon Lim
eb088894dc feat(lib): implement strnlen secure and strcpy secure function
Implement safer version of 'strnlen' function
to handle NULL terminated strings with additional
bound checking and secure version of string copy function
to support better security and avoid destination
buffer overflow.

Change-Id: I93916f003b192c1c6da6a4f78a627c8885db11d9
Signed-off-by: Jit Loon Lim <jit.loon.lim@altera.com>
Signed-off-by: Girisha Dengi <girisha.dengi@intel.com>
2025-03-19 12:57:35 +08:00
Tushar Khandelwal
f801fdc22e feat(rmmd): add RMM_MECID_KEY_UPDATE call
With this addition, TF-A now has an SMC call to handle the
update of MEC keys associated to MECIDs.

The behavior of this newly added call is empty for now until an
implementation for the MPE (Memory Protection Engine) driver is
available. Only parameter sanitization has been implemented.

Signed-off-by: Tushar Khandelwal <tushar.khandelwal@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Juan Pablo Conde <juanpablo.conde@arm.com>
Change-Id: I2a969310b47e8c6da1817a79be0cd56158c6efc3
2025-03-18 17:17:06 -05:00
Abhi Singh
6fa56e9367 feat(tpm): add Infineon SLB9670 GPIO SPI config
add the Infineon Optiga SLB9670 TPM2.0 GPIO SPI
configuration data, as well as chip reset and the
GPIO SPI bitbang driver initialization. This code
supports use with the rpi3 platform, with availibility
to add configuration parameters for other platforms

Change-Id: Ibdffb28fa0b3b5a18dff2ba5d4ea305633740763
Signed-off-by: Abhi Singh <abhi.singh@arm.com>
2025-03-18 19:57:56 +01:00
Abhi.Singh
36e3d877cd feat(tpm): add tpm drivers and framework
Add tpm2 drivers to tf-a with adequate framework
-implement a fifo spi interface that works
 with discrete tpm chip.
-implement tpm command layer interfaces that are used
 to initialize, start and make measurements and
 close the interface.
-tpm drivers are built using their own make file
 to allow for ease in porting across platforms,
 and across different interfaces.

Signed-off-by: Tushar Khandelwal <tushar.khandelwal@arm.com>
Signed-off-by: Abhi Singh <abhi.singh@arm.com>
Change-Id: Ie1a189f45c80f26f4dea16c3bd71b1503709e0ea
2025-03-18 19:57:22 +01:00
Abhi Singh
3c54570afc feat(io): add generic gpio spi bit-bang driver
When using a tpm breakout board with rpi3, we elected to bit-bang
gpio pins to emulate a spi interface, this implementation required a
driver to interface with the platform specific pins and emulate spi
functionality. The generic driver provides the ability to pass in a
gpio_spi_data structure that contains the necessary gpio pins in
order to simulate spi operations (get_access, start, stop, xfer).

Change-Id: I88919e8a294c05e0cabb8224e35ae5c1ba5f2413
Signed-off-by: Tushar Khandelwal <tushar.khandelwal@arm.com>
Signed-off-by: Abhi Singh <abhi.singh@arm.com>
2025-03-18 19:56:16 +01:00
John Powell
f2bd352820 fix(errata): workaround for Cortex-A510 erratum 2971420
Cortex-A510 erratum 2971420 applies to revisions r0p1, r0p2, r0p3,
r1p0, r1p1, r1p2 and r1p3, and is still open.

Under some conditions, data might be corrupted if Trace Buffer
Extension (TRBE) is enabled. The workaround is to disable trace
collection via TRBE by programming MDCR_EL3.NSTB[1] to the opposite
value of SCR_EL3.NS on a security state switch. Since we only enable
TRBE for non-secure world, the workaround is to disable TRBE by
setting the NSTB field to 00 so accesses are trapped to EL3 and
secure state owns the buffer.

SDEN: https://developer.arm.com/documentation/SDEN-1873361/latest/

Signed-off-by: John Powell <john.powell@arm.com>
Change-Id: Ia77051f6b64c726a8c50596c78f220d323ab7d97
2025-03-17 19:04:54 +01:00
John Powell
fcf2ab71ac fix(cpus): workaround for Cortex-A715 erratum 2804830
Cortex-A715 erratum 2804830 applies to r0p0, r1p0, r1p1 and r1p2,
and is fixed in r1p3.

Under some conditions, writes of a 64B-aligned, 64B granule of
memory might cause data corruption without this workaround. See SDEN
for details.

Since this workaround disables write streaming, it is expected to
have a significant performance impact for code that is heavily
reliant on write streaming, such as memcpy or memset.

SDEN: https://developer.arm.com/documentation/SDEN-2148827/latest/

Change-Id: Ia12f6c7de7c92f6ea4aec3057b228b828d48724c
Signed-off-by: John Powell <john.powell@arm.com>
2025-03-17 18:17:48 +01:00
Harrison Mutai
8001247ce2 feat(handoff): add 32-bit variant of SRAM layout
Introduce the 32-bit variant of the SRAM layout used by BL1 to
communicate available free SRAM to BL2. This layout was added to the
specification in:
https://github.com/FirmwareHandoff/firmware_handoff/pull/54.

Change-Id: I559fb8a00725eaedf01856af42d73029802aa095
Signed-off-by: Harrison Mutai <harrison.mutai@arm.com>
2025-03-17 16:58:51 +00:00
Harrison Mutai
7ffc1d6cf3 feat(handoff): add 32-bit variant of ep info
Add the 32-bit version of the entry_point_info structure used to pass
the boot arguments for future executables, added to the spec under the
PR: https://github.com/FirmwareHandoff/firmware_handoff/pull/54.

Change-Id: Id98e0f98db6ffd4790193e201f24e62101450e20
Signed-off-by: Harrison Mutai <harrison.mutai@arm.com>
2025-03-17 16:58:49 +00:00
Govindraj Raja
8762735bea Merge changes from topic "mb/drtm" into integration
* changes:
  feat(drtm): validate launch features in DRTM parameters
  feat(lib): add EXTRACT_FIELD macro for field extraction
2025-03-12 16:11:17 +01:00
Soby Mathew
c5ea3faca1 Merge "feat(rmmd): add FEAT_MEC support" into integration 2025-03-12 11:19:04 +01:00
Tushar Khandelwal
7e84f3cf90 feat(rmmd): add FEAT_MEC support
This patch provides architectural support for further use of
Memory Encryption Contexts (MEC) by declaring the necessary
registers, bits, masks, helpers and values and modifying the
necessary registers to enable FEAT_MEC.

Signed-off-by: Tushar Khandelwal <tushar.khandelwal@arm.com>
Signed-off-by: Juan Pablo Conde <juanpablo.conde@arm.com>
Change-Id: I670dbfcef46e131dcbf3a0b927467ebf6f438fa4
2025-03-11 14:46:00 -05:00
Manish V Badarkhe
8666bcfa75 feat(drtm): validate launch features in DRTM parameters
Perform sanity checks on the launch features received via DRTM parameters.
Return INVALID_PARAMETERS if they are incorrect.

Change-Id: I7e8068154028d1c8f6b6b45449616bb5711ea76e
Signed-off-by: Manish V Badarkhe <Manish.Badarkhe@arm.com>
2025-03-09 11:59:14 +00:00
Manish V Badarkhe
af1dd6e1a5 feat(lib): add EXTRACT_FIELD macro for field extraction
Introduce a new EXTRACT_FIELD macro to simplify the extraction
of specific fields from a value by shifting the value right
and applying the mask.

Change-Id: Iae9573d6d23067bbde13253e264e4f6f18b806c2
Signed-off-by: Manish V Badarkhe <Manish.Badarkhe@arm.com>
2025-03-09 11:57:38 +00:00
Arvind Ram Prakash
8656bdab57 fix(cpufeat): include FEAT_MOPS declaration in aarch32 header
This patch adds the missing is_feat_mops_supported() declaration
in aarch32 header.

Signed-off-by: Arvind Ram Prakash <arvind.ramprakash@arm.com>
Change-Id: I875f65defe23912351f9ef18555a5b0a0e53717d
2025-03-07 12:34:27 -06:00
Madhukar Pappireddy
7aa73612d7 Merge "fix(cpufeat): avoid using mrrs/msrr for tspd" into integration 2025-03-07 18:20:01 +01:00
Govindraj Raja
f3e2b49970 fix(cpufeat): avoid using mrrs/msrr for tspd
tspd compiles with `arch_helpers.h` and when FEAT_D128 is enabled
read/writes to D128 impacted registers will provide 128-bit
mrrs/msrr read/write implementation.

However FEAT_D128 implementation with SCR_EL3.D128en is set only
for lower-EL Non-Secure world. When tspd is chosen as the SPD target,
it builds tsp as well. This S-EL1 payload, used for testing,
inadvertently uses mrrs/msrr read/write implementation in
`modify_el1_common_regs` helper function. This eventually leads
to a panic.

Group all D128 impacted registers and avoid using mrrs/msrr read/write
implementation for tspd builds.

Change-Id: Ic0ed3a901ffa65f9447cae08951defbadee3e02a
Signed-off-by: Govindraj Raja <govindraj.raja@arm.com>
2025-03-07 18:12:12 +01:00
Arvind Ram Prakash
bbff267b6f fix(errata-abi): add support for handling split workarounds
Certain erratum workarounds like Neoverse N1 1542419, need a part
of their mitigation done in EL3 and the rest in lower EL. But currently
such workarounds return HIGHER_EL_MITIGATION which indicates that the
erratum has already been mitigated by a higher EL(EL3 in this case)
which causes the lower EL to not apply it's part of the mitigation.

This patch fixes this issue by adding support for split workarounds
so that on certain errata we return AFFECTED even though EL3 has
applied it's workaround. This is done by reusing the chosen field of
erratum_entry structure into a bitfield that has two bitfields -
Bit 0 indicates that the erratum has been enabled in build,
Bit 1 indicates that the erratum is a split workaround and should
return AFFECTED instead of HIGHER_EL_MITIGATION.

SDEN documentation:
https://developer.arm.com/documentation/SDEN885747/latest

Signed-off-by: Arvind Ram Prakash <arvind.ramprakash@arm.com>
Change-Id: Iec94d665b5f55609507a219a7d1771eb75e7f4a7
2025-03-07 17:02:25 +01:00
Boyan Karatotev
2bec665f46 fix(smccc): register PMUv3p5 and PMUv3p7 bits with the FEATURE_AVAILABILITY call
These bits were missed with the original implementation. They are set if
supported, so we need to ignore them.

Change-Id: I3a94017bacdc54bfc14f0add972240148da3b41d
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
2025-03-07 15:28:35 +01:00
Manish Pandey
d153bcf427 Merge "feat(spm_mm): move mm_communication header define to general header" into integration 2025-03-06 23:36:19 +01:00
Vinoj Soundararajan
ec6f49c26b feat(ras): add eabort get helper function
Add EABORT get field helper function to obtain SET, AET (UET) values
from esr_el3/disr_el1 based on PE error state recording in the exception
syndrome refer to RAS PE architecture in
https://developer.arm.com/documentation/ddi0487/latest/

Change-Id: I0011f041a3089c9bbf670275687ad7c3362a07f9
Signed-off-by: Vinoj Soundararajan <vinojs@google.com>
2025-03-06 13:45:08 +00:00
Vinoj Soundararajan
daeae49511 feat(ras): add asynchronous error type corrected
Add asynchronous error type Corrected (CE) to error status
AET based on PE error state recording in the exception syndrome
Refer to https://developer.arm.com/documentation/ddi0487/latest/
RAS PE architecture.

Change-Id: I9f2525411b94c8fd397b4a0b8cf5dc47457a2771
Signed-off-by: Vinoj Soundararajan <vinojs@google.com>
2025-03-06 13:34:23 +00:00
Vinoj Soundararajan
e5cd3e81d1 fix(ras): fix typo in uncorrectable error type UEO
Fix spelling for UEO from restable to restartable
based on PE error state recording in the exception syndrome
Refer to https://developer.arm.com/documentation/ddi0487/latest/
RAS PE architecture.

Change-Id: I4da419f2120a7385853d4da78b409c675cdfe1c8
Signed-off-by: Vinoj Soundararajan <vinojs@google.com>
2025-03-06 13:30:19 +00:00
Vinoj Soundararajan
9c17687aab fix(ras): fix status synchronous error type fields
Based on SET bits of ISS encoding for an exception from Data or
Instruction Abort. (Refer to ESR_EL3)
1. Fix Synchronous error type restartable value from 1 to 3
2. Remove corrected CE field which is not applicable to SET

Change-Id: If357da9881bee962825bc3b9423ba7fc107f9b1d
Signed-off-by: Vinoj Soundararajan <vinojs@google.com>
2025-03-06 13:14:02 +00:00
Manish V Badarkhe
7990cc80d6 Merge "feat(handoff): add transfer entry printer" into integration 2025-02-28 18:15:31 +01:00
Manish Pandey
c72200357a fix(el3-runtime): replace CTX_ESR_EL3 with CTX_DOUBLE_FAULT_ESR
ESR_EL3 value is updated when an exception is taken to EL3 and its value
does not change until a new exception is taken to EL3. We need to save
ESR in context memory only when we expect nested exception in EL3.

The scenarios where we would expect nested EL3 execution are related
with FFH_SUPPORT, namely
  1.Handling pending async EAs at EL3 boundry
    - It uses CTX_SAVED_ESR_EL3 to preserve origins esr_el3
  2.Double fault handling
    - Introduce an explicit storage (CTX_DOUBLE_FAULT_ESR) for esr_el3
      to take care of DobuleFault.

As the ESR context has been removed, read the register directly instead
of its context value in RD platform.

Signed-off-by: Manish Pandey <manish.pandey2@arm.com>
Change-Id: I7720c5f03903f894a77413a235e3cc05c86f9c17
2025-02-28 11:48:37 +00:00
Govindraj Raja
70b5967ebc Merge changes from topic "mb/drtm" into integration
* changes:
  feat(drtm): retrieve DLME image authentication features
  feat(drtm): log No-Action Event in Event Log for DRTM measurements
  feat(fvp): add stub function to retrieve DLME image auth features
  feat(drtm): introduce plat API for DLME authentication features
  feat(drtm): ensure event types aligns with DRTM specification v1.1
  fix(drtm): add missing DLME data regions for min size requirement
  feat(fvp): add stub platform function to get ACPI table region size
  feat(drtm): add platform API to retrieve ACPI tables region size
2025-02-27 19:14:11 +01:00
Govindraj Raja
98c6516520 chore: rename arcadia to Cortex-A320
Cortex-A320 has been announced, rename arcadia to Cortex-A320.

Ref:
https://newsroom.arm.com/blog/introducing-arm-cortex-a320-cpu
https://www.arm.com/products/silicon-ip-cpu/cortex-a/cortex-a320

Change-Id: Ifb3743d43dca3d8caaf1e7416715ccca4fdf195f
Signed-off-by: Govindraj Raja <govindraj.raja@arm.com>
2025-02-26 11:00:41 -06:00
Manish V Badarkhe
94127ae299 feat(drtm): retrieve DLME image authentication features
Retrieve DLME image authentication features and report them
back to the DCE preamble. Currently, this value is always set
to 0, as no platform supports DLME authentication.

Additionally, the default schema is always used instead of
the DLME PCR schema since DLME authentication is not currently
supported.

This change primarily upgrades the DRTM parameters version to V2,
aligning with DRTM spec v1.1 [1].

[1]: https://developer.arm.com/documentation/den0113/c/?lang=en

Change-Id: Ie2ceb0d2ff49465643597e8725710a93d89e74a2
Signed-off-by: Manish V Badarkhe <Manish.Badarkhe@arm.com>
2025-02-26 12:56:30 +00:00
Manish V Badarkhe
0f7ebef73e feat(drtm): introduce plat API for DLME authentication features
This patch introduces a platform-specific function to provide DLME
authentication features. While no platforms currently support DLME
authentication, this change offers a structured way for platforms
to define and expose their DLME authentication features, with the
flexibility to extend support in the future if needed.

Change-Id: Ia708914477c4d8cfee4809a9daade9a3e91ed073
Signed-off-by: Manish V Badarkhe <Manish.Badarkhe@arm.com>
2025-02-26 12:52:22 +00:00
Manish V Badarkhe
7792bdbdf9 feat(drtm): add platform API to retrieve ACPI tables region size
Introduces a platform-specific API to retrieve the ACPI table
region size. This will be used in a subsequent patch to specify
the minimum DLME size requirement for the DCE preamble.

Change-Id: I44ce9241733b22fea3cbce9d42f1c2cc5ef20852
Signed-off-by: Manish V Badarkhe <Manish.Badarkhe@arm.com>
2025-02-26 12:52:22 +00:00
Harrison Mutai
937c513d5e feat(handoff): add transfer entry printer
Change-Id: Ib7d370b023f92f2fffbd341bcf874914fcc1bac2
Signed-off-by: Harrison Mutai <harrison.mutai@arm.com>
2025-02-25 09:32:42 +00:00
Boyan Karatotev
0a580b5128 perf(cm): drop ZCR_EL3 saving and some ISBs and replace them with root context
SVE and SME aren't enabled symmetrically for all worlds, but EL3 needs
to context switch them nonetheless. Previously, this had to happen by
writing the enable bits just before reading/writing the relevant
context. But since the introduction of root context, this need not be
the case. We can have these enables always be present for EL3 and save
on some work (and ISBs!) on every context switch.

We can also hoist ZCR_EL3 to a never changing register, as we set its
value to be identical for every world, which happens to be the one we
want for EL3 too.

Change-Id: I3d950e72049a298008205ba32f230d5a5c02f8b0
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
2025-02-25 08:52:06 +00:00
Boyan Karatotev
83ec7e452c perf(amu): greatly simplify AMU context management
The current code is incredibly resilient to updates to the spec and
has worked quite well so far. However, recent implementations expose a
weakness in that this is rather slow. A large part of it is written in
assembly, making it opaque to the compiler for optimisations. The
future proofness requires reading registers that are effectively
`volatile`, making it even harder for the compiler, as well as adding
lots of implicit barriers, making it hard for the microarchitecutre to
optimise as well.

We can make a few assumptions, checked by a few well placed asserts, and
remove a lot of this burden. For a start, at the moment there are 4
group 0 counters with static assignments. Contexting them is a trivial
affair that doesn't need a loop. Similarly, there can only be up to 16
group 1 counters. Contexting them is a bit harder, but we can do with a
single branch with a falling through switch. If/when both of these
change, we have a pair of asserts and the feature detection mechanism to
guard us against pretending that we support something we don't.

We can drop contexting of the offset registers. They are fully
accessible by EL2 and as such are its responsibility to preserve on
powerdown.

Another small thing we can do, is pass the core_pos into the hook.
The caller already knows which core we're running on, we don't need to
call this non-trivial function again.

Finally, knowing this, we don't really need the auxiliary AMUs to be
described by the device tree. Linux doesn't care at the moment, and any
information we need for EL3 can be neatly placed in a simple array.

All of this, combined with lifting the actual saving out of assembly,
reduces the instructions to save the context from 180 to 40, including a
lot fewer branches. The code is also much shorter and easier to read.

Also propagate to aarch32 so that the two don't diverge too much.

Change-Id: Ib62e6e9ba5be7fb9fb8965c8eee148d5598a5361
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
2025-02-25 08:50:46 +00:00
Boyan Karatotev
2590e819eb perf(mpmm): greatly simplify MPMM enablement
MPMM is a core-specific microarchitectural feature. It has been present
in every Arm core since the Cortex-A510 and has been implemented in
exactly the same way. Despite that, it is enabled more like an
architectural feature with a top level enable flag. This utilised the
identical implementation.

This duality has left MPMM in an awkward place, where its enablement
should be generic, like an architectural feature, but since it is not,
it should also be core-specific if it ever changes. One choice to do
this has been through the device tree.

This has worked just fine so far, however, recent implementations expose
a weakness in that this is rather slow - the device tree has to be read,
there's a long call stack of functions with many branches, and system
registers are read. In the hot path of PSCI CPU powerdown, this has a
significant and measurable impact. Besides it being a rather large
amount of code that is difficult to understand.

Since MPMM is a microarchitectural feature, its correct placement is in
the reset function. The essence of the current enablement is to write
CPUPPMCR_EL3.MPMM_EN if CPUPPMCR_EL3.MPMMPINCTL == 0. Replacing the C
enablement with an assembly macro in each CPU's reset function achieves
the same effect with just a single close branch and a grand total of 6
instructions (versus the old 2 branches and 32 instructions).

Having done this, the device tree entry becomes redundant. Should a core
that doesn't support MPMM arise, this can cleanly be handled in the
reset function. As such, the whole ENABLE_MPMM_FCONF and platform hooks
mechanisms become obsolete and are removed.

Change-Id: I1d0475b21a1625bb3519f513ba109284f973ffdf
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
2025-02-25 08:50:45 +00:00
Manish V Badarkhe
a8a5d39d6e Merge changes from topic "bk/errata_speed" into integration
* changes:
  refactor(cpus): declare runtime errata correctly
  perf(cpus): make reset errata do fewer branches
  perf(cpus): inline the init_cpu_data_ptr function
  perf(cpus): inline the reset function
  perf(cpus): inline the cpu_get_rev_var call
  perf(cpus): inline cpu_rev_var checks
  refactor(cpus): register DSU errata with the errata framework's wrappers
  refactor(cpus): convert checker functions to standard helpers
  refactor(cpus): convert the Cortex-A65 to use the errata framework
  fix(cpus): declare reset errata correctly
2025-02-24 17:24:53 +01:00
Boyan Karatotev
89dba82dfa perf(cpus): make reset errata do fewer branches
Errata application is painful for performance. For a start, it's done
when the core has just come out of reset, which means branch predictors
and caches will be empty so a branch to a workaround function must be
fetched from memory and that round trip is very slow. Then it also runs
with the I-cache off, which means that the loop to iterate over the
workarounds must also be fetched from memory on each iteration.

We can remove both branches. First, we can simply apply every erratum
directly instead of defining a workaround function and jumping to it.
Currently, no errata that need to be applied at both reset and runtime,
with the same workaround function, exist. If the need arose in future,
this should be achievable with a reset + runtime wrapper combo.

Then, we can construct a function that applies each erratum linearly
instead of looping over the list. If this function is part of the reset
function, then the only "far" branches at reset will be for the checker
functions. Importantly, this mitigates the slowdown even when an erratum
is disabled.

The result is ~50% speedup on N1SDP and ~20% on AArch64 Juno on wakeup
from PSCI calls that end in powerdown. This is roughly back to the
baseline of v2.9, before the errata framework regressed on performance
(or a little better). It is important to note that there are other
slowdowns since then that remain unknown.

Change-Id: Ie4d5288a331b11fd648e5c4a0b652b74160b07b9
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
2025-02-24 09:36:11 +00:00
Boyan Karatotev
b07c317f67 perf(cpus): inline the init_cpu_data_ptr function
Similar to the reset function inline, inline this too to not do a costly
branch with no extra cost.

Change-Id: I54cc399e570e9d0f373ae13c7224d32dbdfae1e5
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
2025-02-24 09:36:11 +00:00
Boyan Karatotev
0d020822ae perf(cpus): inline the reset function
Similar to the cpu_rev_var and cpu_ger_rev_var functions, inline the
call_reset_handler handler. This way we skip the costly branch at no
extra cost as this is the only place where this is called.

While we're at it, drop the options for CPU_NO_RESET_FUNC. The only cpus
that need that are virtual cpus which can spare the tiny bit of
performance lost. The rest are real cores which can save on the check
for zero.

Now is a good time to put the assert for a missing cpu in the
get_cpu_ops_ptr function so that it's a bit better encapsulated.

Change-Id: Ia7c3dcd13b75e5d7c8bafad4698994ea65f42406
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
2025-02-24 09:36:10 +00:00