Commit graph

5 commits

Author SHA1 Message Date
Boyan Karatotev
89dba82dfa perf(cpus): make reset errata do fewer branches
Errata application is painful for performance. For a start, it's done
when the core has just come out of reset, which means branch predictors
and caches will be empty so a branch to a workaround function must be
fetched from memory and that round trip is very slow. Then it also runs
with the I-cache off, which means that the loop to iterate over the
workarounds must also be fetched from memory on each iteration.

We can remove both branches. First, we can simply apply every erratum
directly instead of defining a workaround function and jumping to it.
Currently, no errata that need to be applied at both reset and runtime,
with the same workaround function, exist. If the need arose in future,
this should be achievable with a reset + runtime wrapper combo.

Then, we can construct a function that applies each erratum linearly
instead of looping over the list. If this function is part of the reset
function, then the only "far" branches at reset will be for the checker
functions. Importantly, this mitigates the slowdown even when an erratum
is disabled.

The result is ~50% speedup on N1SDP and ~20% on AArch64 Juno on wakeup
from PSCI calls that end in powerdown. This is roughly back to the
baseline of v2.9, before the errata framework regressed on performance
(or a little better). It is important to note that there are other
slowdowns since then that remain unknown.

Change-Id: Ie4d5288a331b11fd648e5c4a0b652b74160b07b9
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
2025-02-24 09:36:11 +00:00
Boyan Karatotev
b62673c645 refactor(cpus): register DSU errata with the errata framework's wrappers
The existing DSU errata workarounds hijack the errata framework's inner
workings to register with it. However, that is undesirable as any change
to the framework may end up missing these workarounds. So convert the
checks and workarounds to macros and have them included with the
standard wrappers.

The only problem with this is the is_scu_present_in_dsu weak function.
Fortunately, it is only needed for 2 of the errata and only on 3 cores.
So drop it, assuming the default behaviour and have the callers handle
the exception.

Change-Id: Iefa36325804ea093e938f867b9a6f49a6984b8ae
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
2025-02-20 17:28:17 +00:00
Ryan Everett
3fb52e41fd refactor(cpus): remove cpu specific errata funcs
Errata printing is done directly via generic_errata_report.
This commit removes the unused \_cpu\()_errata_report
functions for all cores, and removes errata_func from cpu_ops.

Change-Id: I04fefbde5f0ff63b1f1cd17c864557a14070d68c
Signed-off-by: Ryan Everett <ryan.everett@arm.com>
2024-07-26 11:19:52 +01:00
Jayanth Dodderi Chidanand
38f762a5ee refactor(cpus): convert the Cortex-A65AE to use the errata framework
This involves replacing:
 * the reset_func with the standard cpu_reset_func_{start,end} to apply
   errata automatically
 * the <cpu>_errata_report with the errata_report_shim to report errata
   automatically
...and for each erratum:
 * the prologue with the workaround_<type>_start to do the checks and
   framework registration automatically
 * the epilogue with the workaround_<type>_end
 * the checker function with the check_erratum_<type> to make it more
   descriptive
 * This core has only errata related to DSU, which is defined under
   another file dsu_helpers.s but gets applied to A65AE as well.
   Hence symbolic names have been added to get them registered under
   errata framework.

It is important to note that the errata workaround sequences remain
unchanged and preserve their git blame.

Testing was conducted by:
 * Building for release with all errata flags enabled and running
   script in change 19136 to compare output of objdump for each errata.

 * Testing via script was not complete, as it directed to verify the
   check and the workaround functions of few erratas manually.

 * Manual comparison of disassembly of converted functions with non-
   converted functions

   aarch64-none-elf-objdump -D <trusted-firmware-a with conversion>/build/../release/bl31/bl31.elf
     vs
   aarch64-none-elf-objdump -D <trusted-firmware-a clean repo>/build/fvp/release/bl31/bl31.elf

 * Manual comparison of disassembly of both both files(bl31.elf)
   ensured, the ported changes were identical and hence verified.

 * Build for release with all errata flags enabled and run default
   tftf tests.

   CROSS_COMPILE=aarch64-none-elf- \
   make PLAT=fvp \
   ARCH=aarch64 \
   DEBUG=0 \
   HW_ASSISTED_COHERENCY=1 \
   USE_COHERENT_MEM=0 \
   CTX_INCLUDE_AARCH32_REGS=0 \
   ERRATA_DSU_936184=1 \
   BL33=/home/jaychi01/tf_a/tf-a-tests/build/fvp/release/tftf.bin \
   fip all -j12

 * Build for debug with all errata enabled and step through ArmDS
   at reset to ensure that if Errata are applicable then the workaround
   functions are entered precisely. In this case, errata is not
   applied as DSU does not has the ACP interface and hence the
   check_errata_dsu_936184 returns 0.

 * In summary, porting work for this CPU, does not adds any new changes
   as we are just creating macros via .equ, henceforth code remains
   identical.

Change-Id: Iab37295319b5ccd69428185b2d22af0ca9c07a5e
Signed-off-by: Jayanth Dodderi Chidanand <jayanthdodderi.chidanand@arm.com>
2023-07-27 09:35:12 +01:00
Imre Kis
78f02ae296 Introducing support for Cortex-A65AE
Change-Id: I1ea2bf088f1e001cdbd377cbfb7c6a2866af0422
Signed-off-by: Imre Kis <imre.kis@arm.com>
2019-10-03 15:38:31 +02:00