Errata application is painful for performance. For a start, it's done
when the core has just come out of reset, which means branch predictors
and caches will be empty so a branch to a workaround function must be
fetched from memory and that round trip is very slow. Then it also runs
with the I-cache off, which means that the loop to iterate over the
workarounds must also be fetched from memory on each iteration.
We can remove both branches. First, we can simply apply every erratum
directly instead of defining a workaround function and jumping to it.
Currently, no errata that need to be applied at both reset and runtime,
with the same workaround function, exist. If the need arose in future,
this should be achievable with a reset + runtime wrapper combo.
Then, we can construct a function that applies each erratum linearly
instead of looping over the list. If this function is part of the reset
function, then the only "far" branches at reset will be for the checker
functions. Importantly, this mitigates the slowdown even when an erratum
is disabled.
The result is ~50% speedup on N1SDP and ~20% on AArch64 Juno on wakeup
from PSCI calls that end in powerdown. This is roughly back to the
baseline of v2.9, before the errata framework regressed on performance
(or a little better). It is important to note that there are other
slowdowns since then that remain unknown.
Change-Id: Ie4d5288a331b11fd648e5c4a0b652b74160b07b9
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
Similar to the reset function inline, inline this too to not do a costly
branch with no extra cost.
Change-Id: I54cc399e570e9d0f373ae13c7224d32dbdfae1e5
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
Similar to the cpu_rev_var and cpu_ger_rev_var functions, inline the
call_reset_handler handler. This way we skip the costly branch at no
extra cost as this is the only place where this is called.
While we're at it, drop the options for CPU_NO_RESET_FUNC. The only cpus
that need that are virtual cpus which can spare the tiny bit of
performance lost. The rest are real cores which can save on the check
for zero.
Now is a good time to put the assert for a missing cpu in the
get_cpu_ops_ptr function so that it's a bit better encapsulated.
Change-Id: Ia7c3dcd13b75e5d7c8bafad4698994ea65f42406
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
Similar to the cpu_rev_var_xy functions, branching far away so early in
the reset sequence incurs significant slowdowns. Inline the function.
Change-Id: Ifc349015902cd803e11a1946208141bfe7606b89
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
We strive to apply errata as close to reset as possible with as few
things enabled as possible. Importantly, the I-cache will not be
enabled. This means that repeated branches to these tiny functions must
be re-fetched all the way from memory each time which has glacial speed.
Cores are allowed to fetch things ahead of time though as long as
execution is fairly linear. So we can trade a little bit of space (3 to
7 instructions per erratum) to keep things linear and not have to go to
memory.
While we're at it, optimise the the cpu_rev_var_{ls, hs, range}
functions to take up less space. Dropping the moves allows for a bit of
assembly magic that produces the same result in 2 and 3 instructions
respectively.
Change-Id: I51608352f23b2244ea7a99e76c10892d257f12bf
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
The existing DSU errata workarounds hijack the errata framework's inner
workings to register with it. However, that is undesirable as any change
to the framework may end up missing these workarounds. So convert the
checks and workarounds to macros and have them included with the
standard wrappers.
The only problem with this is the is_scu_present_in_dsu weak function.
Fortunately, it is only needed for 2 of the errata and only on 3 cores.
So drop it, assuming the default behaviour and have the callers handle
the exception.
Change-Id: Iefa36325804ea093e938f867b9a6f49a6984b8ae
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
The library check_erratum_ls already incorporates the check. The return
of ERRATA_MISSING is handled in the errata_report.c functions.
Change-Id: Ic1dff2bc5235195f7cfce1709cd42467f88b3e4c
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
Result was verified by manually stepping through the reset function with
a debugger.
Change-Id: I91cd6111ccf95d6b7ee2364ac2126cb98ee4bb15
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
The errata in this patch are declared as runtime, but are never called
explicitly. This means that they are never called! Convert them to reset
errata so that they are called at reset. Their SDENs entries have been
checked and confirm that this is how they should be implemented.
Also, drop the the MIDR check on the a57 erratum as it's not needed -
the erratum is already called from a cpu-specific function.
Change-Id: I22c3043ab454ce94b3c122c856e5804bc2ebb18b
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
Commit@af5ae9a73f67dc8c9ed493846d031b052b0f22a0
Adding a Cortex-A720-AE erratum 3699562 has a typo in CPU name
for the errata, it is for Cortex-A720-AE but had incorrectly
mentioned as Cortex-A715_AE.
Change-Id: I2332a3fcaf56a7aaab5a04e3d40428cc746d2d46
Signed-off-by: Govindraj Raja <govindraj.raja@arm.com>
Enable FEAT_FGT2 and FEAT_Debugv8p9 in Realm state as well.
Change-Id: Ib9cdde3af328ffdd8718b1ba404265757f2e542b
Signed-off-by: Sona Mathew <sonarebecca.mathew@arm.com>
* changes:
feat(s32g274a): enable sdhc clock
feat(nxp-clk): add clock modules for uSDHC
feat(nxp-clk): get MC_CGM divider's parent
feat(nxp-clk): get MC_CGM divider's rate
feat(nxp-clk): set MC_CGM divider's rate
feat(nxp-clk): enable MC_CGM dividers
feat(nxp-clk): get parent for the fixed dividers
feat(nxp-clk): set the rate for partition objects
feat(nxp-clk): add clock objects for CGM dividers
feat(nxp-clk): add base address for PERIPH_DFS
The uSDHC module clock must be enabled to use the SD/eMMC storage from
where the BL2 is expected to load images for the next boot stages.
Change-Id: Ib1cc7d5dda7a4283a29716f5b3d776048bd5b7ba
Signed-off-by: Ghennadi Procopciuc <ghennadi.procopciuc@nxp.com>
One of the uSDHC module's clock lines is attached to the CGM_MUX 14
divider, which is connected to PERIPH_DFS3. The other one is attached
to XBAR_DIV3.
Change-Id: I23f468a3e5f7daa832c0841b55211a048284a7f0
Signed-off-by: Ghennadi Procopciuc <ghennadi.procopciuc@nxp.com>
The parent of the MC_CGM divider will always be the MC_CGM mux
identified based on s32cc_cgm_div.parent.
Change-Id: Ie13b16e0ee56f35d61374efbe158f166b99960b7
Signed-off-by: Ghennadi Procopciuc <ghennadi.procopciuc@nxp.com>
The MC_CGM divider's frequency is obtained based on the state of the
settings found in its registers. If the divider is disabled, the
intended rate (s32cc_cgm_div.freq) will be returned.
Change-Id: I41698990952b530021de26eb51f74aca50176575
Co-developed-by: Florin Buica <florin.buica@nxp.com>
Signed-off-by: Florin Buica <florin.buica@nxp.com>
Signed-off-by: Ghennadi Procopciuc <ghennadi.procopciuc@nxp.com>
The MC_CGM divider's frequency is saved as part of the object metadata.
No checks are performed on the requested frequency. It will be validated
during the enablement process.
Change-Id: Ide9c8c64be16a66b66f129735cebfc4d1f1772c5
Signed-off-by: Ghennadi Procopciuc <ghennadi.procopciuc@nxp.com>
Add the enablement mechanism for the MC_CGM dividers. The division
factor is established by dividing the parent's rate by the rate of the
divider's output.
Change-Id: Iadb84f4f47531a67b0b1509b94e1f2b962631a77
Signed-off-by: Ghennadi Procopciuc <ghennadi.procopciuc@nxp.com>
Fixed dividers contribute to the Linflex and QSPI clocks.
Change-Id: Idb4e6fe883e117b2bb9260b6eeb6e15d75ce887e
Signed-off-by: Ghennadi Procopciuc <ghennadi.procopciuc@nxp.com>
Only the partition block link can set the frequency, while the other two
should not be able to because none of them participate in the clock
generation. In the first case, the request will be propagated to the
parent object of the partition link.
Change-Id: Ic237972008eb51c62e92f03f657698a8a1ca4b0e
Signed-off-by: Ghennadi Procopciuc <ghennadi.procopciuc@nxp.com>
The CGM dividers are controllable dividers attached to a CGM mux. Its
divison factor can be controlled through the MC_CGM's registers.
Change-Id: Id2786a46c5a1d389ca32a4839c7158949aec3b0a
Signed-off-by: Ghennadi Procopciuc <ghennadi.procopciuc@nxp.com>
The PERIPH_DFS module is used to clock the SD and QSPI modules.
Change-Id: I440fd806d71acab641f0003a7f2a5ce720b469c6
Signed-off-by: Ghennadi Procopciuc <ghennadi.procopciuc@nxp.com>
* changes:
fix(versal2): pass tl address to bl32
fix(xilinx): runtime console to handle dt failure
refactor(xilinx): refactor console to support transfer list
chore(xilinx): propagate error code
feat(versal2): retrieve DT address from transfer list
chore(versal2): move xfer-list file paths
fix(versal2): update transfer list as optional
Pass transfer list address to BL32 as an argument during boot time.
Change-Id: Ic63649b9c41cfae2365ec5911dcab63a7dd005ff
Signed-off-by: Maheedhar Bollapalli <maheedharsai.bollapalli@amd.com>
If the Device Tree is missing or parsing fails in the runtime
console, the console still gets registered with zeroed DT values,
leading to a panic due to the absence of a console type.
Added fallback option and check for zero base address.
Change-Id: I5f5e0222685ba015ab7db2ecbd46d906f5ab9116
Signed-off-by: Maheedhar Bollapalli <maheedharsai.bollapalli@amd.com>
Refactor console to support DTB console in case of transfer list.
Simplify logic where SOC specific macros are moved to platform headers
or makefile where XLNX_DT_CFG macro describe if system is DT driven or not.
Change-Id: Id45c03a950b62e83e91a50e0485eacdb233ba745
Signed-off-by: Maheedhar Bollapalli <maheedharsai.bollapalli@amd.com>
Propagate error instead of making own error code.
Change-Id: I9300ad342e98ca0e730b091510d9d62747b81a5f
Signed-off-by: Maheedhar Bollapalli <maheedharsai.bollapalli@amd.com>
On versal2 platform, unlike current static DT address passing
mechanism, DT address is retrieved from transfer list dynamically.
Change-Id: I44b9a0753809652f26bc1b7e061f5364229ba352
Signed-off-by: Maheedhar Bollapalli <maheedharsai.bollapalli@amd.com>
Only Versal Gen 2 platform supports transfer list.
Move transfer list files to versal2 common path.
Change-Id: I2795270a77e2af5e012c82c7b5916fa1f90f0497
Signed-off-by: Maheedhar Bollapalli <maheedharsai.bollapalli@amd.com>
Updated transfer list feature as optional and user should explicitly
provide build time argument to enable transfer list.
In TL optional case TL address range is utilized as default dtb
address range. Updated default DT address to transfer list address.
Change-Id: Ieeaacb3e6fda4ad1da9330708e19d776bffb06c1
Signed-off-by: Maheedhar Bollapalli <maheedharsai.bollapalli@amd.com>
Clang's nullability completeness checks were triggered after adding
the _Nonnull specifier to one function. Removing it prevents Clang
from flagging atexit() for missing a nullability specifier.
include/lib/libc/stdlib.h:25:25: error: pointer is missing a
nullability type specifier (_Nonnull, _Nullable, or _Null_unspecified)
[-Werror,-Wnullability-completeness]
25 | extern int atexit(void (*func)(void));
This change ensures compliance with the C standard while preventing
unexpected build errors.
Change-Id: I2f881c55b36b692d22c3db22149c6402c32e8c3e
Signed-off-by: Manish V Badarkhe <Manish.Badarkhe@arm.com>
* changes:
refactor(rse)!: remove rse_comms_init
refactor(arm): switch to rse_mbx_init
refactor(rse): put MHU code in a dedicated file
refactor(tc): add plat_rse_comms_init
refactor(arm)!: rename PLAT_MHU_VERSION flag
When RESET_TO_BL31 was enabled, CNTFRQ_EL0 was left uninitialized,
leading to incorrect system counter frequency settings. This
impacted timer-dependent components, such as SMMUv3, causing
initialization failures and unpredictable behavior.
To fix this, CNTFRQ_EL0 is now explicitly set using
plat_get_syscnt_freq2(), ensuring the correct system timer
frequency and proper initialization of dependent components.
Signed-off-by: Lokesh B V <Lokesh.BV@Arm.com>
Change-Id: I808b17d25c87c4dce1bc2c8171a800b69b5c2908
We no longer maintain the device equipped with ES chip. Remove SPM
support for ES ship.
Signed-off-by: Wenzhen Yu <wenzhen.yu@mediatek.com>
Change-Id: I5b2d035ec384a9861239f33dbe6df54c17f1285c
When SPMD_SPM_AT_SEL2 is enabled, saving and restoring the SIMD context
is not needed because the SPMC handles it. The function
spmd_secure_interrupt_handler incorrectly restores the SWD SIMD context
before entering the SPMC without saving the NWD SIMD context, leading to
its loss. Furthermore, the SWD SIMD context is saved after returning
from the SPMC which is unnecessary.
This commit prevents the restoration of the SWD SIMD context before SPMC
entry and the saving of the SWD SIMD context after returning from the
SPMC when SPMD_SPM_AT_SEL2 is enabled. This ensures the preservation of
the NWD SIMD context.
Change-Id: I16a3e698e61da7019b3a670475e542d1690a5dd9
Signed-off-by: Rakshit Goyal <rakshit.goyal@arm.com>
Cortex-X925 erratum 2963999 that applies to r0p0 and is fixed in
r0p1.
In EL3, reads of MPIDR_EL1 and MIDR_EL1 might incorrectly virtualize
which register to return when reading the value of
MPIDR_EL1/VMPIDR_EL2 and MIDR_EL1/VPIDR_EL2, respectively.
The workaround is to do an ISB prior to an MRS read to either
MPIDR_EL1 and MIDR_EL1.
SDEN documentation:
https://developer.arm.com/documentation/109180/latest/
Change-Id: I447fd359ea32e1d274e1245886e1de57d14f082c
Signed-off-by: Govindraj Raja <govindraj.raja@arm.com>
Neoverse V3 erratum 2970647 that applies to r0p0 and is fixed in r0p1.
In EL3, reads of MPIDR_EL1 and MIDR_EL1 might incorrectly virtualize
which register to return when reading the value of
MPIDR_EL1/VMPIDR_EL2 and MIDR_EL1/VPIDR_EL2, respectively.
The workaround is to do an ISB prior to an MRS read to either
MPIDR_EL1 and MIDR_EL1.
SDEN documentation:
https://developer.arm.com/documentation/SDEN-2891958/latest/
Change-Id: Iedf7d799451f0be58a5da1f93f7f5b6940f2bb35
Signed-off-by: Govindraj Raja <govindraj.raja@arm.com>