When ICH_VMCR_EL2.VBPR1 is written in Secure state (SCR_EL3.NS==0)
and then subsequently read in Non-secure state (SCR_EL3.NS==1), a
wrong value might be returned. The same issue exists in the opposite way.
Adding workaround in EL3 software that performs context save/restore
on a change of Security state to use a value of SCR_EL3.NS when
accessing ICH_VMCR_EL2 that reflects the Security state that owns the
data being saved or restored. For example, EL3 software should set
SCR_EL3.NS to 1 when saving or restoring the value ICH_VMCR_EL2 for
Non-secure(or Realm) state. EL3 software should clear
SCR_EL3.NS to 0 when saving or restoring the value ICH_VMCR_EL2 for
Secure state.
SDEN documentation:
https://developer.arm.com/documentation/SDEN-1775101/latest/
Change-Id: I9f0403601c6346276e925f02eab55908b009d957
Signed-off-by: Govindraj Raja <govindraj.raja@arm.com>
- errata.h is using incorrect header macro ERRATA_REPORT_H fix this.
- Group errata function utilities.
Change-Id: I6a4a8ec6546adb41e24d8885cb445fa8be830148
Signed-off-by: Govindraj Raja <govindraj.raja@arm.com>
To allow for generic handling of a wakeup, this hook is no longer
expected to call wfi itself. Update the name everywhere to reflect this
expectation so that future platform implementers don't get misled.
Change-Id: Ic33f0b6da74592ad6778fd802c2f0b85223af614
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
Travis' and Gelas' TRMs tell us to disable SME (set PSTATE.{ZA, SM} to
0) when we're attempting to power down. What they don't tell us is that
if this isn't done, the powerdown request will be rejected. On the
CPU_OFF path that's not a problem - we can force SVCR to 0 and be
certain the core will power off.
On the suspend to powerdown path, however, we cannot do this. The TRM
also tells us that the sequence could also be aborted on eg. GIC
interrupts. If this were to happen when we have overwritten SVCR to 0,
upon a return to the caller they would experience a loss of context. We
know that at least Linux may call into PSCI with SVCR != 0. One option
is to save the entire SME context which would be quite expensive just to
work around. Another option is to downgrade the request to a normal
suspend when SME was left on. This option is better as this is expected
to happen rarely enough to ignore the wasted power and we don't want to
burden the generic (correct) path with needless context management.
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
Change-Id: I698fa8490ebf51461f6aa8bba84f9827c5c46ad4
The simplistic view of a core's powerdown sequence is that power is
atomically cut upon calling `wfi`. However, it turns out that it has
lots to do - it has to talk to the interconnect to exit coherency, clean
caches, check for RAS errors, etc. These take significant amounts of
time and are certainly not atomic. As such there is a significant window
of opportunity for external events to happen. Many of these steps are
not destructive to context, so theoretically, the core can just "give
up" half way (or roll certain actions back) and carry on running. The
point in this sequence after which roll back is not possible is called
the point of no return.
One of these actions is the checking for RAS errors. It is possible for
one to happen during this lengthy sequence, or at least remain
undiscovered until that point. If the core were to continue powerdown
when that happens, there would be no (easy) way to inform anyone about
it. Rejecting the powerdown and letting software handle the error is the
best way to implement this.
Arm cores since at least the a510 have included this exact feature. So
far it hasn't been deemed necessary to account for it in firmware due to
the low likelihood of this happening. However, events like GIC wakeup
requests are much more probable. Older cores will powerdown and
immediately power back up when this happens. Travis and Gelas include a
feature similar to the RAS case above, called powerdown abandon. The
idea is that this will improve the latency to service the interrupt by
saving on work which the core and software need to do.
So far firmware has relied on the `wfi` being the point of no return and
if it doesn't explicitly detect a pending interrupt quite early on, it
will embark onto a sequence that it expects to end with shutdown. To
accommodate for it not being a point of no return, we must undo all of
the system management we did, just like in the warm boot entrypoint.
To achieve that, the pwr_domain_pwr_down_wfi hook must not be terminal.
Most recent platforms do some platform management and finish on the
standard `wfi`, followed by a panic or an endless loop as this is
expected to not return. To make this generic, any platform that wishes
to support wakeups must instead let common code call
`psci_power_down_wfi()` right after. Besides wakeups, this lets common
code handle powerdown errata better as well.
Then, the CPU_OFF case is simple - PSCI does not allow it to return. So
the best that can be done is to attempt the `wfi` a few times (the
choice of 32 is arbitrary) in the hope that the wakeup is transient. If
it isn't, the only choice is to panic, as the system is likely to be in
a bad state, eg. interrupts weren't routed away. The same applies for
SYSTEM_OFF, SYSTEM_RESET, and SYSTEM_RESET2. There the panic won't
matter as the system is going offline one way or another. The RAS case
will be considered in a separate patch.
Now, the CPU_SUSPEND case is more involved. First, to powerdown it must
wipe its context as it is not written on warm boot. But it cannot be
overwritten in case of a wakeup. To avoid the catch 22, save a copy that
will only be used if powerdown fails. That is about 500 bytes on the
stack so it hopefully doesn't tip anyone over any limits. In future that
can be avoided by having a core manage its own context.
Second, when the core wakes up, it must undo anything it did to prepare
for poweroff, which for the cores we care about, is writing
CPUPWRCTLR_EL1.CORE_PWRDN_EN. The least intrusive for the cpu library
way of doing this is to simply call the power off hook again and have
the hook toggle the bit. If in the future there need to be more complex
sequences, their direction can be advised on the value of this bit.
Third, do the actual "resume". Most of the logic is already there for
the retention suspend, so that only needs a small touch up to apply to
the powerdown case as well. The missing bit is the powerdown specific
state management. Luckily, the warmboot entrypoint does exactly that
already too, so steal that and we're done.
All of this is hidden behind a FEAT_PABANDON flag since it has a large
memory and runtime cost that we don't want to burden non pabandon cores
with.
Finally, do some function renaming to better reflect their purpose and
make names a little bit more consistent.
Change-Id: I2405b59300c2e24ce02e266f91b7c51474c1145f
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
This function doesn't return and its callers that don't return either
rely on this. Drop the dead attribute and add a panic() after it to make
this expectation explicit. Calling `wfi` in the powerdown sequence is
terminal so even if the function was made to return, there would be no
functional change.
This is useful for a following patch that makes psci_power_down_wfi()
return.
Change-Id: I62ca1ee058b1eaeb046966c795081e01bf45a2eb
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
The workarounds introduced in the three patches starting at
888eafa00b assumed that any powerdown
request will be (forced to be) terminal. This assumption can no longer
be the case for new CPUs so there is a need to revisit these older
cores. Since we may wake up, we now need to respect the workaround's
recommendation that the workaround needs to be reverted on wakeup. So do
exactly that.
Introduce a new helper to toggle bits in assembly. This allows us to
call the workaround twice, with the first call setting the workaround
and second undoing it. This is also used for gelas' an travis' powerdown
routines. This is so the same function can be called again
Also fix the condition in the cpu helper macro as it was subtly wrong
Change-Id: Iff9e5251dc9d8670d085d88c070f78991955e7c3
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
Introduce a new helper to toggle bits in assembly. This allows us to
call the workaround twice, with the first call setting the workaround
and second undoing it. This allows the (errata) workaround functions to
be used to both apply and undo the mitigation.
This is applied to functions where the undo part will be required in
follow-up patches.
Change-Id: I058bad58f5949b2d5fe058101410e33b6be1b8ba
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
This patch implements SMCCC_ARCH_WORKAROUND_4 and
allows discovery through SMCCC_ARCH_FEATURES.
This mechanism is enabled if CVE_2024_7881 [1] is enabled
by the platform. If CVE_2024_7881 mitigation
is implemented, the discovery call returns 0,
if not -1 (SMC_ARCH_CALL_NOT_SUPPORTED).
For more information about SMCCC_ARCH_WORKAROUND_4 [2], please
refer to the SMCCC Specification reference provided below.
[1]: https://developer.arm.com/Arm%20Security%20Center/Arm%20CPU%20Vulnerability%20CVE-2024-7881
[2]: https://developer.arm.com/documentation/den0028/latest
Signed-off-by: Arvind Ram Prakash <arvind.ramprakash@arm.com>
Change-Id: I1b1ffaa1f806f07472fd79d5525f81764d99bc79
This patch adds new cpu ops function extra4 and a new macro
for CVE-2024-7881 [1]. This new macro declare_cpu_ops_wa_4 allows
support for new CVE check function.
[1]: https://developer.arm.com/Arm%20Security%20Center/Arm%20CPU%20Vulnerability%20CVE-2024-7881
Signed-off-by: Arvind Ram Prakash <arvind.ramprakash@arm.com>
Change-Id: I417389f040c6ead7f96f9b720d29061833f43d37
EXTLLC bit in CPUECTLR_EL1(for non-gelas cpus) and in CPUECTLR2_EL1
register for gelas cpu enables external Last-level cache in the system,
External LLC is present on TC4 systems in MCN but it is not enabled in
CPU registers so enable it.
On TC4, Gelas vs Non-Gelas CPUs have different bits to enable EXTLLC
so take care of that as well.
Change-Id: Ic6a74b4af110a3c34d19131676e51901ea2bf6e3
Signed-off-by: Jagdish Gediya <jagdish.gediya@arm.com>
Signed-off-by: Icen.Zeyada <Icen.Zeyada2@arm.com>
On some platforms plat_my_core_pos is a nontrivial function that takes a
bit of time and the compiler really doesn't like to inline. In the PSCI
library, at least, we have no need to keep repeatedly calling it and we
can instead pass it around as an argument. This saves on a lot of
redundant calls, speeding the library up a bit.
Change-Id: I137f69bea80d7cac90d7a20ffe98e1ba8d77246f
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
The target_pwrlvl field in the psci cpu data struct only stores the
highest power domain that a CPU_SUSPEND call affected, and is used to
resume those same domains on warm reset. If the cpu is otherwise OFF
(never turned on or CPU_OFF), then this needs to be the highest power
level because we don't know the highest level that will be off.
So skip the invalidation and always keep the field to the maximum value.
During suspend the field will be lowered to the appropriate value and
then put back after wakeup.
Also, do that in the suspend to standby path as well as it will have
been written before the sleep and it might end up incorrect.
Change-Id: I614272ec387e1d83023c94700780a0f538a9a6b6
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
Remove XFERLIST_TB_FW_CONFIG as the corresponding patch to add it to the
specification [1] has been abandoned and there are no plans for it to be
merged, with the information it contains being moved to a transfer list
instead.
[1] https://github.com/FirmwareHandoff/firmware_handoff/pull/37
Change-Id: If4a21d56b87bafc2f4894beefd73ac51e36e6571
Signed-off-by: Harrison Mutai <harrison.mutai@arm.com>
Add a function to check whether a transfer list has been initialized
at the input address. If not, initialize a transfer list at the
specified location with the given size. This is to help ensure that we
don't accidently overwrite a transfer list that's been passed from a
previous stage.
Change-Id: Ic5906626df09d3801435488e258490765e8f81eb
Signed-off-by: Harrison Mutai <harrison.mutai@arm.com>
We never directly reference the event handlers so they look like fair
game to be garbage collected when building with LTO.
Tell the compiler that we definitely need them and to leave them alone.
Change-Id: Iac672ce85e20328d25acbc3f5e544ad157eebf48
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
This patch fixes a bug which was introduced in commit
3065513 related to improper saving of EL1 context in the
context management library code when using 128-bit
system registers.
Bug explanation:
The function el1_sysregs_context_save still used the normal
macros that read all the system registers related to the EL1
context, which then involved casting them to uint64_t and
eventually writing them to a memory structure. This means that
the context management library was saving EL1-related SYSREG128
registers with the upper 64 bits zeroed out.
Alternative macros had previously been introduced for the EL2
context in the aforementioned commit, but not for EL1.
Some refactoring has also been done as part of this patch:
- Re-added "common" back to write_el2_ctx_common_sysreg128
- Added dummy SYSREG128 macros for cases when some features
are disabled
- Removed some newlines
Change-Id: I15aa2190794ac099a493e5f430220b1c81e1b558
Signed-off-by: Igor Podgainõi <igor.podgainoi@arm.com>
In the chapter about FEAT_SPE (D16.4 specifically) it is stated that
"Sampling is always disabled at EL3". That means that disabling sampling
(writing PMBLIMITR_EL1.E to 0) is redundant and can be removed. The only
reason we save/restore SPE context is because of that disable, so those
can be removed too.
There's the issue of draining the profiling buffer though. No new
samples will have been generated since entering EL3. However, old
samples might still be in-flight. Unless synchronised by a psb csync,
those might be affected by our extensive context mutation. Adding a psb
in prepare_el3_entry should cater for that. Note that prior to the
introduction of root context this was not a problem as context remained
unchanged and the hooks took care of the rest.
Then, the only time we care about the buffer actually making it to
memory is when we exit coherency. On HW_ASSISTED_COHERENCY systems we
don't have to do anything, it should be handled for us. Systems without
it need a dsb to wait for them to complete. There should be one already
in each cpu's powerdown hook which should work.
While on the topic of barriers, the esb barrier is no longer used.
Remove it.
Change-Id: I9736fc7d109702c63e7d403dc9e2a4272828afb2
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
This patch enables support of FEAT_FPMR by enabling access
to FPMR register. It achieves it by setting the EnFPM bit of
SCR_EL3. This feature is currently enabled for NS world only.
Reference:
https://developer.arm.com/documentation/109697/2024_09/
Feature-descriptions/The-Armv9-5-architecture-extension?lang=en
Change-Id: I580c409b9b22f8ead0737502280fb9093a3d5dd2
Signed-off-by: Arvind Ram Prakash <arvind.ramprakash@arm.com>
According to Platform Initialization (PI) Specification [1] and
discussion on edk2 mailing list [2],
StandaloneMm shouldn't create Hob but it should be passed from TF-A.
IOW, TF-A should pass boot information via HOB list to initialise
StandaloneMm properly.
And this HOB lists could be delivered via
- SPM_MM: Transfer List according to the firmware handoff spec[3]
- FF-A v1.1 >= : FF-A boot protocol.
This patch introduces a TF-A HOB creation library and
some of definitions which StandaloneMm requires to boot.
Link: https://uefi.org/sites/default/files/resources/PI_Spec_1_6.pdf [1]
Link: https://edk2.groups.io/g/devel/topic/103675962#114283 [2]
Link: https://github.com/FirmwareHandoff/firmware_handoff [3]
Signed-off-by: Levi Yun <yeoreum.yun@arm.com>
Change-Id: I5e0838adce487110206998a8b79bc3adca922cec
Add basic CPU library code to support the Alto CPU.
Change-Id: I45958be99c4a350a32a9e511d3705fb568b97236
Signed-off-by: Igor Podgainõi <igor.podgainoi@arm.com>
Armv8.6 introduced the FEAT_LS64 extension, which provides a 64 *byte*
store instruction. A related instruction is ST64BV0, which will replace
the lowest 32 bits of the data with a value taken from the ACCDATA_EL1
system register (so that EL0 cannot alter them).
Using that ST64BV0 instruction and accessing the ACCDATA_EL1 system
register is guarded by two SCR_EL3 bits, which we should set to avoid a
trap into EL3, when lower ELs use one of those.
Add the required bits and pieces to make this feature usable:
- Add the ENABLE_FEAT_LS64_ACCDATA build option (defaulting to 0).
- Add the CPUID and SCR_EL3 bit definitions associated with FEAT_LS64.
- Add a feature check to check for the existing four variants of the
LS64 feature and detect future extensions.
- Add code to save and restore the ACCDATA_EL1 register on
secure/non-secure context switches.
- Enable the feature with runtime detection for FVP and Arm FPGA.
Please note that the *basic* FEAT_LS64 feature does not feature any trap
bits, it's only the addition of the ACCDATA_EL1 system register that
adds these traps and the SCR_EL3 bits.
Change-Id: Ie3e2ca2d9c4fbbd45c0cc6089accbb825579138a
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
This patch disables trapping to EL3 when the FEAT_D128
specific registers are accessed by setting the SCR_EL3.D128En bit.
If FEAT_D128 is implemented, then FEAT_SYSREG128 is implemented.
With FEAT_SYSREG128 certain system registers are treated as 128-bit,
so we should be context saving and restoring 128-bits instead of 64-bit
when FEAT_D128 is enabled.
FEAT_SYSREG128 adds support for MRRS and MSRR instruction which
helps us to read write to 128-bit system register.
Refer to Arm Architecture Manual for further details.
Change the FVP platform to default to handling this as a dynamic option
so the right decision can be made by the code at runtime.
Change-Id: I1a53db5eac29e56c8fbdcd4961ede3abfcb2411a
Signed-off-by: Jayanth Dodderi Chidanand <jayanthdodderi.chidanand@arm.com>
Signed-off-by: Govindraj Raja <govindraj.raja@arm.com>
Add the basic CPU library code to support Cortex-A720AE.
The overall library code is adapted based on Cortex-A720 code.
Signed-off-by: David Hu <david.hu2@arm.com>
Signed-off-by: Ahmed Azeem <ahmed.azeem@arm.com>
Change-Id: I3d64dc5a3098cc823e656a5ad3ea05cd71598dc6
Add basic CPU library code to support the Arcadia CPU.
Change-Id: Iecb0634dc6dcb34e9b5fda4902335530d237cc43
Signed-off-by: Govindraj Raja <govindraj.raja@arm.com>
The errata framework has a helper to invoke workarounds, complete with a
cpu rev_var check. We can use that directly instead of the
apply_cpu_pwr_dwn_errata to save on some code, as well as an extra
branch. It's also more readable.
Also, apply_erratum invocation in cpu files don't need to check the
rev_var as that was already done by the cpu_ops dispatcher for us to end
up in the file.
Finally, X2 erratum 2768515 only applies in the powerdown sequence, i.e.
at runtime. It doesn't achieve anything at reset, so we can label it
accordingly.
Change-Id: I02f9dd7d0619feb54c870938ea186be5e3a6ca7b
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
Build breaks when EL3_EXCEPTION_HANDLING option is enabled. The
CPU_DATA_SIZE macro does not consider the size required to save the
ehf_data field of cpu_data structure.
include/lib/el3_runtime/cpu_data.h:163:17: error: size of array
'assert_cpu_data_size_mismatch' is negative
assert_cpu_data_size_mismatch);
This patch adds support to consider ehf_data field to calculate the
CPU_DATA_SIZE macro. Also adds relevant checks and asserts if the
ehf_data field is not considered.
Signed-off-by: Omkar Anand Kulkarni <omkar.kulkarni@arm.com>
Change-Id: I3c11b2982f4a612ce28e46848b5c5035a8f8efc2
Arm v8.9 introduces FEAT_SCTLR2, adding SCTLR2_ELx registers.
Support this, context switching the registers and disabling
traps so lower ELs can access the new registers.
Change the FVP platform to default to handling this as a dynamic option
so the right decision can be made by the code at runtime.
Change-Id: I0c4cba86917b6b065a7e8dd6af7daf64ee18dcda
Signed-off-by: Jayanth Dodderi Chidanand <jayanthdodderi.chidanand@arm.com>
Signed-off-by: Govindraj Raja <govindraj.raja@arm.com>
Arm v8.9 introduces FEAT_THE, adding Translation Hardening Extension
Read-Check-Write mask registers, RCWMASK_EL1 and RCWSMASK_EL1.
Support this, context switching the registers and disabling
traps so lower ELs can access the new registers.
Change the FVP platform to default to handling this as a dynamic option
so the right decision can be made by the code at runtime.
Change-Id: I8775787f523639b39faf61d046ef482f73b2a562
Signed-off-by: Jayanth Dodderi Chidanand <jayanthdodderi.chidanand@arm.com>
Signed-off-by: Govindraj Raja <govindraj.raja@arm.com>
Apply the mitigation only for the revision and variant
mentioned in the SDEN.
SDEN Documentation:
https://developer.arm.com/documentation/SDEN859515/latest
Change-Id: Ifda1f4cb32bdec9a9af29397ddc03bf22a7a87fc
Signed-off-by: Sona Mathew <sonarebecca.mathew@arm.com>
Cortex-X4 erratum 3076789 is a Cat B erratum that is present
in revisions r0p0, r0p1 and is fixed in r0p2.
The workaround is to set chicken bits CPUACTLR3_EL1[14:13]=0b11
and CPUACTLR_EL1[52] = 1.
Expected performance degradation is < 0.5%, but isolated
benchmark components might see higher impact.
SDEN documentation:
https://developer.arm.com/documentation/SDEN2432808/latest
Change-Id: Ib100bfab91efdb6330fdcdac127bcc5732d59196
Signed-off-by: Ryan Everett <ryan.everett@arm.com>
Cortex-X4 erratum 2897503 is a Cat B erratum that applies
to all revisions <= r0p1 and is fixed in r0p2.
The workaround is to set CPUACTLR4_EL1[8] to 1.
SDEN documentation:
https://developer.arm.com/documentation/SDEN-2432808/latest
Signed-off-by: Arvind Ram Prakash <arvind.ramprakash@arm.com>
Change-Id: I3178a890b6f1307b310e817af75f8fdfb8668cc9
With introduction of FEAT_STATE_CHECK_ASYMMETRIC, the asymmetry of cores
can be handled. FEAT_TCR2 is one of the features which can be
asymmetric across cores and the respective support is added here.
Adding a function to handle this asymmetry by re-visting the
feature presence on running core.
There are two possible cases:
- If the primary core has the feature and secondary does not have it
then the feature is disabled.
- If the primary does not have the feature and secondary has it then,
the feature need to be enabled in secondary cores.
Signed-off-by: Jayanth Dodderi Chidanand <jayanthdodderi.chidanand@arm.com>
Change-Id: I73a70891d52268ddfa4effe40edf04115f5821ca
* Currently, EL1 context is included in cpu_context_t by default
for all the build configurations.
As part of the cpu context structure, we hold a copy of EL1, EL2
system registers, per world per PE. This context structure is
enormous and will continue to grow bigger with the addition of
new features incorporating new registers.
* Ideally, EL3 should save and restore the system registers at its next
lower exception level, which is EL2 in majority of the configurations.
* This patch aims at optimising the memory allocation in cases, when
the members from the context structure are unused. So el1 system
register context must be omitted when lower EL is always x-EL2.
* "CTX_INCLUDE_EL2_REGS" is the internal build flag which gets set,
when SPD=spmd and SPMD_SPM_AT_SEL2=1 or ENABLE_RME=1.
It indicates, the system registers at EL2 are context switched for
the respective build configuration. Here, there is no need to save
and restore EL1 system registers, while x-EL2 is enabled.
Henceforth, this patch addresses this issue, by taking out the EL1
context at all possible places, while EL2 (CTX_INCLUDE_EL2_REGS) is
enabled, there by saving memory.
Change-Id: Ifddc497d3c810e22a15b1c227a731bcc133c2f4a
Signed-off-by: Jayanth Dodderi Chidanand <jayanthdodderi.chidanand@arm.com>
SCTLR_EL1 and TCR_EL1 regs are included either as part of errata
"ERRATA_SPECULATIVE_AT" or under el1_sysregs_t context structure.
The code to write and read into these context entries, looks
repetitive and is invoked at most places.
This section is refactored to bring them under a static procedure,
keeping the code neat and easier to maintain.
Change-Id: Ib0d8c51bee09e1600c5baaa7f9745083dca9fee1
Signed-off-by: Jayanth Dodderi Chidanand <jayanthdodderi.chidanand@arm.com>
* changes:
feat(fvp): allow SIMD context to be put in TZC DRAM
docs(simd): introduce CTX_INCLUDE_SVE_REGS build flag
feat(fvp): add Cactus partition manifest for EL3 SPMC
chore(simd): remove unused macros and utilities for FP
feat(el3-spmc): support simd context management upon world switch
feat(trusty): switch to simd_ctx_save/restore apis
feat(pncd): switch to simd_ctx_save/restore apis
feat(spm-mm): switch to simd_ctx_save/restore APIs
feat(simd): add rules to rationalize simd ctxt mgmt
feat(simd): introduce simd context helper APIs
feat(simd): add routines to save, restore sve state
feat(simd): add sve state to simd ctxt struct
feat(simd): add data struct for simd ctxt management