docs: document the errata framework

Also add a recommended Procedure Call Standard (PCS) to use inside CPU files and split everything into sections to make it easier to follow. Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com> Change-Id: Idd876d8e598b5dfe1193aa3e7375c52f6edf5671
2025-04-16 17:44:19 +00:00 · 2023-02-07 15:46:50 +00:00 · 2023-02-07 15:46:50 +00:00 · 6a0e8e80fb
commit 6a0e8e80fb
parent c161772f40
2 changed files with 114 additions and 55 deletions
--- a/docs/design/cpu-specific-build-macros.rst
+++ b/docs/design/cpu-specific-build-macros.rst
@ -53,7 +53,7 @@ errata notice document. The format of the define used to enable/disable the
 errata workaround is ``ERRATA_<Processor name>_<ID>``, where the ``Processor name``
 is for example ``A57`` for the ``Cortex_A57`` CPU.

-Refer to :ref:`firmware_design_cpu_errata_reporting` for information on how to
+Refer to :ref:`firmware_design_cpu_errata_implementation` for information on how to
 write errata workaround functions.

 All workarounds are disabled by default. The platform is responsible for
--- a/docs/design/firmware-design.rst
+++ b/docs/design/firmware-design.rst
@ -1245,8 +1245,9 @@ the BL1 and BL31 images. It in turn calls the platform and CPU specific reset
 handling functions.

 Details for implementing a CPU specific reset handler can be found in
-Section 8. Details for implementing a platform specific reset handler can be
-found in the :ref:`Porting Guide` (see the ``plat_reset_handler()`` function).
+:ref:`firmware_design_cpu_specific_reset_handling`. Details for implementing a
+platform specific reset handler can be found in the :ref:`Porting Guide` (see
+the``plat_reset_handler()`` function).

 When adding functionality to a reset handler, keep in mind that if a different
 reset handling behavior is required between the first and the subsequent
@ -1340,12 +1341,38 @@ configuration, these CPU specific files must be included in the build by
 the platform makefile. The generic CPU specific operations framework code exists
 in ``lib/cpus/aarch64/cpu_helpers.S``.

+CPU PCS
+~~~~~~~
+
+All assembly functions in CPU files are asked to follow a modified version of
+the Procedure Call Standard (PCS) in their internals. This is done to ensure
+calling these functions from outside the file doesn't unexpectedly corrupt
+registers in the very early environment and to help the internals to be easier
+to understand. Please see the :ref:`firmware_design_cpu_errata_implementation`
+for any function specific restrictions.
+
+--------------+---------------------------------+
+|   register   | use                             |
+==============+=================================+
+|   x0 - x15   | scratch                         |
+--------------+---------------------------------+
+|   x16, x17   | do not use (used by the linker) |
+--------------+---------------------------------+
+|     x18      | do not use (platform register)  |
+--------------+---------------------------------+
+|   x19 - x28  | callee saved                    |
+--------------+---------------------------------+
+|   x29, x30   | FP, LR                          |
+--------------+---------------------------------+
+
+.. _firmware_design_cpu_specific_reset_handling:
+
 CPU specific Reset Handling
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~

 After a reset, the state of the CPU when it calls generic reset handler is:
-MMU turned off, both instruction and data caches turned off and not part
-of any coherency domain.
+MMU turned off, both instruction and data caches turned off, not part
+of any coherency domain and no stack.

 The BL entrypoint code first invokes the ``plat_reset_handler()`` to allow
 the platform to perform any system initialization required and any system
@ -1355,10 +1382,9 @@ array and returns it. Note that only the part number and implementer fields
 in midr are used to find the matching ``cpu_ops`` entry. The ``reset_func()`` in
 the returned ``cpu_ops`` is then invoked which executes the required reset
 handling for that CPU and also any errata workarounds enabled by the platform.
-This function must preserve the values of general purpose registers x20 to x29.

-Refer to Section "Guidelines for Reset Handlers" for general guidelines
-regarding placement of code in a reset handler.
+It should be defined using the ``cpu_reset_func_{start,end}`` macros and its
+body may only clobber x0 to x14 with x14 being the cpu_rev parameter.

 CPU specific power down sequence
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@ -1391,10 +1417,10 @@ reporting framework calls ``do_cpu_reg_dump`` which retrieves the matching
 be reported and a pointer to the ASCII list of register names in a format
 expected by the crash reporting framework.

-.. _firmware_design_cpu_errata_reporting:
+.. _firmware_design_cpu_errata_implementation:

-CPU errata status reporting
-~~~~~~~~~~~~~~~~~~~~~~~~~~~
+CPU errata implementation
+~~~~~~~~~~~~~~~~~~~~~~~~~

 Errata workarounds for CPUs supported in TF-A are applied during both cold and
 warm boots, shortly after reset. Individual Errata workarounds are enabled as
@ -1402,59 +1428,92 @@ build options. Some errata workarounds have potential run-time implications;
 therefore some are enabled by default, others not. Platform ports shall
 override build options to enable or disable errata as appropriate. The CPU
 drivers take care of applying errata workarounds that are enabled and applicable
-to a given CPU. Refer to :ref:`arm_cpu_macros_errata_workarounds` for more
-information.
+to a given CPU.

-Functions in CPU drivers that apply errata workaround must follow the
-conventions listed below.
+Each erratum has a build flag in ``lib/cpus/cpu-ops.mk`` of the form:
+``ERRATA_<cpu_num>_<erratum_id>``. It also has a short description in
+:ref:`arm_cpu_macros_errata_workarounds` on when it should apply.

-The errata workaround must be authored as two separate functions:
+Errata framework
+^^^^^^^^^^^^^^^^

-  One that checks for errata. This function must determine whether that errata
-   applies to the current CPU. Typically this involves matching the current
-   CPUs revision and variant against a value that's known to be affected by the
-   errata. If the function determines that the errata applies to this CPU, it
-   must return ``ERRATA_APPLIES``; otherwise, it must return
-   ``ERRATA_NOT_APPLIES``. The utility functions ``cpu_get_rev_var`` and
-   ``cpu_rev_var_ls`` functions may come in handy for this purpose.
+The errata framework is a convention and a small library to allow errata to be
+automatically discovered. It enables compliant errata to be automatically
+applied and reported at runtime (either by status reporting or the errata ABI).

-For an errata identified as ``E``, the check function must be named
-``check_errata_E``.
+To write a compliant mitigation for erratum number ``erratum_id`` on a cpu that
+declared itself (with ``declare_cpu_ops``) as ``cpu_name`` one needs 3 things:

-This function will be invoked at different times, both from assembly and from
-C run time. Therefore it must follow AAPCS, and must not use stack.
+#. A CPU revision checker function: ``check_erratum_<cpu_name>_<erratum_id>``

-  Another one that applies the errata workaround. This function would call the
-   check function described above, and applies errata workaround if required.
+   It should check whether this erratum applies on this revision of this CPU.
+   It will be called with the CPU revision as its first parameter (x0) and
+   should return one of ``ERRATA_APPLIES`` or ``ERRATA_NOT_APPLIES``.

-CPU drivers that apply errata workaround can optionally implement an assembly
-function that report the status of errata workarounds pertaining to that CPU.
-For a driver that registers the CPU, for example, ``cpux`` via ``declare_cpu_ops``
-macro, the errata reporting function, if it exists, must be named
-``cpux_errata_report``. This function will always be called with MMU enabled; it
-must follow AAPCS and may use stack.
+   It may only clobber x0 to x4. The rest should be treated as callee-saved.
+
+#. A workaround function: ``erratum_<cpu_name>_<erratum_id>_wa``
+
+   It should obtain the cpu revision (with ``cpu_get_rev_var``), call its
+   revision checker, and perform the mitigation, should the erratum apply.
+
+   It may only clobber x0 to x8. The rest should be treated as callee-saved.
+
+#. Register itself to the framework
+
+   Do this with
+   ``add_erratum_entry <cpu_name>, ERRATUM(<erratum_id>), <errata_flag>``
+   where the ``errata_flag`` is the enable flag in ``cpu-ops.mk`` described
+   above.
+
+See the next section on how to do this easily.
+
+.. note::
+
+ CVEs have the format ``CVE_<year>_<number>``. To fit them in the framework, the
+ ``erratum_id`` for the checker and the workaround functions become the
+ ``number`` part of its name and the ``ERRATUM(<number>)`` part of the
+ registration should instead be ``CVE(<year>, <number>)``. In the extremely
+ unlikely scenario where a CVE and an erratum numbers clash, the CVE number
+ should be prefixed with a zero.
+
+ Also, their build flag should be ``WORKAROUND_CVE_<year>_<number>``.
+
+.. note::
+
+ AArch32 uses the legacy convention. The checker function has the format
+ ``check_errata_<erratum_id>`` and the workaround has the format
+ ``errata_<cpu_number>_<erratum_id>_wa`` where ``cpu_number`` is the shortform
+ letter and number name of the CPU.
+
+ For CVEs the ``erratum_id`` also becomes ``cve_<year>_<number>``.
+
+Errata framework helpers
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+Writing these errata involves lots of boilerplate and repetitive code. On
+AArch64 there are helpers to omit most of this. They are located in
+``include/lib/cpus/aarch64/cpu_macros.S`` and the preferred way to implement
+errata. Please see their comments on how to use them.
+
+The most common type of erratum workaround, one that just sets a "chicken" bit
+in some arbitrary register, would have an implementation for the Cortex-A77,
+erratum #1925769 like::
+
+    workaround_reset_start cortex_a77, ERRATUM(1925769), ERRATA_A77_1925769
+        sysreg_bit_set CORTEX_A77_CPUECTLR_EL1, CORTEX_A77_CPUECTLR_EL1_BIT_8
+    workaround_reset_end cortex_a77, ERRATUM(1925769)
+
+    check_erratum_ls cortex_a77, ERRATUM(1925769), CPU_REV(1, 1)
+
+Status reporting
+^^^^^^^^^^^^^^^^

 In a debug build of TF-A, on a CPU that comes out of reset, both BL1 and the
-runtime firmware (BL31 in AArch64, and BL32 in AArch32) will invoke errata
-status reporting function, if one exists, for that type of CPU.
-
-To report the status of each errata workaround, the function shall use the
-assembler macro ``report_errata``, passing it:
-
-  The build option that enables the errata;
-
-  The name of the CPU: this must be the same identifier that CPU driver
-   registered itself with, using ``declare_cpu_ops``;
-
-  And the errata identifier: the identifier must match what's used in the
-   errata's check function described above.
-
-The errata status reporting function will be called once per CPU type/errata
-combination during the software's active life time.
-
-It's expected that whenever an errata workaround is submitted to TF-A, the
-errata reporting function is appropriately extended to report its status as
-well.
+runtime firmware (BL31 in AArch64, and BL32 in AArch32) will invoke a generic
+errata status reporting function. It will read the ``errata_entries`` list of
+that cpu and will report whether each known erratum was applied and, if not,
+whether it should have been.

 Reporting the status of errata workaround is for informational purpose only; it
 has no functional significance.