mirror of
https://github.com/ARM-software/arm-trusted-firmware.git
synced 2025-04-08 05:43:53 +00:00
docs(psci): expound runtime instrumentation docs
Change-Id: I3c30b44d4196c30fd07373282150e543959fce1a Signed-off-by: Harrison Mutai <harrison.mutai@arm.com>
This commit is contained in:
parent
a63de43661
commit
b1af2676f2
3 changed files with 175 additions and 1 deletions
|
@ -5,10 +5,12 @@ Performance & Testing
|
|||
:maxdepth: 1
|
||||
:caption: Contents
|
||||
|
||||
psci-performance-instr
|
||||
psci-performance-juno
|
||||
psci-performance-methodology
|
||||
tsp
|
||||
performance-monitoring-unit
|
||||
|
||||
--------------
|
||||
|
||||
*Copyright (c) 2019-2020, Arm Limited. All rights reserved.*
|
||||
*Copyright (c) 2019-2023, Arm Limited. All rights reserved.*
|
||||
|
|
117
docs/perf/psci-performance-instr.rst
Normal file
117
docs/perf/psci-performance-instr.rst
Normal file
|
@ -0,0 +1,117 @@
|
|||
PSCI Performance Measurement
|
||||
============================
|
||||
|
||||
TF-A provides two instrumentation tools for performing analysis of the PSCI
|
||||
implementation:
|
||||
|
||||
* PSCI STAT
|
||||
* Runtime Instrumentation
|
||||
|
||||
This page explains how they may be enabled and used to perform all varieties of
|
||||
analysis.
|
||||
|
||||
Performance Measurement Framework
|
||||
---------------------------------
|
||||
|
||||
The Performance Measurement Framework `PMF`_ is a framework that provides
|
||||
mechanisms for collecting and retrieving timestamps at runtime from the
|
||||
Performance Measurement Unit (`PMU`_). The PMU is a generalized abstraction for
|
||||
accessing CPU hardware registers used to measure hardware events. This means,
|
||||
for instance, that the PMU might be used to place instrumentation points at
|
||||
logical locations in code for tracing purposes.
|
||||
|
||||
TF-A utilises the PMF as a backend for the two instrumentation services it
|
||||
provides--PSCI Statistics and Runtime Instrumentation. The PMF is used by
|
||||
these services to facilitate collection and retrieval of timestamps. For
|
||||
instance, the PSCI Statistics service registers the PMF service
|
||||
``psci_svc`` to track its residency statistics.
|
||||
|
||||
This is reserved a unique ID, name, and space in memory by the PMF. The
|
||||
framework provides a convenient interface for PSCI Statistics to retrieve
|
||||
values from ``psci_svc`` at runtime. Alternatively, the service may be
|
||||
configured such that the PMF dumps those values to the console. A platform may
|
||||
choose to expose SMCs that allow retrieval of these timestamps from the
|
||||
service.
|
||||
|
||||
This feature is enabled with the Boolean flag ``ENABLE_PMF``.
|
||||
|
||||
PSCI Statistics
|
||||
---------------
|
||||
|
||||
PSCI Statistics is a runtime service that provides residency statistics for
|
||||
power states used by the platform. The service tracks residency time and
|
||||
entry count. Residency time is the total time spent in a particular power
|
||||
state by a PE. The entry count is the number of times the PE has entered
|
||||
the power state. PSCI Statistics implements the optional functions
|
||||
``PSCI_STAT_RESIDENCY`` and ``PSCI_STAT_COUNT`` from the `PSCI`_
|
||||
specification.
|
||||
|
||||
|
||||
.. c:macro:: PSCI_STAT_RESIDENCY
|
||||
|
||||
:param target_cpu: Contains copy of affinity fields in the MPIDR register
|
||||
for identifying the target core (See section 5.1.4 of `PSCI`_
|
||||
specifications for more details).
|
||||
:param power_state: identifier for a specific local
|
||||
state. Generally, this parameter takes the same form as the power_state
|
||||
parameter described for CPU_SUSPEND in section 5.4.2.
|
||||
|
||||
:returns: Time spent in ``power_state``, in microseconds, by ``target_cpu``
|
||||
and the highest level expressed in ``power_state``.
|
||||
|
||||
|
||||
.. c:macro:: PSCI_STAT_COUNT
|
||||
|
||||
:param target_cpu: follows the same format as ``PSCI_STAT_RESIDENCY``.
|
||||
:param power_state: follows the same format as ``PSCI_STAT_RESIDENCY``.
|
||||
|
||||
:returns: Number of times the state expressed in ``power_state`` has been
|
||||
used by ``target_cpu`` and the highest level expressed in
|
||||
``power_state``.
|
||||
|
||||
The implementation provides residency statistics only for low power states,
|
||||
and does this regardless of the entry mechanism into those states. The
|
||||
statistics it collects are set to 0 during shutdown or reset.
|
||||
|
||||
PSCI Statistics is enabled with the Boolean build flag
|
||||
``ENABLE_PSCI_STAT``. All Arm platforms utilise the PMF unless another
|
||||
collection backend is provided (``ENABLE_PMF`` is implicitly enabled).
|
||||
|
||||
Runtime Instrumentation
|
||||
-----------------------
|
||||
|
||||
The Runtime Instrumentation Service is an instrumentation tool that wraps
|
||||
around the PMF to provide timestamp data. Although the service is not
|
||||
restricted to PSCI, it is used primarily in TF-A to quantify the total time
|
||||
spent in the PSCI implementation. The tool can be used to instrument other
|
||||
components in TF-A as well. It is enabled with the Boolean flag
|
||||
``ENABLE_RUNTIME_INSTRUMENTATION``, and as with PSCI STAT, requires PMF to
|
||||
be enabled.
|
||||
|
||||
In PSCI, this service provides instrumentation points in the
|
||||
following code paths:
|
||||
|
||||
* Entry into the PSCI SMC handler
|
||||
* Exit from the PSCI SMC handler
|
||||
* Entry to low power state
|
||||
* Exit from low power state
|
||||
* Entry into cache maintenance operations in PSCI
|
||||
* Exit from cache maintenance operations in PSCI
|
||||
|
||||
The service captures the cycle count, which allows for the time spent in the
|
||||
implementation to be calculated, given the frequency counter.
|
||||
|
||||
PSCI SMC Handler Instrumentation
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The timestamp during entry into the handler is captured as early as possible
|
||||
during the runtime exception, prior to entry into the handler itself. All
|
||||
timestamps are stored in memory for later retrieval. The exit timestamp is
|
||||
captured after normal return from the PSCI SMC handler, or, if a low power state
|
||||
was requested, it is captured in the warm boot path.
|
||||
|
||||
*Copyright (c) 2023, Arm Limited. All rights reserved.*
|
||||
|
||||
.. _PMF: ../design/firmware-design.html#performance-measurement-framework
|
||||
.. _PMU: performance-monitoring-unit.html
|
||||
.. _PSCI: https://developer.arm.com/documentation/den0022/latest/
|
55
docs/perf/psci-performance-methodology.rst
Normal file
55
docs/perf/psci-performance-methodology.rst
Normal file
|
@ -0,0 +1,55 @@
|
|||
Runtime Instrumentation Methodology
|
||||
===================================
|
||||
|
||||
This document outlines steps for undertaking performance measurements of key
|
||||
operations in the Trusted Firmware-A Power State Coordination Interface (PSCI)
|
||||
implementation, using the in-built Performance Measurement Framework (PMF) and
|
||||
runtime instrumentation timestamps.
|
||||
|
||||
Framework
|
||||
~~~~~~~~~
|
||||
|
||||
The tests are based on the ``runtime-instrumentation`` test suite provided by
|
||||
the Trusted Firmware Test Framework (TFTF). The release build of this framework
|
||||
was used because the results in the debug build became skewed; the console
|
||||
output prevented some of the tests from executing in parallel.
|
||||
|
||||
The tests consist of both parallel and sequential tests, which are broadly
|
||||
described as follows:
|
||||
|
||||
- **Parallel Tests** This type of test powers on all the non-lead CPUs and
|
||||
brings them and the lead CPU to a common synchronization point. The lead CPU
|
||||
then initiates the test on all CPUs in parallel.
|
||||
|
||||
- **Sequential Tests** This type of test powers on each non-lead CPU in
|
||||
sequence. The lead CPU initiates the test on a non-lead CPU then waits for the
|
||||
test to complete before proceeding to the next non-lead CPU. The lead CPU then
|
||||
executes the test on itself.
|
||||
|
||||
Note there is very little variance observed in the values given (~1us), although
|
||||
the values for each CPU are sometimes interchanged, depending on the order in
|
||||
which locks are acquired. Also, there is very little variance observed between
|
||||
executing the tests sequentially in a single boot or rebooting between tests.
|
||||
|
||||
Given that runtime instrumentation using PMF is invasive, there is a small
|
||||
(unquantified) overhead on the results. PMF uses the generic counter for
|
||||
timestamps, which runs at 50MHz on Juno.
|
||||
|
||||
Metrics
|
||||
~~~~~~~
|
||||
|
||||
.. glossary::
|
||||
|
||||
Powerdown Latency
|
||||
Time taken from entering the TF PSCI implementation to the point the hardware
|
||||
enters the low power state (WFI). Referring to the TF runtime instrumentation points, this
|
||||
corresponds to: ``(RT_INSTR_ENTER_HW_LOW_PWR - RT_INSTR_ENTER_PSCI)``.
|
||||
|
||||
Wakeup Latency
|
||||
Time taken from the point the hardware exits the low power state to exiting
|
||||
the TF PSCI implementation. This corresponds to: ``(RT_INSTR_EXIT_PSCI -
|
||||
RT_INSTR_EXIT_HW_LOW_PWR)``.
|
||||
|
||||
Cache Flush Latency
|
||||
Time taken to flush the caches during powerdown. This corresponds to:
|
||||
``(RT_INSTR_EXIT_CFLUSH - RT_INSTR_ENTER_CFLUSH)``.
|
Loading…
Add table
Reference in a new issue