diff --git a/docs/design_documents/index.rst b/docs/design_documents/index.rst index 3d82e69cb..d20fc5809 100644 --- a/docs/design_documents/index.rst +++ b/docs/design_documents/index.rst @@ -10,6 +10,7 @@ Design Documents measured_boot_poc drtm_poc rss + psci_osi_mode -------------- diff --git a/docs/design_documents/psci_osi_mode.rst b/docs/design_documents/psci_osi_mode.rst new file mode 100644 index 000000000..3296e270c --- /dev/null +++ b/docs/design_documents/psci_osi_mode.rst @@ -0,0 +1,716 @@ +PSCI OS-initiated mode +====================== + +:Author: Maulik Shah & Wing Li +:Organization: Qualcomm Innovation Center, Inc. & Google LLC +:Contact: Maulik Shah & Wing Li +:Status: RFC + +.. contents:: Table of Contents + +Introduction +------------ + +Power state coordination +^^^^^^^^^^^^^^^^^^^^^^^^ + +A power domain topology is a logical hierarchy of power domains in a system that +arises from the physical dependencies between power domains. + +Local power states describe power states for an individual node, and composite +power states describe the combined power states for an individual node and its +parent node(s). + +Entry into low-power states for a topology node above the core level requires +coordinating its children nodes. For example, in a system with a power domain +that encompasses a shared cache, and a separate power domain for each core that +uses the shared cache, the core power domains must be powered down before the +shared cache power domain can be powered down. + +PSCI supports two modes of power state coordination: platform-coordinated and +OS-initiated. + +Platform-coordinated +~~~~~~~~~~~~~~~~~~~~ + +Platform-coordinated mode is the default mode of power state coordination, and +is currently the only supported mode in TF-A. + +In platform-coordinated mode, the platform is responsible for coordinating power +states, and chooses the deepest power state for a topology node that can be +tolerated by its children. + +OS-initiated +~~~~~~~~~~~~ + +OS-initiated mode is optional. + +In OS-initiated mode, the calling OS is responsible for coordinating power +states, and may request for a topology node to enter a low-power state when +its last child enters the low-power state. + +Motivation +---------- + +There are two reasons why OS-initiated mode might be a more suitable option than +platform-coordinated mode for a platform. + +Scalability +^^^^^^^^^^^ + +In platform-coordinated mode, each core independently selects their own local +power states, and doesn't account for composite power states that are shared +between cores. + +In OS-initiated mode, the OS has knowledge of the next wakeup event for each +core, and can have more precise control over the entry, exit, and wakeup +latencies when deciding if a composite power state (e.g. for a cluster) is +appropriate. This is especially important for multi-cluster SMP systems and +heterogeneous systems like big.LITTLE, where different processor types can have +different power efficiencies. + +Simplicity +^^^^^^^^^^ + +In platform-coordinated mode, the OS doesn't have visibility when the last core +at a power level enters a low-power state. If the OS wants to perform last man +activity (e.g. powering off a shared resource when it is no longer needed), it +would have to communicate with an API side channel to know when it can do so. +This could result in a design smell where the platform is using +platform-coordinated mode when it should be using OS-initiated mode instead. + +In OS-initiated mode, the OS can perform last man activity if it selects a +composite power state when the last core enters a low-power state. This +eliminates the need for a side channel, and uses the well documented API between +the OS and the platform. + +Current vendor implementations and workarounds +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +* STMicroelectronics + + * For their ARM32 platforms, they're using OS-initiated mode implemented in + OP-TEE. + * For their future ARM64 platforms, they are interested in using OS-initiated + mode in TF-A. + +* Qualcomm + + * For their mobile platforms, they're using OS-initiated mode implemented in + their own custom secure monitor firmware. + * For their Chrome OS platforms, they're using platform-coordinated mode in + TF-A with custom driver logic to perform last man activity. + +* Google + + * They're using platform-coordinated mode in TF-A with custom driver logic to + perform last man activity. + +Both Qualcomm and Google would like to be able to use OS-initiated mode in TF-A +in order to simplify custom driver logic. + +Requirements +------------ + +PSCI_FEATURES +^^^^^^^^^^^^^ + +PSCI_FEATURES is for checking whether or not a PSCI function is implemented and +what its properties are. + +.. c:macro:: PSCI_FEATURES + + :param func_id: 0x8400_000A. + :param psci_func_id: the function ID of a PSCI function. + :retval NOT_SUPPORTED: if the function is not implemented. + :retval feature flags associated with the function: if the function is + implemented. + +CPU_SUSPEND feature flags +~~~~~~~~~~~~~~~~~~~~~~~~~ + +* Reserved, bits[31:2] +* Power state parameter format, bit[1] + + * A value of 0 indicates the original format is used. + * A value of 1 indicates the extended format is used. + +* OS-initiated mode, bit[0] + + * A value of 0 indicates OS-initiated mode is not supported. + * A value of 1 indicates OS-initiated mode is supported. + +See sections 5.1.14 and 5.15 of the PSCI spec (DEN0022D.b) for more details. + +PSCI_SET_SUSPEND_MODE +^^^^^^^^^^^^^^^^^^^^^ + +PSCI_SET_SUSPEND_MODE is for switching between the two different modes of power +state coordination. + +.. c:macro:: PSCI_SET_SUSPEND_MODE + + :param func_id: 0x8400_000F. + :param mode: 0 indicates platform-coordinated mode, 1 indicates OS-initiated + mode. + :retval SUCCESS: if the request is successful. + :retval NOT_SUPPORTED: if OS-initiated mode is not supported. + :retval INVALID_PARAMETERS: if the requested mode is not a valid value (0 or + 1). + :retval DENIED: if the cores are not in the correct state. + +Switching from platform-coordinated to OS-initiated is only allowed if the +following conditions are met: + +* All cores are in one of the following states: + + * Running. + * Off, through a call to CPU_OFF or not yet booted. + * Suspended, through a call to CPU_DEFAULT_SUSPEND. + +* None of the cores has called CPU_SUSPEND since the last change of mode or + boot. + +Switching from OS-initiated to platform-coordinated is only allowed if all cores +other than the calling core are off, either through a call to CPU_OFF or not yet +booted. + +If these conditions are not met, the PSCI implementation must return DENIED. + +See sections 5.1.19 and 5.20 of the PSCI spec (DEN0022D.b) for more details. + +CPU_SUSPEND +^^^^^^^^^^^ + +CPU_SUSPEND is for moving a topology node into a low-power state. + +.. c:macro:: CPU_SUSPEND + + :param func_id: 0xC400_0001. + :param power_state: the requested low-power state to enter. + :param entry_point_address: the address at which the core must resume + execution following wakeup from a powerdown state. + :param context_id: this field specifies a pointer to the saved context that + must be restored on a core following wakeup from a powerdown state. + :retval SUCCESS: if the request is successful. + :retval INVALID_PARAMETERS: in OS-initiated mode, this error is returned when + a low-power state is requested for a topology node above the core level, + and at least one of the node's children is in a local low-power state + that is incompatible with the request. + :retval INVALID_ADDRESS: if the entry_point_address argument is invalid. + :retval DENIED: only in OS-initiated mode; this error is returned when a + low-power state is requested for a topology node above the core level, + and at least one of the node's children is running, i.e. not in a + low-power state. + +In platform-coordinated mode, the PSCI implementation coordinates requests from +all cores to determine the deepest power state to enter. + +In OS-initiated mode, the calling OS is making an explicit request for a +specific power state, as opposed to expressing a vote. The PSCI implementation +must comply with the request, unless the request is not consistent with the +implementation's view of the system's state, in which case, the implementation +must return INVALID_PARAMETERS or DENIED. + +See sections 5.1.2 and 5.4 of the PSCI spec (DEN0022D.b) for more details. + +Power state formats +~~~~~~~~~~~~~~~~~~~ + +Original format + +* Power Level, bits[25:24] + + * The requested level in the power domain topology to enter a low-power + state. + +* State Type, bit[16] + + * A value of 0 indicates a standby or retention state. + * A value of 1 indicates a powerdown state. + +* State ID, bits[15:0] + + * Field to specify the requested composite power state. + * The state ID encodings must uniquely describe every possible composite + power state. + * In OS-initiated mode, the state ID encoding must allow expressing the + power level at which the calling core is the last to enter a powerdown + state. + +Extended format + +* State Type, bit[30] +* State ID, bits[27:0] + +Races in OS-initiated mode +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In OS-initiated mode, there are race windows where the OS's view and +implementation's view of the system's state differ. It is possible for the OS to +make requests that are invalid given the implementation's view of the system's +state. For example, the OS might request a powerdown state for a node from one +core, while at the same time, the implementation observes that another core in +that node is powering up. + +To address potential race conditions in power state requests: + +* The calling OS must specify in each CPU_SUSPEND request the deepest power + level for which it sees the calling core as the last running core (last man). + This is required even if the OS doesn't want the node at that power level to + enter a low-power state. +* The implementation must validate that the requested power states in the + CPU_SUSPEND request are consistent with the system's state, and that the + calling core is the last core running at the requested power level, or deny + the request otherwise. + +See sections 4.2.3.2, 6.2, and 6.3 of the PSCI spec (DEN0022D.b) for more +details. + +Caveats +------- + +CPU_OFF +^^^^^^^ + +CPU_OFF is always platform-coordinated, regardless of whether the power state +coordination mode for suspend is platform-coordinated or OS-initiated. If all +cores in a topology node call CPU_OFF, the last core will power down the node. + +In OS-initiated mode, if a subset of the cores in a topology node has called +CPU_OFF, the last running core may call CPU_SUSPEND to request a powerdown state +at or above that node's power level. + +See section 5.5.2 of the PSCI spec (DEN0022D.b) for more details. + +Implementation +-------------- + +Current implementation of platform-coordinated mode +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Platform-coordinated is currently the only supported power state coordination +mode in TF-A. + +The functions of interest in the ``psci_cpu_suspend`` call stack are as follows: + +* ``psci_validate_power_state`` + + * This function calls a platform specific ``validate_power_state`` handler, + which takes the ``power_state`` parameter, and updates the ``state_info`` + object with the requested states for each power level. + +* ``psci_find_target_suspend_lvl`` + + * This function takes the ``state_info`` object containing the requested power + states for each power level, and returns the deepest power level that was + requested to enter a low power state, i.e. the target power level. + +* ``psci_do_state_coordination`` + + * This function takes the target power level and the ``state_info`` object + containing the requested power states for each power level, and updates the + ``state_info`` object with the coordinated target power state for each + level. + +* ``pwr_domain_suspend`` + + * This is a platform specific handler that takes the ``state_info`` object + containing the target power states for each power level, and transitions + each power level to the specified power state. + +Proposed implementation of OS-initiated mode +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +To add support for OS-initiated mode, the following changes are proposed: + +* Add a boolean build option ``PSCI_OS_INIT_MODE`` for a platform to enable + optional support for PSCI OS-initiated mode. This build option defaults to 0. + +.. note:: + + If ``PSCI_OS_INIT_MODE=0``, the following changes will not be compiled into + the build. + +* Update ``psci_features`` to return 1 in bit[0] to indicate support for + OS-initiated mode for CPU_SUSPEND. +* Define a ``suspend_mode`` enum: ``PLAT_COORD`` and ``OS_INIT``. +* Define a ``psci_suspend_mode`` global variable with a default value of + ``PLAT_COORD``. +* Implement a new function handler ``psci_set_suspend_mode`` for + PSCI_SET_SUSPEND_MODE. +* Since ``psci_validate_power_state`` calls a platform specific + ``validate_power_state`` handler, the platform implementation should populate + the ``state_info`` object based on the state ID from the given ``power_state`` + parameter. +* ``psci_find_target_suspend_lvl`` remains unchanged. +* Implement a new function ``psci_validate_state_coordination`` that ensures the + request satisfies the following conditions, and denies any requests + that don't: + + * The requested power states for each power level are consistent with the + system's state + * The calling core is the last core running at the requested power level + + This function differs from ``psci_do_state_coordination`` in that: + + * The ``psci_req_local_pwr_states`` map is not modified if the request were to + be denied + * The ``state_info`` argument is never modified since it contains the power + states requested by the calling OS + +* Update ``psci_cpu_suspend_start`` to do the following: + + * If ``PSCI_SUSPEND_MODE`` is ``PLAT_COORD``, call + ``psci_do_state_coordination``. + * If ``PSCI_SUSPEND_MODE`` is ``OS_INIT``, call + ``psci_validate_state_coordination``. If validation fails, propagate the + error up the call stack. + +* Update the return type of the platform specific ``pwr_domain_suspend`` + handler from ``void`` to ``int``, to allow the platform to optionally perform + validations based on hardware states. + +.. image:: ../resources/diagrams/psci-osi-mode.png + +Testing +------- + +The proposed patches can be found at +https://review.trustedfirmware.org/q/topic:psci-osi. + +Testing on FVP and Google platforms +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The proposed patches add a new CPU Suspend in OSI mode test suite to TF-A Tests. +This has been enabled and verified on the FVP_Base_RevC-2xAEMvA platform and +Google platforms, and excluded from all other platforms via the build option +``PLAT_TESTS_SKIP_LIST``. + +Testing on STM32MP15 +^^^^^^^^^^^^^^^^^^^^ + +The proposed patches have been tested and verified on the STM32MP15 platform, +which has a single cluster with 2 CPUs, by Gabriel Fernandez + from STMicroelectronics with this device tree +configuration: + +.. code-block:: devicetree + + cpus { + #address-cells = <1>; + #size-cells = <0>; + + cpu0: cpu@0 { + device_type = "cpu"; + compatible = "arm,cortex-a7"; + reg = <0>; + enable-method = "psci"; + power-domains = <&CPU_PD0>; + power-domain-names = "psci"; + }; + cpu1: cpu@1 { + device_type = "cpu"; + compatible = "arm,cortex-a7"; + reg = <1>; + enable-method = "psci"; + power-domains = <&CPU_PD1>; + power-domain-names = "psci"; + }; + + idle-states { + cpu_retention: cpu-retention { + compatible = "arm,idle-state"; + arm,psci-suspend-param = <0x00000001>; + entry-latency-us = <130>; + exit-latency-us = <620>; + min-residency-us = <700>; + local-timer-stop; + }; + }; + + domain-idle-states { + CLUSTER_STOP: core-power-domain { + compatible = "domain-idle-state"; + arm,psci-suspend-param = <0x01000001>; + entry-latency-us = <230>; + exit-latency-us = <720>; + min-residency-us = <2000>; + local-timer-stop; + }; + }; + }; + + psci { + compatible = "arm,psci-1.0"; + method = "smc"; + + CPU_PD0: power-domain-cpu0 { + #power-domain-cells = <0>; + power-domains = <&pd_core>; + domain-idle-states = <&cpu_retention>; + }; + + CPU_PD1: power-domain-cpu1 { + #power-domain-cells = <0>; + power-domains = <&pd_core>; + domain-idle-states = <&cpu_retention>; + }; + + pd_core: power-domain-cluster { + #power-domain-cells = <0>; + domain-idle-states = <&CLUSTER_STOP>; + }; + }; + +Testing on Qualcomm SC7280 +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The proposed patches have been tested and verified on the SC7280 platform by +Maulik Shah from Qualcomm with this device tree +configuration: + +.. code-block:: devicetree + + cpus { + #address-cells = <2>; + #size-cells = <0>; + + CPU0: cpu@0 { + device_type = "cpu"; + compatible = "arm,kryo"; + reg = <0x0 0x0>; + enable-method = "psci"; + power-domains = <&CPU_PD0>; + power-domain-names = "psci"; + }; + + CPU1: cpu@100 { + device_type = "cpu"; + compatible = "arm,kryo"; + reg = <0x0 0x100>; + enable-method = "psci"; + power-domains = <&CPU_PD1>; + power-domain-names = "psci"; + }; + + CPU2: cpu@200 { + device_type = "cpu"; + compatible = "arm,kryo"; + reg = <0x0 0x200>; + enable-method = "psci"; + power-domains = <&CPU_PD2>; + power-domain-names = "psci"; + }; + + CPU3: cpu@300 { + device_type = "cpu"; + compatible = "arm,kryo"; + reg = <0x0 0x300>; + enable-method = "psci"; + power-domains = <&CPU_PD3>; + power-domain-names = "psci"; + } + + CPU4: cpu@400 { + device_type = "cpu"; + compatible = "arm,kryo"; + reg = <0x0 0x400>; + enable-method = "psci"; + power-domains = <&CPU_PD4>; + power-domain-names = "psci"; + }; + + CPU5: cpu@500 { + device_type = "cpu"; + compatible = "arm,kryo"; + reg = <0x0 0x500>; + enable-method = "psci"; + power-domains = <&CPU_PD5>; + power-domain-names = "psci"; + }; + + CPU6: cpu@600 { + device_type = "cpu"; + compatible = "arm,kryo"; + reg = <0x0 0x600>; + enable-method = "psci"; + power-domains = <&CPU_PD6>; + power-domain-names = "psci"; + }; + + CPU7: cpu@700 { + device_type = "cpu"; + compatible = "arm,kryo"; + reg = <0x0 0x700>; + enable-method = "psci"; + power-domains = <&CPU_PD7>; + power-domain-names = "psci"; + }; + + idle-states { + entry-method = "psci"; + + LITTLE_CPU_SLEEP_0: cpu-sleep-0-0 { + compatible = "arm,idle-state"; + idle-state-name = "little-power-down"; + arm,psci-suspend-param = <0x40000003>; + entry-latency-us = <549>; + exit-latency-us = <901>; + min-residency-us = <1774>; + local-timer-stop; + }; + + LITTLE_CPU_SLEEP_1: cpu-sleep-0-1 { + compatible = "arm,idle-state"; + idle-state-name = "little-rail-power-down"; + arm,psci-suspend-param = <0x40000004>; + entry-latency-us = <702>; + exit-latency-us = <915>; + min-residency-us = <4001>; + local-timer-stop; + }; + + BIG_CPU_SLEEP_0: cpu-sleep-1-0 { + compatible = "arm,idle-state"; + idle-state-name = "big-power-down"; + arm,psci-suspend-param = <0x40000003>; + entry-latency-us = <523>; + exit-latency-us = <1244>; + min-residency-us = <2207>; + local-timer-stop; + }; + + BIG_CPU_SLEEP_1: cpu-sleep-1-1 { + compatible = "arm,idle-state"; + idle-state-name = "big-rail-power-down"; + arm,psci-suspend-param = <0x40000004>; + entry-latency-us = <526>; + exit-latency-us = <1854>; + min-residency-us = <5555>; + local-timer-stop; + }; + }; + + domain-idle-states { + CLUSTER_SLEEP_0: cluster-sleep-0 { + compatible = "arm,idle-state"; + idle-state-name = "cluster-power-down"; + arm,psci-suspend-param = <0x40003444>; + entry-latency-us = <3263>; + exit-latency-us = <6562>; + min-residency-us = <9926>; + local-timer-stop; + }; + }; + }; + + psci { + compatible = "arm,psci-1.0"; + method = "smc"; + + CPU_PD0: cpu0 { + #power-domain-cells = <0>; + power-domains = <&CLUSTER_PD>; + domain-idle-states = <&LITTLE_CPU_SLEEP_0 &LITTLE_CPU_SLEEP_1>; + }; + + CPU_PD1: cpu1 { + #power-domain-cells = <0>; + power-domains = <&CLUSTER_PD>; + domain-idle-states = <&LITTLE_CPU_SLEEP_0 &LITTLE_CPU_SLEEP_1>; + }; + + CPU_PD2: cpu2 { + #power-domain-cells = <0>; + power-domains = <&CLUSTER_PD>; + domain-idle-states = <&LITTLE_CPU_SLEEP_0 &LITTLE_CPU_SLEEP_1>; + }; + + CPU_PD3: cpu3 { + #power-domain-cells = <0>; + power-domains = <&CLUSTER_PD>; + domain-idle-states = <&LITTLE_CPU_SLEEP_0 &LITTLE_CPU_SLEEP_1>; + }; + + CPU_PD4: cpu4 { + #power-domain-cells = <0>; + power-domains = <&CLUSTER_PD>; + domain-idle-states = <&BIG_CPU_SLEEP_0 &BIG_CPU_SLEEP_1>; + }; + + CPU_PD5: cpu5 { + #power-domain-cells = <0>; + power-domains = <&CLUSTER_PD>; + domain-idle-states = <&BIG_CPU_SLEEP_0 &BIG_CPU_SLEEP_1>; + }; + + CPU_PD6: cpu6 { + #power-domain-cells = <0>; + power-domains = <&CLUSTER_PD>; + domain-idle-states = <&BIG_CPU_SLEEP_0 &BIG_CPU_SLEEP_1>; + }; + + CPU_PD7: cpu7 { + #power-domain-cells = <0>; + power-domains = <&CLUSTER_PD>; + domain-idle-states = <&BIG_CPU_SLEEP_0 &BIG_CPU_SLEEP_1>; + }; + + CLUSTER_PD: cpu-cluster0 { + #power-domain-cells = <0>; + domain-idle-states = <&CLUSTER_SLEEP_0>; + }; + }; + +Comparisons on Qualcomm SC7280 +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +CPUIdle states +~~~~~~~~~~~~~~ + +* 8 CPUs, 1 L3 cache +* Platform-coordinated mode + + * CPUIdle states + + * State0 - WFI + * State1 - Core collapse + * State2 - Rail collapse + * State3 - L3 cache off and system resources voted off + +* OS-initiated mode + + * CPUIdle states + + * State0 - WFI + * State1 - Core collapse + * State2 - Rail collapse + + * Cluster domain idle state + + * State3 - L3 cache off and system resources voted off + +.. image:: ../resources/diagrams/psci-flattened-vs-hierarchical-idle-states.png + +Results +~~~~~~~ + +* The following stats have been captured with fixed CPU frequencies from the use + case of 10 seconds of device idle with the display turned on and Wi-Fi and + modem turned off. +* Count refers to the number of times a CPU or cluster entered power collapse. +* Residency refers to the time in seconds a CPU or cluster stayed in power + collapse. +* The results are an average of 3 iterations of actual counts and residencies. + +.. image:: ../resources/diagrams/psci-pc-mode-vs-osi-mode.png + +OS-initiated mode was able to scale better than platform-coordinated mode for +multiple CPUs. The count and residency results for state3 (i.e. a cluster domain +idle state) in OS-initiated mode for multiple CPUs were much closer to the +results for a single CPU than in platform-coordinated mode. + +-------------- + +*Copyright (c) 2023, Arm Limited and Contributors. All rights reserved.* diff --git a/docs/resources/diagrams/psci-flattened-vs-hierarchical-idle-states.png b/docs/resources/diagrams/psci-flattened-vs-hierarchical-idle-states.png new file mode 100644 index 000000000..7c41f75dd Binary files /dev/null and b/docs/resources/diagrams/psci-flattened-vs-hierarchical-idle-states.png differ diff --git a/docs/resources/diagrams/psci-osi-mode.png b/docs/resources/diagrams/psci-osi-mode.png new file mode 100644 index 000000000..d3229539d Binary files /dev/null and b/docs/resources/diagrams/psci-osi-mode.png differ diff --git a/docs/resources/diagrams/psci-pc-mode-vs-osi-mode.png b/docs/resources/diagrams/psci-pc-mode-vs-osi-mode.png new file mode 100644 index 000000000..7270a3dc4 Binary files /dev/null and b/docs/resources/diagrams/psci-pc-mode-vs-osi-mode.png differ