mirror of
https://github.com/ARM-software/arm-trusted-firmware.git
synced 2025-04-26 14:55:16 +00:00

The simplistic view of a core's powerdown sequence is that power is atomically cut upon calling `wfi`. However, it turns out that it has lots to do - it has to talk to the interconnect to exit coherency, clean caches, check for RAS errors, etc. These take significant amounts of time and are certainly not atomic. As such there is a significant window of opportunity for external events to happen. Many of these steps are not destructive to context, so theoretically, the core can just "give up" half way (or roll certain actions back) and carry on running. The point in this sequence after which roll back is not possible is called the point of no return. One of these actions is the checking for RAS errors. It is possible for one to happen during this lengthy sequence, or at least remain undiscovered until that point. If the core were to continue powerdown when that happens, there would be no (easy) way to inform anyone about it. Rejecting the powerdown and letting software handle the error is the best way to implement this. Arm cores since at least the a510 have included this exact feature. So far it hasn't been deemed necessary to account for it in firmware due to the low likelihood of this happening. However, events like GIC wakeup requests are much more probable. Older cores will powerdown and immediately power back up when this happens. Travis and Gelas include a feature similar to the RAS case above, called powerdown abandon. The idea is that this will improve the latency to service the interrupt by saving on work which the core and software need to do. So far firmware has relied on the `wfi` being the point of no return and if it doesn't explicitly detect a pending interrupt quite early on, it will embark onto a sequence that it expects to end with shutdown. To accommodate for it not being a point of no return, we must undo all of the system management we did, just like in the warm boot entrypoint. To achieve that, the pwr_domain_pwr_down_wfi hook must not be terminal. Most recent platforms do some platform management and finish on the standard `wfi`, followed by a panic or an endless loop as this is expected to not return. To make this generic, any platform that wishes to support wakeups must instead let common code call `psci_power_down_wfi()` right after. Besides wakeups, this lets common code handle powerdown errata better as well. Then, the CPU_OFF case is simple - PSCI does not allow it to return. So the best that can be done is to attempt the `wfi` a few times (the choice of 32 is arbitrary) in the hope that the wakeup is transient. If it isn't, the only choice is to panic, as the system is likely to be in a bad state, eg. interrupts weren't routed away. The same applies for SYSTEM_OFF, SYSTEM_RESET, and SYSTEM_RESET2. There the panic won't matter as the system is going offline one way or another. The RAS case will be considered in a separate patch. Now, the CPU_SUSPEND case is more involved. First, to powerdown it must wipe its context as it is not written on warm boot. But it cannot be overwritten in case of a wakeup. To avoid the catch 22, save a copy that will only be used if powerdown fails. That is about 500 bytes on the stack so it hopefully doesn't tip anyone over any limits. In future that can be avoided by having a core manage its own context. Second, when the core wakes up, it must undo anything it did to prepare for poweroff, which for the cores we care about, is writing CPUPWRCTLR_EL1.CORE_PWRDN_EN. The least intrusive for the cpu library way of doing this is to simply call the power off hook again and have the hook toggle the bit. If in the future there need to be more complex sequences, their direction can be advised on the value of this bit. Third, do the actual "resume". Most of the logic is already there for the retention suspend, so that only needs a small touch up to apply to the powerdown case as well. The missing bit is the powerdown specific state management. Luckily, the warmboot entrypoint does exactly that already too, so steal that and we're done. All of this is hidden behind a FEAT_PABANDON flag since it has a large memory and runtime cost that we don't want to burden non pabandon cores with. Finally, do some function renaming to better reflect their purpose and make names a little bit more consistent. Change-Id: I2405b59300c2e24ce02e266f91b7c51474c1145f Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
175 lines
5.9 KiB
C
175 lines
5.9 KiB
C
/*
|
|
* Copyright (c) 2013-2024, Arm Limited and Contributors. All rights reserved.
|
|
* Copyright (c) 2023, NVIDIA Corporation. All rights reserved.
|
|
*
|
|
* SPDX-License-Identifier: BSD-3-Clause
|
|
*/
|
|
|
|
#include <assert.h>
|
|
#include <string.h>
|
|
|
|
#include <arch.h>
|
|
#include <arch_helpers.h>
|
|
#include <common/debug.h>
|
|
#include <lib/pmf/pmf.h>
|
|
#include <lib/runtime_instr.h>
|
|
#include <plat/common/platform.h>
|
|
|
|
#include "psci_private.h"
|
|
|
|
/******************************************************************************
|
|
* Construct the psci_power_state to request power OFF at all power levels.
|
|
******************************************************************************/
|
|
static void psci_set_power_off_state(psci_power_state_t *state_info)
|
|
{
|
|
unsigned int lvl;
|
|
|
|
for (lvl = PSCI_CPU_PWR_LVL; lvl <= PLAT_MAX_PWR_LVL; lvl++)
|
|
state_info->pwr_domain_state[lvl] = PLAT_MAX_OFF_STATE;
|
|
}
|
|
|
|
/******************************************************************************
|
|
* Top level handler which is called when a cpu wants to power itself down.
|
|
* It's assumed that along with turning the cpu power domain off, power
|
|
* domains at higher levels will be turned off as far as possible. It finds
|
|
* the highest level where a domain has to be powered off by traversing the
|
|
* node information and then performs generic, architectural, platform setup
|
|
* and state management required to turn OFF that power domain and domains
|
|
* below it. e.g. For a cpu that's to be powered OFF, it could mean programming
|
|
* the power controller whereas for a cluster that's to be powered off, it will
|
|
* call the platform specific code which will disable coherency at the
|
|
* interconnect level if the cpu is the last in the cluster and also the
|
|
* program the power controller.
|
|
******************************************************************************/
|
|
int psci_do_cpu_off(unsigned int end_pwrlvl)
|
|
{
|
|
int rc = PSCI_E_SUCCESS;
|
|
unsigned int idx = plat_my_core_pos();
|
|
psci_power_state_t state_info;
|
|
unsigned int parent_nodes[PLAT_MAX_PWR_LVL] = {0};
|
|
|
|
/*
|
|
* This function must only be called on platforms where the
|
|
* CPU_OFF platform hooks have been implemented.
|
|
*/
|
|
assert(psci_plat_pm_ops->pwr_domain_off != NULL);
|
|
|
|
/* Construct the psci_power_state for CPU_OFF */
|
|
psci_set_power_off_state(&state_info);
|
|
|
|
/*
|
|
* Call the platform provided early CPU_OFF handler to allow
|
|
* platforms to perform any housekeeping activities before
|
|
* actually powering the CPU off. PSCI_E_DENIED indicates that
|
|
* the CPU off sequence should be aborted at this time.
|
|
*/
|
|
if (psci_plat_pm_ops->pwr_domain_off_early) {
|
|
rc = psci_plat_pm_ops->pwr_domain_off_early(&state_info);
|
|
if (rc == PSCI_E_DENIED) {
|
|
return rc;
|
|
}
|
|
}
|
|
|
|
/*
|
|
* Get the parent nodes here, this is important to do before we
|
|
* initiate the power down sequence as after that point the core may
|
|
* have exited coherency and its cache may be disabled, any access to
|
|
* shared memory after that (such as the parent node lookup in
|
|
* psci_cpu_pd_nodes) can cause coherency issues on some platforms.
|
|
*/
|
|
psci_get_parent_pwr_domain_nodes(idx, end_pwrlvl, parent_nodes);
|
|
|
|
/*
|
|
* This function acquires the lock corresponding to each power
|
|
* level so that by the time all locks are taken, the system topology
|
|
* is snapshot and state management can be done safely.
|
|
*/
|
|
psci_acquire_pwr_domain_locks(end_pwrlvl, parent_nodes);
|
|
|
|
/*
|
|
* Call the cpu off handler registered by the Secure Payload Dispatcher
|
|
* to let it do any bookkeeping. Assume that the SPD always reports an
|
|
* E_DENIED error if SP refuse to power down
|
|
*/
|
|
if ((psci_spd_pm != NULL) && (psci_spd_pm->svc_off != NULL)) {
|
|
rc = psci_spd_pm->svc_off(0);
|
|
if (rc != 0)
|
|
goto exit;
|
|
}
|
|
|
|
/*
|
|
* This function is passed the requested state info and
|
|
* it returns the negotiated state info for each power level upto
|
|
* the end level specified.
|
|
*/
|
|
psci_do_state_coordination(idx, end_pwrlvl, &state_info);
|
|
|
|
/* Update the target state in the power domain nodes */
|
|
psci_set_target_local_pwr_states(idx, end_pwrlvl, &state_info);
|
|
|
|
#if ENABLE_PSCI_STAT
|
|
/* Update the last cpu for each level till end_pwrlvl */
|
|
psci_stats_update_pwr_down(idx, end_pwrlvl, &state_info);
|
|
#endif
|
|
|
|
/*
|
|
* Arch. management. Initiate power down sequence.
|
|
*/
|
|
psci_pwrdown_cpu_start(psci_find_max_off_lvl(&state_info));
|
|
|
|
/*
|
|
* Plat. management: Perform platform specific actions to turn this
|
|
* cpu off e.g. exit cpu coherency, program the power controller etc.
|
|
*/
|
|
psci_plat_pm_ops->pwr_domain_off(&state_info);
|
|
|
|
#if ENABLE_PSCI_STAT
|
|
plat_psci_stat_accounting_start(&state_info);
|
|
#endif
|
|
|
|
exit:
|
|
/*
|
|
* Release the locks corresponding to each power level in the
|
|
* reverse order to which they were acquired.
|
|
*/
|
|
psci_release_pwr_domain_locks(end_pwrlvl, parent_nodes);
|
|
|
|
/*
|
|
* Check if all actions needed to safely power down this cpu have
|
|
* successfully completed.
|
|
*/
|
|
if (rc == PSCI_E_SUCCESS) {
|
|
/*
|
|
* Set the affinity info state to OFF. When caches are disabled,
|
|
* this writes directly to main memory, so cache maintenance is
|
|
* required to ensure that later cached reads of aff_info_state
|
|
* return AFF_STATE_OFF. A dsbish() ensures ordering of the
|
|
* update to the affinity info state prior to cache line
|
|
* invalidation.
|
|
*/
|
|
psci_flush_cpu_data(psci_svc_cpu_data.aff_info_state);
|
|
psci_set_aff_info_state(AFF_STATE_OFF);
|
|
psci_dsbish();
|
|
psci_inv_cpu_data(psci_svc_cpu_data.aff_info_state);
|
|
|
|
#if ENABLE_RUNTIME_INSTRUMENTATION
|
|
/*
|
|
* Update the timestamp with cache off. We assume this
|
|
* timestamp can only be read from the current CPU and the
|
|
* timestamp cache line will be flushed before return to
|
|
* normal world on wakeup.
|
|
*/
|
|
PMF_CAPTURE_TIMESTAMP(rt_instr_svc,
|
|
RT_INSTR_ENTER_HW_LOW_PWR,
|
|
PMF_NO_CACHE_MAINT);
|
|
#endif
|
|
if (psci_plat_pm_ops->pwr_domain_pwr_down_wfi != NULL) {
|
|
/* This function may not return */
|
|
psci_plat_pm_ops->pwr_domain_pwr_down_wfi(&state_info);
|
|
}
|
|
|
|
psci_pwrdown_cpu_end_terminal();
|
|
}
|
|
|
|
return rc;
|
|
}
|