Commit graph

4 commits

Author SHA1 Message Date
Boyan Karatotev
83ec7e452c perf(amu): greatly simplify AMU context management
The current code is incredibly resilient to updates to the spec and
has worked quite well so far. However, recent implementations expose a
weakness in that this is rather slow. A large part of it is written in
assembly, making it opaque to the compiler for optimisations. The
future proofness requires reading registers that are effectively
`volatile`, making it even harder for the compiler, as well as adding
lots of implicit barriers, making it hard for the microarchitecutre to
optimise as well.

We can make a few assumptions, checked by a few well placed asserts, and
remove a lot of this burden. For a start, at the moment there are 4
group 0 counters with static assignments. Contexting them is a trivial
affair that doesn't need a loop. Similarly, there can only be up to 16
group 1 counters. Contexting them is a bit harder, but we can do with a
single branch with a falling through switch. If/when both of these
change, we have a pair of asserts and the feature detection mechanism to
guard us against pretending that we support something we don't.

We can drop contexting of the offset registers. They are fully
accessible by EL2 and as such are its responsibility to preserve on
powerdown.

Another small thing we can do, is pass the core_pos into the hook.
The caller already knows which core we're running on, we don't need to
call this non-trivial function again.

Finally, knowing this, we don't really need the auxiliary AMUs to be
described by the device tree. Linux doesn't care at the moment, and any
information we need for EL3 can be neatly placed in a simple array.

All of this, combined with lifting the actual saving out of assembly,
reduces the instructions to save the context from 180 to 40, including a
lot fewer branches. The code is also much shorter and easier to read.

Also propagate to aarch32 so that the two don't diverge too much.

Change-Id: Ib62e6e9ba5be7fb9fb8965c8eee148d5598a5361
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
2025-02-25 08:50:46 +00:00
Andre Przywara
d23acc9e4f refactor(amu): unify ENABLE_AMU and ENABLE_FEAT_AMUv1
So far we have the ENABLE_AMU build option to include AMU register
handling code for enabling and context switch. There is also an
ENABLE_FEAT_AMUv1 option, solely to protect the HAFGRTR_EL2 system
register handling. The latter needs some alignment with the new feature
scheme, but it conceptually overlaps with the ENABLE_AMU option.

Since there is no real need for two separate options, unify both into a
new ENABLE_FEAT_AMU name in a first step. This is mostly just renaming at
this point, a subsequent patch will make use of the new feature handling
scheme.

Change-Id: I97d8a55bdee2ed1e1509fa9f2b09fd0bdd82736e
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
2023-03-27 19:36:00 +01:00
Chris Kay
742ca2307f feat(amu): enable per-core AMU auxiliary counters
This change makes AMU auxiliary counters configurable on a per-core
basis, controlled by `ENABLE_AMU_AUXILIARY_COUNTERS`.

Auxiliary counters can be described via the `HW_CONFIG` device tree if
the `ENABLE_AMU_FCONF` build option is enabled, or the platform must
otherwise implement the `plat_amu_topology` function.

A new phandle property for `cpu` nodes (`amu`) has been introduced to
the `HW_CONFIG` specification to allow CPUs to describe the view of
their own AMU:

```
cpu0: cpu@0 {
    ...

    amu = <&cpu0_amu>;
};
```

Multiple cores may share an `amu` handle if they implement the
same set of auxiliary counters.

AMU counters are described for one or more AMUs through the use of a new
`amus` node:

```
amus {
    cpu0_amu: amu-0 {
        #address-cells = <1>;
        #size-cells = <0>;

        counter@0 {
            reg = <0>;

            enable-at-el3;
        };

        counter@n {
            reg = <n>;

            ...
        };
    };
};
```

This structure describes the **auxiliary** (group 1) AMU counters.
Architected counters have architecturally-defined behaviour, and as
such do not require DTB entries.

These `counter` nodes support two properties:

- The `reg` property represents the counter register index.
- The presence of the `enable-at-el3` property determines whether
  the firmware should enable the counter prior to exiting EL3.

Change-Id: Ie43aee010518c5725a3b338a4899b0857caf4c28
Signed-off-by: Chris Kay <chris.kay@arm.com>
2021-10-26 12:15:33 +01:00
Chris Kay
9b43d098d8 build(amu): introduce amu.mk
This change introduces the `amu.mk` Makefile, used to remove the need
to manually include AMU sources into the various build images.
Makefiles requiring the list of AMU sources are expected to include
this file and use `${AMU_SOURCES}` to retrieve them.

Change-Id: I3d174033ecdce6439a110d776f0c064c67abcfe0
Signed-off-by: Chris Kay <chris.kay@arm.com>
2021-10-26 12:14:30 +01:00