Commit graph

27 commits

Author SHA1 Message Date
Boyan Karatotev
83ec7e452c perf(amu): greatly simplify AMU context management
The current code is incredibly resilient to updates to the spec and
has worked quite well so far. However, recent implementations expose a
weakness in that this is rather slow. A large part of it is written in
assembly, making it opaque to the compiler for optimisations. The
future proofness requires reading registers that are effectively
`volatile`, making it even harder for the compiler, as well as adding
lots of implicit barriers, making it hard for the microarchitecutre to
optimise as well.

We can make a few assumptions, checked by a few well placed asserts, and
remove a lot of this burden. For a start, at the moment there are 4
group 0 counters with static assignments. Contexting them is a trivial
affair that doesn't need a loop. Similarly, there can only be up to 16
group 1 counters. Contexting them is a bit harder, but we can do with a
single branch with a falling through switch. If/when both of these
change, we have a pair of asserts and the feature detection mechanism to
guard us against pretending that we support something we don't.

We can drop contexting of the offset registers. They are fully
accessible by EL2 and as such are its responsibility to preserve on
powerdown.

Another small thing we can do, is pass the core_pos into the hook.
The caller already knows which core we're running on, we don't need to
call this non-trivial function again.

Finally, knowing this, we don't really need the auxiliary AMUs to be
described by the device tree. Linux doesn't care at the moment, and any
information we need for EL3 can be neatly placed in a simple array.

All of this, combined with lifting the actual saving out of assembly,
reduces the instructions to save the context from 180 to 40, including a
lot fewer branches. The code is also much shorter and easier to read.

Also propagate to aarch32 so that the two don't diverge too much.

Change-Id: Ib62e6e9ba5be7fb9fb8965c8eee148d5598a5361
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
2025-02-25 08:50:46 +00:00
Boyan Karatotev
2590e819eb perf(mpmm): greatly simplify MPMM enablement
MPMM is a core-specific microarchitectural feature. It has been present
in every Arm core since the Cortex-A510 and has been implemented in
exactly the same way. Despite that, it is enabled more like an
architectural feature with a top level enable flag. This utilised the
identical implementation.

This duality has left MPMM in an awkward place, where its enablement
should be generic, like an architectural feature, but since it is not,
it should also be core-specific if it ever changes. One choice to do
this has been through the device tree.

This has worked just fine so far, however, recent implementations expose
a weakness in that this is rather slow - the device tree has to be read,
there's a long call stack of functions with many branches, and system
registers are read. In the hot path of PSCI CPU powerdown, this has a
significant and measurable impact. Besides it being a rather large
amount of code that is difficult to understand.

Since MPMM is a microarchitectural feature, its correct placement is in
the reset function. The essence of the current enablement is to write
CPUPPMCR_EL3.MPMM_EN if CPUPPMCR_EL3.MPMMPINCTL == 0. Replacing the C
enablement with an assembly macro in each CPU's reset function achieves
the same effect with just a single close branch and a grand total of 6
instructions (versus the old 2 branches and 32 instructions).

Having done this, the device tree entry becomes redundant. Should a core
that doesn't support MPMM arise, this can cleanly be handled in the
reset function. As such, the whole ENABLE_MPMM_FCONF and platform hooks
mechanisms become obsolete and are removed.

Change-Id: I1d0475b21a1625bb3519f513ba109284f973ffdf
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
2025-02-25 08:50:45 +00:00
Manish V Badarkhe
697290a916 Merge changes from topic "us_tc_trng" into integration
* changes:
  feat(tc): get entropy with PSA Crypto API
  feat(psa): add interface with RSE for retrieving entropy
  fix(psa): guard Crypto APIs with CRYPTO_SUPPORT
  feat(tc): enable trng
  feat(tc): initialize the RSE communication in earlier phase
2025-02-04 13:19:10 +01:00
Leo Yan
2ae197acd6 feat(tc): enable trng
Enable the trng on the platform, which can be used by other features.
`rng-seed` has been removed and enabled `FEAT_RNG_TRAP` to trap to EL3
when accessing system registers RNDR and RNDRRS

Change-Id: Ibde39115f285e67d31b14863c75beaf37493deca
Signed-off-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Icen Zeyada <Icen.Zeyada2@arm.com>
2025-02-04 10:26:00 +00:00
Jackson Cooper-Driver
967999d0d9 refactor(tc): clarify msc0 DT node
This node specifies the location of the MPAM registers for the DSU.
Rename the node to clarify this.

Signed-off-by: Jackson Cooper-Driver <jackson.cooper-driver@arm.com>
Signed-off-by: Icen.Zeyada <Icen.Zeyada2@arm.com>
Change-Id: Ie870a7f31acbc44dd943e76896219b9bbdd7d5b4
2025-01-29 08:13:35 +00:00
Jagdish Gediya
25264e292c refactor(tc): remove redundant macro UARTCLK_FREQ
remove redundant macro UARTCLK_FREQ and replace it with TC_UARTCLK
in dts.

Change-Id: Id463a9ddc1588278e552ffca3dfb738676229ce7
Signed-off-by: Jagdish Gediya <jagdish.gediya@arm.com>
Signed-off-by: Icen.Zeyada <Icen.Zeyada2@arm.com>
2025-01-07 09:28:16 +00:00
Jagdish Gediya
1d2d96dd5c fix(tc): replace vencoder with simple panel for kernel > 6.6
The component-aware simple encoder has become outdated with the latest
upstream DRM subsystem changes since Linux kernel commit 4cfe5cc02e3f
("drm/arm/komeda: Remove component framework and add a simple encoder")

To address this we introduce a new compilation flag
`TC_DPU_USE_SIMPLE_PANEL` for control panel vs. encoder enablement.
This flag is set when the kernel version is >= 6.6 and 0 when the kernel
version is < 6.6.

We also rename the `vencoder_in` node to `lcd_in` to avoid unnecessary
conditional code for vencoder vs. simple panel enablement.

Signed-off-by: Jagdish Gediya <jagdish.gediya@arm.com>
Signed-off-by: Icen Zeyada <Icen.Zeyada2@arm.com>
Change-Id: Ibb14a56911cfb406b2181a22cc40db58d8ceaa8d
2024-12-05 15:47:33 +00:00
Davidson K
25a2fe3b74 feat(tc): remove static memory used for fwu
With the updated firmware update implementation in the Trusted Services,
it is no longer needed to carve out static memory. Memory will be
allocated dynamically in U-Boot and shared with the firmware update
secure partition of Trusted Services.

Change-Id: I0fb128a458773236ee10526edfa1116b229e4d6e
Signed-off-by: Davidson K <davidson.kumaresan@arm.com>
Signed-off-by: Jackson Cooper-Driver <jackson.cooper-driver@arm.com>
Signed-off-by: Icen Zeyada <Icen.Zeyada2@arm.com>
2024-10-04 14:08:09 +00:00
Leo Yan
b3a4f8cfcf feat(tc): update DT for Drage GPU
This patch incorporates the changes for Drage GPU to uses new access
window interface "IRQ_AW". As the interrupt properties are different
between TC4 and other TC platforms, this patch appends the interrupt
properties in platform specific DT binding file.

Change-Id: I2ca505846f03ce64b8e5f02fd202962dbfe39f25
Signed-off-by: Leo Yan <leo.yan@arm.com>
2024-08-29 14:39:21 +01:00
Jackson Cooper-Driver
e9e83e96bb feat(tc): add new TC4 RoS definitions
The TC4 uses a new RoS (Virtual Peripherals) and places them at
different address to that in TC3. Add these addresses to the DTS.

Change-Id: Ia62a670e47cdc98b3c113a670a21edc65905cafe
Signed-off-by: Jackson Cooper-Driver <jackson.cooper-driver@arm.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
2024-08-29 14:39:21 +01:00
Jagdish Gediya
7aca660c4e fix(tc): correct CPU PMU binding
CPU PMU types are not same for all CPUs on TC platforms, so define the
PMU nodes per micro architectures.

Change-Id: I4e940976cdda9a6eab3e15936c6c41a2bb668c9d
Signed-off-by: Jagdish Gediya <jagdish.gediya@arm.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
2024-08-05 16:25:59 +01:00
Jagdish Gediya
77080f6aaf feat(tc): add device tree binding for SPE
Add node for Statistical Profiling Extension, which provides
periodic sampling of operations in the CPU pipeline and reports
this via the perf AUX interface.

Change-Id: Ic7a9d9ce927edbce02c7c09470a009dc56247240
Signed-off-by: Jagdish Gediya <jagdish.gediya@arm.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
2024-08-05 16:25:59 +01:00
Jagdish Gediya
1300bbce15 feat(tc): change GIC DT property 'interrupt-cells' to 4
Change the GIC's DT property 'interrupt-cells' to 4, so the 4th cell is
a phandle to a node describing a set of CPUs this interrupt is affine
to.

If an interrupt is a PPI, and the node pointed in the 4th cell must be a
subnode of the "ppi-partitions" in the GIC node. For interrupt types
other than PPI, this cell must be zero. This is a preparison for
sequential changes for interrupt partitions, as the first step, it sets
all zeros for the interrupt affinity.

Change-Id: I66490a86a27aad5db6b1a42c2d8e0d042eee46a9
Signed-off-by: Jagdish Gediya <jagdish.gediya@arm.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
2024-08-05 16:25:59 +01:00
Ben Horgan
4c6960ca40 feat(tc): bind SMMU-600 with the DPU on TC3 FPGA
The SMMU 600 is used on TC3 FPGA board with the display device, add the
device tree binding for it.

Change-Id: Iadf85873720ca47bbbda999aa7b18a9db98ae945
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
2024-06-04 15:13:51 +01:00
Jackson Cooper-Driver
0458d3acae feat(tc): bind SMMU-700 with DPU on TC3
TC3 adds a new SMMU-700 specifically for the DPU. This is used as the
DPU SMMU instead of the existing SMMU used for the DPU. Update the
device tree to reflect this.

Change-Id: I865140f8f53bceaa8849f6583190b240eeee0539
Signed-off-by: Jackson Cooper-Driver <jackson.cooper-driver@arm.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
2024-06-04 15:12:22 +01:00
Leo Yan
2458b38772 refactor(tc): append binding for SMMU-700
The usage for SMMU-700 is not consistent across TC platforms:

  SMMU-700 on TC2:

          | FVP   | FPGA
  --------+-------+------
  Display | Used  | Used
  GPU     | Used  | Used

  SMMU-700 on TC3:

          | FVP   | FPGA
  --------+-------+------
  Display | No    | No
  GPU     | Used  | No

This commit changes to use append mode for SMMU-700 to bind it on TC2
and TC3 separately. As a result, the TC_IOMMU_EN configuration is not
used, remove it.

Change-Id: Ic4152eb4c8ef97bf27b8a97c3c6cb86e32a2e8eb
Signed-off-by: Leo Yan <leo.yan@arm.com>
2024-06-04 13:31:27 +01:00
Manish V Badarkhe
6d5048f025 Merge "feat(tc): add default SLC policy for the gpu" into integration 2024-06-03 09:39:10 +02:00
Angel Rodriguez Garcia
bebefe0f33 feat(tc): add default SLC policy for the gpu
As per the GPU integration guide, adding the PBHA INT overrides to
influence the GPU allocation policy for the System Level Cache (SLC).

This commit uses SLC policy #23, which is the Arm SLC cache policy
number for GPUs. The cache policy #23 may not be optimal for all
workloads, although it outperforms other policies on the tested data
sets.

Change-Id: I19ddbcf52a2f01af0ab6dfd7cc25b2e438b9014a
Signed-off-by: Angel Rodriguez Garcia <angel.rodriguezgarcia@arm.com>
Signed-off-by: Kshitij Sisodia <kshitij.sisodia@arm.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
2024-05-30 14:23:19 +01:00
Boyan Karatotev
f2596ff1a8 feat(tc): bind SCMI over MHUv3 for TC3
TC2 and TC3 have different the scmi shared memory regions and MHU
parameters, this patch appends the properties in scmi node for TC2 and
TC3 respectively.

Change-Id: Ifd001f780b575987877b4be36eb755a9dbe57e60
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
2024-05-22 15:58:57 +01:00
Boyan Karatotev
6c069e7168 feat(tc): add MHUv3 DT binding for TC3
MHUv3's device tree is different from MHUv2's. Add support MHUv3 DT
binding for TC3 while keeping TC2 as-is.

Change-Id: Ib2f55d3a64a4cfe2ea9e62fe39d27ed54a2ca007
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
2024-05-22 15:58:57 +01:00
Boyan Karatotev
d42987c34a refactor(tc): move SCMI nodes into the 'firmware' node
As Linux 6.1 and later kernels require the SCMI nodes must be placed in
a firmware node, this patch adds the 'firmware' node and puts SCMI nodes
under it.

Change-Id: I37855095b8b0e5051c5de6e8db30e43f6220f9de
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
2024-04-30 14:20:18 +01:00
Boyan Karatotev
c33a393675 refactor(tc): move MHUv2 property to tc2.dts
As only TC2 uses MHUv2, move the protocol property to tc2.dts.

Change-Id: I39dd57311e1058a6aabd4cbd5028511f704dd234
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
2024-04-30 14:20:18 +01:00
Boyan Karatotev
75400dd5de refactor(tc): drop the 'mhu-protocol' property in DT binding
As the 'mhu-protocol' property is not used in mhu node, drop it.

Change-Id: I2f7320f668451ce44601dfa48bf47103334c39ed
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
2024-04-30 14:20:18 +01:00
Leo Yan
e6ef3ef0f6 refactor(tc): append properties in DT bindings
This patch appends properties in DT bindings to differentiate between
FVP and FPGA. The related macros are no longer used, so they are
removed.

This patch contains minor improvement for adding labels in device nodes.

Change-Id: I8d708bb7a8a9a0ed32b806abcb4e7651daadf5e6
Signed-off-by: Leo Yan <leo.yan@arm.com>
2024-04-30 14:20:18 +01:00
Leo Yan
79c6ede09a refactor(tc): move SCMI clock DT binding into tc-base.dtsi
As SCMI clock DT bindings are common for TC platforms, move them into
'tc-base.dtsi'.

As a result, the file 'tc_vers.dtsi' is empty, so removes it.

Change-Id: Iaa7219bbbde8458dcfe01de7ad6c277a960357c5
Signed-off-by: Leo Yan <leo.yan@arm.com>
2024-04-30 14:20:18 +01:00
Leo Yan
f9565b2af1 refactor(tc): move out platform specific DT binding from tc-base.dtsi
The main purpose of 'tc-base.dtsi' is for common DT bindings, however,
it contains bindings for platform specific.

This patch moves out these plaform specific bindings to 'tc2.dts' and
'tc3.dts' respectively.

Change-Id: I9355eeff539a3f2940190aef399b4fb4828cbbac
Signed-off-by: Leo Yan <leo.yan@arm.com>
2024-04-30 14:20:18 +01:00
Leo Yan
b3a9737ce0 refactor(tc): add platform specific DT files
Currently, the DT binding uses the file 'tc.dts' as a central place for
all TC platforms. And the variables (for different platforms, or FVP vs
FPGA, etc.) are maintained in 'tc_vers.dtsi'.

This patch renames 'tc.dts' to 'tc-base.dtsi' and creates an individual
.dts file for every platform. The purpose is to use 'tc-base.dtsi' for
maintaining common DT binding and every platform's specific definitions
will be moved into its own .dts file. This is a preparation for
sequential refactoring.

It changes to include the header files in platform DTS files but not in
the 'tc-base.dtsi'. This can allow 'tc-base.dtsi' is general enough and
platform DTS files covers platform specific defintions.

Change-Id: I034fb3f8836bcea36e8ad8ae01de41127693b0c6
Signed-off-by: Leo Yan <leo.yan@arm.com>
2024-04-30 14:20:18 +01:00
Renamed from fdts/tc.dts (Browse further)