blob: 8164e52c95baf0cc11c97857b6ee794712e89851 [file] [log] [blame]
Javier Almansa Sobrino7af29bc2023-01-06 12:32:21 +00001.. SPDX-License-Identifier: BSD-3-Clause
2.. SPDX-FileCopyrightText: Copyright TF-RMM Contributors.
3
4MMU setup and memory management design in RMM
5=============================================
6
7This document describes how the MMU is setup and how memory is managed
8by the |RMM| implementation.
9
10Physical Address Space
11----------------------
12
13The Realm Management Extension (``FEAT_RME``) defines four Physical Address
14Spaces (PAS):
15
16- Non-secure
17- Secure
18- Realm
19- Root
20
21|RMM| code and |RMM| data are in Realm PAS memory, loaded and allocated to
Mate Toth-Pal51bf2fa2024-01-09 12:32:50 +010022Realm PAS at boot time by the EL3 Firmware. This is a static carveout and it
Javier Almansa Sobrino7af29bc2023-01-06 12:32:21 +000023is never changed during the lifetime of the system.
24
25The size of the |RMM| data is fixed at build time. The majority of this is the
26granules array (see `Granule state tracking`_ below), whose size is
27configurable and proportional to the maximum amount of delegable DRAM supported
28by the system.
29
30Realm data and metadata are in Realm PAS memory, which is delegated to the
31Realm PAS by the Host at runtime. The |RMM| ABI ensures that this memory cannot
32be returned to Non-secure PAS ("undelegated") while it is in use by the
33|RMM| or by a Realm.
34
35NS data is in Non-secure PAS memory. The Host is able to change the PAS
36of this memory while it is being accessed by the |RMM|. Consequently, the
37|RMM| must be able to handle a Granule Protection Fault (GPF) while accessing
38NS data as part of RMI handling.
39
40Granule state tracking
41----------------------
42
43The |RMM| manages a data structure called the `granules` array, which is
44stored in |RMM| data memory.
45
46The `granules` array contains one entry for every Granule of physical
47memory which was in Non-secure PAS at |RMM| boot and can be delegated.
48
49Each entry in the `granules` array contains a field `granule_state` which
50records the *state* of the Granule and which can be one of the states as
51listed below:
52
53- NS: Not Realm PAS (i.e. Non-secure PAS, Root PAS or Secure PAS)
54- Delegated: Realm PAS, but not yet assigned a purpose as either Realm
55 data or Realm metadata
56- RD: Realm Descriptor
57- REC: Realm Execution Context
58- REC aux: Auxiliary storage for REC
59- Data: Realm data
60- RTT: Realm Stage 2 translation tables
61
62As part of RMI SMC handling, the state of the granule can be a pre-condition
Mate Toth-Pal51bf2fa2024-01-09 12:32:50 +010063and undergo transition to a new state. For more details on the various granule
64states and their transitions, please refer to the
Javier Almansa Sobrino7af29bc2023-01-06 12:32:21 +000065`Realm Management Monitor (RMM) Specification`_.
66
67For further details, see:
68
69- ``enum granule_state``
70- ``struct granule``
71
72RMM stage 1 translation regime
73------------------------------
74
75|RMM| uses the ``FEAT_VHE`` extension to split the 64-bit VA space into two
76address spaces as shown in the figure below:
77
78|full va space|
79
80- The Low VA range: it expands from VA 0x0 up to the maximum VA size
81 configured for the region (with a maximum VA size of 48 bits or 52 bits
82 if ``FEAT_LPA2`` is supported). This range is used to map the |RMM| Runtime
83 (code, data, shared memory with EL3-FW and any other platform mappings).
84- The High VA range: It expands from VA 0xFFFF_FFFF_FFFF_FFFF all the way down
85 to an address corresponding to the maximum VA size configured for the region.
86 This region is used by the `Stage 1 High VA - Slot Buffer mechanism`_
87 as well as the `Per-CPU stack mapping`_.
88
89There is a range of invalid addresses between both ranges that is not mapped to
90any of them as shown in the figure above. TCR_EL2.TxSZ fields controls the
91maximum VA size of each region and |RMM| configures this field to fit the
92mappings used for each region.
93
94The 2 VA ranges are used for 2 different purposes in RMM as described below.
95
96Stage 1 Low VA range
97^^^^^^^^^^^^^^^^^^^^
98
Mate Toth-Pal51bf2fa2024-01-09 12:32:50 +010099The Low VA range is used to create static mappings which are shared across all
100the CPUs. It encompasses the RMM executable binary memory and the EL3 Shared
Javier Almansa Sobrino7af29bc2023-01-06 12:32:21 +0000101memory region.
102
103The RMM Executable binary memory consists of code, RO data and RW data. Note
104that the stage 1 translation tables for the Low Region are kept in RO data, so
105that once the MMU is enabled, the tables mappings are protected from further
106modification.
107
108The EL3 shared memory, which is allocated by the EL3 Firmware, is used by the
109`RMM-EL3 communications interface`_. A pointer to the beginning of this area
110is received by |RMM| during initialization. |RMM| will then map the region in
111the .rw area.
112
113The Low VA range is setup by the platform layer as part of platform
114initialization.
115
116The following mappings belong to the Low VA Range:
117
118- RMM_CODE
119- RMM_RO
120- RMM_RW
121- RMM_SHARED
122
123Per-platform mappings can also be added if needed, such as the UART for the
124FVP platform.
125
126Stage 1 High VA range
127^^^^^^^^^^^^^^^^^^^^^
128
129The High VA range is used to create dynamic per-CPU mappings. The tables used
130for this are private to each CPU and hence it is possible for every CPU to map
131a different PA at a specific VA. This property is used by the `slot-buffer`
132mechanism as described later.
133
134In order to allow the mappings for this region to be dynamic, its translation
135tables are stored in the RW section of |RMM|, allowing for it to be
136modified as needed.
137
138For more details see ``xlat_high_va.c`` file of the xlat library.
139
140The diagram below shows the memory layout for the High VA region.
141
142|high va region|
143
144Stage 1 High VA - Slot Buffer mechanism
145~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
146
147The |RMM| provides a dynamic mapping mechanism called `slot-buffer` in the
148high VA region. The assigned VA space for `slot-buffer` is divided into `slots`
149of GRANULE_SIZE each.
150
151The |RMM| has a fixed number of `slots` per CPU. Each `slot` is used to map
152memory of a particular category. The |RMM| validates that the target physical
153granule to be mapped is of the expected `granule_state` by looking up the
154corresponding entry in `granules` array.
155
156The `slot-buffer` mechanism has `slots` for mapping memory of the following
157types:
158
159 - Realm metadata: These correspond to the specific Realm and Realm
160 Execution context scheduled on the PE. These mappings are usually only
161 valid during the execution of an RMI or RSI handlers and are removed
162 afterwards. These include Realm Descriptors (RDs), Realm Execution
163 Contexts (RECs), Realm Translation Tables (RTTs).
164
165 - NS data: RMM needs to map NS memory as part of RMIs to access parameters
166 passed by the Host or to return arguments to the Host. RMM also needs
167 to copy Data provided by the Host as part of populating the Realm
168 data memory.
169
170 - Realm data: RMM sometimes needs to temporarily map Realm data memory
171 during Realm creation in order to load the Realm image or access buffers
172 specified by the Realm as part of RSI commends.
173
174The `slot-buffer` design avoids the need for generic allocation of VA space.
175The rationalization of all mappings ever needed for managing a realm via
176`slots` is only possible due to the simple nature of the |RMM| design - in
177particular, the fact that it is possible to statically determine the types
178of objects which need to be mapped into the |RMM|'s address space, and the
179maximum number of objects of a given type which need to be mapped at any point
180in time.
181
182During Realm entry and Realm exit, the RD is mapped in the "RD" buffer
183slot. Once Realm entry or Realm exit is complete, this mapping is
184removed. The RD is not mapped during Realm execution.
185
186The REC and the `rmi_rec_run` data structures are both mapped during Realm
187execution.
188
189As the `slots` are mapped on the High VA region, each CPU
190has its own private translation tables for such mappings, which means
191that a particular slot has a fixed VA on every CPU. Since the Translation
192tables are private to a CPU, the mapping to the slot is private to the CPU.
193This allows the interruption and migration of a REC (vCPU) to another CPU with
194live memory allocations in RMM. An example of this scenario is when the Realm
195attestation token is being created in RMM, a pending IRQ can cause RMM to yield
196to NS Host with live memory allocations in MbedTLS heap. The NS Host can
197schedule the REC on another CPU and, since the mapping for the memory
198allocations remain at the same VA, the interrupted realm token creation can
199continue.
200
201The `slot-buffer` implementation in RMM also has some performance optimizations
202like caching of TTE's to avoid walking the Stage 1 translation tables for every
203map and unmap operation.
204
205As an alternative to using dynamic mappings as required for the RMI command,
206the approach of maintaining static mappings for all physical memory was
Mate Toth-Pal51bf2fa2024-01-09 12:32:50 +0100207considered, but rejected on the grounds that this could permit arbitrary
Javier Almansa Sobrino7af29bc2023-01-06 12:32:21 +0000208memory access for an attacker who is able to subvert |RMM| execution.
209
210The xlat lib APIs are used by the `slot-buffer` to create dynamic mappings.
211These dynamic mappings are stored in the high VA region's ``xlat_ctx``
212structure and marked by the xlat library as *TRANSIENT*. This helps xlat lib to
213distinguish valid Translation Table Entries from invalid ones as otherwise the
214unmapped dynamic TTEs would be identical to INVALID ones.
215
216For further details, see:
217
218- ``enum buffer_slot``
219- ``lib/realm/src/buffer.c``
220
221Per-CPU stack mapping
222~~~~~~~~~~~~~~~~~~~~~
223
224Each CPU maps its stack to the High VA region which means that the stack has
225same VA on all the CPUs and it is private to the CPU. At boot time, each CPU
226calculates the PA for the start of the stack and maps it to the designated
227High VA address space.
228
229The per-CPU VA mapping also includes a gap at the end of the stack VA to detect
230any stack underflows. The gap has a page size.
231
Javier Almansa Sobrino1b61c472023-10-26 15:43:49 +0100232|RMM| also uses a separate Per-CPU stack to handle exceptions and faults.
233This stack is allocated below the general one, and it allows for |RMM| to be
234able to handle a stack overflow fault. There is another page gap of unmapped
235memory between both stacks to harden security.
236
237The rest of the VA space available below the exception stack is unused and
238therefore left unmapped. The stage 1 translation library will not allow to map
239anything there.
Javier Almansa Sobrino7af29bc2023-01-06 12:32:21 +0000240
241Stage 1 translation library (xlat library)
242------------------------------------------
243
244The |RMM| stage 1 translation management is taken care of by the xlat library.
245This library is able to support up to 52-bit addresses and 5 levels of
246translation (when ``FEAT_LPA2`` is enabled).
247
248The xlat library is designed to be stateless and it uses the abstraction of
249`translation context`, modelled through the ``struct xlat_ctx``. A translation
250context stores all the information related to a given VA space, such as the
251translation tables, the VA configuration used to initialize the context and any
252internal status related to such VA. Once a context has been initialized, its
253VA configuration cannot be modified.
254
255At the moment, although the xlat library supports creation of multiple
256contexts, it assumes that the caller will only use a single context per
257CPU for a given VA region. The library does not offer support to switch
258contexts on a CPU at run time. A context can be shared by several CPUs if they
259share the same VA configuration and mappings, like on the low va region.
260
261Dynamic mappings can be created by specifying the ``TRANSIENT`` flag. The
262high VA region create dynamic mappings using this flag.
263
264For further details, see ``lib/xlat``.
265
266RMM executable bootstrap
267------------------------
268
269The |RMM| is loaded as a .bin file by the EL3 loader. The size of the sections
270in the |RMM| binary as well as the placing of |RMM| code and data into
271appropriate sections is controlled by the linker script in the source tree.
272
273Platform initialization code takes care of importing the linker symbols
274that define the boundaries of the different sections and creates static
275memory mappings that are then used to initialize an ``xlat_ctx`` structure
276for the low VA region. The RMM binary sections are flat-mapped and are shared
277across all the CPUs on the system. In addition, as |RMM| is compiled as a
Mate Toth-Pal51bf2fa2024-01-09 12:32:50 +0100278Position Independent Executable (PIE) at address 0x0, the Global Offset
Javier Almansa Sobrino7af29bc2023-01-06 12:32:21 +0000279Table (GOT) and other relocations in the binary are fixed up with the right
280offsets as part of boot. This allows RMM to be run at any physical address as
281a PIE regardless of the compile time address.
282
283For further details, see:
284
285- ``runtime/linker.lds``
286- ``plat/common/src/plat_common_init.c``
287- ``plat/fvp/src/fvp_setup.c``
288
289_______________________________________________________________________________
290
Javier Almansa Sobrinob444a972023-11-20 17:13:36 +0000291.. |full va space| image:: ./diagrams/full_va_space_diagram.drawio.png
Javier Almansa Sobrino7af29bc2023-01-06 12:32:21 +0000292 :height: 500
Javier Almansa Sobrinob444a972023-11-20 17:13:36 +0000293.. |high va region| image:: ./diagrams/high_va_memory_map.drawio.png
Javier Almansa Sobrino7af29bc2023-01-06 12:32:21 +0000294 :height: 600
295.. _Realm Management Monitor (RMM) Specification: https://developer.arm.com/documentation/den0137/1-0eac5/?lang=en
296.. _`RMM-EL3 communications interface`: https://trustedfirmware-a.readthedocs.io/en/latest/components/rmm-el3-comms-spec.html