blob: b4800865a225c13e0fec72395dbc1e66afe4fbe5 [file] [log] [blame]
Joakim Bech8e5c5b32018-10-25 08:18:32 +02001.. _core:
2
3####
4Core
5####
6
7.. _interrupt_handling:
8
9Interrupt handling
10******************
11This section describes how :ref:`optee_os` handles switches of world execution
12context based on :ref:`SMC` exceptions and interrupt notifications Interrupt
13notifications are IRQ/FIQ exceptions which may also imply switching of world
14execution context: normal world to secure world, or secure world to normal
15world.
16
17Use cases of world context switch
18=================================
19This section lists all the cases where optee_os is involved in world context
20switches. Optee_os executes in the secure world. World switch is done by the
21cores secure monitor level/mode, referred below as the Monitor.
22
23When the normal world invokes the secure world, the normal world executes a SMC
24instruction. The SMC exception is always trapped by the Monitor. If the related
25service targets the trusted OS, the Monitor will switch to optee_os world
26execution. When the secure world returns to the normal world, optee_os executes
27a SMC that is caught by the Monitor which switches back to the normal world.
28
29When a secure interrupt is signaled by the Arm GIC, it shall reach the optee_os
30interrupt exception vector. If the secure world is executing, optee_os will
31handle straight the interrupt from its exception vector. If the normal world is
32executing when the secure interrupt raises, the Monitor vector must handle the
33exception and invoke the optee_os to serve the interrupt.
34
35When a non-secure interrupt is signaled by the Arm GIC, it shall reach the
36normal world interrupt exception vector. If the normal world is executing, it
37will handle straight the exception from its exception vector. If the secure
38world is executing when the non-secure interrupt raises, optee_os will
39temporarily return back to normal world via the Monitor to let normal world
40serve the interrupt.
41
42Core exception vectors
43======================
44Monitor vector is ``VBAR_EL3`` in AArch64 and ``MVBAR`` in Armv7-A/AArch32.
45Monitor can be reached while normal world or secure world is executing. The
46executing secure state is known to the Monitor through the ``SCR_NS``.
47
48Monitor can be reached from a SMC exception, an IRQ or FIQ exception (so-called
49interrupts) and from asynchronous aborts. Obviously monitor aborts (data,
50prefetch, undef) are local to the Monitor execution.
51
52The Monitor can be external to optee_os (case ``CFG_WITH_ARM_TRUSTED_FW=y``).
53If not, provides a local secure monitor ``core/arch/arm/sm``. Armv7-A platforms
54should use the optee_os secure monitor. Armv8-A platforms are likely to rely on
55an `Trusted Firmware A`_.
56
57When executing outside the Monitor, the system is executing either in the
58normal world (``SCR_NS=1``) or in the secure world (``SCR_NS=0``). Each world
59owns its own exception vector table (state vector):
60
61 - ``VBAR_EL2`` or ``VBAR_EL1`` non-secure or ``VBAR_EL1`` secure for
62 AArch64.
63 - ``HVBAR`` or ``VBAR`` non-secure or ``VBAR`` secure for Armv7-A and
64 AArch32.
65
66All SMC exceptions are trapped in the Monitor vector. IRQ/FIQ exceptions can be
67trapped either in the Monitor vector or in the state vector of the executing
68world.
69
70When the normal world is executing, the system is configured to route:
71
72 - secure interrupts to the Monitor that will forward to optee_os
73 - non-secure interrupts to the executing world exception vector.
74
75When the secure world is executing, the system is configured to route:
76
77 - secure and non-secure interrupts to the executing optee_os exception
78 vector. optee_os shall forward the non-secure interrupts to the normal
79 world.
80
81Optee_os non-secure interrupts are always trapped in the state vector of the
82executing world. This is reflected by a static value of ``SCR_(IRQ|FIQ)``.
83
84.. _native_foreign_irqs:
85
86Native and foreign interrupts
87=============================
88Two types of interrupt are defined in optee_os:
89
90 - **Native interrupt** - The interrupt handled by optee_os (for example:
91 secure interrupt)
92 - **Foreign interrupt** - The interrupt not handled by optee_os (for
93 example: non-secure interrupt which is handled by normal world)
94
95For Arm **GICv2** mode, native interrupt is sent as FIQ and foreign interrupt
96is sent as IRQ. For Arm **GICv3** mode, foreign interrupt is sent as FIQ which
97could be handled by either secure world (aarch32 Monitor mode or aarch64 EL3)
98or normal world. Arm GICv3 mode can be enabled by setting ``CFG_ARM_GICV3=y``.
99For clarity, this document mainly chooses the GICv2 convention and refers the
100IRQ as optee_os foreign interrupts, and FIQ as optee_os native interrupts.
101Native interrupts must be securely routed to optee_os. Foreign interrupts, when
102trapped during secure world execution might need to be efficiently routed to
103the normal world.
104
105Normal World invokes optee_os using SMC
106=======================================
107
108**Entering the Secure Monitor**
109
110The monitor manages all entries and exits of secure world. To enter secure
111world from normal world the monitor saves the state of normal world (general
112purpose registers and system registers which are not banked) and restores the
113previous state of secure world. Then a return from exception is performed and
114the restored secure state is resumed. Exit from secure world to normal world is
115the reverse.
116
117Some general purpose registers are not saved and restored on entry and exit,
118those are used to pass parameters between secure and normal world (see
119ARM_DEN0028A_SMC_Calling_Convention_ for details).
120
121**Entry and exit of Trusted OS**
122
123On entry and exit of Trusted OS each CPU is uses a separate entry stack and runs
124with IRQ and FIQ blocked. SMCs are categorised in two flavors: **fast** and
125**standard**.
126
127 - For **fast** SMCs, optee_os will execute on the entry stack with IRQ/FIQ
128 blocked until the execution returns to normal world.
129
130 - For **standard** SMCs, optee_os will at some point execute the requested
131 service with interrupts unblocked. In order to handle interrupts, mainly
132 forwarding of foreign interrupts, optee_os assigns a trusted thread
133 (`core/arch/arm/kernel/thread.c`_) to the SMC request. The trusted thread
134 stores the execution context of the requested service. This context can be
135 suspended and resumed as the requested service executes and is
136 interrupted. The trusted thread is released only once the service
137 execution returns with a completion status.
138
139 For **standard** SMCs, optee_os allocates or resumes a trusted thread then
140 unblock the IRQ/FIQ lines. When the optee_os needs to invoke the normal
141 world from a foreign interrupt or a remote service call, optee_os blocks
142 IRQ/FIQ and suspends the trusted thread. When suspending, optee_os gets
143 back to the entry stack.
144
145 - **Both** fast and standard SMC end on the entry stack with IRQ/FIQ blocked
146 and optee_os invokes the Monitor through a SMC to return to the normal
147 world.
148
149.. figure:: ../images/core/interrupt_handling/tee_invoke.png
150 :figclass: align-center
151
152 SMC entry to secure world
153
154Deliver non-secure interrupts to Normal World
155=============================================
156This section uses the Arm GICv1/v2 conventions: IRQ signals non-secure
157interrupts while FIQ signals secure interrupts. On a GICv3 configuration, one
158should exchange IRQ and FIQ in this section.
159
160**Forward a Foreign Interrupt from Secure World to Normal World**
161
162When an IRQ is received in secure world as an IRQ exception then secure world:
163
164 1. Saves trusted thread context (entire state of all processor modes for
165 Armv7-A)
166
167 2. Blocks (masks) all interrupts (IRQ and FIQ)
168
169 3. Switches to entry stack
170
171 4. Issues an SMC with a value to indicates to normal world that an IRQ has
172 been delivered and last SMC call should be continued
173
174The monitor restores normal world context with a return code indicating that an
175IRQ is about to be delivered. Normal world issues a new SMC indicating that it
176should continue last SMC.
177
178The monitor restores secure world context which locates the previously saved
179context and checks that it is a return from IRQ that is requested before
180restoring the context and lets the secure world IRQ handler return from
181exception where the execution would be resumed.
182
183Note that the monitor itself does not know/care that it has just forwarded an
184IRQ to normal world. The bookkeeping is done in the trusted thread handling in
185Trusted OS. Normal world is responsible to decide when the secure world thread
186should resume execution (for details, see :ref:`thread_handling`).
187
188.. figure:: ../images/core/interrupt_handling/irq.png
189 :figclass: align-center
190
191 IRQ received in secure world and forwarded to normal world
192
193**Deliver a non-secure interrupt to normal world when ``SCR_NS`` is set**
194
195Since ``SCR_IRQ`` is cleared, an IRQ will be delivered using the state vector
196(``VBAR``) in the normal world. The IRQ is received as any other exception by
197normal world, the monitor and the Trusted OS are not involved at all.
198
199Deliver secure interrupts to Secure World
200=========================================
201This section uses the Arm GICv1/v2 conventions: FIQ signals secure interrupts
202while IRQ signals non-secure interrupts. On a GICv3 configuration, one should
203exchange IRQ and FIQ in this section. A FIQ can be received during two different
204states, either in normal world (``SCR_NS`` is set) or in secure world
205(``SCR_NS`` is cleared). When the secure monitor is active (Armv8-A EL3 or
206Armv7-A Monitor mode) FIQ is masked. FIQ reception in the two different states
207is described below.
208
209**Deliver FIQ to secure world when SCR_NS is set**
210
211When the monitor gets an FIQ exception it:
212
213 1. Saves normal world context and restores secure world context from last
214 secure world exit (which will have IRQ and FIQ blocked)
215 2. Clears ``SCR_FIQ`` when clearing ``SCR_NS``
216 3. Sets FIQ as parameter to secure world entry
217 4. Does a return from exception into secure context
218 5. Secure world unmasks FIQs because of the FIQ parameter
219 6. FIQ is received as in exception using the state vector
220 7. The state vector handle returns from exception in secure world
221 8. Secure world issues an SMC to return to normal world
222 9. Monitor saves secure world context and restores normal world context
223 10. Does a return from exception into restored context
224
225.. figure:: ../images/core/interrupt_handling/fiq.png
226 :figclass: align-center
227
228 FIQ received when SCR_NS is set
229
230.. figure:: ../images/core/interrupt_handling/irq_fiq.png
231 :figclass: align-center
232
233 FIQ received while processing an IRQ forwarded from secure world
234
235**Deliver FIQ to secure world when SCR_NS is cleared**
236
237Since ``SCR_FIQ`` is cleared when ``SCR_NS`` is cleared a FIQ will be delivered
238using the state vector (``VBAR``) in secure world. The FIQ is received as any
239other exception by Trusted OS, the monitor is not involved at all.
240
241Trusted thread scheduling
242=========================
243**Trusted thread for standard services**
244
245OP-TEE standard services are carried through standard SMC. Execution of these
246services can be interrupted by foreign interrupts. To suspend and restore the
247service execution, optee_os assigns a trusted thread at standard SMCs entry.
248
249The trusted thread terminates when optee_os returns to the normal world with a
250service completion status.
251
252A trusted thread execution can be interrupted by a native interrupt. In this
253case the native interrupt is handled by the interrupt exception handlers and
254once served, optee_os returns to the execution trusted thread.
255
256A trusted thread execution can be interrupted by a foreign interrupt. In this
257case, optee_os suspends the trusted thread and invokes the normal world through
258the Monitor (optee_os so-called RPC services). The trusted threads will resume
259only once normal world invokes the optee_os with the RPC service status.
260
261A trusted thread execution can lead optee_os to invoke a service in normal
262world: access a file, get the REE current time, etc. The trusted thread is
263suspended/resumed during remote service execution.
264
265**Scheduling considerations**
266
267When a trusted thread is interrupted by a foreign interrupt and when optee_os
268invokes a normal world service, the normal world gets the opportunity to
269reschedule the running applications. The trusted thread will resume only once
270the client application is scheduled back. Thus, a trusted thread execution
271follows the scheduling of the normal world caller context.
272
273Optee_os does not implement any thread scheduling. Each trusted thread is
274expected to track a service that is invoked from the normal world and should
275return to it with an execution status.
276
277The OP-TEE Linux driver (as implemented in `drivers/tee/optee`_ since Linux
278kernel 4.12) is designed so that the Linux thread invoking OP-TEE gets assigned
279a trusted thread on TEE side. The execution of the trusted thread is tied to the
280execution of the caller Linux thread which is under the Linux kernel scheduling
281decision. This means trusted threads are scheduled by the Linux kernel.
282
283**Trusted thread constraints**
284
285TEE core handles a static number of trusted threads, see ``CFG_NUM_THREADS``.
286
287Trusted threads are only expensive on memory constrained system, mainly
288regarding the execution stack size.
289
290On SMP systems, optee_os can execute several trusted threads in parallel if the
291normal world supports scheduling of processes. Even on UP systems, supporting
292several trusted threads in optee_os helps normal world scheduler to be
293efficient.
294
295----
296
297.. _memory_objects:
298
299Memory objects
300**************
301A memory object, **MOBJ**, describes a piece of memory. The interface provided
302is mostly abstract when it comes to using the MOBJ to populate translation
303tables etc. There are different kinds of MOBJs describing:
304
305 - Physically contiguous memory
306 - created with ``mobj_phys_alloc(...)``.
307
308 - Virtual memory
309 - one instance with the name ``mobj_virt`` available.
310 - spans the entire virtual address space.
311
312 - Physically contiguous memory allocated from a ``tee_mm_pool_t *``
313 - created with ``mobj_mm_alloc(...)``.
314
315 - Paged memory
316 - created with ``mobj_paged_alloc(...)``.
317 - only contains the supplied size and makes ``mobj_is_paged(...)``
318 return true if supplied as argument.
319
320 - Secure copy paged shared memory
321 - created with ``mobj_seccpy_shm_alloc(...)``.
322 - makes ``mobj_is_paged(...)`` and ``mobj_is_secure(...)`` return true
323 if supplied as argument.
324
325----
326
327.. _mmu:
328
329MMU
330***
331Translation tables
332==================
333OP-TEE uses several L1 translation tables, one large spanning 4 GiB and two or
334more small tables spanning 32 MiB. The large translation table handles kernel
335mode mapping and matches all addresses not covered by the small translation
336tables. The small translation tables are assigned per thread and covers the
337mapping of the virtual memory space for one TA context.
338
339Memory space between small and large translation table is configured by TTBRC.
340TTBR1 always points to the large translation table. TTBR0 points to the a small
341translation table when user mapping is active and to the large translation table
342when no user mapping is currently active. For details about registers etc,
343please refer to a Technical Reference Manual for your architecture, for example
344`Cortex-A53 TRM`_.
345
346The translation tables has certain alignment constraints, the alignment (of the
347physical address) has to be the same as the size of the translation table. The
348translation tables are statically allocated to avoid fragmentation of memory due
349to the alignment constraints.
350
351Each thread has one small L1 translation table of its own. Each TA context has a
352compact representation of its L1 translation table. The compact representation
353is used to initialize the thread specific L1 translation table when the TA
354context is activated.
355
356.. graphviz::
357
358 digraph xlat_table {
359 graph [
360 rankdir = "LR"
361 ];
362 node [
363 fontsize = "16"
364 shape = "ellipse"
365 ];
366 edge [
367 ];
368 "node_ttb" [
369 label = "<f0> TTBR0 | <f1> TTBR1"
370 shape = "record"
371 ];
372 "node_large_l1" [
373 label = "<f0> Large L1\nSpans 4 GiB"
374 shape = "record"
375 ];
376 "node_small_l1" [
377 label = "Small L1\nSpans 32 MiB\nper entry | <f0> 0 | <f1> 1 | ... | <fn> n"
378 shape = "record"
379 ];
380
381 "node_ttb":f0 -> "node_small_l1":f0 [ label = "Thread 0 ctx active" ];
382 "node_ttb":f0 -> "node_small_l1":f1 [ label = "Thread 1 ctx active" ];
383 "node_ttb":f0 -> "node_small_l1":fn [ label = "Thread n ctx active" ];
384 "node_ttb":f0 -> "node_large_l1" [ label="No active ctx" ];
385 "node_ttb":f1 -> "node_large_l1";
386 }
387
388
389Switching to user mode
390======================
391This section only applies with following configuration flags:
392
393 - ``CFG_WITH_LPAE=n``
394 - ``CFG_CORE_UNMAP_CORE_AT_EL0=y``
395
396When switching to user mode only a minimal kernel mode mapping is kept. This is
397achieved by selecting a zeroed out big L1 translation in TTBR1 when
398transitioning to user mode. When returning back to kernel mode the original L1
399translation table is restored in TTBR1.
400
401Switching to normal world
402=========================
403When switching to normal world either via a foreign interrupt (see
404:ref:`native_foreign_irqs` or RPC there is a chance that secure world will
405resume execution on a different CPU. This means that the new CPU need to be
406configured with the context of the currently active TA. This is solved by always
407setting the TA context in the CPU when resuming execution. Here is room for
408improvements since it is more likely than not that it is the same CPU that
409resumes execution in secure world.
410
411.. todo::
412
413 Joakim: Jens? Didn't you do some tweaks here already? I.e., "room for
414 improvements" above?
415
416----
417
418.. _pager:
419
420Pager
421*****
422OP-TEE currently requires >256 KiB RAM for OP-TEE kernel memory. This is not a
423problem if OP-TEE uses TrustZone protected DDR, but for security reasons OP-TEE
424may need to use TrustZone protected SRAM instead. The amount of available SRAM
425varies between platforms, from just a few KiB up to over 512 KiB. Platforms with
426just a few KiB of SRAM cannot be expected to be able to run a complete TEE
427solution in SRAM. But those with 128 to 256 KiB of SRAM can be expected to have
428a capable TEE solution in SRAM. The pager provides a solution to this by demand
429paging parts of OP-TEE using virtual memory.
430
431Secure memory
432=============
433TrustZone protected SRAM is generally considered more secure than TrustZone
434protected DRAM as there is usually more attack vectors on DRAM. The attack
435vectors are hardware dependent and can be different for different platforms.
436
437Backing store
438=============
439TrustZone protected DRAM or in some cases non-secure DRAM is used as backing
440store. The data in the backing store is integrity protected with one hash
441(SHA-256) per page (4KiB). Readonly pages are not encrypted since the OP-TEE
442binary itself is not encrypted.
443
444Partitioning of memory
445======================
446The code that handles demand paging must always be available as it would
447otherwise lead to deadlock. The virtual memory is partitioned as:
448
449 +--------------+-------------------+
450 | Type | Sections |
451 +==============+===================+
452 | unpaged | | text |
453 | | | rodata |
454 | | | data |
455 | | | bss |
456 | | | heap1 |
457 | | | nozi |
458 | | | heap2 |
459 +--------------+-------------------+
460 | init / paged | | text_init |
461 | | | rodata_init |
462 +--------------+-------------------+
463 | paged | | text_pageable |
464 | | | rodata_pageable |
465 +--------------+-------------------+
466 | demand alloc | |
467 +--------------+-------------------+
468
469Where ``nozi`` stands for "not zero initialized", this section contains entry
470stacks (thread stack when TEE pager is not enabled) and translation tables (TEE
471pager cached translation table when the pager is enabled and LPAE MMU is used).
472
473The ``init`` area is available when OP-TEE is initializing and contains
474everything that is needed to initialize the pager. After the pager has been
475initialized this area will be used for demand paged instead.
476
477The ``demand alloc`` area is a special area where the pages are allocated and
478removed from the pager on demand. Those pages are returned when OP-TEE does not
479need them any longer. The thread stacks currently belongs this area. This means
480that when a stack is not used the physical pages can be used by the pager for
481better performance.
482
483The technique to gather code in the different area is based on compiling all
484functions and data into separate sections. The unpaged text and rodata is then
485gathered by linking all object files with ``--gc-sections`` to eliminate
486sections that are outside the dependency graph of the entry functions for
487unpaged functions. A script analyzes this ELF file and generates the bits of the
488final link script. The process is repeated for init text and rodata. What is
489not "unpaged" or "init" becomes "paged".
490
491Partitioning of the binary
492==========================
493.. note::
494 The struct definitions provided in this section are explicitly covered by
495 the following dual license:
496
497 .. code-block:: none
498
499 SPDX-License-Identifier: (BSD-2-Clause OR GPL-2.0)
500
501The binary is partitioned into four parts as:
502
503
504 +----------+
505 | Binary |
506 +==========+
507 | Header |
508 +----------+
509 | Init |
510 +----------+
511 | Hashes |
512 +----------+
513 | Pageable |
514 +----------+
515
516The header is defined as:
517
518.. code-block:: c
519
520 #define OPTEE_MAGIC 0x4554504f
521 #define OPTEE_VERSION 1
522 #define OPTEE_ARCH_ARM32 0
523 #define OPTEE_ARCH_ARM64 1
524
525 struct optee_header {
526 uint32_t magic;
527 uint8_t version;
528 uint8_t arch;
529 uint16_t flags;
530 uint32_t init_size;
531 uint32_t init_load_addr_hi;
532 uint32_t init_load_addr_lo;
533 uint32_t init_mem_usage;
534 uint32_t paged_size;
535 };
536
537The header is only used by the loader of OP-TEE, not OP-TEE itself. To
538initialize OP-TEE the loader loads the complete binary into memory and copies
539what follows the header and the following ``init_size`` bytes to
540``(init_load_addr_hi << 32 | init_load_addr_lo)``. ``init_mem_usage`` is used by
541the loader to be able to check that there is enough physical memory available
542for OP-TEE to be able to initialize at all. The loader supplies in ``r0/x0`` the
543address of the first byte following what was not copied and jumps to the load
544address to start OP-TEE.
545
546In addition to overall binary with partitions inside described as above, three
547extra binaries are generated simultaneously during build process for loaders who
548support loading separate binaries:
549
550 +-----------+
551 | v2 binary |
552 +===========+
553 | Header |
554 +-----------+
555
556 +-----------+
557 | v2 binary |
558 +===========+
559 | Init |
560 +-----------+
561 | Hashes |
562 +-----------+
563
564 +-----------+
565 | v2 binary |
566 +===========+
567 | Pageable |
568 +-----------+
569
570In this case, loaders load header binary first to get image list and information
571of each image; and then load each of them into specific load address assigned in
572structure. These binaries are named with `v2` suffix to distinguish from the
573existing binaries. Header format is updated to help loaders loading binaries
574efficiently:
575
576.. code-block:: c
577
578 #define OPTEE_IMAGE_ID_PAGER 0
579 #define OPTEE_IMAGE_ID_PAGED 1
580
581 struct optee_image {
582 uint32_t load_addr_hi;
583 uint32_t load_addr_lo;
584 uint32_t image_id;
585 uint32_t size;
586 };
587
588 struct optee_header_v2 {
589 uint32_t magic;
590 uint8_t version;
591 uint8_t arch;
592 uint16_t flags;
593 uint32_t nb_images;
594 struct optee_image optee_image[];
595 };
596
597Magic number and architecture are identical as original. Version is increased to
598two. ``load_addr_hi`` and ``load_addr_lo`` may be ``0xFFFFFFFF`` for pageable
599binary since pageable part may get loaded by loader into dynamic available
600position. ``image_id`` indicates how loader handles current binary. Loaders who
601don't support separate loading just ignore all v2 binaries.
602
603Initializing the pager
604======================
605The pager is initialized as early as possible during boot in order to minimize
606the "init" area. The global variable ``tee_mm_vcore`` describes the virtual
607memory range that is covered by the level 2 translation table supplied to
608``tee_pager_init(...)``.
609
610Assign pageable areas
611---------------------
612A virtual memory range to be handled by the pager is registered with a call to
613``tee_pager_add_core_area()``.
614
615.. code-block:: c
616
617 bool tee_pager_add_area(tee_mm_entry_t *mm,
618 uint32_t flags,
619 const void *store,
620 const void *hashes);
621
622which takes a pointer to ``tee_mm_entry_t`` to tell the range, flags to tell how
623memory should be mapped (readonly, execute etc), and pointers to backing store
624and hashes of the pages.
625
626Assign physical pages
627---------------------
628Physical SRAM pages are supplied by calling ``tee_pager_add_pages(...)``
629
630.. code-block:: c
631
632 void tee_pager_add_pages(tee_vaddr_t vaddr,
633 size_t npages,
634 bool unmap);
635
636``tee_pager_add_pages(...)`` takes the physical address stored in the entry
637mapping the virtual address ``vaddr`` and ``npages`` entries after that and uses
638it to map new pages when needed. The unmap parameter tells whether the pages
639should be unmapped immediately since they does not contain initialized data or
640be kept mapped until they need to be recycled. The pages in the "init" area are
641supplied with ``unmap == false`` since those page have valid content and are in
642use.
643
644Invocation
645==========
646The pager is invoked as part of the abort handler. A pool of physical pages are
647used to map different virtual addresses. When a new virtual address needs to be
648mapped a free physical page is mapped at the new address, if a free physical
649page cannot be found the oldest physical page is selected instead. When the page
650is mapped new data is copied from backing store and the hash of the page is
651verified. If it is OK the pager returns from the exception to resume the
652execution.
653
654Paging of user TA
655=================
656Paging of user TAs can optionally be enabled with ``CFG_PAGED_USER_TA=y``.
657Paging of user TAs is analogous to paging of OP-TEE kernel parts but with a few
658differences:
659
660 - Read/write pages are paged in addition to read-only pages
661 - Page tables are managed dynamically
662
663``tee_pager_add_uta_area(...)`` is used to setup initial read/write mapping
664needed when populating the TA. When the TA is fully populated and relocated
665``tee_pager_set_uta_area_attr(...)`` changes the mapping of the area to strict
666permissions used when the TA is running.
667
668----
669
670.. _stacks:
671
672Stacks
673******
674Different stacks are used during different stages. The stacks are:
675
676 - **Secure monitor stack** (128 bytes), bound to the CPU. Only available if
677 OP-TEE is compiled with a secure monitor always the case if the target is
678 Armv7-A but never for Armv8-A.
679
680 - **Temp stack** (small ~1KB), bound to the CPU. Used when transitioning
681 from one state to another. Interrupts are always disabled when using this
682 stack, aborts are fatal when using the temp stack.
683
684 - **Abort stack** (medium ~2KB), bound to the CPU. Used when trapping a data
685 or pre-fetch abort. Aborts from user space are never fatal the TA is only
686 killed. Aborts from kernel mode are used by the pager to do the demand
687 paging, if pager is disabled all kernel mode aborts are fatal.
688
689 - **Thread stack** (large ~8KB), not bound to the CPU instead used by the
690 current thread/task. Interrupts are usually enabled when using this stack.
691
692Notes for Armv7-A/AArch32
693 .. list-table::
694 :header-rows: 1
695 :widths: 1 5
696
697 * - Stack
698 - Comment
699
700 * - Temp
701 - Assigned to ``SP_SVC`` during entry/exit, always assigned to
702 ``SP_IRQ`` and ``SP_FIQ``
703
704 * - Abort
705 - Always assigned to ``SP_ABT``
706
707 * - Thread
708 - Assigned to ``SP_SVC`` while a thread is active
709
710Notes for AArch64
711 There are only two stack pointers, ``SP_EL1`` and ``SP_EL0``, available for
712 OP-TEE in AArch64. When an exception is received stack pointer is always
713 ``SP_EL1`` which is used temporarily while assigning an appropriate stack
714 pointer for ``SP_EL0``. ``SP_EL1`` is always assigned the value of
715 ``thread_core_local[cpu_id]``. This structure has some spare space for
716 temporary storage of registers and also keeps the relevant stack pointers.
717 In general when we talk about assigning a stack pointer to the CPU below we
718 mean ``SP_EL0``.
719
720Boot
721====
722During early boot the CPU is configured with the temp stack which is used until
723OP-TEE exits to normal world the first time.
724
725Notes for AArch64
726 ``SPSEL`` is always ``0`` on entry/exit to have ``SP_EL0`` acting as stack
727 pointer.
728
729Normal entry
730============
731Each time OP-TEE is entered from normal world the temp stack is used as the
732initial stack. For fast calls, this is the only stack used. For normal calls an
733empty thread slot is selected and the CPU switches to that stack.
734
735Normal exit
736===========
737Normal exit occurs when a thread has finished its task and the thread is freed.
738When the main thread function, ``tee_entry_std(...)``, returns interrupts are
739disabled and the CPU switches to the temp stack instead. The thread is freed and
740OP-TEE exits to normal world.
741
742RPC exit
743========
744RPC exit occurs when OP-TEE need some service from normal world. RPC can
745currently only be performed with a thread is in running state. RPC is initiated
746with a call to ``thread_rpc(...)`` which saves the state in a way that when the
747thread is restored it will continue at the next instruction as if this function
748did a normal return. CPU switches to use the temp stack before returning to
749normal world.
750
751Foreign interrupt exit
752======================
753Foreign interrupt exit occurs when OP-TEE receives a foreign interrupt. For Arm
754GICv2 mode, foreign interrupt is sent as IRQ which is always handled in normal
755world. Foreign interrupt exit is similar to RPC exit but it is
756``thread_irq_handler(...)`` and ``elx_irq(...)`` (respectively for
757Armv7-A/Aarch32 and for Aarch64) that saves the thread state instead. The thread
758is resumed in the same way though. For Arm GICv3 mode, foreign interrupt is sent
759as FIQ which could be handled by either secure world (EL3 in AArch64) or normal
760world. This mode is not supported yet.
761
762Notes for Armv7-A/AArch32
763 SP_IRQ is initialized to temp stack instead of a separate stack. Prior to
764 exiting to normal world CPU state is changed to SVC and temp stack is
765 selected.
766
767Notes for AArch64
768 ``SP_EL0`` is assigned temp stack and is selected during IRQ processing. The
769 original ``SP_EL0`` is saved in the thread context to be restored when
770 resuming.
771
772Resume entry
773============
774OP-TEE is entered using the temp stack in the same way as for normal entry. The
775thread to resume is looked up and the state is restored to resume execution. The
776procedure to resume from an RPC exit or an foreign interrupt exit is exactly the
777same.
778
779Syscall
780=======
781Syscall's are executed using the thread stack.
782
783Notes for Armv7-A/AArch32
784 Nothing special ``SP_SVC`` is already set with thread stack.
785
786Notes for syscall AArch64
787 Early in the exception processing the original ``SP_EL0`` is saved in
788 ``struct thread_svc_regs`` in case the TA is executed in AArch64. Current
789 thread stack is assigned to ``SP_EL0`` which is then selected. When
790 returning ``SP_EL0`` is assigned what is in ``struct thread_svc_regs``. This
791 allows ``tee_svc_sys_return_helper(...)`` having the syscall exception
792 handler return directly to ``thread_unwind_user_mode(...)``.
793
794----
795
796.. _shared_memory:
797
798Shared Memory
799*************
800Shared Memory is a block of memory that is shared between the non-secure and the
801secure world. It is used to transfer data between both worlds.
802
803Shared Memory Allocation
804========================
805The shared memory is allocated by the Linux driver from a pool ``struct
806shm_pool``, the pool contains:
807
808 - The physical address of the start of the pool
809 - The size of the pool
810 - Whether or not the memory is cached
811 - List of chunk of memory allocated.
812
813.. note::
814 - The shared memory pool is physically contiguous.
815 - The shared memory area is **not secure** as it is used by both non-secure
816 and secure world.
817
818Shared Memory Configuration
819===========================
820It is the Linux kernel driver for OP-TEE that is responsible for initializing
821the shared memory pool, given information provided by the OP-TEE core. The
822Linux driver issues a SMC call ``OPTEE_SMC_GET_SHM_CONFIG`` to retrieve the
823information
824
825 - Physical address of the start of the pool
826 - Size of the pool
827 - Whether or not the memory is cached
828
829The shared memory pool configuration is platform specific. The memory mapping,
830including the area ``MEM_AREA_NSEC_SHM`` (shared memory with non-secure world),
831is retrieved by calling the platform-specific function ``bootcfg_get_memory()``.
832Please refer to this function and the area type ``MEM_AREA_NSEC_SHM`` to see the
833configuration for the platform of interest. The Linux driver will then
834initialize the shared memory pool accordingly.
835
836.. todo::
837
838 Joakim: bootcfg_get_memory(...) is no longer in our code. Text needs update.
839
840Shared Memory Chunk Allocation
841==============================
842It is the Linux kernel driver for OP-TEE that is responsible for allocating
843chunks of shared memory. OP-TEE linux kernel driver relies on linux kernel
844generic allocation support (``CONFIG_GENERIC_ALLOCATION``) to allocation/release
845of shared memory physical chunks. OP-TEE linux kernel driver relies on linux
846kernel dma-buf support (``CONFIG_DMA_SHARED_BUFFER``) to track shared memory
847buffers references.
848
849Using shared memory
850===================
851From the Client Application
852 The client application can ask for shared memory allocation using the
853 GlobalPlatform Client API function ``TEEC_AllocateSharedMemory(...)``. The
854 client application can also provide shared memory through the GlobalPlatform
855 Client API function ``TEEC_RegisterSharedMemory(...)``. In such a case, the
856 provided memory must be physically contiguous, since OP-TEE core, who does
857 not handle scatter-gather memory, is able to use the provided range of
858 memory addresses. Note that the reference count of a shared memory chunk is
859 incremented when shared memory is registered, and initialized to 1 on
860 allocation.
861
862From the Linux Driver
863 Occasionally the Linux kernel driver needs to allocate shared memory for the
864 communication with secure world, for example when using buffers of type
865 ``TEEC_TempMemoryReference``.
866
867From OP-TEE core
868 In case OP-TEE core needs information from TEE supplicant (dynamic TA
869 loading, REE time request,...), shared memory must be allocated. Allocation
870 depends on the use case. OP-TEE core asks for the following shared memory
871 allocation:
872
873 - ``optee_msg_arg`` structure, used to pass the arguments to the
874 non-secure world, where the allocation will be done by sending a
875 ``OPTEE_SMC_RPC_FUNC_ALLOC`` message.
876
877 - In some cases, a payload might be needed for storing the result from
878 TEE supplicant, for example when loading a Trusted Application. This
879 type of allocation will be done by sending the message
880 ``OPTEE_MSG_RPC_CMD_SHM_ALLOC(OPTEE_MSG_RPC_SHM_TYPE_APPL,...)``,
881 which then will return:
882
883 - the physical address of the shared memory
884 - a handle to the memory, that later on will be used later on when
885 freeing this memory.
886
887From TEE Supplicant
888 TEE supplicant is also working with shared memory, used to exchange data
889 between normal and secure worlds. TEE supplicant receives a memory address
890 from the OP-TEE core, used to store the data. This is for example the case
891 when a Trusted Application is loaded. In this case, TEE supplicant must
892 register the provided shared memory in the same way a client application
893 would do, involving the Linux driver.
894
895----
896
897.. _smc:
898
899SMC
900***
901SMC Interface
902=============
903OP-TEE's SMC interface is defined in two levels using optee_smc.h_ and
904optee_msg.h_. The former file defines SMC identifiers and what is passed in the
905registers for each SMC. The latter file defines the OP-TEE Message protocol
906which is not restricted to only SMC even if that currently is the only option
907available.
908
909SMC communication
910=================
911The main structure used for the SMC communication is defined in ``struct
912optee_msg_arg`` (in optee_msg.h_). If we are looking into the source code, we
913could see that communication mainly is achieved using ``optee_msg_arg`` and
914``thread_smc_args`` (in thread.h_), where ``optee_msg_arg`` could be seen as the
915main structure. What will happen is that the :ref:`linux_kernel` driver will get
916the parameters either from :ref:`optee_client` or directly from an internal
917service in Linux kernel. The TEE driver will populate the struct
918``optee_msg_arg`` with the parameters plus some additional bookkeeping
919information. Parameters for the SMC are passed in registers 1 to 7, register 0
920holds the SMC id which among other things tells whether it is a standard or a
921fast call.
922
923----
924
925.. _thread_handling:
926
927Thread handling
928***************
929OP-TEE core uses a couple of threads to be able to support running jobs in
930parallel (not fully enabled!). There are handlers for different purposes. In
931thread.c_ you will find a function called ``thread_init_primary(...)`` which
932assigns ``init_handlers`` (functions) that should be called when OP-TEE core
933receives standard or fast calls, FIQ and PSCI calls. There are default handlers
934for these services, but the platform can decide if they want to implement their
935own platform specific handlers instead.
936
937Synchronization primitives
938==========================
939OP-TEE has three primitives for synchronization of threads and CPUs:
940*spin-lock*, *mutex*, and *condvar*.
941
942Spin-lock
943 A spin-lock is represented as an ``unsigned int``. This is the most
944 primitive lock. Interrupts should be disabled before attempting to take a
945 spin-lock and should remain disabled until the lock is released. A spin-lock
946 is initialized with ``SPINLOCK_UNLOCK``.
947
948 .. list-table:: Spin lock functions
949 :header-rows: 1
950 :widths: 1 5
951
952 * - Function
953 - Purpose
954
955 * - ``cpu_spin_lock(...)``
956 - Locks a spin-lock
957
958 * - ``cpu_spin_trylock(...)``
959 - Locks a spin-lock if unlocked and returns ``0`` else the spin-lock
960 is unchanged and the function returns ``!0``
961
962 * - ``cpu_spin_unlock(...)``
963 - Unlocks a spin-lock
964
965Mutex
966 A mutex is represented by ``struct mutex``. A mutex can be locked and
967 unlocked with interrupts enabled or disabled, but only from a normal thread.
968 A mutex cannot be used in an interrupt handler, abort handler or before a
969 thread has been selected for the CPU. A mutex is initialized with either
970 ``MUTEX_INITIALIZER`` or ``mutex_init(...)``.
971
972 .. list-table:: Mutex functions
973 :header-rows: 1
974 :widths: 1 5
975
976 * - Function
977 - Purpose
978
979 * - ``mutex_lock(...)``
980 - Locks a mutex. If the mutex is unlocked this is a fast operation,
981 else the function issues an RPC to wait in normal world.
982
983 * - ``mutex_unlock(...)``
984 - Unlocks a mutex. If there is no waiters this is a fast operation,
985 else the function issues an RPC to wake up a waiter in normal world.
986
987 * - ``mutex_trylock(...)``
988 - Locks a mutex if unlocked and returns ``true`` else the mutex is
989 unchanged and the function returns ``false``.
990
991 * - ``mutex_destroy(...)``
992 - Asserts that the mutex is unlocked and there is no waiters, after
993 this the memory used by the mutex can be freed.
994
995 When a mutex is locked it is owned by the thread calling ``mutex_lock(...)``
996 or ``mutex_trylock(...)``, the mutex may only be unlocked by the thread
997 owning the mutex. A thread should not exit to TA user space when holding a
998 mutex.
999
1000Condvar
1001 A condvar is represented by ``struct condvar``. A condvar is similar to a
1002 ``pthread_condvar_t`` in the pthreads standard, only less advanced.
1003 Condition variables are used to wait for some condition to be fulfilled and
1004 are always used together a mutex. Once a condition variable has been used
1005 together with a certain mutex, it must only be used with that mutex until
1006 destroyed. A condvar is initialized with ``CONDVAR_INITIALIZER`` or
1007 ``condvar_init(...)``.
1008
1009 .. list-table:: Condvar functions
1010 :header-rows: 1
1011 :widths: 1 5
1012
1013 * - Function
1014 - Purpose
1015
1016 * - ``condvar_wait(...)``
1017 - Atomically unlocks the supplied mutex and waits in normal world via
1018 an RPC for the condition variable to be signaled, when the function
1019 returns the mutex is locked again.
1020
1021 * - ``condvar_signal(...)``
1022 - Wakes up one waiter of the condition variable (waiting in
1023 ``condvar_wait(...)``).
1024
1025 * - ``condvar_broadcast(...)``
1026 - Wake up all waiters of the condition variable.
1027
1028 The caller of ``condvar_signal(...)`` or ``condvar_broadcast(...)`` should
1029 hold the mutex associated with the condition variable to guarantee that a
1030 waiter does not miss the signal.
1031
1032.. _core/arch/arm/kernel/thread.c: https://github.com/OP-TEE/optee_os/blob/master/core/arch/arm/kernel/thread.c
1033.. _optee_msg.h: https://github.com/OP-TEE/optee_os/blob/master/core/include/optee_msg.h
1034.. _optee_smc.h: https://github.com/OP-TEE/optee_os/blob/master/core/arch/arm/include/sm/optee_smc.h
1035.. _thread.c: https://github.com/OP-TEE/optee_os/blob/master/core/arch/arm/kernel/thread.c
1036.. _thread.h: https://github.com/OP-TEE/optee_os/blob/master/core/arch/arm/include/kernel/thread.h
1037
1038.. _ARM_DEN0028A_SMC_Calling_Convention: http://infocenter.arm.com/help/topic/com.arm.doc.den0028b/ARM_DEN0028B_SMC_Calling_Convention.pdf
1039.. _Cortex-A53 TRM: http://infocenter.arm.com/help/topic/com.arm.doc.ddi0500j/DDI0500J_cortex_a53_trm.pdf
1040.. _drivers/tee/optee: https://github.com/torvalds/linux/tree/master/drivers/tee/optee
1041.. _Trusted Firmware A: https://github.com/ARM-software/arm-trusted-firmware