diff options
author | Paul Beesley <paul.beesley@arm.com> | 2019-02-11 17:54:45 +0000 |
---|---|---|
committer | Paul Beesley <paul.beesley@arm.com> | 2019-05-21 15:05:56 +0100 |
commit | 40d553cfde38d4f68449c62967cd1ce0d6478750 (patch) | |
tree | 1fafd4701066cdf0e5fb15aee2d842279a67b611 /docs/firmware-design.rst | |
parent | 12b67439e93a78a4b756e987e1bd1b6e22cc4bf8 (diff) | |
download | trusted-firmware-a-40d553cfde38d4f68449c62967cd1ce0d6478750.tar.gz |
doc: Move documents into subdirectories
This change creates the following directories under docs/
in order to provide a grouping for the content:
- components
- design
- getting_started
- perf
- process
In each of these directories an index.rst file is created
and this serves as an index / landing page for each of the
groups when the pages are compiled. Proper layout of the
top-level table of contents relies on this directory/index
structure.
Without this patch it is possible to build the documents
correctly with Sphinx but the output looks messy because
there is no overall hierarchy.
Change-Id: I3c9f4443ec98571a56a6edf775f2c8d74d7f429f
Signed-off-by: Paul Beesley <paul.beesley@arm.com>
Diffstat (limited to 'docs/firmware-design.rst')
-rw-r--r-- | docs/firmware-design.rst | 2688 |
1 files changed, 0 insertions, 2688 deletions
diff --git a/docs/firmware-design.rst b/docs/firmware-design.rst deleted file mode 100644 index 8384c9c005..0000000000 --- a/docs/firmware-design.rst +++ /dev/null @@ -1,2688 +0,0 @@ -Trusted Firmware-A design -========================= - - -.. section-numbering:: - :suffix: . - -.. contents:: - -Trusted Firmware-A (TF-A) implements a subset of the Trusted Board Boot -Requirements (TBBR) Platform Design Document (PDD) [1]_ for Arm reference -platforms. The TBB sequence starts when the platform is powered on and runs up -to the stage where it hands-off control to firmware running in the normal -world in DRAM. This is the cold boot path. - -TF-A also implements the Power State Coordination Interface PDD [2]_ as a -runtime service. PSCI is the interface from normal world software to firmware -implementing power management use-cases (for example, secondary CPU boot, -hotplug and idle). Normal world software can access TF-A runtime services via -the Arm SMC (Secure Monitor Call) instruction. The SMC instruction must be -used as mandated by the SMC Calling Convention [3]_. - -TF-A implements a framework for configuring and managing interrupts generated -in either security state. The details of the interrupt management framework -and its design can be found in TF-A Interrupt Management Design guide [4]_. - -TF-A also implements a library for setting up and managing the translation -tables. The details of this library can be found in `Xlat_tables design`_. - -TF-A can be built to support either AArch64 or AArch32 execution state. - -Cold boot ---------- - -The cold boot path starts when the platform is physically turned on. If -``COLD_BOOT_SINGLE_CPU=0``, one of the CPUs released from reset is chosen as the -primary CPU, and the remaining CPUs are considered secondary CPUs. The primary -CPU is chosen through platform-specific means. The cold boot path is mainly -executed by the primary CPU, other than essential CPU initialization executed by -all CPUs. The secondary CPUs are kept in a safe platform-specific state until -the primary CPU has performed enough initialization to boot them. - -Refer to the `Reset Design`_ for more information on the effect of the -``COLD_BOOT_SINGLE_CPU`` platform build option. - -The cold boot path in this implementation of TF-A depends on the execution -state. For AArch64, it is divided into five steps (in order of execution): - -- Boot Loader stage 1 (BL1) *AP Trusted ROM* -- Boot Loader stage 2 (BL2) *Trusted Boot Firmware* -- Boot Loader stage 3-1 (BL31) *EL3 Runtime Software* -- Boot Loader stage 3-2 (BL32) *Secure-EL1 Payload* (optional) -- Boot Loader stage 3-3 (BL33) *Non-trusted Firmware* - -For AArch32, it is divided into four steps (in order of execution): - -- Boot Loader stage 1 (BL1) *AP Trusted ROM* -- Boot Loader stage 2 (BL2) *Trusted Boot Firmware* -- Boot Loader stage 3-2 (BL32) *EL3 Runtime Software* -- Boot Loader stage 3-3 (BL33) *Non-trusted Firmware* - -Arm development platforms (Fixed Virtual Platforms (FVPs) and Juno) implement a -combination of the following types of memory regions. Each bootloader stage uses -one or more of these memory regions. - -- Regions accessible from both non-secure and secure states. For example, - non-trusted SRAM, ROM and DRAM. -- Regions accessible from only the secure state. For example, trusted SRAM and - ROM. The FVPs also implement the trusted DRAM which is statically - configured. Additionally, the Base FVPs and Juno development platform - configure the TrustZone Controller (TZC) to create a region in the DRAM - which is accessible only from the secure state. - -The sections below provide the following details: - -- dynamic configuration of Boot Loader stages -- initialization and execution of the first three stages during cold boot -- specification of the EL3 Runtime Software (BL31 for AArch64 and BL32 for - AArch32) entrypoint requirements for use by alternative Trusted Boot - Firmware in place of the provided BL1 and BL2 - -Dynamic Configuration during cold boot -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Each of the Boot Loader stages may be dynamically configured if required by the -platform. The Boot Loader stage may optionally specify a firmware -configuration file and/or hardware configuration file as listed below: - -- HW_CONFIG - The hardware configuration file. Can be shared by all Boot Loader - stages and also by the Normal World Rich OS. -- TB_FW_CONFIG - Trusted Boot Firmware configuration file. Shared between BL1 - and BL2. -- SOC_FW_CONFIG - SoC Firmware configuration file. Used by BL31. -- TOS_FW_CONFIG - Trusted OS Firmware configuration file. Used by Trusted OS - (BL32). -- NT_FW_CONFIG - Non Trusted Firmware configuration file. Used by Non-trusted - firmware (BL33). - -The Arm development platforms use the Flattened Device Tree format for the -dynamic configuration files. - -Each Boot Loader stage can pass up to 4 arguments via registers to the next -stage. BL2 passes the list of the next images to execute to the *EL3 Runtime -Software* (BL31 for AArch64 and BL32 for AArch32) via `arg0`. All the other -arguments are platform defined. The Arm development platforms use the following -convention: - -- BL1 passes the address of a meminfo_t structure to BL2 via ``arg1``. This - structure contains the memory layout available to BL2. -- When dynamic configuration files are present, the firmware configuration for - the next Boot Loader stage is populated in the first available argument and - the generic hardware configuration is passed the next available argument. - For example, - - - If TB_FW_CONFIG is loaded by BL1, then its address is passed in ``arg0`` - to BL2. - - If HW_CONFIG is loaded by BL1, then its address is passed in ``arg2`` to - BL2. Note, ``arg1`` is already used for meminfo_t. - - If SOC_FW_CONFIG is loaded by BL2, then its address is passed in ``arg1`` - to BL31. Note, ``arg0`` is used to pass the list of executable images. - - Similarly, if HW_CONFIG is loaded by BL1 or BL2, then its address is - passed in ``arg2`` to BL31. - - For other BL3x images, if the firmware configuration file is loaded by - BL2, then its address is passed in ``arg0`` and if HW_CONFIG is loaded - then its address is passed in ``arg1``. - -BL1 -~~~ - -This stage begins execution from the platform's reset vector at EL3. The reset -address is platform dependent but it is usually located in a Trusted ROM area. -The BL1 data section is copied to trusted SRAM at runtime. - -On the Arm development platforms, BL1 code starts execution from the reset -vector defined by the constant ``BL1_RO_BASE``. The BL1 data section is copied -to the top of trusted SRAM as defined by the constant ``BL1_RW_BASE``. - -The functionality implemented by this stage is as follows. - -Determination of boot path -^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Whenever a CPU is released from reset, BL1 needs to distinguish between a warm -boot and a cold boot. This is done using platform-specific mechanisms (see the -``plat_get_my_entrypoint()`` function in the `Porting Guide`_). In the case of a -warm boot, a CPU is expected to continue execution from a separate -entrypoint. In the case of a cold boot, the secondary CPUs are placed in a safe -platform-specific state (see the ``plat_secondary_cold_boot_setup()`` function in -the `Porting Guide`_) while the primary CPU executes the remaining cold boot path -as described in the following sections. - -This step only applies when ``PROGRAMMABLE_RESET_ADDRESS=0``. Refer to the -`Reset Design`_ for more information on the effect of the -``PROGRAMMABLE_RESET_ADDRESS`` platform build option. - -Architectural initialization -^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -BL1 performs minimal architectural initialization as follows. - -- Exception vectors - - BL1 sets up simple exception vectors for both synchronous and asynchronous - exceptions. The default behavior upon receiving an exception is to populate - a status code in the general purpose register ``X0/R0`` and call the - ``plat_report_exception()`` function (see the `Porting Guide`_). The status - code is one of: - - For AArch64: - - :: - - 0x0 : Synchronous exception from Current EL with SP_EL0 - 0x1 : IRQ exception from Current EL with SP_EL0 - 0x2 : FIQ exception from Current EL with SP_EL0 - 0x3 : System Error exception from Current EL with SP_EL0 - 0x4 : Synchronous exception from Current EL with SP_ELx - 0x5 : IRQ exception from Current EL with SP_ELx - 0x6 : FIQ exception from Current EL with SP_ELx - 0x7 : System Error exception from Current EL with SP_ELx - 0x8 : Synchronous exception from Lower EL using aarch64 - 0x9 : IRQ exception from Lower EL using aarch64 - 0xa : FIQ exception from Lower EL using aarch64 - 0xb : System Error exception from Lower EL using aarch64 - 0xc : Synchronous exception from Lower EL using aarch32 - 0xd : IRQ exception from Lower EL using aarch32 - 0xe : FIQ exception from Lower EL using aarch32 - 0xf : System Error exception from Lower EL using aarch32 - - For AArch32: - - :: - - 0x10 : User mode - 0x11 : FIQ mode - 0x12 : IRQ mode - 0x13 : SVC mode - 0x16 : Monitor mode - 0x17 : Abort mode - 0x1a : Hypervisor mode - 0x1b : Undefined mode - 0x1f : System mode - - The ``plat_report_exception()`` implementation on the Arm FVP port programs - the Versatile Express System LED register in the following format to - indicate the occurrence of an unexpected exception: - - :: - - SYS_LED[0] - Security state (Secure=0/Non-Secure=1) - SYS_LED[2:1] - Exception Level (EL3=0x3, EL2=0x2, EL1=0x1, EL0=0x0) - For AArch32 it is always 0x0 - SYS_LED[7:3] - Exception Class (Sync/Async & origin). This is the value - of the status code - - A write to the LED register reflects in the System LEDs (S6LED0..7) in the - CLCD window of the FVP. - - BL1 does not expect to receive any exceptions other than the SMC exception. - For the latter, BL1 installs a simple stub. The stub expects to receive a - limited set of SMC types (determined by their function IDs in the general - purpose register ``X0/R0``): - - - ``BL1_SMC_RUN_IMAGE``: This SMC is raised by BL2 to make BL1 pass control - to EL3 Runtime Software. - - All SMCs listed in section "BL1 SMC Interface" in the `Firmware Update`_ - Design Guide are supported for AArch64 only. These SMCs are currently - not supported when BL1 is built for AArch32. - - Any other SMC leads to an assertion failure. - -- CPU initialization - - BL1 calls the ``reset_handler()`` function which in turn calls the CPU - specific reset handler function (see the section: "CPU specific operations - framework"). - -- Control register setup (for AArch64) - - - ``SCTLR_EL3``. Instruction cache is enabled by setting the ``SCTLR_EL3.I`` - bit. Alignment and stack alignment checking is enabled by setting the - ``SCTLR_EL3.A`` and ``SCTLR_EL3.SA`` bits. Exception endianness is set to - little-endian by clearing the ``SCTLR_EL3.EE`` bit. - - - ``SCR_EL3``. The register width of the next lower exception level is set - to AArch64 by setting the ``SCR.RW`` bit. The ``SCR.EA`` bit is set to trap - both External Aborts and SError Interrupts in EL3. The ``SCR.SIF`` bit is - also set to disable instruction fetches from Non-secure memory when in - secure state. - - - ``CPTR_EL3``. Accesses to the ``CPACR_EL1`` register from EL1 or EL2, or the - ``CPTR_EL2`` register from EL2 are configured to not trap to EL3 by - clearing the ``CPTR_EL3.TCPAC`` bit. Access to the trace functionality is - configured not to trap to EL3 by clearing the ``CPTR_EL3.TTA`` bit. - Instructions that access the registers associated with Floating Point - and Advanced SIMD execution are configured to not trap to EL3 by - clearing the ``CPTR_EL3.TFP`` bit. - - - ``DAIF``. The SError interrupt is enabled by clearing the SError interrupt - mask bit. - - - ``MDCR_EL3``. The trap controls, ``MDCR_EL3.TDOSA``, ``MDCR_EL3.TDA`` and - ``MDCR_EL3.TPM``, are set so that accesses to the registers they control - do not trap to EL3. AArch64 Secure self-hosted debug is disabled by - setting the ``MDCR_EL3.SDD`` bit. Also ``MDCR_EL3.SPD32`` is set to - disable AArch32 Secure self-hosted privileged debug from S-EL1. - -- Control register setup (for AArch32) - - - ``SCTLR``. Instruction cache is enabled by setting the ``SCTLR.I`` bit. - Alignment checking is enabled by setting the ``SCTLR.A`` bit. - Exception endianness is set to little-endian by clearing the - ``SCTLR.EE`` bit. - - - ``SCR``. The ``SCR.SIF`` bit is set to disable instruction fetches from - Non-secure memory when in secure state. - - - ``CPACR``. Allow execution of Advanced SIMD instructions at PL0 and PL1, - by clearing the ``CPACR.ASEDIS`` bit. Access to the trace functionality - is configured not to trap to undefined mode by clearing the - ``CPACR.TRCDIS`` bit. - - - ``NSACR``. Enable non-secure access to Advanced SIMD functionality and - system register access to implemented trace registers. - - - ``FPEXC``. Enable access to the Advanced SIMD and floating-point - functionality from all Exception levels. - - - ``CPSR.A``. The Asynchronous data abort interrupt is enabled by clearing - the Asynchronous data abort interrupt mask bit. - - - ``SDCR``. The ``SDCR.SPD`` field is set to disable AArch32 Secure - self-hosted privileged debug. - -Platform initialization -^^^^^^^^^^^^^^^^^^^^^^^ - -On Arm platforms, BL1 performs the following platform initializations: - -- Enable the Trusted Watchdog. -- Initialize the console. -- Configure the Interconnect to enable hardware coherency. -- Enable the MMU and map the memory it needs to access. -- Configure any required platform storage to load the next bootloader image - (BL2). -- If the BL1 dynamic configuration file, ``TB_FW_CONFIG``, is available, then - load it to the platform defined address and make it available to BL2 via - ``arg0``. -- Configure the system timer and program the `CNTFRQ_EL0` for use by NS-BL1U - and NS-BL2U firmware update images. - -Firmware Update detection and execution -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -After performing platform setup, BL1 common code calls -``bl1_plat_get_next_image_id()`` to determine if `Firmware Update`_ is required or -to proceed with the normal boot process. If the platform code returns -``BL2_IMAGE_ID`` then the normal boot sequence is executed as described in the -next section, else BL1 assumes that `Firmware Update`_ is required and execution -passes to the first image in the `Firmware Update`_ process. In either case, BL1 -retrieves a descriptor of the next image by calling ``bl1_plat_get_image_desc()``. -The image descriptor contains an ``entry_point_info_t`` structure, which BL1 -uses to initialize the execution state of the next image. - -BL2 image load and execution -^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -In the normal boot flow, BL1 execution continues as follows: - -#. BL1 prints the following string from the primary CPU to indicate successful - execution of the BL1 stage: - - :: - - "Booting Trusted Firmware" - -#. BL1 loads a BL2 raw binary image from platform storage, at a - platform-specific base address. Prior to the load, BL1 invokes - ``bl1_plat_handle_pre_image_load()`` which allows the platform to update or - use the image information. If the BL2 image file is not present or if - there is not enough free trusted SRAM the following error message is - printed: - - :: - - "Failed to load BL2 firmware." - -#. BL1 invokes ``bl1_plat_handle_post_image_load()`` which again is intended - for platforms to take further action after image load. This function must - populate the necessary arguments for BL2, which may also include the memory - layout. Further description of the memory layout can be found later - in this document. - -#. BL1 passes control to the BL2 image at Secure EL1 (for AArch64) or at - Secure SVC mode (for AArch32), starting from its load address. - -BL2 -~~~ - -BL1 loads and passes control to BL2 at Secure-EL1 (for AArch64) or at Secure -SVC mode (for AArch32) . BL2 is linked against and loaded at a platform-specific -base address (more information can be found later in this document). -The functionality implemented by BL2 is as follows. - -Architectural initialization -^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -For AArch64, BL2 performs the minimal architectural initialization required -for subsequent stages of TF-A and normal world software. EL1 and EL0 are given -access to Floating Point and Advanced SIMD registers by clearing the -``CPACR.FPEN`` bits. - -For AArch32, the minimal architectural initialization required for subsequent -stages of TF-A and normal world software is taken care of in BL1 as both BL1 -and BL2 execute at PL1. - -Platform initialization -^^^^^^^^^^^^^^^^^^^^^^^ - -On Arm platforms, BL2 performs the following platform initializations: - -- Initialize the console. -- Configure any required platform storage to allow loading further bootloader - images. -- Enable the MMU and map the memory it needs to access. -- Perform platform security setup to allow access to controlled components. -- Reserve some memory for passing information to the next bootloader image - EL3 Runtime Software and populate it. -- Define the extents of memory available for loading each subsequent - bootloader image. -- If BL1 has passed TB_FW_CONFIG dynamic configuration file in ``arg0``, - then parse it. - -Image loading in BL2 -^^^^^^^^^^^^^^^^^^^^ - -BL2 generic code loads the images based on the list of loadable images -provided by the platform. BL2 passes the list of executable images -provided by the platform to the next handover BL image. - -The list of loadable images provided by the platform may also contain -dynamic configuration files. The files are loaded and can be parsed as -needed in the ``bl2_plat_handle_post_image_load()`` function. These -configuration files can be passed to next Boot Loader stages as arguments -by updating the corresponding entrypoint information in this function. - -SCP_BL2 (System Control Processor Firmware) image load -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Some systems have a separate System Control Processor (SCP) for power, clock, -reset and system control. BL2 loads the optional SCP_BL2 image from platform -storage into a platform-specific region of secure memory. The subsequent -handling of SCP_BL2 is platform specific. For example, on the Juno Arm -development platform port the image is transferred into SCP's internal memory -using the Boot Over MHU (BOM) protocol after being loaded in the trusted SRAM -memory. The SCP executes SCP_BL2 and signals to the Application Processor (AP) -for BL2 execution to continue. - -EL3 Runtime Software image load -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -BL2 loads the EL3 Runtime Software image from platform storage into a platform- -specific address in trusted SRAM. If there is not enough memory to load the -image or image is missing it leads to an assertion failure. - -AArch64 BL32 (Secure-EL1 Payload) image load -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -BL2 loads the optional BL32 image from platform storage into a platform- -specific region of secure memory. The image executes in the secure world. BL2 -relies on BL31 to pass control to the BL32 image, if present. Hence, BL2 -populates a platform-specific area of memory with the entrypoint/load-address -of the BL32 image. The value of the Saved Processor Status Register (``SPSR``) -for entry into BL32 is not determined by BL2, it is initialized by the -Secure-EL1 Payload Dispatcher (see later) within BL31, which is responsible for -managing interaction with BL32. This information is passed to BL31. - -BL33 (Non-trusted Firmware) image load -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -BL2 loads the BL33 image (e.g. UEFI or other test or boot software) from -platform storage into non-secure memory as defined by the platform. - -BL2 relies on EL3 Runtime Software to pass control to BL33 once secure state -initialization is complete. Hence, BL2 populates a platform-specific area of -memory with the entrypoint and Saved Program Status Register (``SPSR``) of the -normal world software image. The entrypoint is the load address of the BL33 -image. The ``SPSR`` is determined as specified in Section 5.13 of the -`PSCI PDD`_. This information is passed to the EL3 Runtime Software. - -AArch64 BL31 (EL3 Runtime Software) execution -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -BL2 execution continues as follows: - -#. BL2 passes control back to BL1 by raising an SMC, providing BL1 with the - BL31 entrypoint. The exception is handled by the SMC exception handler - installed by BL1. - -#. BL1 turns off the MMU and flushes the caches. It clears the - ``SCTLR_EL3.M/I/C`` bits, flushes the data cache to the point of coherency - and invalidates the TLBs. - -#. BL1 passes control to BL31 at the specified entrypoint at EL3. - -Running BL2 at EL3 execution level -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Some platforms have a non-TF-A Boot ROM that expects the next boot stage -to execute at EL3. On these platforms, TF-A BL1 is a waste of memory -as its only purpose is to ensure TF-A BL2 is entered at S-EL1. To avoid -this waste, a special mode enables BL2 to execute at EL3, which allows -a non-TF-A Boot ROM to load and jump directly to BL2. This mode is selected -when the build flag BL2_AT_EL3 is enabled. The main differences in this -mode are: - -#. BL2 includes the reset code and the mailbox mechanism to differentiate - cold boot and warm boot. It runs at EL3 doing the arch - initialization required for EL3. - -#. BL2 does not receive the meminfo information from BL1 anymore. This - information can be passed by the Boot ROM or be internal to the - BL2 image. - -#. Since BL2 executes at EL3, BL2 jumps directly to the next image, - instead of invoking the RUN_IMAGE SMC call. - - -We assume 3 different types of BootROM support on the platform: - -#. The Boot ROM always jumps to the same address, for both cold - and warm boot. In this case, we will need to keep a resident part - of BL2 whose memory cannot be reclaimed by any other image. The - linker script defines the symbols __TEXT_RESIDENT_START__ and - __TEXT_RESIDENT_END__ that allows the platform to configure - correctly the memory map. -#. The platform has some mechanism to indicate the jump address to the - Boot ROM. Platform code can then program the jump address with - psci_warmboot_entrypoint during cold boot. -#. The platform has some mechanism to program the reset address using - the PROGRAMMABLE_RESET_ADDRESS feature. Platform code can then - program the reset address with psci_warmboot_entrypoint during - cold boot, bypassing the boot ROM for warm boot. - -In the last 2 cases, no part of BL2 needs to remain resident at -runtime. In the first 2 cases, we expect the Boot ROM to be able to -differentiate between warm and cold boot, to avoid loading BL2 again -during warm boot. - -This functionality can be tested with FVP loading the image directly -in memory and changing the address where the system jumps at reset. -For example: - - -C cluster0.cpu0.RVBAR=0x4022000 - --data cluster0.cpu0=bl2.bin@0x4022000 - -With this configuration, FVP is like a platform of the first case, -where the Boot ROM jumps always to the same address. For simplification, -BL32 is loaded in DRAM in this case, to avoid other images reclaiming -BL2 memory. - - -AArch64 BL31 -~~~~~~~~~~~~ - -The image for this stage is loaded by BL2 and BL1 passes control to BL31 at -EL3. BL31 executes solely in trusted SRAM. BL31 is linked against and -loaded at a platform-specific base address (more information can be found later -in this document). The functionality implemented by BL31 is as follows. - -Architectural initialization -^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Currently, BL31 performs a similar architectural initialization to BL1 as -far as system register settings are concerned. Since BL1 code resides in ROM, -architectural initialization in BL31 allows override of any previous -initialization done by BL1. - -BL31 initializes the per-CPU data framework, which provides a cache of -frequently accessed per-CPU data optimised for fast, concurrent manipulation -on different CPUs. This buffer includes pointers to per-CPU contexts, crash -buffer, CPU reset and power down operations, PSCI data, platform data and so on. - -It then replaces the exception vectors populated by BL1 with its own. BL31 -exception vectors implement more elaborate support for handling SMCs since this -is the only mechanism to access the runtime services implemented by BL31 (PSCI -for example). BL31 checks each SMC for validity as specified by the -`SMC calling convention PDD`_ before passing control to the required SMC -handler routine. - -BL31 programs the ``CNTFRQ_EL0`` register with the clock frequency of the system -counter, which is provided by the platform. - -Platform initialization -^^^^^^^^^^^^^^^^^^^^^^^ - -BL31 performs detailed platform initialization, which enables normal world -software to function correctly. - -On Arm platforms, this consists of the following: - -- Initialize the console. -- Configure the Interconnect to enable hardware coherency. -- Enable the MMU and map the memory it needs to access. -- Initialize the generic interrupt controller. -- Initialize the power controller device. -- Detect the system topology. - -Runtime services initialization -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -BL31 is responsible for initializing the runtime services. One of them is PSCI. - -As part of the PSCI initializations, BL31 detects the system topology. It also -initializes the data structures that implement the state machine used to track -the state of power domain nodes. The state can be one of ``OFF``, ``RUN`` or -``RETENTION``. All secondary CPUs are initially in the ``OFF`` state. The cluster -that the primary CPU belongs to is ``ON``; any other cluster is ``OFF``. It also -initializes the locks that protect them. BL31 accesses the state of a CPU or -cluster immediately after reset and before the data cache is enabled in the -warm boot path. It is not currently possible to use 'exclusive' based spinlocks, -therefore BL31 uses locks based on Lamport's Bakery algorithm instead. - -The runtime service framework and its initialization is described in more -detail in the "EL3 runtime services framework" section below. - -Details about the status of the PSCI implementation are provided in the -"Power State Coordination Interface" section below. - -AArch64 BL32 (Secure-EL1 Payload) image initialization -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -If a BL32 image is present then there must be a matching Secure-EL1 Payload -Dispatcher (SPD) service (see later for details). During initialization -that service must register a function to carry out initialization of BL32 -once the runtime services are fully initialized. BL31 invokes such a -registered function to initialize BL32 before running BL33. This initialization -is not necessary for AArch32 SPs. - -Details on BL32 initialization and the SPD's role are described in the -"Secure-EL1 Payloads and Dispatchers" section below. - -BL33 (Non-trusted Firmware) execution -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -EL3 Runtime Software initializes the EL2 or EL1 processor context for normal- -world cold boot, ensuring that no secure state information finds its way into -the non-secure execution state. EL3 Runtime Software uses the entrypoint -information provided by BL2 to jump to the Non-trusted firmware image (BL33) -at the highest available Exception Level (EL2 if available, otherwise EL1). - -Using alternative Trusted Boot Firmware in place of BL1 & BL2 (AArch64 only) -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Some platforms have existing implementations of Trusted Boot Firmware that -would like to use TF-A BL31 for the EL3 Runtime Software. To enable this -firmware architecture it is important to provide a fully documented and stable -interface between the Trusted Boot Firmware and BL31. - -Future changes to the BL31 interface will be done in a backwards compatible -way, and this enables these firmware components to be independently enhanced/ -updated to develop and exploit new functionality. - -Required CPU state when calling ``bl31_entrypoint()`` during cold boot -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -This function must only be called by the primary CPU. - -On entry to this function the calling primary CPU must be executing in AArch64 -EL3, little-endian data access, and all interrupt sources masked: - -:: - - PSTATE.EL = 3 - PSTATE.RW = 1 - PSTATE.DAIF = 0xf - SCTLR_EL3.EE = 0 - -X0 and X1 can be used to pass information from the Trusted Boot Firmware to the -platform code in BL31: - -:: - - X0 : Reserved for common TF-A information - X1 : Platform specific information - -BL31 zero-init sections (e.g. ``.bss``) should not contain valid data on entry, -these will be zero filled prior to invoking platform setup code. - -Use of the X0 and X1 parameters -''''''''''''''''''''''''''''''' - -The parameters are platform specific and passed from ``bl31_entrypoint()`` to -``bl31_early_platform_setup()``. The value of these parameters is never directly -used by the common BL31 code. - -The convention is that ``X0`` conveys information regarding the BL31, BL32 and -BL33 images from the Trusted Boot firmware and ``X1`` can be used for other -platform specific purpose. This convention allows platforms which use TF-A's -BL1 and BL2 images to transfer additional platform specific information from -Secure Boot without conflicting with future evolution of TF-A using ``X0`` to -pass a ``bl31_params`` structure. - -BL31 common and SPD initialization code depends on image and entrypoint -information about BL33 and BL32, which is provided via BL31 platform APIs. -This information is required until the start of execution of BL33. This -information can be provided in a platform defined manner, e.g. compiled into -the platform code in BL31, or provided in a platform defined memory location -by the Trusted Boot firmware, or passed from the Trusted Boot Firmware via the -Cold boot Initialization parameters. This data may need to be cleaned out of -the CPU caches if it is provided by an earlier boot stage and then accessed by -BL31 platform code before the caches are enabled. - -TF-A's BL2 implementation passes a ``bl31_params`` structure in -``X0`` and the Arm development platforms interpret this in the BL31 platform -code. - -MMU, Data caches & Coherency -'''''''''''''''''''''''''''' - -BL31 does not depend on the enabled state of the MMU, data caches or -interconnect coherency on entry to ``bl31_entrypoint()``. If these are disabled -on entry, these should be enabled during ``bl31_plat_arch_setup()``. - -Data structures used in the BL31 cold boot interface -'''''''''''''''''''''''''''''''''''''''''''''''''''' - -These structures are designed to support compatibility and independent -evolution of the structures and the firmware images. For example, a version of -BL31 that can interpret the BL3x image information from different versions of -BL2, a platform that uses an extended entry_point_info structure to convey -additional register information to BL31, or a ELF image loader that can convey -more details about the firmware images. - -To support these scenarios the structures are versioned and sized, which enables -BL31 to detect which information is present and respond appropriately. The -``param_header`` is defined to capture this information: - -.. code:: c - - typedef struct param_header { - uint8_t type; /* type of the structure */ - uint8_t version; /* version of this structure */ - uint16_t size; /* size of this structure in bytes */ - uint32_t attr; /* attributes: unused bits SBZ */ - } param_header_t; - -The structures using this format are ``entry_point_info``, ``image_info`` and -``bl31_params``. The code that allocates and populates these structures must set -the header fields appropriately, and the ``SET_PARAM_HEAD()`` a macro is defined -to simplify this action. - -Required CPU state for BL31 Warm boot initialization -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -When requesting a CPU power-on, or suspending a running CPU, TF-A provides -the platform power management code with a Warm boot initialization -entry-point, to be invoked by the CPU immediately after the reset handler. -On entry to the Warm boot initialization function the calling CPU must be in -AArch64 EL3, little-endian data access and all interrupt sources masked: - -:: - - PSTATE.EL = 3 - PSTATE.RW = 1 - PSTATE.DAIF = 0xf - SCTLR_EL3.EE = 0 - -The PSCI implementation will initialize the processor state and ensure that the -platform power management code is then invoked as required to initialize all -necessary system, cluster and CPU resources. - -AArch32 EL3 Runtime Software entrypoint interface -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -To enable this firmware architecture it is important to provide a fully -documented and stable interface between the Trusted Boot Firmware and the -AArch32 EL3 Runtime Software. - -Future changes to the entrypoint interface will be done in a backwards -compatible way, and this enables these firmware components to be independently -enhanced/updated to develop and exploit new functionality. - -Required CPU state when entering during cold boot -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -This function must only be called by the primary CPU. - -On entry to this function the calling primary CPU must be executing in AArch32 -EL3, little-endian data access, and all interrupt sources masked: - -:: - - PSTATE.AIF = 0x7 - SCTLR.EE = 0 - -R0 and R1 are used to pass information from the Trusted Boot Firmware to the -platform code in AArch32 EL3 Runtime Software: - -:: - - R0 : Reserved for common TF-A information - R1 : Platform specific information - -Use of the R0 and R1 parameters -''''''''''''''''''''''''''''''' - -The parameters are platform specific and the convention is that ``R0`` conveys -information regarding the BL3x images from the Trusted Boot firmware and ``R1`` -can be used for other platform specific purpose. This convention allows -platforms which use TF-A's BL1 and BL2 images to transfer additional platform -specific information from Secure Boot without conflicting with future -evolution of TF-A using ``R0`` to pass a ``bl_params`` structure. - -The AArch32 EL3 Runtime Software is responsible for entry into BL33. This -information can be obtained in a platform defined manner, e.g. compiled into -the AArch32 EL3 Runtime Software, or provided in a platform defined memory -location by the Trusted Boot firmware, or passed from the Trusted Boot Firmware -via the Cold boot Initialization parameters. This data may need to be cleaned -out of the CPU caches if it is provided by an earlier boot stage and then -accessed by AArch32 EL3 Runtime Software before the caches are enabled. - -When using AArch32 EL3 Runtime Software, the Arm development platforms pass a -``bl_params`` structure in ``R0`` from BL2 to be interpreted by AArch32 EL3 Runtime -Software platform code. - -MMU, Data caches & Coherency -'''''''''''''''''''''''''''' - -AArch32 EL3 Runtime Software must not depend on the enabled state of the MMU, -data caches or interconnect coherency in its entrypoint. They must be explicitly -enabled if required. - -Data structures used in cold boot interface -''''''''''''''''''''''''''''''''''''''''''' - -The AArch32 EL3 Runtime Software cold boot interface uses ``bl_params`` instead -of ``bl31_params``. The ``bl_params`` structure is based on the convention -described in AArch64 BL31 cold boot interface section. - -Required CPU state for warm boot initialization -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -When requesting a CPU power-on, or suspending a running CPU, AArch32 EL3 -Runtime Software must ensure execution of a warm boot initialization entrypoint. -If TF-A BL1 is used and the PROGRAMMABLE_RESET_ADDRESS build flag is false, -then AArch32 EL3 Runtime Software must ensure that BL1 branches to the warm -boot entrypoint by arranging for the BL1 platform function, -plat_get_my_entrypoint(), to return a non-zero value. - -In this case, the warm boot entrypoint must be in AArch32 EL3, little-endian -data access and all interrupt sources masked: - -:: - - PSTATE.AIF = 0x7 - SCTLR.EE = 0 - -The warm boot entrypoint may be implemented by using TF-A -``psci_warmboot_entrypoint()`` function. In that case, the platform must fulfil -the pre-requisites mentioned in the `PSCI Library integration guide`_. - -EL3 runtime services framework ------------------------------- - -Software executing in the non-secure state and in the secure state at exception -levels lower than EL3 will request runtime services using the Secure Monitor -Call (SMC) instruction. These requests will follow the convention described in -the SMC Calling Convention PDD (`SMCCC`_). The `SMCCC`_ assigns function -identifiers to each SMC request and describes how arguments are passed and -returned. - -The EL3 runtime services framework enables the development of services by -different providers that can be easily integrated into final product firmware. -The following sections describe the framework which facilitates the -registration, initialization and use of runtime services in EL3 Runtime -Software (BL31). - -The design of the runtime services depends heavily on the concepts and -definitions described in the `SMCCC`_, in particular SMC Function IDs, Owning -Entity Numbers (OEN), Fast and Yielding calls, and the SMC32 and SMC64 calling -conventions. Please refer to that document for more detailed explanation of -these terms. - -The following runtime services are expected to be implemented first. They have -not all been instantiated in the current implementation. - -#. Standard service calls - - This service is for management of the entire system. The Power State - Coordination Interface (`PSCI`_) is the first set of standard service calls - defined by Arm (see PSCI section later). - -#. Secure-EL1 Payload Dispatcher service - - If a system runs a Trusted OS or other Secure-EL1 Payload (SP) then - it also requires a *Secure Monitor* at EL3 to switch the EL1 processor - context between the normal world (EL1/EL2) and trusted world (Secure-EL1). - The Secure Monitor will make these world switches in response to SMCs. The - `SMCCC`_ provides for such SMCs with the Trusted OS Call and Trusted - Application Call OEN ranges. - - The interface between the EL3 Runtime Software and the Secure-EL1 Payload is - not defined by the `SMCCC`_ or any other standard. As a result, each - Secure-EL1 Payload requires a specific Secure Monitor that runs as a runtime - service - within TF-A this service is referred to as the Secure-EL1 Payload - Dispatcher (SPD). - - TF-A provides a Test Secure-EL1 Payload (TSP) and its associated Dispatcher - (TSPD). Details of SPD design and TSP/TSPD operation are described in the - "Secure-EL1 Payloads and Dispatchers" section below. - -#. CPU implementation service - - This service will provide an interface to CPU implementation specific - services for a given platform e.g. access to processor errata workarounds. - This service is currently unimplemented. - -Additional services for Arm Architecture, SiP and OEM calls can be implemented. -Each implemented service handles a range of SMC function identifiers as -described in the `SMCCC`_. - -Registration -~~~~~~~~~~~~ - -A runtime service is registered using the ``DECLARE_RT_SVC()`` macro, specifying -the name of the service, the range of OENs covered, the type of service and -initialization and call handler functions. This macro instantiates a ``const struct rt_svc_desc`` for the service with these details (see ``runtime_svc.h``). -This structure is allocated in a special ELF section ``rt_svc_descs``, enabling -the framework to find all service descriptors included into BL31. - -The specific service for a SMC Function is selected based on the OEN and call -type of the Function ID, and the framework uses that information in the service -descriptor to identify the handler for the SMC Call. - -The service descriptors do not include information to identify the precise set -of SMC function identifiers supported by this service implementation, the -security state from which such calls are valid nor the capability to support -64-bit and/or 32-bit callers (using SMC32 or SMC64). Responding appropriately -to these aspects of a SMC call is the responsibility of the service -implementation, the framework is focused on integration of services from -different providers and minimizing the time taken by the framework before the -service handler is invoked. - -Details of the parameters, requirements and behavior of the initialization and -call handling functions are provided in the following sections. - -Initialization -~~~~~~~~~~~~~~ - -``runtime_svc_init()`` in ``runtime_svc.c`` initializes the runtime services -framework running on the primary CPU during cold boot as part of the BL31 -initialization. This happens prior to initializing a Trusted OS and running -Normal world boot firmware that might in turn use these services. -Initialization involves validating each of the declared runtime service -descriptors, calling the service initialization function and populating the -index used for runtime lookup of the service. - -The BL31 linker script collects all of the declared service descriptors into a -single array and defines symbols that allow the framework to locate and traverse -the array, and determine its size. - -The framework does basic validation of each descriptor to halt firmware -initialization if service declaration errors are detected. The framework does -not check descriptors for the following error conditions, and may behave in an -unpredictable manner under such scenarios: - -#. Overlapping OEN ranges -#. Multiple descriptors for the same range of OENs and ``call_type`` -#. Incorrect range of owning entity numbers for a given ``call_type`` - -Once validated, the service ``init()`` callback is invoked. This function carries -out any essential EL3 initialization before servicing requests. The ``init()`` -function is only invoked on the primary CPU during cold boot. If the service -uses per-CPU data this must either be initialized for all CPUs during this call, -or be done lazily when a CPU first issues an SMC call to that service. If -``init()`` returns anything other than ``0``, this is treated as an initialization -error and the service is ignored: this does not cause the firmware to halt. - -The OEN and call type fields present in the SMC Function ID cover a total of -128 distinct services, but in practice a single descriptor can cover a range of -OENs, e.g. SMCs to call a Trusted OS function. To optimize the lookup of a -service handler, the framework uses an array of 128 indices that map every -distinct OEN/call-type combination either to one of the declared services or to -indicate the service is not handled. This ``rt_svc_descs_indices[]`` array is -populated for all of the OENs covered by a service after the service ``init()`` -function has reported success. So a service that fails to initialize will never -have it's ``handle()`` function invoked. - -The following figure shows how the ``rt_svc_descs_indices[]`` index maps the SMC -Function ID call type and OEN onto a specific service handler in the -``rt_svc_descs[]`` array. - -|Image 1| - -Handling an SMC -~~~~~~~~~~~~~~~ - -When the EL3 runtime services framework receives a Secure Monitor Call, the SMC -Function ID is passed in W0 from the lower exception level (as per the -`SMCCC`_). If the calling register width is AArch32, it is invalid to invoke an -SMC Function which indicates the SMC64 calling convention: such calls are -ignored and return the Unknown SMC Function Identifier result code ``0xFFFFFFFF`` -in R0/X0. - -Bit[31] (fast/yielding call) and bits[29:24] (owning entity number) of the SMC -Function ID are combined to index into the ``rt_svc_descs_indices[]`` array. The -resulting value might indicate a service that has no handler, in this case the -framework will also report an Unknown SMC Function ID. Otherwise, the value is -used as a further index into the ``rt_svc_descs[]`` array to locate the required -service and handler. - -The service's ``handle()`` callback is provided with five of the SMC parameters -directly, the others are saved into memory for retrieval (if needed) by the -handler. The handler is also provided with an opaque ``handle`` for use with the -supporting library for parameter retrieval, setting return values and context -manipulation; and with ``flags`` indicating the security state of the caller. The -framework finally sets up the execution stack for the handler, and invokes the -services ``handle()`` function. - -On return from the handler the result registers are populated in X0-X3 before -restoring the stack and CPU state and returning from the original SMC. - -Exception Handling Framework ----------------------------- - -Please refer to the `Exception Handling Framework`_ document. - -Power State Coordination Interface ----------------------------------- - -TODO: Provide design walkthrough of PSCI implementation. - -The PSCI v1.1 specification categorizes APIs as optional and mandatory. All the -mandatory APIs in PSCI v1.1, PSCI v1.0 and in PSCI v0.2 draft specification -`Power State Coordination Interface PDD`_ are implemented. The table lists -the PSCI v1.1 APIs and their support in generic code. - -An API implementation might have a dependency on platform code e.g. CPU_SUSPEND -requires the platform to export a part of the implementation. Hence the level -of support of the mandatory APIs depends upon the support exported by the -platform port as well. The Juno and FVP (all variants) platforms export all the -required support. - -+-----------------------------+-------------+-------------------------------+ -| PSCI v1.1 API | Supported | Comments | -+=============================+=============+===============================+ -| ``PSCI_VERSION`` | Yes | The version returned is 1.1 | -+-----------------------------+-------------+-------------------------------+ -| ``CPU_SUSPEND`` | Yes\* | | -+-----------------------------+-------------+-------------------------------+ -| ``CPU_OFF`` | Yes\* | | -+-----------------------------+-------------+-------------------------------+ -| ``CPU_ON`` | Yes\* | | -+-----------------------------+-------------+-------------------------------+ -| ``AFFINITY_INFO`` | Yes | | -+-----------------------------+-------------+-------------------------------+ -| ``MIGRATE`` | Yes\*\* | | -+-----------------------------+-------------+-------------------------------+ -| ``MIGRATE_INFO_TYPE`` | Yes\*\* | | -+-----------------------------+-------------+-------------------------------+ -| ``MIGRATE_INFO_CPU`` | Yes\*\* | | -+-----------------------------+-------------+-------------------------------+ -| ``SYSTEM_OFF`` | Yes\* | | -+-----------------------------+-------------+-------------------------------+ -| ``SYSTEM_RESET`` | Yes\* | | -+-----------------------------+-------------+-------------------------------+ -| ``PSCI_FEATURES`` | Yes | | -+-----------------------------+-------------+-------------------------------+ -| ``CPU_FREEZE`` | No | | -+-----------------------------+-------------+-------------------------------+ -| ``CPU_DEFAULT_SUSPEND`` | No | | -+-----------------------------+-------------+-------------------------------+ -| ``NODE_HW_STATE`` | Yes\* | | -+-----------------------------+-------------+-------------------------------+ -| ``SYSTEM_SUSPEND`` | Yes\* | | -+-----------------------------+-------------+-------------------------------+ -| ``PSCI_SET_SUSPEND_MODE`` | No | | -+-----------------------------+-------------+-------------------------------+ -| ``PSCI_STAT_RESIDENCY`` | Yes\* | | -+-----------------------------+-------------+-------------------------------+ -| ``PSCI_STAT_COUNT`` | Yes\* | | -+-----------------------------+-------------+-------------------------------+ -| ``SYSTEM_RESET2`` | Yes\* | | -+-----------------------------+-------------+-------------------------------+ -| ``MEM_PROTECT`` | Yes\* | | -+-----------------------------+-------------+-------------------------------+ -| ``MEM_PROTECT_CHECK_RANGE`` | Yes\* | | -+-----------------------------+-------------+-------------------------------+ - -\*Note : These PSCI APIs require platform power management hooks to be -registered with the generic PSCI code to be supported. - -\*\*Note : These PSCI APIs require appropriate Secure Payload Dispatcher -hooks to be registered with the generic PSCI code to be supported. - -The PSCI implementation in TF-A is a library which can be integrated with -AArch64 or AArch32 EL3 Runtime Software for Armv8-A systems. A guide to -integrating PSCI library with AArch32 EL3 Runtime Software can be found -`here`_. - -Secure-EL1 Payloads and Dispatchers ------------------------------------ - -On a production system that includes a Trusted OS running in Secure-EL1/EL0, -the Trusted OS is coupled with a companion runtime service in the BL31 -firmware. This service is responsible for the initialisation of the Trusted -OS and all communications with it. The Trusted OS is the BL32 stage of the -boot flow in TF-A. The firmware will attempt to locate, load and execute a -BL32 image. - -TF-A uses a more general term for the BL32 software that runs at Secure-EL1 - -the *Secure-EL1 Payload* - as it is not always a Trusted OS. - -TF-A provides a Test Secure-EL1 Payload (TSP) and a Test Secure-EL1 Payload -Dispatcher (TSPD) service as an example of how a Trusted OS is supported on a -production system using the Runtime Services Framework. On such a system, the -Test BL32 image and service are replaced by the Trusted OS and its dispatcher -service. The TF-A build system expects that the dispatcher will define the -build flag ``NEED_BL32`` to enable it to include the BL32 in the build either -as a binary or to compile from source depending on whether the ``BL32`` build -option is specified or not. - -The TSP runs in Secure-EL1. It is designed to demonstrate synchronous -communication with the normal-world software running in EL1/EL2. Communication -is initiated by the normal-world software - -- either directly through a Fast SMC (as defined in the `SMCCC`_) - -- or indirectly through a `PSCI`_ SMC. The `PSCI`_ implementation in turn - informs the TSPD about the requested power management operation. This allows - the TSP to prepare for or respond to the power state change - -The TSPD service is responsible for. - -- Initializing the TSP - -- Routing requests and responses between the secure and the non-secure - states during the two types of communications just described - -Initializing a BL32 Image -~~~~~~~~~~~~~~~~~~~~~~~~~ - -The Secure-EL1 Payload Dispatcher (SPD) service is responsible for initializing -the BL32 image. It needs access to the information passed by BL2 to BL31 to do -so. This is provided by: - -.. code:: c - - entry_point_info_t *bl31_plat_get_next_image_ep_info(uint32_t); - -which returns a reference to the ``entry_point_info`` structure corresponding to -the image which will be run in the specified security state. The SPD uses this -API to get entry point information for the SECURE image, BL32. - -In the absence of a BL32 image, BL31 passes control to the normal world -bootloader image (BL33). When the BL32 image is present, it is typical -that the SPD wants control to be passed to BL32 first and then later to BL33. - -To do this the SPD has to register a BL32 initialization function during -initialization of the SPD service. The BL32 initialization function has this -prototype: - -.. code:: c - - int32_t init(void); - -and is registered using the ``bl31_register_bl32_init()`` function. - -TF-A supports two approaches for the SPD to pass control to BL32 before -returning through EL3 and running the non-trusted firmware (BL33): - -#. In the BL32 setup function, use ``bl31_set_next_image_type()`` to - request that the exit from ``bl31_main()`` is to the BL32 entrypoint in - Secure-EL1. BL31 will exit to BL32 using the asynchronous method by - calling ``bl31_prepare_next_image_entry()`` and ``el3_exit()``. - - When the BL32 has completed initialization at Secure-EL1, it returns to - BL31 by issuing an SMC, using a Function ID allocated to the SPD. On - receipt of this SMC, the SPD service handler should switch the CPU context - from trusted to normal world and use the ``bl31_set_next_image_type()`` and - ``bl31_prepare_next_image_entry()`` functions to set up the initial return to - the normal world firmware BL33. On return from the handler the framework - will exit to EL2 and run BL33. - -#. The BL32 setup function registers an initialization function using - ``bl31_register_bl32_init()`` which provides a SPD-defined mechanism to - invoke a 'world-switch synchronous call' to Secure-EL1 to run the BL32 - entrypoint. - NOTE: The Test SPD service included with TF-A provides one implementation - of such a mechanism. - - On completion BL32 returns control to BL31 via a SMC, and on receipt the - SPD service handler invokes the synchronous call return mechanism to return - to the BL32 initialization function. On return from this function, - ``bl31_main()`` will set up the return to the normal world firmware BL33 and - continue the boot process in the normal world. - -Crash Reporting in BL31 ------------------------ - -BL31 implements a scheme for reporting the processor state when an unhandled -exception is encountered. The reporting mechanism attempts to preserve all the -register contents and report it via a dedicated UART (PL011 console). BL31 -reports the general purpose, EL3, Secure EL1 and some EL2 state registers. - -A dedicated per-CPU crash stack is maintained by BL31 and this is retrieved via -the per-CPU pointer cache. The implementation attempts to minimise the memory -required for this feature. The file ``crash_reporting.S`` contains the -implementation for crash reporting. - -The sample crash output is shown below. - -:: - - x0 :0x000000004F00007C - x1 :0x0000000007FFFFFF - x2 :0x0000000004014D50 - x3 :0x0000000000000000 - x4 :0x0000000088007998 - x5 :0x00000000001343AC - x6 :0x0000000000000016 - x7 :0x00000000000B8A38 - x8 :0x00000000001343AC - x9 :0x00000000000101A8 - x10 :0x0000000000000002 - x11 :0x000000000000011C - x12 :0x00000000FEFDC644 - x13 :0x00000000FED93FFC - x14 :0x0000000000247950 - x15 :0x00000000000007A2 - x16 :0x00000000000007A4 - x17 :0x0000000000247950 - x18 :0x0000000000000000 - x19 :0x00000000FFFFFFFF - x20 :0x0000000004014D50 - x21 :0x000000000400A38C - x22 :0x0000000000247950 - x23 :0x0000000000000010 - x24 :0x0000000000000024 - x25 :0x00000000FEFDC868 - x26 :0x00000000FEFDC86A - x27 :0x00000000019EDEDC - x28 :0x000000000A7CFDAA - x29 :0x0000000004010780 - x30 :0x000000000400F004 - scr_el3 :0x0000000000000D3D - sctlr_el3 :0x0000000000C8181F - cptr_el3 :0x0000000000000000 - tcr_el3 :0x0000000080803520 - daif :0x00000000000003C0 - mair_el3 :0x00000000000004FF - spsr_el3 :0x00000000800003CC - elr_el3 :0x000000000400C0CC - ttbr0_el3 :0x00000000040172A0 - esr_el3 :0x0000000096000210 - sp_el3 :0x0000000004014D50 - far_el3 :0x000000004F00007C - spsr_el1 :0x0000000000000000 - elr_el1 :0x0000000000000000 - spsr_abt :0x0000000000000000 - spsr_und :0x0000000000000000 - spsr_irq :0x0000000000000000 - spsr_fiq :0x0000000000000000 - sctlr_el1 :0x0000000030C81807 - actlr_el1 :0x0000000000000000 - cpacr_el1 :0x0000000000300000 - csselr_el1 :0x0000000000000002 - sp_el1 :0x0000000004028800 - esr_el1 :0x0000000000000000 - ttbr0_el1 :0x000000000402C200 - ttbr1_el1 :0x0000000000000000 - mair_el1 :0x00000000000004FF - amair_el1 :0x0000000000000000 - tcr_el1 :0x0000000000003520 - tpidr_el1 :0x0000000000000000 - tpidr_el0 :0x0000000000000000 - tpidrro_el0 :0x0000000000000000 - dacr32_el2 :0x0000000000000000 - ifsr32_el2 :0x0000000000000000 - par_el1 :0x0000000000000000 - far_el1 :0x0000000000000000 - afsr0_el1 :0x0000000000000000 - afsr1_el1 :0x0000000000000000 - contextidr_el1 :0x0000000000000000 - vbar_el1 :0x0000000004027000 - cntp_ctl_el0 :0x0000000000000000 - cntp_cval_el0 :0x0000000000000000 - cntv_ctl_el0 :0x0000000000000000 - cntv_cval_el0 :0x0000000000000000 - cntkctl_el1 :0x0000000000000000 - sp_el0 :0x0000000004010780 - -Guidelines for Reset Handlers ------------------------------ - -TF-A implements a framework that allows CPU and platform ports to perform -actions very early after a CPU is released from reset in both the cold and warm -boot paths. This is done by calling the ``reset_handler()`` function in both -the BL1 and BL31 images. It in turn calls the platform and CPU specific reset -handling functions. - -Details for implementing a CPU specific reset handler can be found in -Section 8. Details for implementing a platform specific reset handler can be -found in the `Porting Guide`_ (see the ``plat_reset_handler()`` function). - -When adding functionality to a reset handler, keep in mind that if a different -reset handling behavior is required between the first and the subsequent -invocations of the reset handling code, this should be detected at runtime. -In other words, the reset handler should be able to detect whether an action has -already been performed and act as appropriate. Possible courses of actions are, -e.g. skip the action the second time, or undo/redo it. - -Configuring secure interrupts ------------------------------ - -The GIC driver is responsible for performing initial configuration of secure -interrupts on the platform. To this end, the platform is expected to provide the -GIC driver (either GICv2 or GICv3, as selected by the platform) with the -interrupt configuration during the driver initialisation. - -Secure interrupt configuration are specified in an array of secure interrupt -properties. In this scheme, in both GICv2 and GICv3 driver data structures, the -``interrupt_props`` member points to an array of interrupt properties. Each -element of the array specifies the interrupt number and its attributes -(priority, group, configuration). Each element of the array shall be populated -by the macro ``INTR_PROP_DESC()``. The macro takes the following arguments: - -- 10-bit interrupt number, - -- 8-bit interrupt priority, - -- Interrupt type (one of ``INTR_TYPE_EL3``, ``INTR_TYPE_S_EL1``, - ``INTR_TYPE_NS``), - -- Interrupt configuration (either ``GIC_INTR_CFG_LEVEL`` or - ``GIC_INTR_CFG_EDGE``). - -CPU specific operations framework ---------------------------------- - -Certain aspects of the Armv8-A architecture are implementation defined, -that is, certain behaviours are not architecturally defined, but must be -defined and documented by individual processor implementations. TF-A -implements a framework which categorises the common implementation defined -behaviours and allows a processor to export its implementation of that -behaviour. The categories are: - -#. Processor specific reset sequence. - -#. Processor specific power down sequences. - -#. Processor specific register dumping as a part of crash reporting. - -#. Errata status reporting. - -Each of the above categories fulfils a different requirement. - -#. allows any processor specific initialization before the caches and MMU - are turned on, like implementation of errata workarounds, entry into - the intra-cluster coherency domain etc. - -#. allows each processor to implement the power down sequence mandated in - its Technical Reference Manual (TRM). - -#. allows a processor to provide additional information to the developer - in the event of a crash, for example Cortex-A53 has registers which - can expose the data cache contents. - -#. allows a processor to define a function that inspects and reports the status - of all errata workarounds on that processor. - -Please note that only 2. is mandated by the TRM. - -The CPU specific operations framework scales to accommodate a large number of -different CPUs during power down and reset handling. The platform can specify -any CPU optimization it wants to enable for each CPU. It can also specify -the CPU errata workarounds to be applied for each CPU type during reset -handling by defining CPU errata compile time macros. Details on these macros -can be found in the `cpu-specific-build-macros.rst`_ file. - -The CPU specific operations framework depends on the ``cpu_ops`` structure which -needs to be exported for each type of CPU in the platform. It is defined in -``include/lib/cpus/aarch64/cpu_macros.S`` and has the following fields : ``midr``, -``reset_func()``, ``cpu_pwr_down_ops`` (array of power down functions) and -``cpu_reg_dump()``. - -The CPU specific files in ``lib/cpus`` export a ``cpu_ops`` data structure with -suitable handlers for that CPU. For example, ``lib/cpus/aarch64/cortex_a53.S`` -exports the ``cpu_ops`` for Cortex-A53 CPU. According to the platform -configuration, these CPU specific files must be included in the build by -the platform makefile. The generic CPU specific operations framework code exists -in ``lib/cpus/aarch64/cpu_helpers.S``. - -CPU specific Reset Handling -~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -After a reset, the state of the CPU when it calls generic reset handler is: -MMU turned off, both instruction and data caches turned off and not part -of any coherency domain. - -The BL entrypoint code first invokes the ``plat_reset_handler()`` to allow -the platform to perform any system initialization required and any system -errata workarounds that needs to be applied. The ``get_cpu_ops_ptr()`` reads -the current CPU midr, finds the matching ``cpu_ops`` entry in the ``cpu_ops`` -array and returns it. Note that only the part number and implementer fields -in midr are used to find the matching ``cpu_ops`` entry. The ``reset_func()`` in -the returned ``cpu_ops`` is then invoked which executes the required reset -handling for that CPU and also any errata workarounds enabled by the platform. -This function must preserve the values of general purpose registers x20 to x29. - -Refer to Section "Guidelines for Reset Handlers" for general guidelines -regarding placement of code in a reset handler. - -CPU specific power down sequence -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -During the BL31 initialization sequence, the pointer to the matching ``cpu_ops`` -entry is stored in per-CPU data by ``init_cpu_ops()`` so that it can be quickly -retrieved during power down sequences. - -Various CPU drivers register handlers to perform power down at certain power -levels for that specific CPU. The PSCI service, upon receiving a power down -request, determines the highest power level at which to execute power down -sequence for a particular CPU. It uses the ``prepare_cpu_pwr_dwn()`` function to -pick the right power down handler for the requested level. The function -retrieves ``cpu_ops`` pointer member of per-CPU data, and from that, further -retrieves ``cpu_pwr_down_ops`` array, and indexes into the required level. If the -requested power level is higher than what a CPU driver supports, the handler -registered for highest level is invoked. - -At runtime the platform hooks for power down are invoked by the PSCI service to -perform platform specific operations during a power down sequence, for example -turning off CCI coherency during a cluster power down. - -CPU specific register reporting during crash -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -If the crash reporting is enabled in BL31, when a crash occurs, the crash -reporting framework calls ``do_cpu_reg_dump`` which retrieves the matching -``cpu_ops`` using ``get_cpu_ops_ptr()`` function. The ``cpu_reg_dump()`` in -``cpu_ops`` is invoked, which then returns the CPU specific register values to -be reported and a pointer to the ASCII list of register names in a format -expected by the crash reporting framework. - -CPU errata status reporting -~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Errata workarounds for CPUs supported in TF-A are applied during both cold and -warm boots, shortly after reset. Individual Errata workarounds are enabled as -build options. Some errata workarounds have potential run-time implications; -therefore some are enabled by default, others not. Platform ports shall -override build options to enable or disable errata as appropriate. The CPU -drivers take care of applying errata workarounds that are enabled and applicable -to a given CPU. Refer to the section titled *CPU Errata Workarounds* in `CPUBM`_ -for more information. - -Functions in CPU drivers that apply errata workaround must follow the -conventions listed below. - -The errata workaround must be authored as two separate functions: - -- One that checks for errata. This function must determine whether that errata - applies to the current CPU. Typically this involves matching the current - CPUs revision and variant against a value that's known to be affected by the - errata. If the function determines that the errata applies to this CPU, it - must return ``ERRATA_APPLIES``; otherwise, it must return - ``ERRATA_NOT_APPLIES``. The utility functions ``cpu_get_rev_var`` and - ``cpu_rev_var_ls`` functions may come in handy for this purpose. - -For an errata identified as ``E``, the check function must be named -``check_errata_E``. - -This function will be invoked at different times, both from assembly and from -C run time. Therefore it must follow AAPCS, and must not use stack. - -- Another one that applies the errata workaround. This function would call the - check function described above, and applies errata workaround if required. - -CPU drivers that apply errata workaround can optionally implement an assembly -function that report the status of errata workarounds pertaining to that CPU. -For a driver that registers the CPU, for example, ``cpux`` via ``declare_cpu_ops`` -macro, the errata reporting function, if it exists, must be named -``cpux_errata_report``. This function will always be called with MMU enabled; it -must follow AAPCS and may use stack. - -In a debug build of TF-A, on a CPU that comes out of reset, both BL1 and the -runtime firmware (BL31 in AArch64, and BL32 in AArch32) will invoke errata -status reporting function, if one exists, for that type of CPU. - -To report the status of each errata workaround, the function shall use the -assembler macro ``report_errata``, passing it: - -- The build option that enables the errata; - -- The name of the CPU: this must be the same identifier that CPU driver - registered itself with, using ``declare_cpu_ops``; - -- And the errata identifier: the identifier must match what's used in the - errata's check function described above. - -The errata status reporting function will be called once per CPU type/errata -combination during the software's active life time. - -It's expected that whenever an errata workaround is submitted to TF-A, the -errata reporting function is appropriately extended to report its status as -well. - -Reporting the status of errata workaround is for informational purpose only; it -has no functional significance. - -Memory layout of BL images --------------------------- - -Each bootloader image can be divided in 2 parts: - -- the static contents of the image. These are data actually stored in the - binary on the disk. In the ELF terminology, they are called ``PROGBITS`` - sections; - -- the run-time contents of the image. These are data that don't occupy any - space in the binary on the disk. The ELF binary just contains some - metadata indicating where these data will be stored at run-time and the - corresponding sections need to be allocated and initialized at run-time. - In the ELF terminology, they are called ``NOBITS`` sections. - -All PROGBITS sections are grouped together at the beginning of the image, -followed by all NOBITS sections. This is true for all TF-A images and it is -governed by the linker scripts. This ensures that the raw binary images are -as small as possible. If a NOBITS section was inserted in between PROGBITS -sections then the resulting binary file would contain zero bytes in place of -this NOBITS section, making the image unnecessarily bigger. Smaller images -allow faster loading from the FIP to the main memory. - -Linker scripts and symbols -~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Each bootloader stage image layout is described by its own linker script. The -linker scripts export some symbols into the program symbol table. Their values -correspond to particular addresses. TF-A code can refer to these symbols to -figure out the image memory layout. - -Linker symbols follow the following naming convention in TF-A. - -- ``__<SECTION>_START__`` - - Start address of a given section named ``<SECTION>``. - -- ``__<SECTION>_END__`` - - End address of a given section named ``<SECTION>``. If there is an alignment - constraint on the section's end address then ``__<SECTION>_END__`` corresponds - to the end address of the section's actual contents, rounded up to the right - boundary. Refer to the value of ``__<SECTION>_UNALIGNED_END__`` to know the - actual end address of the section's contents. - -- ``__<SECTION>_UNALIGNED_END__`` - - End address of a given section named ``<SECTION>`` without any padding or - rounding up due to some alignment constraint. - -- ``__<SECTION>_SIZE__`` - - Size (in bytes) of a given section named ``<SECTION>``. If there is an - alignment constraint on the section's end address then ``__<SECTION>_SIZE__`` - corresponds to the size of the section's actual contents, rounded up to the - right boundary. In other words, ``__<SECTION>_SIZE__ = __<SECTION>_END__ - _<SECTION>_START__``. Refer to the value of ``__<SECTION>_UNALIGNED_SIZE__`` - to know the actual size of the section's contents. - -- ``__<SECTION>_UNALIGNED_SIZE__`` - - Size (in bytes) of a given section named ``<SECTION>`` without any padding or - rounding up due to some alignment constraint. In other words, - ``__<SECTION>_UNALIGNED_SIZE__ = __<SECTION>_UNALIGNED_END__ - __<SECTION>_START__``. - -Some of the linker symbols are mandatory as TF-A code relies on them to be -defined. They are listed in the following subsections. Some of them must be -provided for each bootloader stage and some are specific to a given bootloader -stage. - -The linker scripts define some extra, optional symbols. They are not actually -used by any code but they help in understanding the bootloader images' memory -layout as they are easy to spot in the link map files. - -Common linker symbols -^^^^^^^^^^^^^^^^^^^^^ - -All BL images share the following requirements: - -- The BSS section must be zero-initialised before executing any C code. -- The coherent memory section (if enabled) must be zero-initialised as well. -- The MMU setup code needs to know the extents of the coherent and read-only - memory regions to set the right memory attributes. When - ``SEPARATE_CODE_AND_RODATA=1``, it needs to know more specifically how the - read-only memory region is divided between code and data. - -The following linker symbols are defined for this purpose: - -- ``__BSS_START__`` -- ``__BSS_SIZE__`` -- ``__COHERENT_RAM_START__`` Must be aligned on a page-size boundary. -- ``__COHERENT_RAM_END__`` Must be aligned on a page-size boundary. -- ``__COHERENT_RAM_UNALIGNED_SIZE__`` -- ``__RO_START__`` -- ``__RO_END__`` -- ``__TEXT_START__`` -- ``__TEXT_END__`` -- ``__RODATA_START__`` -- ``__RODATA_END__`` - -BL1's linker symbols -^^^^^^^^^^^^^^^^^^^^ - -BL1 being the ROM image, it has additional requirements. BL1 resides in ROM and -it is entirely executed in place but it needs some read-write memory for its -mutable data. Its ``.data`` section (i.e. its allocated read-write data) must be -relocated from ROM to RAM before executing any C code. - -The following additional linker symbols are defined for BL1: - -- ``__BL1_ROM_END__`` End address of BL1's ROM contents, covering its code - and ``.data`` section in ROM. -- ``__DATA_ROM_START__`` Start address of the ``.data`` section in ROM. Must be - aligned on a 16-byte boundary. -- ``__DATA_RAM_START__`` Address in RAM where the ``.data`` section should be - copied over. Must be aligned on a 16-byte boundary. -- ``__DATA_SIZE__`` Size of the ``.data`` section (in ROM or RAM). -- ``__BL1_RAM_START__`` Start address of BL1 read-write data. -- ``__BL1_RAM_END__`` End address of BL1 read-write data. - -How to choose the right base addresses for each bootloader stage image -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -There is currently no support for dynamic image loading in TF-A. This means -that all bootloader images need to be linked against their ultimate runtime -locations and the base addresses of each image must be chosen carefully such -that images don't overlap each other in an undesired way. As the code grows, -the base addresses might need adjustments to cope with the new memory layout. - -The memory layout is completely specific to the platform and so there is no -general recipe for choosing the right base addresses for each bootloader image. -However, there are tools to aid in understanding the memory layout. These are -the link map files: ``build/<platform>/<build-type>/bl<x>/bl<x>.map``, with ``<x>`` -being the stage bootloader. They provide a detailed view of the memory usage of -each image. Among other useful information, they provide the end address of -each image. - -- ``bl1.map`` link map file provides ``__BL1_RAM_END__`` address. -- ``bl2.map`` link map file provides ``__BL2_END__`` address. -- ``bl31.map`` link map file provides ``__BL31_END__`` address. -- ``bl32.map`` link map file provides ``__BL32_END__`` address. - -For each bootloader image, the platform code must provide its start address -as well as a limit address that it must not overstep. The latter is used in the -linker scripts to check that the image doesn't grow past that address. If that -happens, the linker will issue a message similar to the following: - -:: - - aarch64-none-elf-ld: BLx has exceeded its limit. - -Additionally, if the platform memory layout implies some image overlaying like -on FVP, BL31 and TSP need to know the limit address that their PROGBITS -sections must not overstep. The platform code must provide those. - -TF-A does not provide any mechanism to verify at boot time that the memory -to load a new image is free to prevent overwriting a previously loaded image. -The platform must specify the memory available in the system for all the -relevant BL images to be loaded. - -For example, in the case of BL1 loading BL2, ``bl1_plat_sec_mem_layout()`` will -return the region defined by the platform where BL1 intends to load BL2. The -``load_image()`` function performs bounds check for the image size based on the -base and maximum image size provided by the platforms. Platforms must take -this behaviour into account when defining the base/size for each of the images. - -Memory layout on Arm development platforms -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -The following list describes the memory layout on the Arm development platforms: - -- A 4KB page of shared memory is used for communication between Trusted - Firmware and the platform's power controller. This is located at the base of - Trusted SRAM. The amount of Trusted SRAM available to load the bootloader - images is reduced by the size of the shared memory. - - The shared memory is used to store the CPUs' entrypoint mailbox. On Juno, - this is also used for the MHU payload when passing messages to and from the - SCP. - -- Another 4 KB page is reserved for passing memory layout between BL1 and BL2 - and also the dynamic firmware configurations. - -- On FVP, BL1 is originally sitting in the Trusted ROM at address ``0x0``. On - Juno, BL1 resides in flash memory at address ``0x0BEC0000``. BL1 read-write - data are relocated to the top of Trusted SRAM at runtime. - -- BL2 is loaded below BL1 RW - -- EL3 Runtime Software, BL31 for AArch64 and BL32 for AArch32 (e.g. SP_MIN), - is loaded at the top of the Trusted SRAM, such that its NOBITS sections will - overwrite BL1 R/W data and BL2. This implies that BL1 global variables - remain valid only until execution reaches the EL3 Runtime Software entry - point during a cold boot. - -- On Juno, SCP_BL2 is loaded temporarily into the EL3 Runtime Software memory - region and transfered to the SCP before being overwritten by EL3 Runtime - Software. - -- BL32 (for AArch64) can be loaded in one of the following locations: - - - Trusted SRAM - - Trusted DRAM (FVP only) - - Secure region of DRAM (top 16MB of DRAM configured by the TrustZone - controller) - - When BL32 (for AArch64) is loaded into Trusted SRAM, it is loaded below - BL31. - -The location of the BL32 image will result in different memory maps. This is -illustrated for both FVP and Juno in the following diagrams, using the TSP as -an example. - -Note: Loading the BL32 image in TZC secured DRAM doesn't change the memory -layout of the other images in Trusted SRAM. - -CONFIG section in memory layouts shown below contains: - -:: - - +--------------------+ - |bl2_mem_params_descs| - |--------------------| - | fw_configs | - +--------------------+ - -``bl2_mem_params_descs`` contains parameters passed from BL2 to next the -BL image during boot. - -``fw_configs`` includes soc_fw_config, tos_fw_config and tb_fw_config. - -**FVP with TSP in Trusted SRAM with firmware configs :** -(These diagrams only cover the AArch64 case) - -:: - - DRAM - 0xffffffff +----------+ - : : - |----------| - |HW_CONFIG | - 0x83000000 |----------| (non-secure) - | | - 0x80000000 +----------+ - - Trusted SRAM - 0x04040000 +----------+ loaded by BL2 +----------------+ - | BL1 (rw) | <<<<<<<<<<<<< | | - |----------| <<<<<<<<<<<<< | BL31 NOBITS | - | BL2 | <<<<<<<<<<<<< | | - |----------| <<<<<<<<<<<<< |----------------| - | | <<<<<<<<<<<<< | BL31 PROGBITS | - | | <<<<<<<<<<<<< |----------------| - | | <<<<<<<<<<<<< | BL32 | - 0x04002000 +----------+ +----------------+ - | CONFIG | - 0x04001000 +----------+ - | Shared | - 0x04000000 +----------+ - - Trusted ROM - 0x04000000 +----------+ - | BL1 (ro) | - 0x00000000 +----------+ - -**FVP with TSP in Trusted DRAM with firmware configs (default option):** - -:: - - DRAM - 0xffffffff +--------------+ - : : - |--------------| - | HW_CONFIG | - 0x83000000 |--------------| (non-secure) - | | - 0x80000000 +--------------+ - - Trusted DRAM - 0x08000000 +--------------+ - | BL32 | - 0x06000000 +--------------+ - - Trusted SRAM - 0x04040000 +--------------+ loaded by BL2 +----------------+ - | BL1 (rw) | <<<<<<<<<<<<< | | - |--------------| <<<<<<<<<<<<< | BL31 NOBITS | - | BL2 | <<<<<<<<<<<<< | | - |--------------| <<<<<<<<<<<<< |----------------| - | | <<<<<<<<<<<<< | BL31 PROGBITS | - | | +----------------+ - +--------------+ - | CONFIG | - 0x04001000 +--------------+ - | Shared | - 0x04000000 +--------------+ - - Trusted ROM - 0x04000000 +--------------+ - | BL1 (ro) | - 0x00000000 +--------------+ - -**FVP with TSP in TZC-Secured DRAM with firmware configs :** - -:: - - DRAM - 0xffffffff +----------+ - | BL32 | (secure) - 0xff000000 +----------+ - | | - |----------| - |HW_CONFIG | - 0x83000000 |----------| (non-secure) - | | - 0x80000000 +----------+ - - Trusted SRAM - 0x04040000 +----------+ loaded by BL2 +----------------+ - | BL1 (rw) | <<<<<<<<<<<<< | | - |----------| <<<<<<<<<<<<< | BL31 NOBITS | - | BL2 | <<<<<<<<<<<<< | | - |----------| <<<<<<<<<<<<< |----------------| - | | <<<<<<<<<<<<< | BL31 PROGBITS | - | | +----------------+ - 0x04002000 +----------+ - | CONFIG | - 0x04001000 +----------+ - | Shared | - 0x04000000 +----------+ - - Trusted ROM - 0x04000000 +----------+ - | BL1 (ro) | - 0x00000000 +----------+ - -**Juno with BL32 in Trusted SRAM :** - -:: - - Flash0 - 0x0C000000 +----------+ - : : - 0x0BED0000 |----------| - | BL1 (ro) | - 0x0BEC0000 |----------| - : : - 0x08000000 +----------+ BL31 is loaded - after SCP_BL2 has - Trusted SRAM been sent to SCP - 0x04040000 +----------+ loaded by BL2 +----------------+ - | BL1 (rw) | <<<<<<<<<<<<< | | - |----------| <<<<<<<<<<<<< | BL31 NOBITS | - | BL2 | <<<<<<<<<<<<< | | - |----------| <<<<<<<<<<<<< |----------------| - | SCP_BL2 | <<<<<<<<<<<<< | BL31 PROGBITS | - |----------| <<<<<<<<<<<<< |----------------| - | | <<<<<<<<<<<<< | BL32 | - | | +----------------+ - | | - 0x04001000 +----------+ - | MHU | - 0x04000000 +----------+ - -**Juno with BL32 in TZC-secured DRAM :** - -:: - - DRAM - 0xFFE00000 +----------+ - | BL32 | (secure) - 0xFF000000 |----------| - | | - : : (non-secure) - | | - 0x80000000 +----------+ - - Flash0 - 0x0C000000 +----------+ - : : - 0x0BED0000 |----------| - | BL1 (ro) | - 0x0BEC0000 |----------| - : : - 0x08000000 +----------+ BL31 is loaded - after SCP_BL2 has - Trusted SRAM been sent to SCP - 0x04040000 +----------+ loaded by BL2 +----------------+ - | BL1 (rw) | <<<<<<<<<<<<< | | - |----------| <<<<<<<<<<<<< | BL31 NOBITS | - | BL2 | <<<<<<<<<<<<< | | - |----------| <<<<<<<<<<<<< |----------------| - | SCP_BL2 | <<<<<<<<<<<<< | BL31 PROGBITS | - |----------| +----------------+ - 0x04001000 +----------+ - | MHU | - 0x04000000 +----------+ - -Library at ROM ---------------- - -Please refer to the `ROMLIB Design`_ document. - -Firmware Image Package (FIP) ----------------------------- - -Using a Firmware Image Package (FIP) allows for packing bootloader images (and -potentially other payloads) into a single archive that can be loaded by TF-A -from non-volatile platform storage. A driver to load images from a FIP has -been added to the storage layer and allows a package to be read from supported -platform storage. A tool to create Firmware Image Packages is also provided -and described below. - -Firmware Image Package layout -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The FIP layout consists of a table of contents (ToC) followed by payload data. -The ToC itself has a header followed by one or more table entries. The ToC is -terminated by an end marker entry, and since the size of the ToC is 0 bytes, -the offset equals the total size of the FIP file. All ToC entries describe some -payload data that has been appended to the end of the binary package. With the -information provided in the ToC entry the corresponding payload data can be -retrieved. - -:: - - ------------------ - | ToC Header | - |----------------| - | ToC Entry 0 | - |----------------| - | ToC Entry 1 | - |----------------| - | ToC End Marker | - |----------------| - | | - | Data 0 | - | | - |----------------| - | | - | Data 1 | - | | - ------------------ - -The ToC header and entry formats are described in the header file -``include/tools_share/firmware_image_package.h``. This file is used by both the -tool and TF-A. - -The ToC header has the following fields: - -:: - - `name`: The name of the ToC. This is currently used to validate the header. - `serial_number`: A non-zero number provided by the creation tool - `flags`: Flags associated with this data. - Bits 0-31: Reserved - Bits 32-47: Platform defined - Bits 48-63: Reserved - -A ToC entry has the following fields: - -:: - - `uuid`: All files are referred to by a pre-defined Universally Unique - IDentifier [UUID] . The UUIDs are defined in - `include/tools_share/firmware_image_package.h`. The platform translates - the requested image name into the corresponding UUID when accessing the - package. - `offset_address`: The offset address at which the corresponding payload data - can be found. The offset is calculated from the ToC base address. - `size`: The size of the corresponding payload data in bytes. - `flags`: Flags associated with this entry. None are yet defined. - -Firmware Image Package creation tool -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The FIP creation tool can be used to pack specified images into a binary -package that can be loaded by TF-A from platform storage. The tool currently -only supports packing bootloader images. Additional image definitions can be -added to the tool as required. - -The tool can be found in ``tools/fiptool``. - -Loading from a Firmware Image Package (FIP) -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The Firmware Image Package (FIP) driver can load images from a binary package on -non-volatile platform storage. For the Arm development platforms, this is -currently NOR FLASH. - -Bootloader images are loaded according to the platform policy as specified by -the function ``plat_get_image_source()``. For the Arm development platforms, this -means the platform will attempt to load images from a Firmware Image Package -located at the start of NOR FLASH0. - -The Arm development platforms' policy is to only allow loading of a known set of -images. The platform policy can be modified to allow additional images. - -Use of coherent memory in TF-A ------------------------------- - -There might be loss of coherency when physical memory with mismatched -shareability, cacheability and memory attributes is accessed by multiple CPUs -(refer to section B2.9 of `Arm ARM`_ for more details). This possibility occurs -in TF-A during power up/down sequences when coherency, MMU and caches are -turned on/off incrementally. - -TF-A defines coherent memory as a region of memory with Device nGnRE attributes -in the translation tables. The translation granule size in TF-A is 4KB. This -is the smallest possible size of the coherent memory region. - -By default, all data structures which are susceptible to accesses with -mismatched attributes from various CPUs are allocated in a coherent memory -region (refer to section 2.1 of `Porting Guide`_). The coherent memory region -accesses are Outer Shareable, non-cacheable and they can be accessed -with the Device nGnRE attributes when the MMU is turned on. Hence, at the -expense of at least an extra page of memory, TF-A is able to work around -coherency issues due to mismatched memory attributes. - -The alternative to the above approach is to allocate the susceptible data -structures in Normal WriteBack WriteAllocate Inner shareable memory. This -approach requires the data structures to be designed so that it is possible to -work around the issue of mismatched memory attributes by performing software -cache maintenance on them. - -Disabling the use of coherent memory in TF-A -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -It might be desirable to avoid the cost of allocating coherent memory on -platforms which are memory constrained. TF-A enables inclusion of coherent -memory in firmware images through the build flag ``USE_COHERENT_MEM``. -This flag is enabled by default. It can be disabled to choose the second -approach described above. - -The below sections analyze the data structures allocated in the coherent memory -region and the changes required to allocate them in normal memory. - -Coherent memory usage in PSCI implementation -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The ``psci_non_cpu_pd_nodes`` data structure stores the platform's power domain -tree information for state management of power domains. By default, this data -structure is allocated in the coherent memory region in TF-A because it can be -accessed by multiple CPUs, either with caches enabled or disabled. - -.. code:: c - - typedef struct non_cpu_pwr_domain_node { - /* - * Index of the first CPU power domain node level 0 which has this node - * as its parent. - */ - unsigned int cpu_start_idx; - - /* - * Number of CPU power domains which are siblings of the domain indexed - * by 'cpu_start_idx' i.e. all the domains in the range 'cpu_start_idx - * -> cpu_start_idx + ncpus' have this node as their parent. - */ - unsigned int ncpus; - - /* - * Index of the parent power domain node. - */ - unsigned int parent_node; - - plat_local_state_t local_state; - - unsigned char level; - - /* For indexing the psci_lock array*/ - unsigned char lock_index; - } non_cpu_pd_node_t; - -In order to move this data structure to normal memory, the use of each of its -fields must be analyzed. Fields like ``cpu_start_idx``, ``ncpus``, ``parent_node`` -``level`` and ``lock_index`` are only written once during cold boot. Hence removing -them from coherent memory involves only doing a clean and invalidate of the -cache lines after these fields are written. - -The field ``local_state`` can be concurrently accessed by multiple CPUs in -different cache states. A Lamport's Bakery lock ``psci_locks`` is used to ensure -mutual exclusion to this field and a clean and invalidate is needed after it -is written. - -Bakery lock data -~~~~~~~~~~~~~~~~ - -The bakery lock data structure ``bakery_lock_t`` is allocated in coherent memory -and is accessed by multiple CPUs with mismatched attributes. ``bakery_lock_t`` is -defined as follows: - -.. code:: c - - typedef struct bakery_lock { - /* - * The lock_data is a bit-field of 2 members: - * Bit[0] : choosing. This field is set when the CPU is - * choosing its bakery number. - * Bits[1 - 15] : number. This is the bakery number allocated. - */ - volatile uint16_t lock_data[BAKERY_LOCK_MAX_CPUS]; - } bakery_lock_t; - -It is a characteristic of Lamport's Bakery algorithm that the volatile per-CPU -fields can be read by all CPUs but only written to by the owning CPU. - -Depending upon the data cache line size, the per-CPU fields of the -``bakery_lock_t`` structure for multiple CPUs may exist on a single cache line. -These per-CPU fields can be read and written during lock contention by multiple -CPUs with mismatched memory attributes. Since these fields are a part of the -lock implementation, they do not have access to any other locking primitive to -safeguard against the resulting coherency issues. As a result, simple software -cache maintenance is not enough to allocate them in coherent memory. Consider -the following example. - -CPU0 updates its per-CPU field with data cache enabled. This write updates a -local cache line which contains a copy of the fields for other CPUs as well. Now -CPU1 updates its per-CPU field of the ``bakery_lock_t`` structure with data cache -disabled. CPU1 then issues a DCIVAC operation to invalidate any stale copies of -its field in any other cache line in the system. This operation will invalidate -the update made by CPU0 as well. - -To use bakery locks when ``USE_COHERENT_MEM`` is disabled, the lock data structure -has been redesigned. The changes utilise the characteristic of Lamport's Bakery -algorithm mentioned earlier. The bakery_lock structure only allocates the memory -for a single CPU. The macro ``DEFINE_BAKERY_LOCK`` allocates all the bakery locks -needed for a CPU into a section ``bakery_lock``. The linker allocates the memory -for other cores by using the total size allocated for the bakery_lock section -and multiplying it with (PLATFORM_CORE_COUNT - 1). This enables software to -perform software cache maintenance on the lock data structure without running -into coherency issues associated with mismatched attributes. - -The bakery lock data structure ``bakery_info_t`` is defined for use when -``USE_COHERENT_MEM`` is disabled as follows: - -.. code:: c - - typedef struct bakery_info { - /* - * The lock_data is a bit-field of 2 members: - * Bit[0] : choosing. This field is set when the CPU is - * choosing its bakery number. - * Bits[1 - 15] : number. This is the bakery number allocated. - */ - volatile uint16_t lock_data; - } bakery_info_t; - -The ``bakery_info_t`` represents a single per-CPU field of one lock and -the combination of corresponding ``bakery_info_t`` structures for all CPUs in the -system represents the complete bakery lock. The view in memory for a system -with n bakery locks are: - -:: - - bakery_lock section start - |----------------| - | `bakery_info_t`| <-- Lock_0 per-CPU field - | Lock_0 | for CPU0 - |----------------| - | `bakery_info_t`| <-- Lock_1 per-CPU field - | Lock_1 | for CPU0 - |----------------| - | .... | - |----------------| - | `bakery_info_t`| <-- Lock_N per-CPU field - | Lock_N | for CPU0 - ------------------ - | XXXXX | - | Padding to | - | next Cache WB | <--- Calculate PERCPU_BAKERY_LOCK_SIZE, allocate - | Granule | continuous memory for remaining CPUs. - ------------------ - | `bakery_info_t`| <-- Lock_0 per-CPU field - | Lock_0 | for CPU1 - |----------------| - | `bakery_info_t`| <-- Lock_1 per-CPU field - | Lock_1 | for CPU1 - |----------------| - | .... | - |----------------| - | `bakery_info_t`| <-- Lock_N per-CPU field - | Lock_N | for CPU1 - ------------------ - | XXXXX | - | Padding to | - | next Cache WB | - | Granule | - ------------------ - -Consider a system of 2 CPUs with 'N' bakery locks as shown above. For an -operation on Lock_N, the corresponding ``bakery_info_t`` in both CPU0 and CPU1 -``bakery_lock`` section need to be fetched and appropriate cache operations need -to be performed for each access. - -On Arm Platforms, bakery locks are used in psci (``psci_locks``) and power controller -driver (``arm_lock``). - -Non Functional Impact of removing coherent memory -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Removal of the coherent memory region leads to the additional software overhead -of performing cache maintenance for the affected data structures. However, since -the memory where the data structures are allocated is cacheable, the overhead is -mostly mitigated by an increase in performance. - -There is however a performance impact for bakery locks, due to: - -- Additional cache maintenance operations, and -- Multiple cache line reads for each lock operation, since the bakery locks - for each CPU are distributed across different cache lines. - -The implementation has been optimized to minimize this additional overhead. -Measurements indicate that when bakery locks are allocated in Normal memory, the -minimum latency of acquiring a lock is on an average 3-4 micro seconds whereas -in Device memory the same is 2 micro seconds. The measurements were done on the -Juno Arm development platform. - -As mentioned earlier, almost a page of memory can be saved by disabling -``USE_COHERENT_MEM``. Each platform needs to consider these trade-offs to decide -whether coherent memory should be used. If a platform disables -``USE_COHERENT_MEM`` and needs to use bakery locks in the porting layer, it can -optionally define macro ``PLAT_PERCPU_BAKERY_LOCK_SIZE`` (see the -`Porting Guide`_). Refer to the reference platform code for examples. - -Isolating code and read-only data on separate memory pages ----------------------------------------------------------- - -In the Armv8-A VMSA, translation table entries include fields that define the -properties of the target memory region, such as its access permissions. The -smallest unit of memory that can be addressed by a translation table entry is -a memory page. Therefore, if software needs to set different permissions on two -memory regions then it needs to map them using different memory pages. - -The default memory layout for each BL image is as follows: - -:: - - | ... | - +-------------------+ - | Read-write data | - +-------------------+ Page boundary - | <Padding> | - +-------------------+ - | Exception vectors | - +-------------------+ 2 KB boundary - | <Padding> | - +-------------------+ - | Read-only data | - +-------------------+ - | Code | - +-------------------+ BLx_BASE - -Note: The 2KB alignment for the exception vectors is an architectural -requirement. - -The read-write data start on a new memory page so that they can be mapped with -read-write permissions, whereas the code and read-only data below are configured -as read-only. - -However, the read-only data are not aligned on a page boundary. They are -contiguous to the code. Therefore, the end of the code section and the beginning -of the read-only data one might share a memory page. This forces both to be -mapped with the same memory attributes. As the code needs to be executable, this -means that the read-only data stored on the same memory page as the code are -executable as well. This could potentially be exploited as part of a security -attack. - -TF provides the build flag ``SEPARATE_CODE_AND_RODATA`` to isolate the code and -read-only data on separate memory pages. This in turn allows independent control -of the access permissions for the code and read-only data. In this case, -platform code gets a finer-grained view of the image layout and can -appropriately map the code region as executable and the read-only data as -execute-never. - -This has an impact on memory footprint, as padding bytes need to be introduced -between the code and read-only data to ensure the segregation of the two. To -limit the memory cost, this flag also changes the memory layout such that the -code and exception vectors are now contiguous, like so: - -:: - - | ... | - +-------------------+ - | Read-write data | - +-------------------+ Page boundary - | <Padding> | - +-------------------+ - | Read-only data | - +-------------------+ Page boundary - | <Padding> | - +-------------------+ - | Exception vectors | - +-------------------+ 2 KB boundary - | <Padding> | - +-------------------+ - | Code | - +-------------------+ BLx_BASE - -With this more condensed memory layout, the separation of read-only data will -add zero or one page to the memory footprint of each BL image. Each platform -should consider the trade-off between memory footprint and security. - -This build flag is disabled by default, minimising memory footprint. On Arm -platforms, it is enabled. - -Publish and Subscribe Framework -------------------------------- - -The Publish and Subscribe Framework allows EL3 components to define and publish -events, to which other EL3 components can subscribe. - -The following macros are provided by the framework: - -- ``REGISTER_PUBSUB_EVENT(event)``: Defines an event, and takes one argument, - the event name, which must be a valid C identifier. All calls to - ``REGISTER_PUBSUB_EVENT`` macro must be placed in the file - ``pubsub_events.h``. - -- ``PUBLISH_EVENT_ARG(event, arg)``: Publishes a defined event, by iterating - subscribed handlers and calling them in turn. The handlers will be passed the - parameter ``arg``. The expected use-case is to broadcast an event. - -- ``PUBLISH_EVENT(event)``: Like ``PUBLISH_EVENT_ARG``, except that the value - ``NULL`` is passed to subscribed handlers. - -- ``SUBSCRIBE_TO_EVENT(event, handler)``: Registers the ``handler`` to - subscribe to ``event``. The handler will be executed whenever the ``event`` - is published. - -- ``for_each_subscriber(event, subscriber)``: Iterates through all handlers - subscribed for ``event``. ``subscriber`` must be a local variable of type - ``pubsub_cb_t *``, and will point to each subscribed handler in turn during - iteration. This macro can be used for those patterns that none of the - ``PUBLISH_EVENT_*()`` macros cover. - -Publishing an event that wasn't defined using ``REGISTER_PUBSUB_EVENT`` will -result in build error. Subscribing to an undefined event however won't. - -Subscribed handlers must be of type ``pubsub_cb_t``, with following function -signature: - -:: - - typedef void* (*pubsub_cb_t)(const void *arg); - -There may be arbitrary number of handlers registered to the same event. The -order in which subscribed handlers are notified when that event is published is -not defined. Subscribed handlers may be executed in any order; handlers should -not assume any relative ordering amongst them. - -Publishing an event on a PE will result in subscribed handlers executing on that -PE only; it won't cause handlers to execute on a different PE. - -Note that publishing an event on a PE blocks until all the subscribed handlers -finish executing on the PE. - -TF-A generic code publishes and subscribes to some events within. Platform -ports are discouraged from subscribing to them. These events may be withdrawn, -renamed, or have their semantics altered in the future. Platforms may however -register, publish, and subscribe to platform-specific events. - -Publish and Subscribe Example -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -A publisher that wants to publish event ``foo`` would: - -- Define the event ``foo`` in the ``pubsub_events.h``. - - :: - - REGISTER_PUBSUB_EVENT(foo); - -- Depending on the nature of event, use one of ``PUBLISH_EVENT_*()`` macros to - publish the event at the appropriate path and time of execution. - -A subscriber that wants to subscribe to event ``foo`` published above would -implement: - -.. code:: c - - void *foo_handler(const void *arg) - { - void *result; - - /* Do handling ... */ - - return result; - } - - SUBSCRIBE_TO_EVENT(foo, foo_handler); - - -Reclaiming the BL31 initialization code -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -A significant amount of the code used for the initialization of BL31 is never -needed again after boot time. In order to reduce the runtime memory -footprint, the memory used for this code can be reclaimed after initialization -has finished and be used for runtime data. - -The build option ``RECLAIM_INIT_CODE`` can be set to mark this boot time code -with a ``.text.init.*`` attribute which can be filtered and placed suitably -within the BL image for later reclamation by the platform. The platform can -specify the filter and the memory region for this init section in BL31 via the -plat.ld.S linker script. For example, on the FVP, this section is placed -overlapping the secondary CPU stacks so that after the cold boot is done, this -memory can be reclaimed for the stacks. The init memory section is initially -mapped with ``RO``, ``EXECUTE`` attributes. After BL31 initialization has -completed, the FVP changes the attributes of this section to ``RW``, -``EXECUTE_NEVER`` allowing it to be used for runtime data. The memory attributes -are changed within the ``bl31_plat_runtime_setup`` platform hook. The init -section section can be reclaimed for any data which is accessed after cold -boot initialization and it is upto the platform to make the decision. - -Performance Measurement Framework ---------------------------------- - -The Performance Measurement Framework (PMF) facilitates collection of -timestamps by registered services and provides interfaces to retrieve them -from within TF-A. A platform can choose to expose appropriate SMCs to -retrieve these collected timestamps. - -By default, the global physical counter is used for the timestamp -value and is read via ``CNTPCT_EL0``. The framework allows to retrieve -timestamps captured by other CPUs. - -Timestamp identifier format -~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -A PMF timestamp is uniquely identified across the system via the -timestamp ID or ``tid``. The ``tid`` is composed as follows: - -:: - - Bits 0-7: The local timestamp identifier. - Bits 8-9: Reserved. - Bits 10-15: The service identifier. - Bits 16-31: Reserved. - -#. The service identifier. Each PMF service is identified by a - service name and a service identifier. Both the service name and - identifier are unique within the system as a whole. - -#. The local timestamp identifier. This identifier is unique within a given - service. - -Registering a PMF service -~~~~~~~~~~~~~~~~~~~~~~~~~ - -To register a PMF service, the ``PMF_REGISTER_SERVICE()`` macro from ``pmf.h`` -is used. The arguments required are the service name, the service ID, -the total number of local timestamps to be captured and a set of flags. - -The ``flags`` field can be specified as a bitwise-OR of the following values: - -:: - - PMF_STORE_ENABLE: The timestamp is stored in memory for later retrieval. - PMF_DUMP_ENABLE: The timestamp is dumped on the serial console. - -The ``PMF_REGISTER_SERVICE()`` reserves memory to store captured -timestamps in a PMF specific linker section at build time. -Additionally, it defines necessary functions to capture and -retrieve a particular timestamp for the given service at runtime. - -The macro ``PMF_REGISTER_SERVICE()`` only enables capturing PMF timestamps -from within TF-A. In order to retrieve timestamps from outside of TF-A, the -``PMF_REGISTER_SERVICE_SMC()`` macro must be used instead. This macro -accepts the same set of arguments as the ``PMF_REGISTER_SERVICE()`` -macro but additionally supports retrieving timestamps using SMCs. - -Capturing a timestamp -~~~~~~~~~~~~~~~~~~~~~ - -PMF timestamps are stored in a per-service timestamp region. On a -system with multiple CPUs, each timestamp is captured and stored -in a per-CPU cache line aligned memory region. - -Having registered the service, the ``PMF_CAPTURE_TIMESTAMP()`` macro can be -used to capture a timestamp at the location where it is used. The macro -takes the service name, a local timestamp identifier and a flag as arguments. - -The ``flags`` field argument can be zero, or ``PMF_CACHE_MAINT`` which -instructs PMF to do cache maintenance following the capture. Cache -maintenance is required if any of the service's timestamps are captured -with data cache disabled. - -To capture a timestamp in assembly code, the caller should use -``pmf_calc_timestamp_addr`` macro (defined in ``pmf_asm_macros.S``) to -calculate the address of where the timestamp would be stored. The -caller should then read ``CNTPCT_EL0`` register to obtain the timestamp -and store it at the determined address for later retrieval. - -Retrieving a timestamp -~~~~~~~~~~~~~~~~~~~~~~ - -From within TF-A, timestamps for individual CPUs can be retrieved using either -``PMF_GET_TIMESTAMP_BY_MPIDR()`` or ``PMF_GET_TIMESTAMP_BY_INDEX()`` macros. -These macros accept the CPU's MPIDR value, or its ordinal position -respectively. - -From outside TF-A, timestamps for individual CPUs can be retrieved by calling -into ``pmf_smc_handler()``. - -.. code:: c - - Interface : pmf_smc_handler() - Argument : unsigned int smc_fid, u_register_t x1, - u_register_t x2, u_register_t x3, - u_register_t x4, void *cookie, - void *handle, u_register_t flags - Return : uintptr_t - - smc_fid: Holds the SMC identifier which is either `PMF_SMC_GET_TIMESTAMP_32` - when the caller of the SMC is running in AArch32 mode - or `PMF_SMC_GET_TIMESTAMP_64` when the caller is running in AArch64 mode. - x1: Timestamp identifier. - x2: The `mpidr` of the CPU for which the timestamp has to be retrieved. - This can be the `mpidr` of a different core to the one initiating - the SMC. In that case, service specific cache maintenance may be - required to ensure the updated copy of the timestamp is returned. - x3: A flags value that is either 0 or `PMF_CACHE_MAINT`. If - `PMF_CACHE_MAINT` is passed, then the PMF code will perform a - cache invalidate before reading the timestamp. This ensures - an updated copy is returned. - -The remaining arguments, ``x4``, ``cookie``, ``handle`` and ``flags`` are unused -in this implementation. - -PMF code structure -~~~~~~~~~~~~~~~~~~ - -#. ``pmf_main.c`` consists of core functions that implement service registration, - initialization, storing, dumping and retrieving timestamps. - -#. ``pmf_smc.c`` contains the SMC handling for registered PMF services. - -#. ``pmf.h`` contains the public interface to Performance Measurement Framework. - -#. ``pmf_asm_macros.S`` consists of macros to facilitate capturing timestamps in - assembly code. - -#. ``pmf_helpers.h`` is an internal header used by ``pmf.h``. - -Armv8-A Architecture Extensions -------------------------------- - -TF-A makes use of Armv8-A Architecture Extensions where applicable. This -section lists the usage of Architecture Extensions, and build flags -controlling them. - -In general, and unless individually mentioned, the build options -``ARM_ARCH_MAJOR`` and ``ARM_ARCH_MINOR`` select the Architecture Extension to -target when building TF-A. Subsequent Arm Architecture Extensions are backward -compatible with previous versions. - -The build system only requires that ``ARM_ARCH_MAJOR`` and ``ARM_ARCH_MINOR`` have a -valid numeric value. These build options only control whether or not -Architecture Extension-specific code is included in the build. Otherwise, TF-A -targets the base Armv8.0-A architecture; i.e. as if ``ARM_ARCH_MAJOR`` == 8 -and ``ARM_ARCH_MINOR`` == 0, which are also their respective default values. - -See also the *Summary of build options* in `User Guide`_. - -For details on the Architecture Extension and available features, please refer -to the respective Architecture Extension Supplement. - -Armv8.1-A -~~~~~~~~~ - -This Architecture Extension is targeted when ``ARM_ARCH_MAJOR`` >= 8, or when -``ARM_ARCH_MAJOR`` == 8 and ``ARM_ARCH_MINOR`` >= 1. - -- The Compare and Swap instruction is used to implement spinlocks. Otherwise, - the load-/store-exclusive instruction pair is used. - -Armv8.2-A -~~~~~~~~~ - -- The presence of ARMv8.2-TTCNP is detected at runtime. When it is present, the - Common not Private (TTBRn_ELx.CnP) bit is enabled to indicate that multiple - Processing Elements in the same Inner Shareable domain use the same - translation table entries for a given stage of translation for a particular - translation regime. - -Armv8.3-A -~~~~~~~~~ - -- Pointer authentication features of Armv8.3-A are unconditionally enabled in - the Non-secure world so that lower ELs are allowed to use them without - causing a trap to EL3. - - In order to enable the Secure world to use it, ``CTX_INCLUDE_PAUTH_REGS`` - must be set to 1. This will add all pointer authentication system registers - to the context that is saved when doing a world switch. - - The TF-A itself has support for pointer authentication at runtime - that can be enabled by setting both options ``ENABLE_PAUTH`` and - ``CTX_INCLUDE_PAUTH_REGS`` to 1. This enables pointer authentication in BL1, - BL2, BL31, and the TSP if it is used. - - These options are experimental features. - - Note that Pointer Authentication is enabled for Non-secure world irrespective - of the value of these build flags if the CPU supports it. - - If ``ARM_ARCH_MAJOR == 8`` and ``ARM_ARCH_MINOR >= 3`` the code footprint of - enabling PAuth is lower because the compiler will use the optimized - PAuth instructions rather than the backwards-compatible ones. - -Armv7-A -~~~~~~~ - -This Architecture Extension is targeted when ``ARM_ARCH_MAJOR`` == 7. - -There are several Armv7-A extensions available. Obviously the TrustZone -extension is mandatory to support the TF-A bootloader and runtime services. - -Platform implementing an Armv7-A system can to define from its target -Cortex-A architecture through ``ARM_CORTEX_A<X> = yes`` in their -``platform.mk`` script. For example ``ARM_CORTEX_A15=yes`` for a -Cortex-A15 target. - -Platform can also set ``ARM_WITH_NEON=yes`` to enable neon support. -Note that using neon at runtime has constraints on non secure wolrd context. -TF-A does not yet provide VFP context management. - -Directive ``ARM_CORTEX_A<x>`` and ``ARM_WITH_NEON`` are used to set -the toolchain target architecture directive. - -Platform may choose to not define straight the toolchain target architecture -directive by defining ``MARCH32_DIRECTIVE``. -I.e: - -:: - - MARCH32_DIRECTIVE := -mach=armv7-a - -Code Structure --------------- - -TF-A code is logically divided between the three boot loader stages mentioned -in the previous sections. The code is also divided into the following -categories (present as directories in the source code): - -- **Platform specific.** Choice of architecture specific code depends upon - the platform. -- **Common code.** This is platform and architecture agnostic code. -- **Library code.** This code comprises of functionality commonly used by all - other code. The PSCI implementation and other EL3 runtime frameworks reside - as Library components. -- **Stage specific.** Code specific to a boot stage. -- **Drivers.** -- **Services.** EL3 runtime services (eg: SPD). Specific SPD services - reside in the ``services/spd`` directory (e.g. ``services/spd/tspd``). - -Each boot loader stage uses code from one or more of the above mentioned -categories. Based upon the above, the code layout looks like this: - -:: - - Directory Used by BL1? Used by BL2? Used by BL31? - bl1 Yes No No - bl2 No Yes No - bl31 No No Yes - plat Yes Yes Yes - drivers Yes No Yes - common Yes Yes Yes - lib Yes Yes Yes - services No No Yes - -The build system provides a non configurable build option IMAGE_BLx for each -boot loader stage (where x = BL stage). e.g. for BL1 , IMAGE_BL1 will be -defined by the build system. This enables TF-A to compile certain code only -for specific boot loader stages - -All assembler files have the ``.S`` extension. The linker source files for each -boot stage have the extension ``.ld.S``. These are processed by GCC to create the -linker scripts which have the extension ``.ld``. - -FDTs provide a description of the hardware platform and are used by the Linux -kernel at boot time. These can be found in the ``fdts`` directory. - -References ----------- - -.. [#] `Trusted Board Boot Requirements CLIENT (TBBR-CLIENT) Armv8-A (ARM DEN0006D)`_ -.. [#] `Power State Coordination Interface PDD`_ -.. [#] `SMC Calling Convention PDD`_ -.. [#] `TF-A Interrupt Management Design guide`_. - --------------- - -*Copyright (c) 2013-2019, Arm Limited and Contributors. All rights reserved.* - -.. _Reset Design: ./reset-design.rst -.. _Porting Guide: ./porting-guide.rst -.. _Firmware Update: ./firmware-update.rst -.. _PSCI PDD: http://infocenter.arm.com/help/topic/com.arm.doc.den0022d/Power_State_Coordination_Interface_PDD_v1_1_DEN0022D.pdf -.. _SMC calling convention PDD: http://infocenter.arm.com/help/topic/com.arm.doc.den0028b/ARM_DEN0028B_SMC_Calling_Convention.pdf -.. _PSCI Library integration guide: ./psci-lib-integration-guide.rst -.. _SMCCC: http://infocenter.arm.com/help/topic/com.arm.doc.den0028b/ARM_DEN0028B_SMC_Calling_Convention.pdf -.. _PSCI: http://infocenter.arm.com/help/topic/com.arm.doc.den0022d/Power_State_Coordination_Interface_PDD_v1_1_DEN0022D.pdf -.. _Power State Coordination Interface PDD: http://infocenter.arm.com/help/topic/com.arm.doc.den0022d/Power_State_Coordination_Interface_PDD_v1_1_DEN0022D.pdf -.. _here: ./psci-lib-integration-guide.rst -.. _cpu-specific-build-macros.rst: ./cpu-specific-build-macros.rst -.. _CPUBM: ./cpu-specific-build-macros.rst -.. _Arm ARM: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0487a.e/index.html -.. _User Guide: ./user-guide.rst -.. _SMC Calling Convention PDD: http://infocenter.arm.com/help/topic/com.arm.doc.den0028b/ARM_DEN0028B_SMC_Calling_Convention.pdf -.. _TF-A Interrupt Management Design guide: ./interrupt-framework-design.rst -.. _Xlat_tables design: xlat-tables-lib-v2-design.rst -.. _Exception Handling Framework: exception-handling.rst -.. _ROMLIB Design: romlib-design.rst -.. _Trusted Board Boot Requirements CLIENT (TBBR-CLIENT) Armv8-A (ARM DEN0006D): https://developer.arm.com/docs/den0006/latest/trusted-board-boot-requirements-client-tbbr-client-armv8-a - -.. |Image 1| image:: diagrams/rt-svc-descs-layout.png?raw=true |