Blame - docs/design_docs/tfm_physical_attack_mitigation.rst - TF-M/trusted-firmware-m

blob: d91e5c08340ea18dc3e407a7530c056786cbe874 [file] [log] [blame]

Tamas Ban	4f953f5	2020-11-16 07:49:43 +0000	[diff] [blame]	1	#################################################
				2	Physical attack mitigation in Trusted Firmware-M
				3	#################################################
				4
				5	:Authors: Tamas Ban; David Hu
				6	:Organization: Arm Limited
				7	:Contact: tamas.ban@arm.com; david.hu@arm.com
Tamas Ban	4f953f5	2020-11-16 07:49:43 +0000	[diff] [blame]	8
				9	************
				10	Requirements
				11	************
				12	PSA Certified Level 3 Lightweight Protection Profile [1]_ requires protection
				13	against physical attacks. This includes protection against manipulation of the
				14	hardware and any data, undetected manipulation of memory contents, physical
				15	probing on the chip's surface. The RoT detects or prevents its operation outside
				16	the normal operating conditions (such as voltage, clock frequency, temperature,
				17	or external energy fields) where reliability and secure operation has not been
				18	proven or tested.
				19
				20	.. note::
				21
				22	Mitigation against certain level of physical attacks is a mandatory
				23	requirement for PSA Level 3 certification.
				24	The :ref:`tf-m-against-physical-attacks` discussed below
				25	doesn't provide mitigation against all the physical attacks considered in
				26	scope for PSA L3 certification. Please check the Protection Profile document
				27	for an exhaustive list of requirements.
				28
				29	****************
				30	Physical attacks
				31	****************
				32	The goal of physical attacks is to alter the expected behavior of a circuit.
				33	This can be achieved by changing the device's normal operating conditions to
				34	untested operating conditions. As a result a hazard might be triggered on the
				35	circuit level, whose impact is unpredictable in advance but its effect can be
				36	observed. With frequent attempts, a weak point of the system could be identified
				37	and the attacker could gain access to the entire device. There is a wide variety
				38	of physical attacks, the following is not a comprehensive list rather just give
				39	a taste of the possibilities:
				40
				41	- Inject a glitch into the device power supply or clock line.
				42	- Operate the device outside its temperature range: cool down or warm it up.
				43	- Shoot the chip with an electromagnetic field. This can be done by passing
				44	current through a small coil close to the chip surface, no physical contact
				45	or modification of the PCB (soldering) is necessary.
				46	- Point a laser beam on the chip surface. It could flip bits in memory or a
				47	register, but precise knowledge of chip layout and design is necessary.
				48
				49	The required equipment and cost of these attacks varies. There are commercial
				50	products to perform such attacks. Furthermore, they are shipped with a scripting
				51	environment, good documentation, and a lot of examples. In general, there is a
				52	ton of videos, research paper and blogs about fault injection attacks. As a
				53	result the threshold, that even non-proficient can successfully perform such
				54	attack, gets lower over time.
				55
				56	*****************************************************************
				57	Effects of physical attacks in hardware and in software execution
				58	*****************************************************************
				59	The change in the behavior of the hardware and software cannot be seen in
				60	advance when performing a physical attack. On circuit-level they manifest
				61	in bit faults. These bit faults can cause varied effects in the behavior of
				62	the device micro-architecture:
				63
				64	- Instruction decoding pipeline is flushed.
				65	- Altering instructions when decoding.
				66	- Altering data when fetching or storing.
				67	- Altering register content, and the program counter.
				68	- Flip bits in register or memory.
				69
				70	These phenomenons happen at random and cannot be observed directly but the
				71	effect can be traced in software execution. On the software level the following
				72	can happen:
				73
				74	- A few instructions are skipped. This can lead to taking different branch
				75	than normal.
				76	- Corrupted CPU register or data fetch could alter the result of a comparison
				77	instruction. Or change the value returned from a function.
				78	- Corrupted data store could alter the config of peripherals.
				79	- Very precise attacks with laser can flip bits in any register or in memory.
				80
				81	This is a complex domain. Faults are not well-understood. Different fault models
				82	exist but all of them target a specific aspect of fault injection. One of the
				83	most common and probably the easily applicable fault model is the instruction
				84	skip.
				85
				86	***********************************
				87	Mitigation against physical attacks
				88	***********************************
				89	The applicability of these attacks highly depends on the device. Some
				90	devices are more sensitive than others. Protection is possible at hardware and
				91	software levels as well.
				92
				93	On the hardware level, there are chip design principles and system IPs that are
				94	resistant to fault injection attacks. These can make it harder to perform a
				95	successful attack and as a result the chip might reset or erase sensitive
				96	content. The device maker needs to consider what level of physical attack is in
				97	scope and choose a SoC accordingly.
				98
				99	On top of hardware-level protection, a secondary protection layer can be
				100	implemented in software. This approach is known as "defence in depth".
				101
				102	Neither hardware nor software level protection is perfect because both can be
				103	bypassed. The combination of them provides the maximum level of protection.
				104	However, even when both are in place, it is not certain that they provide 100%
				105	protection against physical attacks. The best of what is to achievable to harden
				106	the system to increase the cost of a successful attack (in terms of time and
				107	equipment), thereby making it non profitable to perform them.
				108
				109	.. _phy-att-countermeasures:
				110
				111	Software countermeasures against physical attacks
				112	=================================================
				113	There are practical coding techniques which can be applied to harden software
				114	against fault injection attacks. They significantly decrease the probability of
				115	a successful attack:
				116
				117	- Control flow monitor
				118
				119	To catch malicious modification of the expected control flow. When an
				120	important portion of a program is executed, a flow monitor counter is
				121	incremented. The program moves to the next stage only if the accumulated
				122	flow monitor counter is equal to an expected value.
				123
				124	- Default failure
				125
				126	The return value variable should always contain a value indicating
				127	failure. Changing its value to success is done only under one protected
				128	flow (preferably protected by double checks).
				129
				130	- Complex constant
				131
				132	It is hard to change a memory region or register to a pre-defined value, but
				133	usual boolean values (0 or 1) are easier to manipulate.
				134
				135	- Redundant variables and condition checks
				136
				137	To make branch condition attack harder it is recommended to check the
				138	relevant condition twice (it is better to have a random delay between the
				139	two comparisons).
				140
				141	- Random delay
				142
				143	Successful fault injection attacks require very precise timing. Adding
				144	random delay to the code execution makes the timing of an attack much
				145	harder.
				146
				147	- Loop integrity check
				148
				149	To avoid to skip critical loop iterations. It can weaken the cryptographic
				150	algorithms. After a loop has executed, check the loop counter whether it
				151	indeed has the expected value.
				152
				153	- Duplicated execution
				154
				155	Execute a critical step multiple times to prevent fault injection from
				156	skipping the step. To mitigate multiple consecutive fault injections, random
				157	delay can be inserted between duplicated executions.
				158
				159	These techniques should be applied in a thoughtful way. If it is applied
				160	everywhere then it can result in messy code that makes the maintenance harder.
				161	Code must be analysed and sensitive parts and critical call path must be
				162	identified. Furthermore, these techniques increase the overall code size which
				163	might be an issue on the constrained devices.
				164
				165	Currently, compilers are not providing any support to implement these
				166	countermeasures automatically. On the contrary, they can eliminate the
				167	protection code during optimization. As a result, the C level protection does
				168	not add any guarantee about the final behavior of the system. The effectiveness
				169	of these protections highly depends on the actual compiler and the optimization
				170	level. The compiled assembly code must be visually inspected and tested to make
				171	sure that proper countermeasures are in-place and perform as expected.
				172
				173	.. _phy-att-threat-model:
				174
				175	******************************************
				176	TF-M Threat Model against physical attacks
				177	******************************************
				178
				179	Physical attack target
				180	======================
				181	A malicious actor performs physical attack against TF-M to retrieve assets from
				182	device. These assets can be sensitive data, credentials, crypto keys. These
				183	assets are protected in TF-M by proper isolation.
				184
				185	For example, a malicious actor can perform the following attacks:
				186
				187	- Reopen the debug port or hinder the closure of it then connect to the device
				188	with a debugger and dump memory.
				189	- Bypass secure boot to replace authentic firmware with a malicious image.
				190	Then arbitrary memory can be read.
				191	- Assuming that secure boot cannot be bypassed then an attacker can try to
				192	hinder the setup of the memory isolation hardware by TF-M
				193	:term:`Secure Partition Manager` (SPM) and manage to execute the non-secure
				194	image in secure state. If this is achieved then still an exploitable
				195	vulnerability is needed in the non-secure code which can be used to inject
				196	and execute arbitrary code to read the assets.
				197	- Device might contain unsigned binary blob next to the official firmware.
				198	This can be any data, not necessarily code. If an attacker manages to
				199	replace this data with arbitrary content (e.g. a NOP slide leading to a
				200	malicious code) then they can try to manipulate the program counter to jump
				201	to this area before setting up the memory isolation.
				202
				203	.. _attacker-capability:
				204
				205	Assumptions on attacker capability
				206	==================================
				207	It is assumed that the attacker owns the following capabilities to perform
				208	physical attack against devices protected by TF-M.
				209
				210	- Has physical access to the device.
				211	- Able to access external memory, read and possibly tamper it.
				212	- Able to load arbitrary candidate images for firmware upgrade.
				213	- Able to manage that bootloader tries to upgrade the arbitrary image from
				214	staging area.
				215	- Able to inject faults on hardware level (voltage or power glitch, EM pulse,
				216	etc.) to the system.
				217	- Precise timing of fault injection is possible once or a few times, but in
				218	general the more intervention is required for a successful attack the harder
				219	will be to succeed.
				220
				221	It is out of the scope of TF-M mitigation if an attacker is able to directly
				222	tamper or disclose the assets. It is assumed that an attacker has the following
				223	technical limitations.
				224
				225	- No knowledge of the image signing key. Not able to sign an arbitrary image.
				226	- Not able to directly access to the chip through debug port.
				227	- Not able to directly access internal memory.
				228	- No knowledge of the layout of the die or the memory arrangement of the
				229	secure code, so precise attack against specific registers or memory
				230	addresses are out of scope.
				231
				232	Physical attack scenarios against TF-M
				233	======================================
				234	Based on the analysis above, a malicious actor may perform physical attacks
				235	against critical operations in :term:`SPE` workflow and critical modules in
				236	TF-M, to indirectly gain unauthenticated accesses to assets.
				237
				238	Those critical operations and modules either directly access the assets or
				239	protect the assets from disclosure. Those operations and modules can include:
				240
				241	- Image validation in bootloader
				242	- Isolation management in TF-M, including platform specific configuration
				243	- Cryptographic operations
				244	- TF-M Secure Storage operations
				245	- PSA client permission check in TF-M
				246
				247	The detailed scenarios are discussed in following sections.
				248
				249	Physical attacks against bootloader
				250	-----------------------------------
				251	Physical attacks may bypass secure image validation in bootloader and a
				252	malicious image can be installed.
				253
				254	The countermeasures is bootloader specific implementation and out of the scope
				255	of this document. TF-M relies on MCUboot by default. MCUboot has already
				256	implemented countermeasures against fault injection attacks [3]_.
				257
				258	.. _physical-attacks-spm:
				259
				260	Physical attacks against TF-M SPM
				261	---------------------------------
				262	TF-M SPM initializes and manages the isolation configuration. It also performs
				263	permission check against secure service requests from PSA clients.
				264
				265	Static isolation configuration
				266	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				267	It is TF-M SPM's responsibility to build up isolation during the initialization
				268	phase. If this is missed or not done correctly then it might be possible for
				269	non-secure code to access some secure memory area or an external device can
				270	access assets in the device through a debug port.
				271
				272	Therefore, hindering the setup of memory or peripheral isolation hardware is an
				273	obvious candidate for physical attacks. The initialization phase has a constant
				274	time execution (like the previous boot-up state), therefore the timing of the
				275	attack is simpler, compared to cases when secure and non-secure runtime firmware
				276	is up-and-running for a while and IRQs make timing unpredictable.
				277
				278	Some examples of attacking isolation configuration are shown in the list below.
				279
				280	- Hinder the setting of security regions. Try to execute non-secure code as
				281	secure.
				282	- Manipulate the setting of secure regions, try to extend the non-secure
				283	regions to cover a memory area which otherwise is intended to be secure
				284	area.
				285	- Hinder the setting of isolation boundary. In this case vulnerable ARoT code
				286	has access to all memory.
				287	- Manipulate peripheral configuration to give access to non-secure code to a
				288	peripheral which is intended to be secure.
				289
				290	PSA client permission checks
				291	^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				292	TF-M SPM performs several permission checks against secure service requests from
				293	a PSA client, such as:
				294
				295	- Check whether the PSA client is a non-secure client or a secure client
				296
				297	NS client's PSA client ID is negative. NS client is not allowed to directly
				298	access secure areas. A malicious actor can inject faults when TF-M SPM
				299	authenticates a NS client. It may manipulate TF-M to accept it as a secure
				300	client and allow the NS client to access assets.
				301
				302	- Memory access checks
				303
				304	TF-M SPM checks whether the request has correct permission to access a secure
				305	memory area. A malicious actor can inject faults when TF-M SPM checks memory
				306	access permission. It may skip critical check steps or corrupt the check
				307	result. Thereby a malicious service request may pass TF-M memory access check
				308	and accesses assets which it is not allowed to.
				309
				310	The physical attacks mentioned above relies on the a malicious NS application or
				311	a vulnerable RoT service to start a malicious secure service request to access
				312	the assets. The malicious actor has to be aware of the accurate timing of
				313	dealing with the malicious request in TF-M SPM. The timing can be affected by
				314	other clients and interrupts.
				315	It should be more difficult than pure fault injection.
				316
				317	Dynamic isolation boundary configuration
				318	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				319	Physical attack may affect the isolation boundary setting during TF-M context
				320	switch, especially in Isolation Level 3. For example:
				321
				322	- A fault injection may cause TF-M SPM to skip clear privileged state before
				323	switching in an ARoT service.
				324	- A fault injection may cause TF-M SPM to skip updating MPU regions and
				325	therefore the next RoT service may access assets belonging to a previous
				326	one.
				327
				328	However, it is much more difficult to find out the accurate timing of TF-M
				329	context switch, compared to other scenarios in TF-M SPM. It also requires a
				330	vulnerable RoT service to access assets after fault injection.
				331
				332	Physical attacks against TF-M Crypto service
				333	--------------------------------------------
				334	Since crypto operations are done by mbedTLS library or by a custom crypto
				335	accelerator engine and its related software driver stack, the analysis of
				336	physical attacks against crypto operations is out-of-scope for this document.
				337	However, in general the same requirements are applicable for the crypto, to be
				338	compliant with PSA Level 3 certification. That is, it must be resistant against
				339	physical attacks. So crypto software and hardware must be hardened against
				340	side-channel and physical attacks.
				341
				342	Physical attacks against Secure Storage
				343	---------------------------------------
				344	Physical attacks against Internal Trusted Storage
				345	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				346	Based on the assumption in :ref:`attacker-capability`, a malicious actor is
				347	unable to directly retrieve assets via physical attacks against
				348	:term:`Internal Trusted Storage` (ITS).
				349
				350	Instead, a malicious actor can inject faults into isolation configuration of ITS
				351	area in TF-M SPM to gain the access to assets stored in ITS. Refer to
				352	:ref:`physical-attacks-spm` for details.
				353
				354	Physical attacks against Protected Storage
				355	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				356	Based on the assumption in :ref:`attacker-capability`, a malicious actor can be
				357	able to directly access external storage device.
				358	Therefore :term:`Protected Storage` (PS) shall enable encryption and
				359	authentication by default to detect tampering with the content in external
				360	storage device.
				361
				362	A malicious actor can also inject faults into isolation configuration of PS and
				363	external storage device peripherals in TF-M SPM to gain the access to assets
				364	stored in PS. Refer to :ref:`physical-attacks-spm` for details.
				365
				366	It is out of the scope of TF-M to fully prevent malicious actors from directly
				367	tampering with or retrieving content stored in external storage devices.
				368
				369	Physical attacks against platform specific implementation
				370	---------------------------------------------------------
				371	Platform specific implementation includes critical TF-M HAL implementations.
				372	A malicious actor can perform physical attack against those platform specific
				373	implementations to bypass the countermeasures in TF-M common code.
				374
Kevin Peng	c855573	2021-09-24 15:15:21 +0800	[diff] [blame]	375	Platform early initialization
				376	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				377	TFM provides a HAL API for platforms to perform HW initialization before SPM
				378	initialization starts.
Tamas Ban	4f953f5	2020-11-16 07:49:43 +0000	[diff] [blame]	379	The system integrator is responsible to implement this API on a particular SoC
				380	and harden it against physical attacks:
				381
				382	.. code-block:: c
				383
Kevin Peng	c855573	2021-09-24 15:15:21 +0800	[diff] [blame]	384	enum tfm_hal_status_t tfm_hal_platform_init(void);
				385
				386	The API can have several initializations on different modules. The system
				387	integrator can choose to even harden some of these initializations functions
				388	within this platform init API. One of the examples is the debug access setting.
				389
				390	Debug access setting
				391	********************
				392	TF-M configures debug access according to device lifecycle and accessible debug
				393	certificates. In general, TF-M locks down the debug port if the device is in
				394	secure production state.
				395	The system integrator can put the settings into an API and harden it against
				396	physical attacks.
Tamas Ban	4f953f5	2020-11-16 07:49:43 +0000	[diff] [blame]	397
				398	Platform specific isolation configuration
				399	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				400	TFM SPM exposes a HAL API for static and dynamic isolation configuration. The
				401	system integrator is responsible to implement these API on a particular SoC and
				402	harden it against physical attacks.
				403
				404	.. code-block:: c
				405
				406	enum tfm_hal_status_t tfm_hal_set_up_static_boundaries(void);
Xinyu Zhang	03c72ef	2022-09-06 16:56:30 +0800	[diff] [blame]	407	enum tfm_hal_status_t tfm_hal_bind_boundary(const struct partition_load_info_t *p_ldinf,
				408	uintptr_t *p_boundary);
Tamas Ban	4f953f5	2020-11-16 07:49:43 +0000	[diff] [blame]	409
				410	Memory access check
				411	^^^^^^^^^^^^^^^^^^^
				412	TFM SPM exposes a HAL API for platform specific memory access check. The
Xinyu Zhang	03c72ef	2022-09-06 16:56:30 +0800	[diff] [blame]	413	system integrator is responsible to implement this API on a particular SoC and
Tamas Ban	4f953f5	2020-11-16 07:49:43 +0000	[diff] [blame]	414	harden it against physical attacks.
				415
				416	.. code-block:: c
				417
Summer Qin	56725eb	2022-05-06 15:23:40 +0800	[diff] [blame]	418	tfm_hal_status_t tfm_hal_memory_check(uintptr_t boundary,
				419	uintptr_t base,
				420	size_t size,
				421	uint32_t access_type);
Tamas Ban	4f953f5	2020-11-16 07:49:43 +0000	[diff] [blame]	422
				423	.. _tf-m-against-physical-attacks:
				424
				425	*********************************************
				426	TF-M countermeasures against physical attacks
				427	*********************************************
				428	This section propose a design of software countermeasures against physical
				429	attacks.
				430
				431	Fault injection hardening library
				432	=================================
				433	There is no open-source library which implements generic mitigation techniques
				434	listed in :ref:`phy-att-countermeasures`.
				435	TF-M project implements a portion of these techniques. TF-M software
				436	countermeasures are implemented as a small library Fault Injection Hardening
				437	(FIH) in TF-M code base. A similar library was first introduced and tested in
				438	the MCUboot project (version 1.7.0) [2]_ which TF-M relies on.
				439
				440	The FIH library is put under TF-M ``lib/fih/``.
				441
				442	The implementation of the different techniques was assigned to fault injection
Xinyu Zhang	03c72ef	2022-09-06 16:56:30 +0800	[diff] [blame]	443	protection profiles. Four profiles (OFF, LOW, MEDIUM, HIGH) were introduced to fit
Tamas Ban	4f953f5	2020-11-16 07:49:43 +0000	[diff] [blame]	444	better to the device capability (memory size, TRNG availability) and to
				445	protection requirements mandated by the device threat model. Fault injection
				446	protection profile is configurable at compile-time, default value: OFF.
				447
				448	Countermeasure profiles and corresponding techniques are listed in the table
				449	below.
				450
				451	+--------------------------------+-------------+----------------+--------------+------------------+
				452	\| Countermeasure \| Profile LOW \| Profile MEDIUM \| Profile HIGH \| Comments \|
				453	+================================+=============+================+==============+==================+
				454	\| Control flow monitor \| Y \| Y \| Y \| \|
				455	+--------------------------------+-------------+----------------+--------------+------------------+
				456	\| Failure loop hardening \| Y \| Y \| Y \| \|
				457	+--------------------------------+-------------+----------------+--------------+------------------+
				458	\| Complex constant \| \| Y \| Y \| \|
				459	+--------------------------------+-------------+----------------+--------------+------------------+
				460	\| Redundant variables and checks \| \| Y \| Y \| \|
				461	+--------------------------------+-------------+----------------+--------------+------------------+
				462	\| Random delay \| \| \| Y \| Implemented, but \|
				463	\| \| \| \| \| depends on HW \|
				464	\| \| \| \| \| capability \|
				465	+--------------------------------+-------------+----------------+--------------+------------------+
				466
Xinyu Zhang	03c72ef	2022-09-06 16:56:30 +0800	[diff] [blame]	467	Similar to MCUboot, four profiles are supported. It can be configured at build
				468	time by setting (default is OFF):
Tamas Ban	4f953f5	2020-11-16 07:49:43 +0000	[diff] [blame]	469
				470	``-DTFM_FIH_PROFILE=<OFF, LOW, MEDIUM, HIGH>``
				471
				472	How to use FIH library
				473	======================
				474	As analyzed in :ref:`phy-att-threat-model`, this section focuses on integrating
				475	FIH library in TF-M SPM to mitigate physical attacks.
				476
				477	- Identify critical function call path which is mandatory for configuring
Xinyu Zhang	03c72ef	2022-09-06 16:56:30 +0800	[diff] [blame]	478	isolation or debug access. Change their return types to ``FIH_RET_TYPE`` and
				479	make them return with ``FIH_RET``. Then call them with ``FIH_CALL``. These macros
				480	are providing the extra checking functionality (control flow monitor, redundant
				481	checks and variables, random delay, complex constant) according to the profile
Tamas Ban	4f953f5	2020-11-16 07:49:43 +0000	[diff] [blame]	482	settings. More details about usage can be found here:
Xinyu Zhang	03c72ef	2022-09-06 16:56:30 +0800	[diff] [blame]	483	``trusted-firmware-m/lib/fih/inc/fih.h``
Tamas Ban	4f953f5	2020-11-16 07:49:43 +0000	[diff] [blame]	484
				485	Take simplified TF-M SPM initialization flow as an example:
				486
				487	.. code-block:: c
				488
				489	main()
				490	\|
				491	\|--> tfm_core_init()
				492	\| \|
Tamas Ban	4f953f5	2020-11-16 07:49:43 +0000	[diff] [blame]	493	\| \|--> tfm_hal_set_up_static_boundaries()
Kevin Peng	c855573	2021-09-24 15:15:21 +0800	[diff] [blame]	494	\| \| \|
				495	\| \| \|--> platform specific isolation impl.
				496	\| \|
				497	\| \|--> tfm_hal_platform_init()
Tamas Ban	4f953f5	2020-11-16 07:49:43 +0000	[diff] [blame]	498	\| \|
Kevin Peng	c855573	2021-09-24 15:15:21 +0800	[diff] [blame]	499	\| \|--> platform specific init
Tamas Ban	4f953f5	2020-11-16 07:49:43 +0000	[diff] [blame]	500	\|
				501	\|--> During each partition initialization
				502	\|
Xinyu Zhang	03c72ef	2022-09-06 16:56:30 +0800	[diff] [blame]	503	\|--> tfm_hal_bind_boundary()
Tamas Ban	4f953f5	2020-11-16 07:49:43 +0000	[diff] [blame]	504	\|
Xinyu Zhang	03c72ef	2022-09-06 16:56:30 +0800	[diff] [blame]	505	\|--> platform specific peripheral isolation impl.
Tamas Ban	4f953f5	2020-11-16 07:49:43 +0000	[diff] [blame]	506
				507	- Might make the important setting of peripheral config register redundant
				508	and verify them to match expectations before continue.
				509
				510	- Implements an extra verification function which checks the critical hardware
				511	config before secure code switches to non-secure. Proposed API for this
				512	purpose:
				513
				514	.. code-block:: c
				515
Kevin Peng	38788a1	2021-09-08 16:23:50 +0800	[diff] [blame]	516	fih_int tfm_hal_verify_static_boundaries(void);
Tamas Ban	4f953f5	2020-11-16 07:49:43 +0000	[diff] [blame]	517
Xinyu Zhang	03c72ef	2022-09-06 16:56:30 +0800	[diff] [blame]	518	This function is intended to be called just after the static boundaries are
				519	set up and is responsible for checking all critical hardware configurations.
				520	The goal is to catch if something is missed and act according
Tamas Ban	4f953f5	2020-11-16 07:49:43 +0000	[diff] [blame]	521	to system policy. The introduction of one more checking point requires one
				522	more intervention with precise timing. The system integrator is responsible
				523	to implement this API on a particular SoC and harden it against physical
				524	attacks. Make sure that all platform dependent security feature is properly
				525	configured.
				526
				527	- The most powerful mitigation technique is to add random delay to the code
				528	execution. This makes the timing of the attack much harder. However it
				529	requires an entropy source. It is recommended to use the ``HIGH`` profile
				530	when hardware support is available. There is a porting API layer to fetch
				531	random numbers in FIH library:
				532
				533	.. code-block:: c
				534
Xinyu Zhang	03c72ef	2022-09-06 16:56:30 +0800	[diff] [blame]	535	void fih_delay_init(void);
				536	uint8_t fih_delay_random(void);
Tamas Ban	4f953f5	2020-11-16 07:49:43 +0000	[diff] [blame]	537
				538	- Similar countermeasures can be implemented in critical steps in platform
				539	specific implementation.
				540
Xinyu Zhang	03c72ef	2022-09-06 16:56:30 +0800	[diff] [blame]	541	Take memory isolation settings on AN521 platform as an example.
Tamas Ban	4f953f5	2020-11-16 07:49:43 +0000	[diff] [blame]	542	The following hardware components are responsible for memory isolation in a
				543	SoC, which is based on SSE-200 subsystem.
				544	System integrators must examine the chip specific memory isolation solution,
				545	identify the key components and harden the configuration of those.
				546	This list just serves as an example here for easier understanding:
				547
				548	- Implementation Defined Attribution Unit (IDAU): Implementation defined,
				549	it can be a static config or dynamic.
				550	Contains the default security access permissions of the memory map.
				551	- SAU: The main module in the CPU to determine the security settings of
				552	the memory.
				553	- :term:`MPC`: External module from the CPU point of view. It protects the
				554	non security aware memories from unauthenticated access. Having a
				555	properly configured MPC significantly increases the security of the
				556	system.
				557	- :term:`PPC`: External module from the CPU
				558	point of view. Protects the non security aware peripherals from
				559	unauthenticated access.
				560	- MPU: Protects memory from unprivileged access. ARoT code has only a
				561	restricted access in secure domain. It mitigates that a vulnerable or
				562	malicious ARoT partition can access to device assets.
				563
Xinyu Zhang	03c72ef	2022-09-06 16:56:30 +0800	[diff] [blame]	564	The following AN521 specific isolation configuration functions
Tamas Ban	4f953f5	2020-11-16 07:49:43 +0000	[diff] [blame]	565	shall be hardened against physical attacks.
				566
				567	.. code-block:: c
				568
				569	sau_and_idau_cfg()
				570	mpc_init_cfg()
				571	ppc_init_cfg()
				572
				573	Some platform specific implementation rely on platform standard device
				574	driver libraries. It can become much more difficult to maintain drivers if
				575	the standard libraries are modified with FIH library. Platform specific
				576	implementation can implement duplicated execution and redundant variables/
				577	condition check when calling platform standard device driver libraries
				578	according to usage scenarios.
				579
Xinyu Zhang	03c72ef	2022-09-06 16:56:30 +0800	[diff] [blame]	580	Impact on memory footprint
				581	==========================
				582	The addition of protection code against physical attacks increases the memory
				583	footprint. The actual increase depends on the selected profile and where the
Tamas Ban	4f953f5	2020-11-16 07:49:43 +0000	[diff] [blame]	584	mitigation code is added.
				585
				586	Attack experiment with SPM
				587	==========================
				588	The goal is to bypass the setting of memory isolation hardware with simulated
				589	instruction skips in fast model execution (FVP_MPS2_AEMv8M) in order to execute
				590	the regular non-secure test code in secure state. This is done by identifying
				591	the configuration steps which must be bypassed to make this happen. The
				592	instruction skip simulation is achieved by breakpoints and manual manipulation
				593	of the program counter. The following steps are done on AN521 target, but this
				594	can be different on another target:
				595
				596	- Bypass the configuration of isolation HW: SAU, MPC.
Tamas Ban	4f953f5	2020-11-16 07:49:43 +0000	[diff] [blame]	597	- Bypass the setting of the PSP limit register. Otherwise, a stack overflow
				598	exception will happen. Because the secure PSP will be overwritten by the
				599	address of the non-secure stack and on this particular target the non-secure
				600	stack is on lower address than the value in the secure PSP_LIMIT register.
				601	- Avoid the clearing of the least significant bit in the non-secure entry
				602	point, where BLXNS/BXNS is jumping to non-secure code. Having the least
				603	significant bit cleared indicates to the hardware to switch security state.
				604
				605	The previous steps are enough to execute the non-secure Reset_Handler() in
				606	secure state. Usually, RTOS is executing on the non-secure side. In order to
				607	properly boot it up further steps are needed:
				608
				609	- Set the S_VTOR system register to point the address of the NS Vector table.
				610	Code is executed in secure state therefore when an IRQ hit then the handler
				611	address is fetched from the table pointed by S_VTOR register. RTOS usually
				612	do an SVC call at start-up. If S_VTOR is not modified then SPM's SVC handler
				613	will be executed.
				614	- TBC: RTX osKernelStart still failing.
				615
				616	The bottom line is that in order to execute the regular non-secure code in
				617	secure state the attacker need to interfere with the execution flow at many
				618	places. Successful attack can be made even harder by adding the described
				619	mitigation techniques and some random delays.
				620
				621
				622	*********
				623	Reference
				624	*********
				625
				626	.. [1] `PSA Certified Level 3 Lightweight Protection Profile <https://www.psacertified.org/app/uploads/2020/11/JSADEN009-PSA_Certified_Level_3_LW_PP-1.0-ALP02.pdf>`_
				627
Matthew Dalzell	988bbd6	2025-06-05 15:49:26 +0100	[diff] [blame]	628	.. [2] `MCUboot project - fault injection hardening <https://github.com/mcu-tools/mcuboot/blob/master/boot/bootutil/include/bootutil/fault_injection_hardening.h>`_
Tamas Ban	4f953f5	2020-11-16 07:49:43 +0000	[diff] [blame]	629
				630	.. [3] `MCUboot fault injection mitigation <https://www.trustedfirmware.org/docs/TF-M_fault_injection_mitigation.pdf>`_
David Hu	a91a9d1	2021-04-15 16:54:53 +0800	[diff] [blame]	631
				632	--------------------------------
				633
Summer Qin	56725eb	2022-05-06 15:23:40 +0800	[diff] [blame]	634	Copyright (c) 2021-2022, Arm Limited. All rights reserved.