Tamas Ban | 4f953f5 | 2020-11-16 07:49:43 +0000 | [diff] [blame] | 1 | ################################################# |
| 2 | Physical attack mitigation in Trusted Firmware-M |
| 3 | ################################################# |
| 4 | |
| 5 | :Authors: Tamas Ban; David Hu |
| 6 | :Organization: Arm Limited |
| 7 | :Contact: tamas.ban@arm.com; david.hu@arm.com |
Tamas Ban | 4f953f5 | 2020-11-16 07:49:43 +0000 | [diff] [blame] | 8 | |
| 9 | ************ |
| 10 | Requirements |
| 11 | ************ |
| 12 | PSA Certified Level 3 Lightweight Protection Profile [1]_ requires protection |
| 13 | against physical attacks. This includes protection against manipulation of the |
| 14 | hardware and any data, undetected manipulation of memory contents, physical |
| 15 | probing on the chip's surface. The RoT detects or prevents its operation outside |
| 16 | the normal operating conditions (such as voltage, clock frequency, temperature, |
| 17 | or external energy fields) where reliability and secure operation has not been |
| 18 | proven or tested. |
| 19 | |
| 20 | .. note:: |
| 21 | |
| 22 | Mitigation against certain level of physical attacks is a mandatory |
| 23 | requirement for PSA Level 3 certification. |
| 24 | The :ref:`tf-m-against-physical-attacks` discussed below |
| 25 | doesn't provide mitigation against all the physical attacks considered in |
| 26 | scope for PSA L3 certification. Please check the Protection Profile document |
| 27 | for an exhaustive list of requirements. |
| 28 | |
| 29 | **************** |
| 30 | Physical attacks |
| 31 | **************** |
| 32 | The goal of physical attacks is to alter the expected behavior of a circuit. |
| 33 | This can be achieved by changing the device's normal operating conditions to |
| 34 | untested operating conditions. As a result a hazard might be triggered on the |
| 35 | circuit level, whose impact is unpredictable in advance but its effect can be |
| 36 | observed. With frequent attempts, a weak point of the system could be identified |
| 37 | and the attacker could gain access to the entire device. There is a wide variety |
| 38 | of physical attacks, the following is not a comprehensive list rather just give |
| 39 | a taste of the possibilities: |
| 40 | |
| 41 | - Inject a glitch into the device power supply or clock line. |
| 42 | - Operate the device outside its temperature range: cool down or warm it up. |
| 43 | - Shoot the chip with an electromagnetic field. This can be done by passing |
| 44 | current through a small coil close to the chip surface, no physical contact |
| 45 | or modification of the PCB (soldering) is necessary. |
| 46 | - Point a laser beam on the chip surface. It could flip bits in memory or a |
| 47 | register, but precise knowledge of chip layout and design is necessary. |
| 48 | |
| 49 | The required equipment and cost of these attacks varies. There are commercial |
| 50 | products to perform such attacks. Furthermore, they are shipped with a scripting |
| 51 | environment, good documentation, and a lot of examples. In general, there is a |
| 52 | ton of videos, research paper and blogs about fault injection attacks. As a |
| 53 | result the threshold, that even non-proficient can successfully perform such |
| 54 | attack, gets lower over time. |
| 55 | |
| 56 | ***************************************************************** |
| 57 | Effects of physical attacks in hardware and in software execution |
| 58 | ***************************************************************** |
| 59 | The change in the behavior of the hardware and software cannot be seen in |
| 60 | advance when performing a physical attack. On circuit-level they manifest |
| 61 | in bit faults. These bit faults can cause varied effects in the behavior of |
| 62 | the device micro-architecture: |
| 63 | |
| 64 | - Instruction decoding pipeline is flushed. |
| 65 | - Altering instructions when decoding. |
| 66 | - Altering data when fetching or storing. |
| 67 | - Altering register content, and the program counter. |
| 68 | - Flip bits in register or memory. |
| 69 | |
| 70 | These phenomenons happen at random and cannot be observed directly but the |
| 71 | effect can be traced in software execution. On the software level the following |
| 72 | can happen: |
| 73 | |
| 74 | - A few instructions are skipped. This can lead to taking different branch |
| 75 | than normal. |
| 76 | - Corrupted CPU register or data fetch could alter the result of a comparison |
| 77 | instruction. Or change the value returned from a function. |
| 78 | - Corrupted data store could alter the config of peripherals. |
| 79 | - Very precise attacks with laser can flip bits in any register or in memory. |
| 80 | |
| 81 | This is a complex domain. Faults are not well-understood. Different fault models |
| 82 | exist but all of them target a specific aspect of fault injection. One of the |
| 83 | most common and probably the easily applicable fault model is the instruction |
| 84 | skip. |
| 85 | |
| 86 | *********************************** |
| 87 | Mitigation against physical attacks |
| 88 | *********************************** |
| 89 | The applicability of these attacks highly depends on the device. Some |
| 90 | devices are more sensitive than others. Protection is possible at hardware and |
| 91 | software levels as well. |
| 92 | |
| 93 | On the hardware level, there are chip design principles and system IPs that are |
| 94 | resistant to fault injection attacks. These can make it harder to perform a |
| 95 | successful attack and as a result the chip might reset or erase sensitive |
| 96 | content. The device maker needs to consider what level of physical attack is in |
| 97 | scope and choose a SoC accordingly. |
| 98 | |
| 99 | On top of hardware-level protection, a secondary protection layer can be |
| 100 | implemented in software. This approach is known as "defence in depth". |
| 101 | |
| 102 | Neither hardware nor software level protection is perfect because both can be |
| 103 | bypassed. The combination of them provides the maximum level of protection. |
| 104 | However, even when both are in place, it is not certain that they provide 100% |
| 105 | protection against physical attacks. The best of what is to achievable to harden |
| 106 | the system to increase the cost of a successful attack (in terms of time and |
| 107 | equipment), thereby making it non profitable to perform them. |
| 108 | |
| 109 | .. _phy-att-countermeasures: |
| 110 | |
| 111 | Software countermeasures against physical attacks |
| 112 | ================================================= |
| 113 | There are practical coding techniques which can be applied to harden software |
| 114 | against fault injection attacks. They significantly decrease the probability of |
| 115 | a successful attack: |
| 116 | |
| 117 | - Control flow monitor |
| 118 | |
| 119 | To catch malicious modification of the expected control flow. When an |
| 120 | important portion of a program is executed, a flow monitor counter is |
| 121 | incremented. The program moves to the next stage only if the accumulated |
| 122 | flow monitor counter is equal to an expected value. |
| 123 | |
| 124 | - Default failure |
| 125 | |
| 126 | The return value variable should always contain a value indicating |
| 127 | failure. Changing its value to success is done only under one protected |
| 128 | flow (preferably protected by double checks). |
| 129 | |
| 130 | - Complex constant |
| 131 | |
| 132 | It is hard to change a memory region or register to a pre-defined value, but |
| 133 | usual boolean values (0 or 1) are easier to manipulate. |
| 134 | |
| 135 | - Redundant variables and condition checks |
| 136 | |
| 137 | To make branch condition attack harder it is recommended to check the |
| 138 | relevant condition twice (it is better to have a random delay between the |
| 139 | two comparisons). |
| 140 | |
| 141 | - Random delay |
| 142 | |
| 143 | Successful fault injection attacks require very precise timing. Adding |
| 144 | random delay to the code execution makes the timing of an attack much |
| 145 | harder. |
| 146 | |
| 147 | - Loop integrity check |
| 148 | |
| 149 | To avoid to skip critical loop iterations. It can weaken the cryptographic |
| 150 | algorithms. After a loop has executed, check the loop counter whether it |
| 151 | indeed has the expected value. |
| 152 | |
| 153 | - Duplicated execution |
| 154 | |
| 155 | Execute a critical step multiple times to prevent fault injection from |
| 156 | skipping the step. To mitigate multiple consecutive fault injections, random |
| 157 | delay can be inserted between duplicated executions. |
| 158 | |
| 159 | These techniques should be applied in a thoughtful way. If it is applied |
| 160 | everywhere then it can result in messy code that makes the maintenance harder. |
| 161 | Code must be analysed and sensitive parts and critical call path must be |
| 162 | identified. Furthermore, these techniques increase the overall code size which |
| 163 | might be an issue on the constrained devices. |
| 164 | |
| 165 | Currently, compilers are not providing any support to implement these |
| 166 | countermeasures automatically. On the contrary, they can eliminate the |
| 167 | protection code during optimization. As a result, the C level protection does |
| 168 | not add any guarantee about the final behavior of the system. The effectiveness |
| 169 | of these protections highly depends on the actual compiler and the optimization |
| 170 | level. The compiled assembly code must be visually inspected and tested to make |
| 171 | sure that proper countermeasures are in-place and perform as expected. |
| 172 | |
| 173 | .. _phy-att-threat-model: |
| 174 | |
| 175 | ****************************************** |
| 176 | TF-M Threat Model against physical attacks |
| 177 | ****************************************** |
| 178 | |
| 179 | Physical attack target |
| 180 | ====================== |
| 181 | A malicious actor performs physical attack against TF-M to retrieve assets from |
| 182 | device. These assets can be sensitive data, credentials, crypto keys. These |
| 183 | assets are protected in TF-M by proper isolation. |
| 184 | |
| 185 | For example, a malicious actor can perform the following attacks: |
| 186 | |
| 187 | - Reopen the debug port or hinder the closure of it then connect to the device |
| 188 | with a debugger and dump memory. |
| 189 | - Bypass secure boot to replace authentic firmware with a malicious image. |
| 190 | Then arbitrary memory can be read. |
| 191 | - Assuming that secure boot cannot be bypassed then an attacker can try to |
| 192 | hinder the setup of the memory isolation hardware by TF-M |
| 193 | :term:`Secure Partition Manager` (SPM) and manage to execute the non-secure |
| 194 | image in secure state. If this is achieved then still an exploitable |
| 195 | vulnerability is needed in the non-secure code which can be used to inject |
| 196 | and execute arbitrary code to read the assets. |
| 197 | - Device might contain unsigned binary blob next to the official firmware. |
| 198 | This can be any data, not necessarily code. If an attacker manages to |
| 199 | replace this data with arbitrary content (e.g. a NOP slide leading to a |
| 200 | malicious code) then they can try to manipulate the program counter to jump |
| 201 | to this area before setting up the memory isolation. |
| 202 | |
| 203 | .. _attacker-capability: |
| 204 | |
| 205 | Assumptions on attacker capability |
| 206 | ================================== |
| 207 | It is assumed that the attacker owns the following capabilities to perform |
| 208 | physical attack against devices protected by TF-M. |
| 209 | |
| 210 | - Has physical access to the device. |
| 211 | - Able to access external memory, read and possibly tamper it. |
| 212 | - Able to load arbitrary candidate images for firmware upgrade. |
| 213 | - Able to manage that bootloader tries to upgrade the arbitrary image from |
| 214 | staging area. |
| 215 | - Able to inject faults on hardware level (voltage or power glitch, EM pulse, |
| 216 | etc.) to the system. |
| 217 | - Precise timing of fault injection is possible once or a few times, but in |
| 218 | general the more intervention is required for a successful attack the harder |
| 219 | will be to succeed. |
| 220 | |
| 221 | It is out of the scope of TF-M mitigation if an attacker is able to directly |
| 222 | tamper or disclose the assets. It is assumed that an attacker has the following |
| 223 | technical limitations. |
| 224 | |
| 225 | - No knowledge of the image signing key. Not able to sign an arbitrary image. |
| 226 | - Not able to directly access to the chip through debug port. |
| 227 | - Not able to directly access internal memory. |
| 228 | - No knowledge of the layout of the die or the memory arrangement of the |
| 229 | secure code, so precise attack against specific registers or memory |
| 230 | addresses are out of scope. |
| 231 | |
| 232 | Physical attack scenarios against TF-M |
| 233 | ====================================== |
| 234 | Based on the analysis above, a malicious actor may perform physical attacks |
| 235 | against critical operations in :term:`SPE` workflow and critical modules in |
| 236 | TF-M, to indirectly gain unauthenticated accesses to assets. |
| 237 | |
| 238 | Those critical operations and modules either directly access the assets or |
| 239 | protect the assets from disclosure. Those operations and modules can include: |
| 240 | |
| 241 | - Image validation in bootloader |
| 242 | - Isolation management in TF-M, including platform specific configuration |
| 243 | - Cryptographic operations |
| 244 | - TF-M Secure Storage operations |
| 245 | - PSA client permission check in TF-M |
| 246 | |
| 247 | The detailed scenarios are discussed in following sections. |
| 248 | |
| 249 | Physical attacks against bootloader |
| 250 | ----------------------------------- |
| 251 | Physical attacks may bypass secure image validation in bootloader and a |
| 252 | malicious image can be installed. |
| 253 | |
| 254 | The countermeasures is bootloader specific implementation and out of the scope |
| 255 | of this document. TF-M relies on MCUboot by default. MCUboot has already |
| 256 | implemented countermeasures against fault injection attacks [3]_. |
| 257 | |
| 258 | .. _physical-attacks-spm: |
| 259 | |
| 260 | Physical attacks against TF-M SPM |
| 261 | --------------------------------- |
| 262 | TF-M SPM initializes and manages the isolation configuration. It also performs |
| 263 | permission check against secure service requests from PSA clients. |
| 264 | |
| 265 | Static isolation configuration |
| 266 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 267 | It is TF-M SPM's responsibility to build up isolation during the initialization |
| 268 | phase. If this is missed or not done correctly then it might be possible for |
| 269 | non-secure code to access some secure memory area or an external device can |
| 270 | access assets in the device through a debug port. |
| 271 | |
| 272 | Therefore, hindering the setup of memory or peripheral isolation hardware is an |
| 273 | obvious candidate for physical attacks. The initialization phase has a constant |
| 274 | time execution (like the previous boot-up state), therefore the timing of the |
| 275 | attack is simpler, compared to cases when secure and non-secure runtime firmware |
| 276 | is up-and-running for a while and IRQs make timing unpredictable. |
| 277 | |
| 278 | Some examples of attacking isolation configuration are shown in the list below. |
| 279 | |
| 280 | - Hinder the setting of security regions. Try to execute non-secure code as |
| 281 | secure. |
| 282 | - Manipulate the setting of secure regions, try to extend the non-secure |
| 283 | regions to cover a memory area which otherwise is intended to be secure |
| 284 | area. |
| 285 | - Hinder the setting of isolation boundary. In this case vulnerable ARoT code |
| 286 | has access to all memory. |
| 287 | - Manipulate peripheral configuration to give access to non-secure code to a |
| 288 | peripheral which is intended to be secure. |
| 289 | |
| 290 | PSA client permission checks |
| 291 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 292 | TF-M SPM performs several permission checks against secure service requests from |
| 293 | a PSA client, such as: |
| 294 | |
| 295 | - Check whether the PSA client is a non-secure client or a secure client |
| 296 | |
| 297 | NS client's PSA client ID is negative. NS client is not allowed to directly |
| 298 | access secure areas. A malicious actor can inject faults when TF-M SPM |
| 299 | authenticates a NS client. It may manipulate TF-M to accept it as a secure |
| 300 | client and allow the NS client to access assets. |
| 301 | |
| 302 | - Memory access checks |
| 303 | |
| 304 | TF-M SPM checks whether the request has correct permission to access a secure |
| 305 | memory area. A malicious actor can inject faults when TF-M SPM checks memory |
| 306 | access permission. It may skip critical check steps or corrupt the check |
| 307 | result. Thereby a malicious service request may pass TF-M memory access check |
| 308 | and accesses assets which it is not allowed to. |
| 309 | |
| 310 | The physical attacks mentioned above relies on the a malicious NS application or |
| 311 | a vulnerable RoT service to start a malicious secure service request to access |
| 312 | the assets. The malicious actor has to be aware of the accurate timing of |
| 313 | dealing with the malicious request in TF-M SPM. The timing can be affected by |
| 314 | other clients and interrupts. |
| 315 | It should be more difficult than pure fault injection. |
| 316 | |
| 317 | Dynamic isolation boundary configuration |
| 318 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 319 | Physical attack may affect the isolation boundary setting during TF-M context |
| 320 | switch, especially in Isolation Level 3. For example: |
| 321 | |
| 322 | - A fault injection may cause TF-M SPM to skip clear privileged state before |
| 323 | switching in an ARoT service. |
| 324 | - A fault injection may cause TF-M SPM to skip updating MPU regions and |
| 325 | therefore the next RoT service may access assets belonging to a previous |
| 326 | one. |
| 327 | |
| 328 | However, it is much more difficult to find out the accurate timing of TF-M |
| 329 | context switch, compared to other scenarios in TF-M SPM. It also requires a |
| 330 | vulnerable RoT service to access assets after fault injection. |
| 331 | |
| 332 | Physical attacks against TF-M Crypto service |
| 333 | -------------------------------------------- |
| 334 | Since crypto operations are done by mbedTLS library or by a custom crypto |
| 335 | accelerator engine and its related software driver stack, the analysis of |
| 336 | physical attacks against crypto operations is out-of-scope for this document. |
| 337 | However, in general the same requirements are applicable for the crypto, to be |
| 338 | compliant with PSA Level 3 certification. That is, it must be resistant against |
| 339 | physical attacks. So crypto software and hardware must be hardened against |
| 340 | side-channel and physical attacks. |
| 341 | |
| 342 | Physical attacks against Secure Storage |
| 343 | --------------------------------------- |
| 344 | Physical attacks against Internal Trusted Storage |
| 345 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 346 | Based on the assumption in :ref:`attacker-capability`, a malicious actor is |
| 347 | unable to directly retrieve assets via physical attacks against |
| 348 | :term:`Internal Trusted Storage` (ITS). |
| 349 | |
| 350 | Instead, a malicious actor can inject faults into isolation configuration of ITS |
| 351 | area in TF-M SPM to gain the access to assets stored in ITS. Refer to |
| 352 | :ref:`physical-attacks-spm` for details. |
| 353 | |
| 354 | Physical attacks against Protected Storage |
| 355 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 356 | Based on the assumption in :ref:`attacker-capability`, a malicious actor can be |
| 357 | able to directly access external storage device. |
| 358 | Therefore :term:`Protected Storage` (PS) shall enable encryption and |
| 359 | authentication by default to detect tampering with the content in external |
| 360 | storage device. |
| 361 | |
| 362 | A malicious actor can also inject faults into isolation configuration of PS and |
| 363 | external storage device peripherals in TF-M SPM to gain the access to assets |
| 364 | stored in PS. Refer to :ref:`physical-attacks-spm` for details. |
| 365 | |
| 366 | It is out of the scope of TF-M to fully prevent malicious actors from directly |
| 367 | tampering with or retrieving content stored in external storage devices. |
| 368 | |
| 369 | Physical attacks against platform specific implementation |
| 370 | --------------------------------------------------------- |
| 371 | Platform specific implementation includes critical TF-M HAL implementations. |
| 372 | A malicious actor can perform physical attack against those platform specific |
| 373 | implementations to bypass the countermeasures in TF-M common code. |
| 374 | |
Kevin Peng | c855573 | 2021-09-24 15:15:21 +0800 | [diff] [blame] | 375 | Platform early initialization |
| 376 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 377 | TFM provides a HAL API for platforms to perform HW initialization before SPM |
| 378 | initialization starts. |
Tamas Ban | 4f953f5 | 2020-11-16 07:49:43 +0000 | [diff] [blame] | 379 | The system integrator is responsible to implement this API on a particular SoC |
| 380 | and harden it against physical attacks: |
| 381 | |
| 382 | .. code-block:: c |
| 383 | |
Kevin Peng | c855573 | 2021-09-24 15:15:21 +0800 | [diff] [blame] | 384 | enum tfm_hal_status_t tfm_hal_platform_init(void); |
| 385 | |
| 386 | The API can have several initializations on different modules. The system |
| 387 | integrator can choose to even harden some of these initializations functions |
| 388 | within this platform init API. One of the examples is the debug access setting. |
| 389 | |
| 390 | Debug access setting |
| 391 | ******************** |
| 392 | TF-M configures debug access according to device lifecycle and accessible debug |
| 393 | certificates. In general, TF-M locks down the debug port if the device is in |
| 394 | secure production state. |
| 395 | The system integrator can put the settings into an API and harden it against |
| 396 | physical attacks. |
Tamas Ban | 4f953f5 | 2020-11-16 07:49:43 +0000 | [diff] [blame] | 397 | |
| 398 | Platform specific isolation configuration |
| 399 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 400 | TFM SPM exposes a HAL API for static and dynamic isolation configuration. The |
| 401 | system integrator is responsible to implement these API on a particular SoC and |
| 402 | harden it against physical attacks. |
| 403 | |
| 404 | .. code-block:: c |
| 405 | |
| 406 | enum tfm_hal_status_t tfm_hal_set_up_static_boundaries(void); |
Xinyu Zhang | 03c72ef | 2022-09-06 16:56:30 +0800 | [diff] [blame] | 407 | enum tfm_hal_status_t tfm_hal_bind_boundary(const struct partition_load_info_t *p_ldinf, |
| 408 | uintptr_t *p_boundary); |
Tamas Ban | 4f953f5 | 2020-11-16 07:49:43 +0000 | [diff] [blame] | 409 | |
| 410 | Memory access check |
| 411 | ^^^^^^^^^^^^^^^^^^^ |
| 412 | TFM SPM exposes a HAL API for platform specific memory access check. The |
Xinyu Zhang | 03c72ef | 2022-09-06 16:56:30 +0800 | [diff] [blame] | 413 | system integrator is responsible to implement this API on a particular SoC and |
Tamas Ban | 4f953f5 | 2020-11-16 07:49:43 +0000 | [diff] [blame] | 414 | harden it against physical attacks. |
| 415 | |
| 416 | .. code-block:: c |
| 417 | |
Summer Qin | 56725eb | 2022-05-06 15:23:40 +0800 | [diff] [blame] | 418 | tfm_hal_status_t tfm_hal_memory_check(uintptr_t boundary, |
| 419 | uintptr_t base, |
| 420 | size_t size, |
| 421 | uint32_t access_type); |
Tamas Ban | 4f953f5 | 2020-11-16 07:49:43 +0000 | [diff] [blame] | 422 | |
| 423 | .. _tf-m-against-physical-attacks: |
| 424 | |
| 425 | ********************************************* |
| 426 | TF-M countermeasures against physical attacks |
| 427 | ********************************************* |
| 428 | This section propose a design of software countermeasures against physical |
| 429 | attacks. |
| 430 | |
| 431 | Fault injection hardening library |
| 432 | ================================= |
| 433 | There is no open-source library which implements generic mitigation techniques |
| 434 | listed in :ref:`phy-att-countermeasures`. |
| 435 | TF-M project implements a portion of these techniques. TF-M software |
| 436 | countermeasures are implemented as a small library Fault Injection Hardening |
| 437 | (FIH) in TF-M code base. A similar library was first introduced and tested in |
| 438 | the MCUboot project (version 1.7.0) [2]_ which TF-M relies on. |
| 439 | |
| 440 | The FIH library is put under TF-M ``lib/fih/``. |
| 441 | |
| 442 | The implementation of the different techniques was assigned to fault injection |
Xinyu Zhang | 03c72ef | 2022-09-06 16:56:30 +0800 | [diff] [blame] | 443 | protection profiles. Four profiles (OFF, LOW, MEDIUM, HIGH) were introduced to fit |
Tamas Ban | 4f953f5 | 2020-11-16 07:49:43 +0000 | [diff] [blame] | 444 | better to the device capability (memory size, TRNG availability) and to |
| 445 | protection requirements mandated by the device threat model. Fault injection |
| 446 | protection profile is configurable at compile-time, default value: OFF. |
| 447 | |
| 448 | Countermeasure profiles and corresponding techniques are listed in the table |
| 449 | below. |
| 450 | |
| 451 | +--------------------------------+-------------+----------------+--------------+------------------+ |
| 452 | | Countermeasure | Profile LOW | Profile MEDIUM | Profile HIGH | Comments | |
| 453 | +================================+=============+================+==============+==================+ |
| 454 | | Control flow monitor | Y | Y | Y | | |
| 455 | +--------------------------------+-------------+----------------+--------------+------------------+ |
| 456 | | Failure loop hardening | Y | Y | Y | | |
| 457 | +--------------------------------+-------------+----------------+--------------+------------------+ |
| 458 | | Complex constant | | Y | Y | | |
| 459 | +--------------------------------+-------------+----------------+--------------+------------------+ |
| 460 | | Redundant variables and checks | | Y | Y | | |
| 461 | +--------------------------------+-------------+----------------+--------------+------------------+ |
| 462 | | Random delay | | | Y | Implemented, but | |
| 463 | | | | | | depends on HW | |
| 464 | | | | | | capability | |
| 465 | +--------------------------------+-------------+----------------+--------------+------------------+ |
| 466 | |
Xinyu Zhang | 03c72ef | 2022-09-06 16:56:30 +0800 | [diff] [blame] | 467 | Similar to MCUboot, four profiles are supported. It can be configured at build |
| 468 | time by setting (default is OFF): |
Tamas Ban | 4f953f5 | 2020-11-16 07:49:43 +0000 | [diff] [blame] | 469 | |
| 470 | ``-DTFM_FIH_PROFILE=<OFF, LOW, MEDIUM, HIGH>`` |
| 471 | |
| 472 | How to use FIH library |
| 473 | ====================== |
| 474 | As analyzed in :ref:`phy-att-threat-model`, this section focuses on integrating |
| 475 | FIH library in TF-M SPM to mitigate physical attacks. |
| 476 | |
| 477 | - Identify critical function call path which is mandatory for configuring |
Xinyu Zhang | 03c72ef | 2022-09-06 16:56:30 +0800 | [diff] [blame] | 478 | isolation or debug access. Change their return types to ``FIH_RET_TYPE`` and |
| 479 | make them return with ``FIH_RET``. Then call them with ``FIH_CALL``. These macros |
| 480 | are providing the extra checking functionality (control flow monitor, redundant |
| 481 | checks and variables, random delay, complex constant) according to the profile |
Tamas Ban | 4f953f5 | 2020-11-16 07:49:43 +0000 | [diff] [blame] | 482 | settings. More details about usage can be found here: |
Xinyu Zhang | 03c72ef | 2022-09-06 16:56:30 +0800 | [diff] [blame] | 483 | ``trusted-firmware-m/lib/fih/inc/fih.h`` |
Tamas Ban | 4f953f5 | 2020-11-16 07:49:43 +0000 | [diff] [blame] | 484 | |
| 485 | Take simplified TF-M SPM initialization flow as an example: |
| 486 | |
| 487 | .. code-block:: c |
| 488 | |
| 489 | main() |
| 490 | | |
| 491 | |--> tfm_core_init() |
| 492 | | | |
Tamas Ban | 4f953f5 | 2020-11-16 07:49:43 +0000 | [diff] [blame] | 493 | | |--> tfm_hal_set_up_static_boundaries() |
Kevin Peng | c855573 | 2021-09-24 15:15:21 +0800 | [diff] [blame] | 494 | | | | |
| 495 | | | |--> platform specific isolation impl. |
| 496 | | | |
| 497 | | |--> tfm_hal_platform_init() |
Tamas Ban | 4f953f5 | 2020-11-16 07:49:43 +0000 | [diff] [blame] | 498 | | | |
Kevin Peng | c855573 | 2021-09-24 15:15:21 +0800 | [diff] [blame] | 499 | | |--> platform specific init |
Tamas Ban | 4f953f5 | 2020-11-16 07:49:43 +0000 | [diff] [blame] | 500 | | |
| 501 | |--> During each partition initialization |
| 502 | | |
Xinyu Zhang | 03c72ef | 2022-09-06 16:56:30 +0800 | [diff] [blame] | 503 | |--> tfm_hal_bind_boundary() |
Tamas Ban | 4f953f5 | 2020-11-16 07:49:43 +0000 | [diff] [blame] | 504 | | |
Xinyu Zhang | 03c72ef | 2022-09-06 16:56:30 +0800 | [diff] [blame] | 505 | |--> platform specific peripheral isolation impl. |
Tamas Ban | 4f953f5 | 2020-11-16 07:49:43 +0000 | [diff] [blame] | 506 | |
| 507 | - Might make the important setting of peripheral config register redundant |
| 508 | and verify them to match expectations before continue. |
| 509 | |
| 510 | - Implements an extra verification function which checks the critical hardware |
| 511 | config before secure code switches to non-secure. Proposed API for this |
| 512 | purpose: |
| 513 | |
| 514 | .. code-block:: c |
| 515 | |
Kevin Peng | 38788a1 | 2021-09-08 16:23:50 +0800 | [diff] [blame] | 516 | fih_int tfm_hal_verify_static_boundaries(void); |
Tamas Ban | 4f953f5 | 2020-11-16 07:49:43 +0000 | [diff] [blame] | 517 | |
Xinyu Zhang | 03c72ef | 2022-09-06 16:56:30 +0800 | [diff] [blame] | 518 | This function is intended to be called just after the static boundaries are |
| 519 | set up and is responsible for checking all critical hardware configurations. |
| 520 | The goal is to catch if something is missed and act according |
Tamas Ban | 4f953f5 | 2020-11-16 07:49:43 +0000 | [diff] [blame] | 521 | to system policy. The introduction of one more checking point requires one |
| 522 | more intervention with precise timing. The system integrator is responsible |
| 523 | to implement this API on a particular SoC and harden it against physical |
| 524 | attacks. Make sure that all platform dependent security feature is properly |
| 525 | configured. |
| 526 | |
| 527 | - The most powerful mitigation technique is to add random delay to the code |
| 528 | execution. This makes the timing of the attack much harder. However it |
| 529 | requires an entropy source. It is recommended to use the ``HIGH`` profile |
| 530 | when hardware support is available. There is a porting API layer to fetch |
| 531 | random numbers in FIH library: |
| 532 | |
| 533 | .. code-block:: c |
| 534 | |
Xinyu Zhang | 03c72ef | 2022-09-06 16:56:30 +0800 | [diff] [blame] | 535 | void fih_delay_init(void); |
| 536 | uint8_t fih_delay_random(void); |
Tamas Ban | 4f953f5 | 2020-11-16 07:49:43 +0000 | [diff] [blame] | 537 | |
| 538 | - Similar countermeasures can be implemented in critical steps in platform |
| 539 | specific implementation. |
| 540 | |
Xinyu Zhang | 03c72ef | 2022-09-06 16:56:30 +0800 | [diff] [blame] | 541 | Take memory isolation settings on AN521 platform as an example. |
Tamas Ban | 4f953f5 | 2020-11-16 07:49:43 +0000 | [diff] [blame] | 542 | The following hardware components are responsible for memory isolation in a |
| 543 | SoC, which is based on SSE-200 subsystem. |
| 544 | System integrators must examine the chip specific memory isolation solution, |
| 545 | identify the key components and harden the configuration of those. |
| 546 | This list just serves as an example here for easier understanding: |
| 547 | |
| 548 | - Implementation Defined Attribution Unit (IDAU): Implementation defined, |
| 549 | it can be a static config or dynamic. |
| 550 | Contains the default security access permissions of the memory map. |
| 551 | - SAU: The main module in the CPU to determine the security settings of |
| 552 | the memory. |
| 553 | - :term:`MPC`: External module from the CPU point of view. It protects the |
| 554 | non security aware memories from unauthenticated access. Having a |
| 555 | properly configured MPC significantly increases the security of the |
| 556 | system. |
| 557 | - :term:`PPC`: External module from the CPU |
| 558 | point of view. Protects the non security aware peripherals from |
| 559 | unauthenticated access. |
| 560 | - MPU: Protects memory from unprivileged access. ARoT code has only a |
| 561 | restricted access in secure domain. It mitigates that a vulnerable or |
| 562 | malicious ARoT partition can access to device assets. |
| 563 | |
Xinyu Zhang | 03c72ef | 2022-09-06 16:56:30 +0800 | [diff] [blame] | 564 | The following AN521 specific isolation configuration functions |
Tamas Ban | 4f953f5 | 2020-11-16 07:49:43 +0000 | [diff] [blame] | 565 | shall be hardened against physical attacks. |
| 566 | |
| 567 | .. code-block:: c |
| 568 | |
| 569 | sau_and_idau_cfg() |
| 570 | mpc_init_cfg() |
| 571 | ppc_init_cfg() |
| 572 | |
| 573 | Some platform specific implementation rely on platform standard device |
| 574 | driver libraries. It can become much more difficult to maintain drivers if |
| 575 | the standard libraries are modified with FIH library. Platform specific |
| 576 | implementation can implement duplicated execution and redundant variables/ |
| 577 | condition check when calling platform standard device driver libraries |
| 578 | according to usage scenarios. |
| 579 | |
Xinyu Zhang | 03c72ef | 2022-09-06 16:56:30 +0800 | [diff] [blame] | 580 | Impact on memory footprint |
| 581 | ========================== |
| 582 | The addition of protection code against physical attacks increases the memory |
| 583 | footprint. The actual increase depends on the selected profile and where the |
Tamas Ban | 4f953f5 | 2020-11-16 07:49:43 +0000 | [diff] [blame] | 584 | mitigation code is added. |
| 585 | |
| 586 | Attack experiment with SPM |
| 587 | ========================== |
| 588 | The goal is to bypass the setting of memory isolation hardware with simulated |
| 589 | instruction skips in fast model execution (FVP_MPS2_AEMv8M) in order to execute |
| 590 | the regular non-secure test code in secure state. This is done by identifying |
| 591 | the configuration steps which must be bypassed to make this happen. The |
| 592 | instruction skip simulation is achieved by breakpoints and manual manipulation |
| 593 | of the program counter. The following steps are done on AN521 target, but this |
| 594 | can be different on another target: |
| 595 | |
| 596 | - Bypass the configuration of isolation HW: SAU, MPC. |
Tamas Ban | 4f953f5 | 2020-11-16 07:49:43 +0000 | [diff] [blame] | 597 | - Bypass the setting of the PSP limit register. Otherwise, a stack overflow |
| 598 | exception will happen. Because the secure PSP will be overwritten by the |
| 599 | address of the non-secure stack and on this particular target the non-secure |
| 600 | stack is on lower address than the value in the secure PSP_LIMIT register. |
| 601 | - Avoid the clearing of the least significant bit in the non-secure entry |
| 602 | point, where BLXNS/BXNS is jumping to non-secure code. Having the least |
| 603 | significant bit cleared indicates to the hardware to switch security state. |
| 604 | |
| 605 | The previous steps are enough to execute the non-secure Reset_Handler() in |
| 606 | secure state. Usually, RTOS is executing on the non-secure side. In order to |
| 607 | properly boot it up further steps are needed: |
| 608 | |
| 609 | - Set the S_VTOR system register to point the address of the NS Vector table. |
| 610 | Code is executed in secure state therefore when an IRQ hit then the handler |
| 611 | address is fetched from the table pointed by S_VTOR register. RTOS usually |
| 612 | do an SVC call at start-up. If S_VTOR is not modified then SPM's SVC handler |
| 613 | will be executed. |
| 614 | - TBC: RTX osKernelStart still failing. |
| 615 | |
| 616 | The bottom line is that in order to execute the regular non-secure code in |
| 617 | secure state the attacker need to interfere with the execution flow at many |
| 618 | places. Successful attack can be made even harder by adding the described |
| 619 | mitigation techniques and some random delays. |
| 620 | |
| 621 | |
| 622 | ********* |
| 623 | Reference |
| 624 | ********* |
| 625 | |
| 626 | .. [1] `PSA Certified Level 3 Lightweight Protection Profile <https://www.psacertified.org/app/uploads/2020/11/JSADEN009-PSA_Certified_Level_3_LW_PP-1.0-ALP02.pdf>`_ |
| 627 | |
Matthew Dalzell | 988bbd6 | 2025-06-05 15:49:26 +0100 | [diff] [blame] | 628 | .. [2] `MCUboot project - fault injection hardening <https://github.com/mcu-tools/mcuboot/blob/master/boot/bootutil/include/bootutil/fault_injection_hardening.h>`_ |
Tamas Ban | 4f953f5 | 2020-11-16 07:49:43 +0000 | [diff] [blame] | 629 | |
| 630 | .. [3] `MCUboot fault injection mitigation <https://www.trustedfirmware.org/docs/TF-M_fault_injection_mitigation.pdf>`_ |
David Hu | a91a9d1 | 2021-04-15 16:54:53 +0800 | [diff] [blame] | 631 | |
| 632 | -------------------------------- |
| 633 | |
Summer Qin | 56725eb | 2022-05-06 15:23:40 +0800 | [diff] [blame] | 634 | *Copyright (c) 2021-2022, Arm Limited. All rights reserved.* |