Tamas Ban | 4f953f5 | 2020-11-16 07:49:43 +0000 | [diff] [blame] | 1 | ################################################# |
| 2 | Physical attack mitigation in Trusted Firmware-M |
| 3 | ################################################# |
| 4 | |
| 5 | :Authors: Tamas Ban; David Hu |
| 6 | :Organization: Arm Limited |
| 7 | :Contact: tamas.ban@arm.com; david.hu@arm.com |
| 8 | :Status: Draft |
| 9 | |
| 10 | ************ |
| 11 | Requirements |
| 12 | ************ |
| 13 | PSA Certified Level 3 Lightweight Protection Profile [1]_ requires protection |
| 14 | against physical attacks. This includes protection against manipulation of the |
| 15 | hardware and any data, undetected manipulation of memory contents, physical |
| 16 | probing on the chip's surface. The RoT detects or prevents its operation outside |
| 17 | the normal operating conditions (such as voltage, clock frequency, temperature, |
| 18 | or external energy fields) where reliability and secure operation has not been |
| 19 | proven or tested. |
| 20 | |
| 21 | .. note:: |
| 22 | |
| 23 | Mitigation against certain level of physical attacks is a mandatory |
| 24 | requirement for PSA Level 3 certification. |
| 25 | The :ref:`tf-m-against-physical-attacks` discussed below |
| 26 | doesn't provide mitigation against all the physical attacks considered in |
| 27 | scope for PSA L3 certification. Please check the Protection Profile document |
| 28 | for an exhaustive list of requirements. |
| 29 | |
| 30 | **************** |
| 31 | Physical attacks |
| 32 | **************** |
| 33 | The goal of physical attacks is to alter the expected behavior of a circuit. |
| 34 | This can be achieved by changing the device's normal operating conditions to |
| 35 | untested operating conditions. As a result a hazard might be triggered on the |
| 36 | circuit level, whose impact is unpredictable in advance but its effect can be |
| 37 | observed. With frequent attempts, a weak point of the system could be identified |
| 38 | and the attacker could gain access to the entire device. There is a wide variety |
| 39 | of physical attacks, the following is not a comprehensive list rather just give |
| 40 | a taste of the possibilities: |
| 41 | |
| 42 | - Inject a glitch into the device power supply or clock line. |
| 43 | - Operate the device outside its temperature range: cool down or warm it up. |
| 44 | - Shoot the chip with an electromagnetic field. This can be done by passing |
| 45 | current through a small coil close to the chip surface, no physical contact |
| 46 | or modification of the PCB (soldering) is necessary. |
| 47 | - Point a laser beam on the chip surface. It could flip bits in memory or a |
| 48 | register, but precise knowledge of chip layout and design is necessary. |
| 49 | |
| 50 | The required equipment and cost of these attacks varies. There are commercial |
| 51 | products to perform such attacks. Furthermore, they are shipped with a scripting |
| 52 | environment, good documentation, and a lot of examples. In general, there is a |
| 53 | ton of videos, research paper and blogs about fault injection attacks. As a |
| 54 | result the threshold, that even non-proficient can successfully perform such |
| 55 | attack, gets lower over time. |
| 56 | |
| 57 | ***************************************************************** |
| 58 | Effects of physical attacks in hardware and in software execution |
| 59 | ***************************************************************** |
| 60 | The change in the behavior of the hardware and software cannot be seen in |
| 61 | advance when performing a physical attack. On circuit-level they manifest |
| 62 | in bit faults. These bit faults can cause varied effects in the behavior of |
| 63 | the device micro-architecture: |
| 64 | |
| 65 | - Instruction decoding pipeline is flushed. |
| 66 | - Altering instructions when decoding. |
| 67 | - Altering data when fetching or storing. |
| 68 | - Altering register content, and the program counter. |
| 69 | - Flip bits in register or memory. |
| 70 | |
| 71 | These phenomenons happen at random and cannot be observed directly but the |
| 72 | effect can be traced in software execution. On the software level the following |
| 73 | can happen: |
| 74 | |
| 75 | - A few instructions are skipped. This can lead to taking different branch |
| 76 | than normal. |
| 77 | - Corrupted CPU register or data fetch could alter the result of a comparison |
| 78 | instruction. Or change the value returned from a function. |
| 79 | - Corrupted data store could alter the config of peripherals. |
| 80 | - Very precise attacks with laser can flip bits in any register or in memory. |
| 81 | |
| 82 | This is a complex domain. Faults are not well-understood. Different fault models |
| 83 | exist but all of them target a specific aspect of fault injection. One of the |
| 84 | most common and probably the easily applicable fault model is the instruction |
| 85 | skip. |
| 86 | |
| 87 | *********************************** |
| 88 | Mitigation against physical attacks |
| 89 | *********************************** |
| 90 | The applicability of these attacks highly depends on the device. Some |
| 91 | devices are more sensitive than others. Protection is possible at hardware and |
| 92 | software levels as well. |
| 93 | |
| 94 | On the hardware level, there are chip design principles and system IPs that are |
| 95 | resistant to fault injection attacks. These can make it harder to perform a |
| 96 | successful attack and as a result the chip might reset or erase sensitive |
| 97 | content. The device maker needs to consider what level of physical attack is in |
| 98 | scope and choose a SoC accordingly. |
| 99 | |
| 100 | On top of hardware-level protection, a secondary protection layer can be |
| 101 | implemented in software. This approach is known as "defence in depth". |
| 102 | |
| 103 | Neither hardware nor software level protection is perfect because both can be |
| 104 | bypassed. The combination of them provides the maximum level of protection. |
| 105 | However, even when both are in place, it is not certain that they provide 100% |
| 106 | protection against physical attacks. The best of what is to achievable to harden |
| 107 | the system to increase the cost of a successful attack (in terms of time and |
| 108 | equipment), thereby making it non profitable to perform them. |
| 109 | |
| 110 | .. _phy-att-countermeasures: |
| 111 | |
| 112 | Software countermeasures against physical attacks |
| 113 | ================================================= |
| 114 | There are practical coding techniques which can be applied to harden software |
| 115 | against fault injection attacks. They significantly decrease the probability of |
| 116 | a successful attack: |
| 117 | |
| 118 | - Control flow monitor |
| 119 | |
| 120 | To catch malicious modification of the expected control flow. When an |
| 121 | important portion of a program is executed, a flow monitor counter is |
| 122 | incremented. The program moves to the next stage only if the accumulated |
| 123 | flow monitor counter is equal to an expected value. |
| 124 | |
| 125 | - Default failure |
| 126 | |
| 127 | The return value variable should always contain a value indicating |
| 128 | failure. Changing its value to success is done only under one protected |
| 129 | flow (preferably protected by double checks). |
| 130 | |
| 131 | - Complex constant |
| 132 | |
| 133 | It is hard to change a memory region or register to a pre-defined value, but |
| 134 | usual boolean values (0 or 1) are easier to manipulate. |
| 135 | |
| 136 | - Redundant variables and condition checks |
| 137 | |
| 138 | To make branch condition attack harder it is recommended to check the |
| 139 | relevant condition twice (it is better to have a random delay between the |
| 140 | two comparisons). |
| 141 | |
| 142 | - Random delay |
| 143 | |
| 144 | Successful fault injection attacks require very precise timing. Adding |
| 145 | random delay to the code execution makes the timing of an attack much |
| 146 | harder. |
| 147 | |
| 148 | - Loop integrity check |
| 149 | |
| 150 | To avoid to skip critical loop iterations. It can weaken the cryptographic |
| 151 | algorithms. After a loop has executed, check the loop counter whether it |
| 152 | indeed has the expected value. |
| 153 | |
| 154 | - Duplicated execution |
| 155 | |
| 156 | Execute a critical step multiple times to prevent fault injection from |
| 157 | skipping the step. To mitigate multiple consecutive fault injections, random |
| 158 | delay can be inserted between duplicated executions. |
| 159 | |
| 160 | These techniques should be applied in a thoughtful way. If it is applied |
| 161 | everywhere then it can result in messy code that makes the maintenance harder. |
| 162 | Code must be analysed and sensitive parts and critical call path must be |
| 163 | identified. Furthermore, these techniques increase the overall code size which |
| 164 | might be an issue on the constrained devices. |
| 165 | |
| 166 | Currently, compilers are not providing any support to implement these |
| 167 | countermeasures automatically. On the contrary, they can eliminate the |
| 168 | protection code during optimization. As a result, the C level protection does |
| 169 | not add any guarantee about the final behavior of the system. The effectiveness |
| 170 | of these protections highly depends on the actual compiler and the optimization |
| 171 | level. The compiled assembly code must be visually inspected and tested to make |
| 172 | sure that proper countermeasures are in-place and perform as expected. |
| 173 | |
| 174 | .. _phy-att-threat-model: |
| 175 | |
| 176 | ****************************************** |
| 177 | TF-M Threat Model against physical attacks |
| 178 | ****************************************** |
| 179 | |
| 180 | Physical attack target |
| 181 | ====================== |
| 182 | A malicious actor performs physical attack against TF-M to retrieve assets from |
| 183 | device. These assets can be sensitive data, credentials, crypto keys. These |
| 184 | assets are protected in TF-M by proper isolation. |
| 185 | |
| 186 | For example, a malicious actor can perform the following attacks: |
| 187 | |
| 188 | - Reopen the debug port or hinder the closure of it then connect to the device |
| 189 | with a debugger and dump memory. |
| 190 | - Bypass secure boot to replace authentic firmware with a malicious image. |
| 191 | Then arbitrary memory can be read. |
| 192 | - Assuming that secure boot cannot be bypassed then an attacker can try to |
| 193 | hinder the setup of the memory isolation hardware by TF-M |
| 194 | :term:`Secure Partition Manager` (SPM) and manage to execute the non-secure |
| 195 | image in secure state. If this is achieved then still an exploitable |
| 196 | vulnerability is needed in the non-secure code which can be used to inject |
| 197 | and execute arbitrary code to read the assets. |
| 198 | - Device might contain unsigned binary blob next to the official firmware. |
| 199 | This can be any data, not necessarily code. If an attacker manages to |
| 200 | replace this data with arbitrary content (e.g. a NOP slide leading to a |
| 201 | malicious code) then they can try to manipulate the program counter to jump |
| 202 | to this area before setting up the memory isolation. |
| 203 | |
| 204 | .. _attacker-capability: |
| 205 | |
| 206 | Assumptions on attacker capability |
| 207 | ================================== |
| 208 | It is assumed that the attacker owns the following capabilities to perform |
| 209 | physical attack against devices protected by TF-M. |
| 210 | |
| 211 | - Has physical access to the device. |
| 212 | - Able to access external memory, read and possibly tamper it. |
| 213 | - Able to load arbitrary candidate images for firmware upgrade. |
| 214 | - Able to manage that bootloader tries to upgrade the arbitrary image from |
| 215 | staging area. |
| 216 | - Able to inject faults on hardware level (voltage or power glitch, EM pulse, |
| 217 | etc.) to the system. |
| 218 | - Precise timing of fault injection is possible once or a few times, but in |
| 219 | general the more intervention is required for a successful attack the harder |
| 220 | will be to succeed. |
| 221 | |
| 222 | It is out of the scope of TF-M mitigation if an attacker is able to directly |
| 223 | tamper or disclose the assets. It is assumed that an attacker has the following |
| 224 | technical limitations. |
| 225 | |
| 226 | - No knowledge of the image signing key. Not able to sign an arbitrary image. |
| 227 | - Not able to directly access to the chip through debug port. |
| 228 | - Not able to directly access internal memory. |
| 229 | - No knowledge of the layout of the die or the memory arrangement of the |
| 230 | secure code, so precise attack against specific registers or memory |
| 231 | addresses are out of scope. |
| 232 | |
| 233 | Physical attack scenarios against TF-M |
| 234 | ====================================== |
| 235 | Based on the analysis above, a malicious actor may perform physical attacks |
| 236 | against critical operations in :term:`SPE` workflow and critical modules in |
| 237 | TF-M, to indirectly gain unauthenticated accesses to assets. |
| 238 | |
| 239 | Those critical operations and modules either directly access the assets or |
| 240 | protect the assets from disclosure. Those operations and modules can include: |
| 241 | |
| 242 | - Image validation in bootloader |
| 243 | - Isolation management in TF-M, including platform specific configuration |
| 244 | - Cryptographic operations |
| 245 | - TF-M Secure Storage operations |
| 246 | - PSA client permission check in TF-M |
| 247 | |
| 248 | The detailed scenarios are discussed in following sections. |
| 249 | |
| 250 | Physical attacks against bootloader |
| 251 | ----------------------------------- |
| 252 | Physical attacks may bypass secure image validation in bootloader and a |
| 253 | malicious image can be installed. |
| 254 | |
| 255 | The countermeasures is bootloader specific implementation and out of the scope |
| 256 | of this document. TF-M relies on MCUboot by default. MCUboot has already |
| 257 | implemented countermeasures against fault injection attacks [3]_. |
| 258 | |
| 259 | .. _physical-attacks-spm: |
| 260 | |
| 261 | Physical attacks against TF-M SPM |
| 262 | --------------------------------- |
| 263 | TF-M SPM initializes and manages the isolation configuration. It also performs |
| 264 | permission check against secure service requests from PSA clients. |
| 265 | |
| 266 | Static isolation configuration |
| 267 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 268 | It is TF-M SPM's responsibility to build up isolation during the initialization |
| 269 | phase. If this is missed or not done correctly then it might be possible for |
| 270 | non-secure code to access some secure memory area or an external device can |
| 271 | access assets in the device through a debug port. |
| 272 | |
| 273 | Therefore, hindering the setup of memory or peripheral isolation hardware is an |
| 274 | obvious candidate for physical attacks. The initialization phase has a constant |
| 275 | time execution (like the previous boot-up state), therefore the timing of the |
| 276 | attack is simpler, compared to cases when secure and non-secure runtime firmware |
| 277 | is up-and-running for a while and IRQs make timing unpredictable. |
| 278 | |
| 279 | Some examples of attacking isolation configuration are shown in the list below. |
| 280 | |
| 281 | - Hinder the setting of security regions. Try to execute non-secure code as |
| 282 | secure. |
| 283 | - Manipulate the setting of secure regions, try to extend the non-secure |
| 284 | regions to cover a memory area which otherwise is intended to be secure |
| 285 | area. |
| 286 | - Hinder the setting of isolation boundary. In this case vulnerable ARoT code |
| 287 | has access to all memory. |
| 288 | - Manipulate peripheral configuration to give access to non-secure code to a |
| 289 | peripheral which is intended to be secure. |
| 290 | |
| 291 | PSA client permission checks |
| 292 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 293 | TF-M SPM performs several permission checks against secure service requests from |
| 294 | a PSA client, such as: |
| 295 | |
| 296 | - Check whether the PSA client is a non-secure client or a secure client |
| 297 | |
| 298 | NS client's PSA client ID is negative. NS client is not allowed to directly |
| 299 | access secure areas. A malicious actor can inject faults when TF-M SPM |
| 300 | authenticates a NS client. It may manipulate TF-M to accept it as a secure |
| 301 | client and allow the NS client to access assets. |
| 302 | |
| 303 | - Memory access checks |
| 304 | |
| 305 | TF-M SPM checks whether the request has correct permission to access a secure |
| 306 | memory area. A malicious actor can inject faults when TF-M SPM checks memory |
| 307 | access permission. It may skip critical check steps or corrupt the check |
| 308 | result. Thereby a malicious service request may pass TF-M memory access check |
| 309 | and accesses assets which it is not allowed to. |
| 310 | |
| 311 | The physical attacks mentioned above relies on the a malicious NS application or |
| 312 | a vulnerable RoT service to start a malicious secure service request to access |
| 313 | the assets. The malicious actor has to be aware of the accurate timing of |
| 314 | dealing with the malicious request in TF-M SPM. The timing can be affected by |
| 315 | other clients and interrupts. |
| 316 | It should be more difficult than pure fault injection. |
| 317 | |
| 318 | Dynamic isolation boundary configuration |
| 319 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 320 | Physical attack may affect the isolation boundary setting during TF-M context |
| 321 | switch, especially in Isolation Level 3. For example: |
| 322 | |
| 323 | - A fault injection may cause TF-M SPM to skip clear privileged state before |
| 324 | switching in an ARoT service. |
| 325 | - A fault injection may cause TF-M SPM to skip updating MPU regions and |
| 326 | therefore the next RoT service may access assets belonging to a previous |
| 327 | one. |
| 328 | |
| 329 | However, it is much more difficult to find out the accurate timing of TF-M |
| 330 | context switch, compared to other scenarios in TF-M SPM. It also requires a |
| 331 | vulnerable RoT service to access assets after fault injection. |
| 332 | |
| 333 | Physical attacks against TF-M Crypto service |
| 334 | -------------------------------------------- |
| 335 | Since crypto operations are done by mbedTLS library or by a custom crypto |
| 336 | accelerator engine and its related software driver stack, the analysis of |
| 337 | physical attacks against crypto operations is out-of-scope for this document. |
| 338 | However, in general the same requirements are applicable for the crypto, to be |
| 339 | compliant with PSA Level 3 certification. That is, it must be resistant against |
| 340 | physical attacks. So crypto software and hardware must be hardened against |
| 341 | side-channel and physical attacks. |
| 342 | |
| 343 | Physical attacks against Secure Storage |
| 344 | --------------------------------------- |
| 345 | Physical attacks against Internal Trusted Storage |
| 346 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 347 | Based on the assumption in :ref:`attacker-capability`, a malicious actor is |
| 348 | unable to directly retrieve assets via physical attacks against |
| 349 | :term:`Internal Trusted Storage` (ITS). |
| 350 | |
| 351 | Instead, a malicious actor can inject faults into isolation configuration of ITS |
| 352 | area in TF-M SPM to gain the access to assets stored in ITS. Refer to |
| 353 | :ref:`physical-attacks-spm` for details. |
| 354 | |
| 355 | Physical attacks against Protected Storage |
| 356 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 357 | Based on the assumption in :ref:`attacker-capability`, a malicious actor can be |
| 358 | able to directly access external storage device. |
| 359 | Therefore :term:`Protected Storage` (PS) shall enable encryption and |
| 360 | authentication by default to detect tampering with the content in external |
| 361 | storage device. |
| 362 | |
| 363 | A malicious actor can also inject faults into isolation configuration of PS and |
| 364 | external storage device peripherals in TF-M SPM to gain the access to assets |
| 365 | stored in PS. Refer to :ref:`physical-attacks-spm` for details. |
| 366 | |
| 367 | It is out of the scope of TF-M to fully prevent malicious actors from directly |
| 368 | tampering with or retrieving content stored in external storage devices. |
| 369 | |
| 370 | Physical attacks against platform specific implementation |
| 371 | --------------------------------------------------------- |
| 372 | Platform specific implementation includes critical TF-M HAL implementations. |
| 373 | A malicious actor can perform physical attack against those platform specific |
| 374 | implementations to bypass the countermeasures in TF-M common code. |
| 375 | |
| 376 | Debug access setting |
| 377 | ^^^^^^^^^^^^^^^^^^^^ |
| 378 | TF-M configures debug access according to device lifecycle and accessible debug |
| 379 | certificates. In general, TF-M locks down the debug port if the device is in |
| 380 | secure production state. TF-M exposed a HAL API for this purpose. |
| 381 | The system integrator is responsible to implement this API on a particular SoC |
| 382 | and harden it against physical attacks: |
| 383 | |
| 384 | .. code-block:: c |
| 385 | |
| 386 | enum tfm_plat_err_t tfm_spm_hal_init_debug(void); |
| 387 | |
| 388 | Platform specific isolation configuration |
| 389 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 390 | TFM SPM exposes a HAL API for static and dynamic isolation configuration. The |
| 391 | system integrator is responsible to implement these API on a particular SoC and |
| 392 | harden it against physical attacks. |
| 393 | |
| 394 | .. code-block:: c |
| 395 | |
| 396 | enum tfm_hal_status_t tfm_hal_set_up_static_boundaries(void); |
| 397 | enum tfm_plat_err_t tfm_spm_hal_configure_default_isolation( |
| 398 | uint32_t partition_idx, |
Ken Liu | 172f1e3 | 2021-02-05 16:31:03 +0800 | [diff] [blame] | 399 | const struct platform_data_t *platform_data); |
Tamas Ban | 4f953f5 | 2020-11-16 07:49:43 +0000 | [diff] [blame] | 400 | enum tfm_hal_status_t tfm_hal_mpu_update_partition_boundary(uintptr_t start, |
| 401 | uintptr_t end); |
| 402 | |
| 403 | Memory access check |
| 404 | ^^^^^^^^^^^^^^^^^^^ |
| 405 | TFM SPM exposes a HAL API for platform specific memory access check. The |
| 406 | system integrator is responsible to implement these API on a particular SoC and |
| 407 | harden it against physical attacks. |
| 408 | |
| 409 | .. code-block:: c |
| 410 | |
| 411 | tfm_hal_status_t tfm_hal_memory_has_access(const uintptr_t base, |
| 412 | size_t size, |
| 413 | uint32_t attr); |
| 414 | |
| 415 | .. _tf-m-against-physical-attacks: |
| 416 | |
| 417 | ********************************************* |
| 418 | TF-M countermeasures against physical attacks |
| 419 | ********************************************* |
| 420 | This section propose a design of software countermeasures against physical |
| 421 | attacks. |
| 422 | |
| 423 | Fault injection hardening library |
| 424 | ================================= |
| 425 | There is no open-source library which implements generic mitigation techniques |
| 426 | listed in :ref:`phy-att-countermeasures`. |
| 427 | TF-M project implements a portion of these techniques. TF-M software |
| 428 | countermeasures are implemented as a small library Fault Injection Hardening |
| 429 | (FIH) in TF-M code base. A similar library was first introduced and tested in |
| 430 | the MCUboot project (version 1.7.0) [2]_ which TF-M relies on. |
| 431 | |
| 432 | The FIH library is put under TF-M ``lib/fih/``. |
| 433 | |
| 434 | The implementation of the different techniques was assigned to fault injection |
| 435 | protection profiles. Four profile (OFF, LOW, MEDIUM, HIGH) was introduced to fit |
| 436 | better to the device capability (memory size, TRNG availability) and to |
| 437 | protection requirements mandated by the device threat model. Fault injection |
| 438 | protection profile is configurable at compile-time, default value: OFF. |
| 439 | |
| 440 | Countermeasure profiles and corresponding techniques are listed in the table |
| 441 | below. |
| 442 | |
| 443 | +--------------------------------+-------------+----------------+--------------+------------------+ |
| 444 | | Countermeasure | Profile LOW | Profile MEDIUM | Profile HIGH | Comments | |
| 445 | +================================+=============+================+==============+==================+ |
| 446 | | Control flow monitor | Y | Y | Y | | |
| 447 | +--------------------------------+-------------+----------------+--------------+------------------+ |
| 448 | | Failure loop hardening | Y | Y | Y | | |
| 449 | +--------------------------------+-------------+----------------+--------------+------------------+ |
| 450 | | Complex constant | | Y | Y | | |
| 451 | +--------------------------------+-------------+----------------+--------------+------------------+ |
| 452 | | Redundant variables and checks | | Y | Y | | |
| 453 | +--------------------------------+-------------+----------------+--------------+------------------+ |
| 454 | | Random delay | | | Y | Implemented, but | |
| 455 | | | | | | depends on HW | |
| 456 | | | | | | capability | |
| 457 | +--------------------------------+-------------+----------------+--------------+------------------+ |
| 458 | |
| 459 | Similar to MCUboot four profiles are supported, it can be configured at build |
| 460 | time by setting(default is OFF): |
| 461 | |
| 462 | ``-DTFM_FIH_PROFILE=<OFF, LOW, MEDIUM, HIGH>`` |
| 463 | |
| 464 | How to use FIH library |
| 465 | ====================== |
| 466 | As analyzed in :ref:`phy-att-threat-model`, this section focuses on integrating |
| 467 | FIH library in TF-M SPM to mitigate physical attacks. |
| 468 | |
| 469 | - Identify critical function call path which is mandatory for configuring |
| 470 | isolation or debug access. Transfer them to ``fih_int`` functions with the |
| 471 | usage of ``FIH_CALL`` and ``FIH_RET`` macros. These are providing the extra |
| 472 | checking functionality (control flow monitor, redundant checks and |
| 473 | variables, random delay, complex constant) according to the profile |
| 474 | settings. More details about usage can be found here: |
| 475 | ``tf-m/lib/fih/inc/fault_injection_hardening.h`` |
| 476 | |
| 477 | Take simplified TF-M SPM initialization flow as an example: |
| 478 | |
| 479 | .. code-block:: c |
| 480 | |
| 481 | main() |
| 482 | | |
| 483 | |--> tfm_core_init() |
| 484 | | | |
| 485 | | |--> tfm_spm_hal_init_debug() |
| 486 | | | | |
| 487 | | | |--> platform specific debug init |
| 488 | | | |
| 489 | | |--> tfm_hal_set_up_static_boundaries() |
| 490 | | | |
| 491 | | |--> platform specific isolation impl. |
| 492 | | |
| 493 | |--> During each partition initialization |
| 494 | | |
| 495 | |--> tfm_spm_hal_configure_default_isolation() |
| 496 | | |
| 497 | |--> platform specific peripheral |
| 498 | isolation impl. |
| 499 | |
| 500 | - Might make the important setting of peripheral config register redundant |
| 501 | and verify them to match expectations before continue. |
| 502 | |
| 503 | - Implements an extra verification function which checks the critical hardware |
| 504 | config before secure code switches to non-secure. Proposed API for this |
| 505 | purpose: |
| 506 | |
| 507 | .. code-block:: c |
| 508 | |
| 509 | fih_int tfm_spm_hal_verify_isolation_hw(void); |
| 510 | |
| 511 | This function is intended to be called just before the security state |
| 512 | transition and is responsible for checking all critical hardware |
| 513 | configuration. The goal is to catch if something is missed and act according |
| 514 | to system policy. The introduction of one more checking point requires one |
| 515 | more intervention with precise timing. The system integrator is responsible |
| 516 | to implement this API on a particular SoC and harden it against physical |
| 517 | attacks. Make sure that all platform dependent security feature is properly |
| 518 | configured. |
| 519 | |
| 520 | - The most powerful mitigation technique is to add random delay to the code |
| 521 | execution. This makes the timing of the attack much harder. However it |
| 522 | requires an entropy source. It is recommended to use the ``HIGH`` profile |
| 523 | when hardware support is available. There is a porting API layer to fetch |
| 524 | random numbers in FIH library: |
| 525 | |
| 526 | .. code-block:: c |
| 527 | |
| 528 | int fih_delay_init(void); |
| 529 | unsigned char fih_delay_random_uchar(void); |
| 530 | |
| 531 | - Similar countermeasures can be implemented in critical steps in platform |
| 532 | specific implementation. |
| 533 | |
| 534 | Take memory isolation settings on AN521 and Musca-B1 platforms as an |
| 535 | example. |
| 536 | The following hardware components are responsible for memory isolation in a |
| 537 | SoC, which is based on SSE-200 subsystem. |
| 538 | System integrators must examine the chip specific memory isolation solution, |
| 539 | identify the key components and harden the configuration of those. |
| 540 | This list just serves as an example here for easier understanding: |
| 541 | |
| 542 | - Implementation Defined Attribution Unit (IDAU): Implementation defined, |
| 543 | it can be a static config or dynamic. |
| 544 | Contains the default security access permissions of the memory map. |
| 545 | - SAU: The main module in the CPU to determine the security settings of |
| 546 | the memory. |
| 547 | - :term:`MPC`: External module from the CPU point of view. It protects the |
| 548 | non security aware memories from unauthenticated access. Having a |
| 549 | properly configured MPC significantly increases the security of the |
| 550 | system. |
| 551 | - :term:`PPC`: External module from the CPU |
| 552 | point of view. Protects the non security aware peripherals from |
| 553 | unauthenticated access. |
| 554 | - MPU: Protects memory from unprivileged access. ARoT code has only a |
| 555 | restricted access in secure domain. It mitigates that a vulnerable or |
| 556 | malicious ARoT partition can access to device assets. |
| 557 | |
| 558 | The following AN521/Musca-B1 specific isolation configuration functions |
| 559 | shall be hardened against physical attacks. |
| 560 | |
| 561 | .. code-block:: c |
| 562 | |
| 563 | sau_and_idau_cfg() |
| 564 | mpc_init_cfg() |
| 565 | ppc_init_cfg() |
| 566 | |
| 567 | Some platform specific implementation rely on platform standard device |
| 568 | driver libraries. It can become much more difficult to maintain drivers if |
| 569 | the standard libraries are modified with FIH library. Platform specific |
| 570 | implementation can implement duplicated execution and redundant variables/ |
| 571 | condition check when calling platform standard device driver libraries |
| 572 | according to usage scenarios. |
| 573 | |
| 574 | Impact on code size |
| 575 | =================== |
| 576 | The addition of protection code against physical attacks increases the code |
| 577 | size. The actual increase depends on the selected profile and where the |
| 578 | mitigation code is added. |
| 579 | |
| 580 | Attack experiment with SPM |
| 581 | ========================== |
| 582 | The goal is to bypass the setting of memory isolation hardware with simulated |
| 583 | instruction skips in fast model execution (FVP_MPS2_AEMv8M) in order to execute |
| 584 | the regular non-secure test code in secure state. This is done by identifying |
| 585 | the configuration steps which must be bypassed to make this happen. The |
| 586 | instruction skip simulation is achieved by breakpoints and manual manipulation |
| 587 | of the program counter. The following steps are done on AN521 target, but this |
| 588 | can be different on another target: |
| 589 | |
| 590 | - Bypass the configuration of isolation HW: SAU, MPC. |
| 591 | - Bypass tfm_spm_hal_nvic_interrupt_enable: The state of the MPC is checked |
| 592 | here whether is it initialized or not. |
| 593 | - Bypass the setting of the PSP limit register. Otherwise, a stack overflow |
| 594 | exception will happen. Because the secure PSP will be overwritten by the |
| 595 | address of the non-secure stack and on this particular target the non-secure |
| 596 | stack is on lower address than the value in the secure PSP_LIMIT register. |
| 597 | - Avoid the clearing of the least significant bit in the non-secure entry |
| 598 | point, where BLXNS/BXNS is jumping to non-secure code. Having the least |
| 599 | significant bit cleared indicates to the hardware to switch security state. |
| 600 | |
| 601 | The previous steps are enough to execute the non-secure Reset_Handler() in |
| 602 | secure state. Usually, RTOS is executing on the non-secure side. In order to |
| 603 | properly boot it up further steps are needed: |
| 604 | |
| 605 | - Set the S_VTOR system register to point the address of the NS Vector table. |
| 606 | Code is executed in secure state therefore when an IRQ hit then the handler |
| 607 | address is fetched from the table pointed by S_VTOR register. RTOS usually |
| 608 | do an SVC call at start-up. If S_VTOR is not modified then SPM's SVC handler |
| 609 | will be executed. |
| 610 | - TBC: RTX osKernelStart still failing. |
| 611 | |
| 612 | The bottom line is that in order to execute the regular non-secure code in |
| 613 | secure state the attacker need to interfere with the execution flow at many |
| 614 | places. Successful attack can be made even harder by adding the described |
| 615 | mitigation techniques and some random delays. |
| 616 | |
| 617 | |
| 618 | ********* |
| 619 | Reference |
| 620 | ********* |
| 621 | |
| 622 | .. [1] `PSA Certified Level 3 Lightweight Protection Profile <https://www.psacertified.org/app/uploads/2020/11/JSADEN009-PSA_Certified_Level_3_LW_PP-1.0-ALP02.pdf>`_ |
| 623 | |
| 624 | .. [2] `MCUboot project <https://github.com/mcu-tools/mcuboot/blob/master/boot/bootutil/include/bootutil/fault_injection_hardening.h>`_ |
| 625 | |
| 626 | .. [3] `MCUboot fault injection mitigation <https://www.trustedfirmware.org/docs/TF-M_fault_injection_mitigation.pdf>`_ |