Manuel Pégourié-Gonnard | b89fd95 | 2021-09-30 11:52:04 +0200 | [diff] [blame] | 1 | This document explains the strategy that was used so far in starting the |
| 2 | migration to PSA Crypto and mentions future perspectives and open questions. |
| 3 | |
| 4 | Goals |
| 5 | ===== |
| 6 | |
| 7 | Several benefits are expected from migrating to PSA Crypto: |
| 8 | |
Manuel Pégourié-Gonnard | 7497991 | 2021-10-27 14:00:08 +0200 | [diff] [blame] | 9 | G1. Use PSA Crypto drivers when available. |
Manuel Pégourié-Gonnard | b89fd95 | 2021-09-30 11:52:04 +0200 | [diff] [blame] | 10 | G2. Allow isolation of long-term secrets (for example, private keys). |
| 11 | G3. Allow isolation of short-term secrets (for example, TLS sesssion keys). |
| 12 | G4. Have a clean, unified API for Crypto (retire the legacy API). |
Manuel Pégourié-Gonnard | 7497991 | 2021-10-27 14:00:08 +0200 | [diff] [blame] | 13 | G5. Code size: compile out our implementation when a driver is available. |
Manuel Pégourié-Gonnard | b89fd95 | 2021-09-30 11:52:04 +0200 | [diff] [blame] | 14 | |
| 15 | Currently, some parts of (G1) and (G2) are implemented when |
| 16 | `MBEDTLS_USE_PSA_CRYPTO` is enabled. For (G2) to take effect, the application |
| 17 | needs to be changed to use new APIs. |
| 18 | |
| 19 | Generally speaking, the numbering above doesn't mean that each goal requires |
Manuel Pégourié-Gonnard | 7497991 | 2021-10-27 14:00:08 +0200 | [diff] [blame] | 20 | the preceding ones to be completed, for example G2-G5 could be done in any |
| 21 | order; however they all either depend on G1 or are just much more convenient |
| 22 | if G1 is done before (note that this is not a dependency on G1 being complete, |
| 23 | it's more like each bit of G2-G5 is helped by some speficic bit in G1). |
Manuel Pégourié-Gonnard | b89fd95 | 2021-09-30 11:52:04 +0200 | [diff] [blame] | 24 | |
| 25 | So, a solid intermediate goal would be to complete (G1) when |
| 26 | `MBEDTLS_USA_PSA_CRYPTO` is enabled - that is, all crypto operations in X.509 |
| 27 | and TLS would be done via the PSA Crypto API. |
| 28 | |
| 29 | Compile-time options |
| 30 | ==================== |
| 31 | |
| 32 | We currently have two compile-time options that are relevant to the migration: |
| 33 | |
| 34 | - `MBEDTLS_PSA_CRYPTO_C` - enabled by default, controls the presence of the PSA |
| 35 | Crypto APIs. |
| 36 | - `MBEDTLS_USE_PSA_CRYPTO` - disabled by default (enabled in "full" config), |
| 37 | controls usage of PSA Crypto APIs to perform operations in X.509 and TLS |
| 38 | (G1 above), as well as the availability of some new APIs (G2 above). |
| 39 | |
Manuel Pégourié-Gonnard | a6c601c | 2021-10-27 14:12:44 +0200 | [diff] [blame^] | 40 | The reasons why `MBEDTLS_USE_PSA_CRYPTO` is optional and disabled by default |
| 41 | are: |
| 42 | - it's incompatible with `MBEDTLS_ECP_RESTARTABLE`, `MBEDTLS_PSA_CRYPTO_CONFIG` and `MBEDTLS_PSA_CRYPTO_KEY_ID_ENCODES_OWNER`; |
| 43 | - to avoid a hard/default dependency of X509 and TLS and |
| 44 | `MBEDTLS_PSA_CRYPTO_C`, mostly reasons of code size, and historically |
| 45 | concerns about the maturity of the PSA code (which we might want to |
| 46 | re-evaluate). |
Manuel Pégourié-Gonnard | b89fd95 | 2021-09-30 11:52:04 +0200 | [diff] [blame] | 47 | |
| 48 | The downside of this approach is that until we feel ready to make |
| 49 | `MBDEDTLS_USE_PSA_CRYPTO` non-optional (always enabled), we have to maintain |
| 50 | two versions of some parts of the code: one using PSA, the other using the |
| 51 | legacy APIs. However, see next section for strategies that can lower that |
Manuel Pégourié-Gonnard | a6c601c | 2021-10-27 14:12:44 +0200 | [diff] [blame^] | 52 | cost. The rest of this section explains the reasons for the |
| 53 | incompatibilities mentioned above. |
| 54 | |
| 55 | ### `MBEDTLS_ECP_RESTARTABLE` |
| 56 | |
| 57 | Currently this option controls not only the presence of restartable APIs in |
| 58 | the crypto library, but also their use in the TLS and X.509 layers. Since PSA |
| 59 | Crypto does not support restartable operations, there's a clear conflict: the |
| 60 | TLS and X.509 layers can't both use only PSA APIs and get restartable |
| 61 | behaviour. |
| 62 | |
| 63 | Supporting this in PSA is on our roadmap (it's been requested). But it's way |
| 64 | below generalizing support for `MBEDTLS_USE_PSA_CRYPTO` for “mainstream” use |
| 65 | cases on our priority list. So in the medium term `MBEDTLS_ECP_RESTARTABLE` is |
| 66 | incompatible with `MBEDTLS_USE_PSA_CRYPTO`. |
| 67 | |
| 68 | Note: it is possible to make the options compatible at build time simply by |
| 69 | deciding that when `USE_PSA_CRYPTO` is enabled, then `MBEDTLS_ECP_RESTARTABLE` |
| 70 | cease to have any effect on X.509 and TLS: it simply controls the presence of |
| 71 | the APIs in libmbedcrypto. (Or we could split `ECP_RESTARTABLE` into several |
| 72 | options to achieve a similar effect.) This would allow people to use |
| 73 | restartable ECC in non-TLS, non-X509 code (for example firmware verification) |
| 74 | with a build that also uses PSA for TLS and X509), if there is an interest for |
| 75 | that. |
| 76 | |
| 77 | ### `MBEDTLS_PSA_CRYPTO_CONFIG` |
| 78 | |
| 79 | X509 and TLS code use `MBEDTLS_xxx` macros to decide whether an algorithm is |
| 80 | supported. This doesn't make `MBEDTLS_USE_PSA_CRYPTO` incompatible with |
| 81 | `MBEDTLS_PSA_CRYPTO_CONFIG` per se, but it makes it incompatible with most |
| 82 | useful uses of `MBEDTLS_PSA_CRYPTO_CONFIG`. The point of |
| 83 | `MBEDTLS_PSA_CRYPTO_CONFIG` is to be able to build a library with support for |
| 84 | an algorithm through a PSA driver only, without building the software |
| 85 | implementation of that algorithm. But then the TLS code would consider the |
| 86 | algorithm unavailable. |
| 87 | |
| 88 | This is tracked in https://github.com/ARMmbed/mbedtls/issues/3674 and |
| 89 | https://github.com/ARMmbed/mbedtls/issues/3677. But now that I look at it with |
| 90 | fresh eyes, I don't think the approach we were planning to use would actually |
| 91 | works. This needs more design effort. |
| 92 | |
| 93 | This is something we need to support eventually, and several partners want it. |
| 94 | I don't know what the priority is for `MBEDTLS_USE_PSA_CRYPTO` between |
| 95 | improving driver support and covering more of the protocol. It seems to me |
| 96 | that it'll be less work overall to first implement a good architecture for |
| 97 | `MBEDTLS_USE_PSA_CRYPTO + MBEDTLS_PSA_CRYPTO_CONFIG` and then extend to more |
| 98 | protocol featues, because implementing that architecture will require changes |
| 99 | to the existing code and the less code there is at this point the better, |
| 100 | whereas extending to more procotol features will require the same amount of |
| 101 | work either way. |
| 102 | |
| 103 | ### `MBEDTLS_PSA_CRYPTO_KEY_ID_ENCODES_OWNER` |
| 104 | |
| 105 | When `MBEDTLS_PSA_CRYPTO_KEY_ID_ENCODES_OWNER` is enabled, the library is |
| 106 | built for use with an RPC server that dispatches PSA crypto function calls |
| 107 | from multiple clients. In such a build, all the `psa_xxx` functions that take |
| 108 | would normally take a `psa_key_id_t` as argument instead take a structure |
| 109 | containing both the key id and the client id. And so if e.g. a TLS function |
| 110 | calls `psa_import_key`, it would have to pass this structure, not just the |
| 111 | `psa_key_id_t` key id. |
| 112 | |
| 113 | A solution is to use `mbedtls_svc_key_id_t` throughout instead of |
| 114 | `psa_key_id_t`, and use similar abstractions to define values. That's what we |
| 115 | do in unit tests of PSA crypto itself to support both cases. That abstraction |
| 116 | is more confusing to readers, so the less we use it the better. |
| 117 | |
| 118 | I don't think supporting TLS and an RPC interface in the same build is an |
| 119 | important use case (I don't remember anyone requesting it). So I propose to |
| 120 | ignore it in the design: we just don't intend to support it. |
Manuel Pégourié-Gonnard | b89fd95 | 2021-09-30 11:52:04 +0200 | [diff] [blame] | 121 | |
| 122 | Taking advantage of the existing abstractions layers - or not |
| 123 | ============================================================= |
| 124 | |
| 125 | The Crypto library in Mbed TLS currently has 3 abstraction layers that offer |
| 126 | algorithm-agnostic APIs for a class of algorithms: |
| 127 | |
| 128 | - MD for messages digests aka hashes (including HMAC) |
| 129 | - Cipher for symmetric ciphers (included AEAD) |
| 130 | - PK for asymmetric (aka public-key) cryptography (excluding key exchange) |
| 131 | |
| 132 | Note: key exchange (FFDH, ECDH) is not covered by an abstraction layer. |
| 133 | |
| 134 | These abstraction layers typically provide, in addition to the API for crypto |
| 135 | operations, types and numerical identifiers for algorithms (for |
| 136 | example `mbedtls_cipher_mode_t` and its values). The |
| 137 | current strategy is to keep using those identifiers in most of the code, in |
| 138 | particular in existing structures and public APIs, even when |
| 139 | `MBEDTLS_USE_PSA_CRYPTO` is enabled. (This is not an issue for G1, G2, G3 |
| 140 | above, and is only potentially relevant for G4.) |
| 141 | |
| 142 | The are multiple strategies that can be used regarding the place of those |
| 143 | layers in the migration to PSA. |
| 144 | |
| 145 | Silently call to PSA from the abstraction layer |
| 146 | ----------------------------------------------- |
| 147 | |
| 148 | - Provide a new definition (conditionally on `USE_PSA_CRYPTO`) of wrapper |
| 149 | functions in the abstraction layer, that calls PSA instead of the legacy |
| 150 | crypto API. |
| 151 | - Upside: changes contained to a single place, no need to change TLS or X.509 |
| 152 | code anywhere. |
| 153 | - Downside: tricky to implement if the PSA implementation is currently done on |
| 154 | top of that layer (dependency loop). |
| 155 | |
| 156 | This strategy is currently used for ECDSA signature verification in the PK |
| 157 | layer, and could be extended to all operations in the PK layer. |
| 158 | |
| 159 | This strategy is not very well suited to the Cipher and MD layers, as the PSA |
| 160 | implementation is currently done on top of those layers. |
| 161 | |
| 162 | Replace calls for each operation |
| 163 | -------------------------------- |
| 164 | |
| 165 | - For every operation that's done through this layer in TLS or X.509, just |
| 166 | replace function call with calls to PSA (conditionally on `USE_PSA_CRYPTO`) |
| 167 | - Upside: conceptually simple, and if the PSA implementation is currently done |
| 168 | on top of that layer, avoids concerns about dependency loops. |
| 169 | - Downside: TLS/X.509 code has to be done for each operation. |
| 170 | |
| 171 | This strategy is currently used for the MD layer. (Currently only a subset of |
| 172 | calling places, but could be extended to all of them.) |
| 173 | |
| 174 | Opt-in use of PSA from the abstraction layer |
| 175 | -------------------------------------------- |
| 176 | |
| 177 | - Provide a new way to set up a context that causes operations on that context |
| 178 | to be done via PSA. |
| 179 | - Upside: changes mostly contained in one place, TLS/X.509 code only needs to |
| 180 | be changed when setting up the context, but not when using it. In |
| 181 | particular, no changes to/duplication of existing public APIs that expect a |
| 182 | key to be passed as a context of this layer (eg, `mbedtls_pk_context`). |
| 183 | - Upside: avoids dependency loop when PSA implemented on top of that layer. |
| 184 | - Downside: when the context is typically set up by the application, requires |
| 185 | changes in application code. |
| 186 | |
| 187 | There are two variants of this strategy: one where using the new setup |
| 188 | function also allows for key isolation (the key is only held by PSA, |
| 189 | supporting both G1 and G2 in that area), and one without isolation (the key is |
| 190 | still stored outsde of PSA most of the time, supporting only G1). |
| 191 | |
| 192 | This strategy, with support for key isolation, is currently used for ECDSA |
| 193 | signature generation in the PK layer - see `mbedtls_pk_setup_opaque()`. This |
| 194 | allows use of PSA-held private ECDSA keys in TLS and X.509 with no change to |
| 195 | the TLS/X.509 code, but a contained change in the application. If could be |
| 196 | extended to other private key operations in the PK layer. |
| 197 | |
| 198 | This strategy, without key isolation, is also currently used in the Cipher |
| 199 | layer - see `mbedtls_cipher_setup_psa()`. This allows use of PSA for cipher |
| 200 | operations in TLS with no change to the application code, and a |
| 201 | contained change in TLS code. (It currently only supports a subset of ciphers, |
| 202 | but could easily be extended to all of them.) |
| 203 | |
| 204 | Note: for private key operations in the PK layer, both the "silent" and the |
| 205 | "opt-in" strategy can apply, and can complement each other, as one provides |
| 206 | support for key isolation, but at the (unavoidable) code of change in |
| 207 | application code, while the other requires no application change to get |
| 208 | support for drivers, but fails to provide isolation support. |
| 209 | |
| 210 | Migrating away from the legacy API |
| 211 | ================================== |
| 212 | |
| 213 | This section briefly introduces questions and possible plans towards G4, |
| 214 | mainly as they relate to choices in previous stages. |
| 215 | |
| 216 | The role of the PK/Cipher/MD APIs in user migration |
| 217 | --------------------------------------------------- |
| 218 | |
| 219 | We're currently taking advantage of the existing PK and Cipher layers in order |
| 220 | to reduce the number of places where library code needs to be changed. It's |
| 221 | only natural to consider using the same strategy (with the PK, MD and Cipher |
| 222 | layers) for facilitating migration of application code. |
| 223 | |
| 224 | Note: a necessary first step for that would be to make sure PSA is no longer |
| 225 | implemented of top of the concerned layers |
| 226 | |
| 227 | ### Zero-cost compatibility layer? |
| 228 | |
| 229 | The most favourable case is if we can have a zero-cost abstraction (no |
| 230 | runtime, RAM usage or code size penalty), for example just a bunch of |
| 231 | `#define`s, essentialy mapping `mbedtls_` APIs to their `psa_` equivalent. |
| 232 | |
| 233 | Unfortunately that's unlikely fully work. For example, the MD layer uses the |
| 234 | same context type for hashes and HMACs, while the PSA API (rightfully) has |
| 235 | distinct operation types. Similarly, the Cipher layer uses the same context |
| 236 | type for unauthenticated and AEAD ciphers, which again the PSA API |
| 237 | distinguishes. |
| 238 | |
| 239 | It is unclear how much value, if any, a zero-cost compatibility layer that's |
| 240 | incomplete (for example, for MD covering only hashes, or for Cipher covering |
| 241 | only AEAD) or differs significantly from the existing API (for example, |
| 242 | introducing new context types) would provide to users. |
| 243 | |
| 244 | ### Low-cost compatibility layers? |
| 245 | |
| 246 | Another possibility is to keep most or all of the existing API for the PK, MD |
| 247 | and Cipher layers, implemented on top of PSA, aiming for the lowest possible |
| 248 | cost. For example, `mbedtls_md_context_t` would be defined as a (tagged) union |
| 249 | of `psa_hash_operation_t` and `psa_mac_operation_t`, then `mbedtls_md_setup()` |
| 250 | would initialize the correct part, and the rest of the functions be simple |
| 251 | wrappers around PSA functions. This would vastly reduce the complexity of the |
| 252 | layers compared to the existing (no need to dispatch through function |
| 253 | pointers, just call the corresponding PSA API). |
| 254 | |
| 255 | Since this would still represent a non-zero cost, not only in terms of code |
| 256 | size, but also in terms of maintainance (testing, etc.) this would probably |
| 257 | be a temporary solution: for example keep the compatibility layers in 4.0 (and |
| 258 | make them optional), but remove them in 5.0. |
| 259 | |
| 260 | Again, this provides the most value to users if we can manage to keep the |
| 261 | existing API unchanged. Their might be conflcits between this goal and that of |
| 262 | reducing the cost, and judgment calls may need to be made. |
| 263 | |
| 264 | Note: when it comes to holding public keys in the PK layer, depending on how |
| 265 | the rest of the code is structured, it may be worth holding the key data in |
| 266 | memory controlled by the PK layer as opposed to a PSA key slot, moving it to a |
| 267 | slot only when needed (see current `ecdsa_verify_wrap` when |
| 268 | `MBEDTLS_USE_PSA_CRYPTO` is defined) For example, when parsing a large |
| 269 | number, N, of X.509 certificates (for example the list of trusted roots), it |
| 270 | might be undesirable to use N PSA key slots for their public keys as long as |
| 271 | the certs are loaded. OTOH, this could also be addressed by merging the "X.509 |
| 272 | parsing on-demand" (#2478), and then the public key data would be held as |
| 273 | bytes in the X.509 CRT structure, and only moved to a PK context / PSA slot |
| 274 | when it's actually used. |
| 275 | |
| 276 | Note: the PK layer actually consists of two relatively distinct parts: crypto |
| 277 | operations, which will be covered by PSA, and parsing/writing (exporting) |
| 278 | from/to various formats, which is currently not fully covered by the PSA |
| 279 | Crypto API. |
| 280 | |
| 281 | ### Algorithm identifiers and other identifiers |
| 282 | |
| 283 | It should be easy to provide the user with a bunch of `#define`s for algorithm |
| 284 | identifiers, for example `#define MBEDTLS_MD_SHA256 PSA_ALG_SHA_256`; most of |
| 285 | those would be in the MD, Cipher and PK compatibility layers mentioned above, |
| 286 | but there might be some in other modules that may be worth considering, for |
| 287 | example identifiers for elliptic curves. |
| 288 | |
| 289 | ### Lower layers |
| 290 | |
| 291 | Generally speaking, we would retire all of the low-level, non-generic modules, |
| 292 | such as AES, SHA-256, RSA, DHM, ECDH, ECP, bignum, etc, without providing |
| 293 | compatibility APIs for them. People would be encouraged to switch to the PSA |
| 294 | API. (The compatiblity implementation of the existing PK, MD, Cipher APIs |
| 295 | would mostly benefit people who already used those generic APis rather than |
| 296 | the low-level, alg-specific ones.) |
| 297 | |
| 298 | ### APIs in TLS and X.509 |
| 299 | |
| 300 | Public APIs in TLS and X.509 may be affected by the migration in at least two |
| 301 | ways: |
| 302 | |
| 303 | 1. APIs that rely on a legacy `mbedtls_` crypto type: for example |
| 304 | `mbedtls_ssl_conf_own_cert()` to configure a (certificate and the |
| 305 | associated) private key. Currently the private key is passed as a |
| 306 | `mbedtls_pk_context` object, which would probably change to a `psa_key_id_t`. |
| 307 | Since some users would probably still be using the compatibility PK layer, it |
| 308 | would need a way to easily extract the PSA key ID from the PK context. |
| 309 | |
| 310 | 2. APIs the accept list of identifiers: for example |
| 311 | `mbedtls_ssl_conf_curves()` taking a list of `mbedtls_ecp_group_id`s. This |
| 312 | could be changed to accept a list of pairs (`psa_ecc_familiy_t`, size) but we |
| 313 | should probably take this opportunity to move to a identifier independant from |
| 314 | the underlying crypto implementation and use TLS-specific identifiers instead |
| 315 | (based on IANA values or custom enums), as is currently done in the new |
| 316 | `mbedtls_ssl_conf_groups()` API, see #4859). |
| 317 | |
| 318 | Testing |
| 319 | ------- |
| 320 | |
| 321 | An question that needs careful consideration when we come around to removing |
| 322 | the low-level crypto APIs and making PK, MD and Cipher optional compatibility |
| 323 | layers is to be sure to preserve testing quality. A lot of the existing test |
| 324 | cases use the low level crypto APIs; we would need to either keep using that |
| 325 | API for tests, or manually migrated test to the PSA Crypto API. Perhaps a |
| 326 | combination of both, perhaps evolving gradually over time. |