blob: 76081deef4e5a671c092e13fb8419ea21bfeff2a [file] [log] [blame] [view]
Gilles Peskined47ba712022-11-07 22:28:26 +01001PSA migration strategy for hashes and ciphers
2=============================================
3
4## Introduction
5
6This document discusses a migration strategy for code that is not subject to `MBEDTLS_USE_PSA_CRYPTO`, is currently using legacy cryptography APIs, and should transition to PSA, without a major version change.
7
8### Relationship with the main strategy document
9
10This is complementary to the main [strategy document](strategy.html) and is intended as a refinement. However, at this stage, there may be contradictions between the strategy proposed here and some of the earlier strategy.
11
12A difference between the original strategy and the current one is that in this work, we are not treating PSA as a black box. We can change experimental features, and we can call internal interfaces.
13
14## Requirements
15
16### User stories
17
18#### Backward compatibility user story
19
20As a developer of an application that uses Mbed TLS's interfaces (including legacy crypto),
21I want Mbed TLS to preserve backward compatibility,
22so that my code keeps working in new minor versions of Mbed TLS.
23
24#### Interface design user story
25
26As a developer of library code that uses Mbed TLS to perform cryptographic operations,
27I want to know which functions to call and which feature macros to check,
28so that my code works in all Mbed TLS configurations.
29
30Note: this is the same problem we face in X.509 and TLS.
31
32#### Hardware accelerator vendor user stories
33
34As a vendor of a platform with hardware acceleration for some crypto,
Gilles Peskined167f162022-12-13 20:28:29 +010035I want to build Mbed TLS in a way that uses my hardware wherever relevant,
Gilles Peskined47ba712022-11-07 22:28:26 +010036so that my customers maximally benefit from my hardware.
37
38As a vendor of a platform with hardware acceleration for some crypto,
39I want to build Mbed TLS without software that replicates what my hardware does,
40to minimize the code size.
41
42#### Maintainer user stories
43
44As a maintainer of Mbed TLS,
45I want to have clear rules for when to use which interface,
46to avoid bugs in “unusual” configurations.
47
48As a maintainer of Mbed TLS,
49I want to avoid duplicating code,
50because this is inefficient and error-prone.
51
52### Use PSA more
53
54In the long term, all code using cryptography should use PSA interfaces, to benefit from PSA drivers, allow eliminating legacy interfaces (less code size, less maintenance). However, this can't be done without breaking [backward compatibility](#backward-compatibility).
55
56The goal of this work is to arrange for more non-PSA interfaces to use PSA interfaces under the hood, without breaking code in the cases where this doesn't work. Using PSA interfaces has two benefits:
57
58* Where a PSA driver is available, it likely has better performance, and sometimes better security, than the built-in software implementation.
59* In many scenarios, where a PSA driver is available, this allows removing the software implementation altogether.
60* We may be able to get rid of some redundancies, for example the duplication between the implementations of HMAC in `md.c` and in `psa_crypto_mac.c`, and HKDF in `hkdf.c` and `psa_crypto.c`.
61
62### Correct dependencies
63
64Traditionally, to determine whether a cryptographic mechanism was available, you had to check whether the corresponding Mbed TLS module or submodule was present: `MBEDTLS_SHA256_C` for SHA256, `MBEDTLS_AES_C && MBEDTLS_CIPHER_MODE_CBC` for AES-CBC, etc. In code that uses the PSA interfaces, this needs to change to `PSA_WANT_xxx` symbols.
65
66### Backward compatibility
67
68All documented behavior must be preserved, except for interfaces currently described as experimental or unstable. Those interfaces can change, but we should minimize disruption by providing a transition path for reasonable use cases.
69
70#### Changeable configuration options
71
72The following configuration options are described as experimental, and are likely to change at least marginally:
73
74* `MBEDTLS_PSA_CRYPTO_CLIENT`: “This interface is experimental and may change or be removed without notice.” In practice we don't want to remove this, but we may constrain how it's used.
75* `MBEDTLS_PSA_CRYPTO_DRIVERS`: “This interface is experimental. We intend to maintain backward compatibility with application code that relies on drivers, but the driver interfaces may change without notice.” In practice, this may mean constraints not only on how to write drivers, but also on how to integrate drivers into code that is platform code more than application code.
Gilles Peskinecb93ac92022-12-13 20:29:43 +010076* `MBEDTLS_PSA_CRYPTO_CONFIG`: “This feature is still experimental and is not ready for production since it is not completed.” We may want to change this, for example, to automatically enable more mechanisms (although this wouldn't be considered a backward compatibility break anyway, since we don't promise that you will not get a feature if you don't enable its `PSA_WANT_xxx`).
Gilles Peskined47ba712022-11-07 22:28:26 +010077
78### Non-goals
79
80It is not a goal at this stage to make more code directly call `psa_xxx` functions. Rather, the goal is to make more code call PSA drivers where available. How dispatch is done is secondary.
81
82## Problem analysis
83
84### Scope analysis
85
86#### Limitations of `MBEDTLS_USE_PSA_CRYPTO`
87
Gilles Peskined167f162022-12-13 20:28:29 +010088The option `MBEDTLS_USE_PSA_CRYPTO` causes parts of the library to call the PSA API instead of legacy APIs for cryptographic calculations. `MBEDTLS_USE_PSA_CRYPTO` only applies to `pk.h`, X.509 and TLS. When this option is enabled, applications must call `psa_crypto_init()` before calling any of the functions in these modules.
Gilles Peskined47ba712022-11-07 22:28:26 +010089
90In this work, we want two things:
91
Gilles Peskine91af0f92023-02-10 14:31:36 +010092* Make non-covered modules call PSA, but only [when this will actually work](#why-psa-is-not-always-possible). This effectively brings those modules to a partial use-PSA behavior (benefiting from PSA accelerators when they're usable) regardless of whether the option is enabled.
Gilles Peskined167f162022-12-13 20:28:29 +010093* Call PSA when a covered module calls a non-covered module which calls another module, for example X.509 calling pk for PSS verification which calls RSA which calculates a hash ([see issue \#6497](https://github.com/Mbed-TLS/mbedtls/issues/6497)). This effectively extends the option to modules that aren't directly covered.
Gilles Peskined47ba712022-11-07 22:28:26 +010094
95#### Classification of callers
96
97We can classify code that implements or uses cryptographic mechanisms into several groups:
98
99* Software implementations of primitive cryptographic mechanisms. These are not expected to change.
Gilles Peskine143ebcc2022-12-13 20:30:10 +0100100* Software implementations of constructed cryptographic mechanisms (e.g. HMAC, CTR_DRBG, RSA (calling a hash for PSS/OAEP, and needing to know the hash length in PKCS1v1.5 sign/verify), …). These need to keep working whenever a legacy implementation of the auxiliary mechanism is available, regardless of whether a PSA implementation is also available.
Gilles Peskined47ba712022-11-07 22:28:26 +0100101* Code implementing the PSA crypto interface. This is not expected to change, except perhaps to expose some internal functionality to overhauled glue code.
102* Code that's subject to `MBEDTLS_USE_PSA_CRYPTO`: `pk.h`, X.509, TLS (excluding TLS 1.3).
103* Code that always uses PSA for crypto: TLS 1.3, LMS.
104
105For the purposes of this work, three domains emerge:
106
107* **Legacy domain**: does not interact with PSA. Implementations of hashes, of cipher primitives, of arithmetic.
108* **Mixed domain**: does not currently use PSA, but should [when possible](#why-psa-is-not-always-possible). This consists of the constructed cryptographic primitives (except LMS), as well as pk, X.509 and TLS when `MBEDTLS_USE_PSA_CRYPTO` is disabled.
109* **PSA domain**: includes pk, X.509 and TLS when `MBEDTLS_USE_PSA_CRYPTO` is enabled. Also TLS 1.3, LMS.
110
111#### Non-use-PSA modules
112
Manuel Pégourié-Gonnard948137b2023-08-10 16:58:04 +0200113The following modules in Mbed TLS call another module to perform cryptographic operations which, in the long term, will be provided through a PSA interface, but cannot make any PSA-related assumption.
Gilles Peskined47ba712022-11-07 22:28:26 +0100114
Manuel Pégourié-Gonnard839d3582023-09-15 21:27:19 +0200115Hashes and HMAC (after the work on driver-only hashes):
Manuel Pégourié-Gonnard948137b2023-08-10 16:58:04 +0200116
117* entropy (hashes via MD-light)
Gilles Peskined167f162022-12-13 20:28:29 +0100118* ECDSA (HMAC\_DRBG; `md.h` exposed through API)
Manuel Pégourié-Gonnard948137b2023-08-10 16:58:04 +0200119* ECJPAKE (hashes via MD-light; `md.h` exposed through API)
120* MD (hashes and HMAC)
Manuel Pégourié-Gonnard2daee042023-10-10 09:55:03 +0200121* HKDF (HMAC via `md.h`; `md.h` exposed through API)
Gilles Peskined47ba712022-11-07 22:28:26 +0100122* HMAC\_DRBG (hashes and HMAC via `md.h`; `md.h` exposed through API)
Manuel Pégourié-Gonnard948137b2023-08-10 16:58:04 +0200123* PKCS12 (hashes via MD-light)
124* PKCS5 (HMAC via `md.h`; `md.h` exposed through API)
Manuel Pégourié-Gonnard2daee042023-10-10 09:55:03 +0200125* PKCS7 (hashes via MD)
Manuel Pégourié-Gonnard948137b2023-08-10 16:58:04 +0200126* RSA (hash via MD-light for PSS and OAEP; `md.h` exposed through API)
127* PEM (MD5 hash via MD-light)
128
Manuel Pégourié-Gonnard839d3582023-09-15 21:27:19 +0200129Symmetric ciphers and AEADs (before work on driver-only cipher):
Manuel Pégourié-Gonnard948137b2023-08-10 16:58:04 +0200130
Manuel Pégourié-Gonnard839d3582023-09-15 21:27:19 +0200131* PEM:
132 * AES, DES or 3DES in CBC mode without padding, decrypt only (!).
133 * Currently using low-level non-generic APIs.
134 * No hard dependency, features guarded by `AES_C` resp. `DES_C`.
135 * Functions called: `setkey_dec()` + `crypt_cbc()`.
136* PKCS12:
137 * In practice: 2DES or 3DES in CBC mode with PKCS7 padding, decrypt only
138 (when called from pkparse).
139 * In principle: any cipher-mode (default padding), passed an
140 `mbedtls_cipher_type_t` as an argument, no documented restriction.
141 * Cipher, generically, selected from ASN.1 or function parameters;
142 no documented restriction but in practice TODO (inc. padding and
143 en/decrypt, look at standards and tests)
144 * Unconditional dependency on `CIPHER_C` in `check_config.h`.
145 * Note: `cipher.h` exposed through API.
146 * Functions called: `setup`, `setkey`, `set_iv`, `reset`, `update`, `finish` (in sequence, once).
147* PKCS5 (PBES2, `mbedtls_pkcs5_pbes2()`):
148 * 3DES or DES in CBC mode with PKCS7 padding, both encrypt and decrypt.
149 * Note: could also be AES in the future, see #7038.
150 * Unconditional dependency on `CIPHER_C` in `check_config.h`.
151 * Functions called: `setup`, `setkey`, `crypt`.
152* CTR\_DRBG:
153 * AES in ECB mode, encrypt only.
154 * Currently using low-level non-generic API (`aes.h`).
155 * Unconditional dependency on `AES_C` in `check_config.h`.
156 * Functions called: `setkey_enc`, `crypt_ecb`.
157* CCM:
158 * AES, Camellia or Aria in ECB mode, encrypt only.
159 * Unconditional dependency on `AES_C || CAMELLIA_C || ARIA_C` in `check_config.h`.
160 * Unconditional dependency on `CIPHER_C` in `check_config.h`.
161 * Note: also called by `cipher.c` if enabled.
162 * Functions called: `info`, `setup`, `setkey`, `update` (several times) - (never finish)
163* CMAC:
164 * AES or DES in ECB mode, encrypt only.
165 * Unconditional dependency on `AES_C || DES_C` in `check_config.h`.
166 * Unconditional dependency on `CIPHER_C` in `check_config.h`.
167 * Note: also called by `cipher.c` if enabled.
168 * Functions called: `info`, `setup`, `setkey`, `update` (several times) - (never finish)
169* GCM:
170 * AES, Camellia or Aria in ECB mode, encrypt only.
171 * Unconditional dependency on `AES_C || CAMELLIA_C || ARIA_C` in `check_config.h`.
172 * Unconditional dependency on `CIPHER_C` in `check_config.h`.
173 * Note: also called by `cipher.c` if enabled.
174 * Functions called: `info`, `setup`, `setkey`, `update` (several times) - (never finish)
175* NIST\_KW:
176 * AES in ECB mode, both encryt and decrypt.
177 * Unconditional dependency on `AES_C || DES_C` in `check_config.h`.
178 * Unconditional dependency on `CIPHER_C` in `check_config.h`.
179 * Note: also called by `cipher.c` if enabled.
180 * Note: `cipher.h` exposed through API.
181 * Functions called: `info`, `setup`, `setkey`, `update` (several times) - (never finish)
182* Cipher:
183 * potentially any cipher/AEAD in any mode and any direction
184
185Note: PSA cipher is built on Cipher, but PSA AEAD directly calls the underlying AEAD modules (GCM, CCM, ChachaPoly).
Gilles Peskined47ba712022-11-07 22:28:26 +0100186
187### Difficulties
188
189#### Why PSA is not always possible
190
191Here are some reasons why calling `psa_xxx()` to perform a hash or cipher calculation might not be desirable in some circumstances, explaining why the application would arrange to call the legacy software implementation instead.
192
193* `MBEDTLS_PSA_CRYPTO_C` is disabled.
194* There is a PSA driver which has not been initialized (this happens in `psa_crypto_init()`).
Gilles Peskine22db9912022-12-13 20:30:35 +0100195* For ciphers, the keystore is not initialized yet, and Mbed TLS uses a custom implementation of PSA ITS where the file system is not accessible yet (because something else needs to happen first, and the application takes care that it happens before it calls `psa_crypto_init()`). A possible workaround may be to dispatch to the internal functions that are called after the keystore lookup, rather than to the PSA API functions (but this is incompatible with `MBEDTLS_PSA_CRYPTO_CLIENT`).
Gilles Peskined47ba712022-11-07 22:28:26 +0100196* The requested mechanism is enabled in the legacy interface but not in the PSA interface. This was not really intended, but is possible, for example, if you enable `MBEDTLS_MD5_C` for PEM decoding with PBKDF1 but don't want `PSA_ALG_WANT_MD5` because it isn't supported for `PSA_ALG_RSA_PSS` and `PSA_ALG_DETERMINISTIC_ECDSA`.
197* `MBEDTLS_PSA_CRYPTO_CLIENT` is enabled, and the client has not yet activated the connection to the server (this happens in `psa_crypto_init()`).
Gilles Peskine14239c62022-12-13 20:32:48 +0100198* `MBEDTLS_PSA_CRYPTO_CLIENT` is enabled, but the operation is part of the implementation of an encrypted communication with the crypto service, or the local implementation is faster because it avoids a costly remote procedure call.
Gilles Peskined47ba712022-11-07 22:28:26 +0100199
200#### Indirect knowledge
201
Gilles Peskineff674d42023-02-10 14:31:17 +0100202Consider for example the code in `rsa.c` to perform an RSA-PSS signature. It needs to calculate a hash. If `mbedtls_rsa_rsassa_pss_sign()` is called directly by application code, it is supposed to call the built-in implementation: calling a PSA accelerator would be a behavior change, acceptable only if this does not add a risk of failure or performance degradation ([PSA is impossible or undesirable in some circumstances](#why-psa-is-not-always-possible)). Note that this holds regardless of the state of `MBEDTLS_USE_PSA_CRYPTO`, since `rsa.h` is outside the scope of `MBEDTLS_USE_PSA_CRYPTO`. On the other hand, if `mbedtls_rsa_rsassa_pss_sign()` is called from X.509 code, it should use PSA to calculate hashes. It doesn't, currently, which is [bug \#6497](https://github.com/Mbed-TLS/mbedtls/issues/6497).
Gilles Peskined47ba712022-11-07 22:28:26 +0100203
204Generally speaking, modules in the mixed domain:
205
206* must call PSA if called by a module in the PSA domain;
207* must not call PSA (or must have a fallback) if their caller is not in the PSA domain and the PSA call is not guaranteed to work.
208
Gilles Peskinec82050e2022-11-08 19:17:58 +0100209#### Non-support guarantees: requirements
210
211Generally speaking, just because some feature is not enabled in `mbedtls_config.h` or `psa_config.h` doesn't guarantee that it won't be enabled in the build. We can enable additional features through `build_info.h`.
212
213If `PSA_WANT_xxx` is disabled, this should guarantee that attempting xxx through the PSA API will fail. This is generally guaranteed by the test suite `test_suite_psa_crypto_not_supported` with automatically enumerated test cases, so it would be inconvenient to carve out an exception.
214
215### Technical requirements
216
Gilles Peskineff674d42023-02-10 14:31:17 +0100217Based on the preceding analysis, the core of the problem is: for code in the mixed domain (see [“Classification of callers”](#classification-of-callers)), how do we handle a cryptographic mechanism? This has several related subproblems:
Gilles Peskinec82050e2022-11-08 19:17:58 +0100218
219* How the mechanism is encoded (e.g. `mbedtls_md_type_t` vs `const *mbedtls_md_info_t` vs `psa_algorithm_t` for hashes).
220* How to decide whether a specific algorithm or key type is supported (eventually based on `MBEDTLS_xxx_C` vs `PSA_WANT_xxx`).
221* How to obtain metadata about algorithms (e.g. hash/MAC/tag size, key size).
222* How to perform the operation (context type, which functions to call).
223
224We need a way to decide this based on the available information:
225
226* Who's the ultimate caller — see [indirect knowledge](#indirect-knowledge) — which is not actually available.
227* Some parameter indicating which algorithm to use.
228* The available cryptographic implementations, based on preprocessor symbols (`MBEDTLS_xxx_C`, `PSA_WANT_xxx`, `MBEDTLS_PSA_ACCEL_xxx`, etc.).
229* Possibly additional runtime state (for example, we might check whether `psa_crypto_init` has been called).
230
231And we need to take care of the [the cases where PSA is not possible](#why-psa-is-not-always-possible): either make sure the current behavior is preserved, or (where allowed by backward compatibility) document a behavior change and, preferably, a workaround.
232
Gilles Peskine382b34c2022-11-25 22:52:02 +0100233### Working through an example: RSA-PSS
Gilles Peskinec82050e2022-11-08 19:17:58 +0100234
235Let us work through the example of RSA-PSS which calculates a hash, as in [see issue \#6497](https://github.com/Mbed-TLS/mbedtls/issues/6497).
236
237RSA is in the [mixed domain](#classification-of-callers). So:
238
239* When called from `psa_sign_hash` and other PSA functions, it must call the PSA hash accelerator if there is one.
Gilles Peskine91af0f92023-02-10 14:31:36 +0100240* When called from user code, it must call the built-in hash implementation if PSA is not available (regardless of whether this is because `MBEDTLS_PSA_CRYPTO_C` is disabled, or because `PSA_WANT_ALG_xxx` is disabled for this hash, or because there is an accelerator driver which has not been initialized yet).
Gilles Peskinec82050e2022-11-08 19:17:58 +0100241
242RSA knows which hash algorithm to use based on a parameter of type `mbedtls_md_type_t`. (More generally, all mixed-domain modules that take an algorithm specification as a parameter take it via a numerical type, except HMAC\_DRBG and HKDF which take a `const mbedtls_md_info_t*` instead, and CMAC which takes a `const mbedtls_cipher_info_t *`.)
243
244#### Double encoding solution
245
246A natural solution is to double up the encoding of hashes in `mbedtls_md_type_t`. Pass `MBEDTLS_MD_SHA256` and `md` will dispatch to the legacy code, pass a new constant `MBEDTLS_MD_SHA256_USE_PSA` and `md` will dispatch through PSA.
247
248This maximally preserves backward compatibility, but then no non-PSA code benefits from PSA accelerators, and there's little potential for removing the software implementation.
249
Gilles Peskine382b34c2022-11-25 22:52:02 +0100250#### Availability of hashes in RSA-PSS
Gilles Peskinec82050e2022-11-08 19:17:58 +0100251
Gilles Peskine382b34c2022-11-25 22:52:02 +0100252Here we try to answer the question: As a caller of RSA-PSS via `rsa.h`, how do I know whether it can use a certain hash?
253
254* For a caller in the legacy domain: if e.g. `MBEDTLS_SHA256_C` is enabled, then I want RSA-PSS to support SHA-256. I don't care about negative support. So `MBEDTLS_SHA256_C` must imply support for RSA-PSS-SHA-256. It must work at all times, regardless of the state of PSA (e.g. drivers not initialized).
Gilles Peskined167f162022-12-13 20:28:29 +0100255* For a caller in the PSA domain: if e.g. `PSA_WANT_ALG_SHA_256` is enabled, then I want RSA-PSS to support SHA-256, provided that `psa_crypto_init()` has been called. In some limited cases, such as `test_suite_psa_crypto_not_supported` when PSA implements RSA-PSS in software, we care about negative support: if `PSA_WANT_ALG_SHA_256` is disabled then `psa_verify_hash` must reject `PSA_WANT_ALG_SHA_256`. This can be done at the level of PSA before it calls the RSA module, though, so it doesn't have any implication on the RSA module. As far as `rsa.c` is concerned, what matters is that `PSA_WANT_ALG_SHA_256` implies that SHA-256 is supported after `psa_crypto_init()` has been called.
Gilles Peskine382b34c2022-11-25 22:52:02 +0100256* For a caller in the mixed domain: requirements depend on the caller. Whatever solution RSA has to determine the availability of algorithms will apply to its caller as well.
257
258Conclusion so far: RSA must be able to do SHA-256 if either `MBEDTLS_SHA256_C` or `PSA_WANT_ALG_SHA_256` is enabled. If only `PSA_WANT_ALG_SHA_256` and not `MBEDTLS_SHA256_C` is enabled (which implies that PSA's SHA-256 comes from an accelerator driver), then SHA-256 only needs to work if `psa_crypto_init()` has been called.
259
260#### More in-depth discussion of compile-time availability determination
Gilles Peskinec82050e2022-11-08 19:17:58 +0100261
262The following combinations of compile-time support are possible:
263
264* `MBEDTLS_PSA_CRYPTO_CLIENT`. Then calling PSA may or may not be desirable for performance. There are plausible use cases where only the server has access to an accelerator so it's best to call the server, and plausible use cases where calling the server has overhead that negates the savings from using acceleration, if there are savings at all. In any case, calling PSA only works if the connection to the server has been established, meaning `psa_crypto_init` has been called successfully. In the rest of this case enumeration, assume `MBEDTLS_PSA_CRYPTO_CLIENT` is disabled.
265* No PSA accelerator. Then just call `mbedtls_sha256`, it's all there is, and it doesn't matter (from an API perspective) exactly what call chain leads to it.
266* PSA accelerator, no software implementation. Then we might as well call the accelerator, unless it's important that the call fails. At the time of writing, I can't think of a case where we would want to guarantee that if `MBEDTLS_xxx_C` is not enabled, but xxx is enabled through PSA, then a request to use algorithm xxx through some legacy interface must fail.
Gilles Peskine3e30e1f2022-12-13 20:34:17 +0100267* Both PSA acceleration and the built-in implementation. In this case, we would prefer PSA for the acceleration, but we can only do this if the accelerator driver is working. For hashes, it's enough to assume the driver is initialized; we've [considered requiring hash drivers to work without initialization](https://github.com/Mbed-TLS/mbedtls/pull/6470). For ciphers, this is more complicated because the cipher functions require the keystore, and plausibly a cipher accelerator might want entropy (for side channel countermeasures) which might not be available at boot time.
Gilles Peskinec82050e2022-11-08 19:17:58 +0100268
269Note that it's a bit tricky to determine which algorithms are available. In the case where there is a PSA accelerator but no software implementation, we don't want the preprocessor symbols to indicate that the algorithm is available through the legacy domain, only through the PSA domain. What does this mean for the interfaces in the mixed domain? They can't guarantee the availability of the algorithm, but they must try if requested.
270
Gilles Peskine382b34c2022-11-25 22:52:02 +0100271### Designing an interface for hashes
Gilles Peskinec82050e2022-11-08 19:17:58 +0100272
Gilles Peskine382b34c2022-11-25 22:52:02 +0100273In this section, we specify a hash metadata and calculation for the [mixed domain](#classification-of-callers), i.e. code that can be called both from legacy code and from PSA code.
Gilles Peskinec82050e2022-11-08 19:17:58 +0100274
Gilles Peskine382b34c2022-11-25 22:52:02 +0100275#### Availability of hashes
Gilles Peskinec82050e2022-11-08 19:17:58 +0100276
Gilles Peskine382b34c2022-11-25 22:52:02 +0100277Generalizing the analysis in [“Availability of hashes in RSA-PSS”](#availability-of-hashes-in-RSA-PSS):
278
279A hash is available through the mixed-domain interface iff either of the following conditions is true:
280
281* A legacy hash interface is available and the hash algorithm is implemented in software.
282* PSA crypto is enabled and the hash algorithm is implemented via PSA.
283
284We could go further and make PSA accelerators available to legacy callers that call any legacy hash interface, e.g. `md.h` or `shaX.h`. There is little point in doing this, however: callers should just use the mixed-domain interface.
285
Gilles Peskinefad34a42023-02-07 20:37:56 +0100286#### Implications between legacy availability and PSA availability
287
288* When `MBEDTLS_PSA_CRYPTO_CONFIG` is disabled, all legacy mechanisms are automatically enabled through PSA. Users can manually enable PSA mechanisms that are available through accelerators but not through legacy, but this is not officially supported (users are not supposed to manually define PSA configuration symbols when `MBEDTLS_PSA_CRYPTO_CONFIG` is disabled).
Gilles Peskine58e935f2023-02-08 12:07:12 +0100289* When `MBEDTLS_PSA_CRYPTO_CONFIG` is enabled, there is no mandatory relationship between PSA support and legacy support for a mechanism. Users can configure legacy support and PSA support independently. Legacy support is automatically enabled if PSA support is requested, but only if there is no accelerator.
Gilles Peskinefad34a42023-02-07 20:37:56 +0100290
291It is strongly desirable to allow mechanisms available through PSA but not legacy: this allows saving code size when an accelerator is present.
292
293There is no strong reason to allow mechanisms available through legacy but not PSA when `MBEDTLS_PSA_CRYPTO_C` is enabled. This would only save at best a very small amount of code size in the PSA dispatch code. This may be more desirable when `MBEDTLS_PSA_CRYPTO_CLIENT` is enabled (having a mechanism available only locally and not in the crypto service), but we do not have an explicit request for this and it would be entirely reasonable to forbid it.
294
295In this analysis, we have not found a compelling reason to require all legacy mechanisms to also be available through PSA. However, this can simplify both the implementation and the use of dispatch code thanks to some simplifying properties:
296
297* Mixed-domain code can call PSA code if it knows that `psa_crypto_init()` has been called, without having to inspect the specifics of algorithm support.
298* Mixed-domain code can assume that PSA buffer calculations work correctly for all algorithms that it supports.
299
Gilles Peskine382b34c2022-11-25 22:52:02 +0100300#### Shape of the mixed-domain hash interface
301
302We now need to create an abstraction for mixed-domain hash calculation. (We could not create an abstraction, but that would require every piece of mixed-domain code to replicate the logic here. We went that route in Mbed TLS 3.3, but it made it effectively impossible to get something that works correctly.)
303
304Requirements: given a hash algorithm,
305
306* Obtain some metadata about it (size, block size).
307* Calculate the hash.
308* Set up a multipart operation to calculate the hash. The operation must support update, finish, reset, abort, clone.
309
310The existing interface in `md.h` is close to what we want, but not perfect. What's wrong with it?
311
312* It has an extra step of converting from `mbedtls_md_type_t` to `const mbedtls_md_info_t *`.
313* It includes extra fluff such as names and HMAC. This costs code size.
Gilles Peskined167f162022-12-13 20:28:29 +0100314* The md module has some legacy baggage dating from when it was more open, which we don't care about anymore. This may cost code size.
Gilles Peskine382b34c2022-11-25 22:52:02 +0100315
316These problems are easily solvable.
317
318* `mbedtls_md_info_t` can become a very thin type. We can't remove the extra function call from the source code of callers, but we can make it a very thin abstraction that compilers can often optimize.
319* We can make names and HMAC optional. The mixed-domain hash interface won't be the full `MBEDTLS_MD_C` but a subset.
320* We can optimize `md.c` without making API changes to `md.h`.
Gilles Peskine188e9002022-11-25 23:04:16 +0100321
322## Specification
323
324### MD light
325
326https://github.com/Mbed-TLS/mbedtls/pull/6474 implements part of this specification, but it's based on Mbed TLS 3.2, so it needs to be rewritten for 3.3.
327
328#### Definition of MD light
329
330MD light is a subset of `md.h` that implements the hash calculation interface described in ”[Designing an interface for hashes](#designing-an-interface-for-hashes)”. It is activated by `MBEDTLS_MD_LIGHT` in `mbedtls_config.h`.
331
332The following things enable MD light automatically in `build_info.h`:
333
334* A [mixed-domain](#classification-of-callers) module that needs to calculate hashes is enabled.
335* `MBEDTLS_MD_C` is enabled.
336
337MD light includes the following types:
338
339* `mbedtls_md_type_t`
340* `mbedtls_md_info_t`
341* `mbedtls_md_context_t`
342
343MD light includes the following functions:
344
345* `mbedtls_md_info_from_type`
346* `mbedtls_md_init`
347* `mbedtls_md_free`
348* `mbedtls_md_setup` but `hmac` must be 0 if `MBEDTLS_MD_C` is disabled.
349* `mbedtls_md_clone`
350* `mbedtls_md_get_size`
351* `mbedtls_md_get_type`
352* `mbedtls_md_starts`
353* `mbedtls_md_update`
354* `mbedtls_md_finish`
Gilles Peskined167f162022-12-13 20:28:29 +0100355* `mbedtls_md`
Gilles Peskine188e9002022-11-25 23:04:16 +0100356
Gilles Peskined167f162022-12-13 20:28:29 +0100357Unlike the full MD, MD light does not support null pointers as `mbedtls_md_context_t *`. At least some functions still need to support null pointers as `const mbedtls_md_info_t *` because this arises when you try to use an unsupported algorithm (`mbedtls_md_info_from_type` returns `NULL`).
Gilles Peskine188e9002022-11-25 23:04:16 +0100358
359#### MD algorithm support macros
360
361For each hash algorithm, `md.h` defines a macro `MBEDTLS_MD_CAN_xxx` whenever the corresponding hash is available through MD light. These macros are only defined when `MBEDTLS_MD_LIGHT` is enabled. Per “[Availability of hashes](#availability-of-hashes)”, `MBEDTLS_MD_CAN_xxx` is enabled if:
362
363* the corresponding `MBEDTLS_xxx_C` is defined; or
364* one of `MBEDTLS_PSA_CRYPTO_C` or `MBEDTLS_PSA_CRYPTO_CLIENT` is enabled, and the corresponding `PSA_WANT_ALG_xxx` is enabled.
365
366Note that some algorithms have different spellings in legacy and PSA. Since MD is a legacy interface, we'll use the legacy names. Thus, for example:
367
368```
369#if defined(MBEDTLS_MD_LIGHT)
370#if defined(MBEDTLS_SHA256_C) || \
Manuel Pégourié-Gonnardc9e0ad22023-03-09 16:46:08 +0100371 (defined(MBEDTLS_PSA_CRYPTO_C) && PSA_WANT_ALG_SHA_256)
Gilles Peskine188e9002022-11-25 23:04:16 +0100372#define MBEDTLS_MD_CAN_SHA256
373#endif
374#endif
375```
376
Manuel Pégourié-Gonnardc9e0ad22023-03-09 16:46:08 +0100377Note: in the future, we may want to replace `defined(MBEDTLS_PSA_CRYPTO_C)`
378with `defined(MBEDTLS_PSA_CRYTO_C) || defined(MBEDTLS_PSA_CRYPTO_CLIENT)` but
379for now this is out of scope.
380
Gilles Peskine188e9002022-11-25 23:04:16 +0100381#### MD light internal support macros
382
383* If at least one hash has a PSA driver, define `MBEDTLS_MD_SOME_PSA`.
384* If at least one hash has a legacy implementation, defined `MBEDTLS_MD_SOME_LEGACY`.
385
386#### Support for PSA in the MD context
387
388An MD context needs to contain either a legacy module's context (or a pointer to one, as is the case now), or a PSA context (or a pointer to one).
389
390I am inclined to remove the pointer indirection, but this means that an MD context would always be as large as the largest supported hash context. So for the time being, this specification keeps a pointer. For uniformity, PSA will also have a pointer (we may simplify this later).
391
392```
393enum {
394 MBEDTLS_MD_ENGINE_LEGACY,
395 MBEDTLS_MD_ENGINE_PSA,
396} mbedtls_md_engine_t; // private type
397
398typedef struct mbedtls_md_context_t {
Manuel Pégourié-Gonnardc9e0ad22023-03-09 16:46:08 +0100399 mbedtls_md_type_t type;
Gilles Peskine188e9002022-11-25 23:04:16 +0100400#if defined(MBEDTLS_MD_SOME_PSA)
Manuel Pégourié-Gonnardc9e0ad22023-03-09 16:46:08 +0100401 mbedtls_md_engine_t engine;
Gilles Peskine188e9002022-11-25 23:04:16 +0100402#endif
Manuel Pégourié-Gonnardc9e0ad22023-03-09 16:46:08 +0100403 void *md_ctx; // mbedtls_xxx_context or psa_hash_operation
Gilles Peskine188e9002022-11-25 23:04:16 +0100404#if defined(MBEDTLS_MD_C)
405 void *hmac_ctx;
406#endif
407} mbedtls_md_context_t;
408```
409
410All fields are private.
411
412The `engine` field is almost redundant with knowledge about `type`. However, when an algorithm is available both via a legacy module and a PSA accelerator, we will choose based on the runtime availability of the accelerator when the context is set up. This choice needs to be recorded in the context structure.
413
414#### Inclusion of MD info structures
415
416MD light needs to support hashes that are only enabled through PSA. Therefore the `mbedtls_md_info_t` structures must be included based on `MBEDTLS_MD_CAN_xxx` instead of just the legacy module.
417
418The same criterion applies in `mbedtls_md_info_from_type`.
419
420#### Conversion to PSA encoding
421
422The implementation needs to convert from a legacy type encoding to a PSA encoding.
423
424```
425static inline psa_algorithm_t psa_alg_of_md_info(
426 const mbedtls_md_info_t *md_info );
427```
428
429#### Determination of PSA support at runtime
430
431```
432int psa_can_do_hash(psa_algorithm_t hash_alg);
433```
434
435The job of this private function is to return 1 if `hash_alg` can be performed through PSA now, and 0 otherwise. It is only defined on algorithms that are enabled via PSA.
436
437As a starting point, return 1 if PSA crypto has been initialized. This will be refined later (to return 1 if the [accelerator subsystem](https://github.com/Mbed-TLS/mbedtls/issues/6007) has been initialized).
438
Gilles Peskined167f162022-12-13 20:28:29 +0100439Usage note: for algorithms that are not enabled via PSA, calling `psa_can_do_hash` is generally safe: whether it returns 0 or 1, you can call a PSA hash function on the algorithm and it will return `PSA_ERROR_NOT_SUPPORTED`.
440
Gilles Peskine188e9002022-11-25 23:04:16 +0100441#### Support for PSA dispatch in hash operations
442
443Each function that performs some hash operation or context management needs to know whether to dispatch via PSA or legacy.
444
445If given an established context, use its `engine` field.
446
447If given an algorithm as an `mbedtls_md_type_t type` (possibly being the `type` field of a `const mbedtls_md_info_t *`):
448
449* If there is a PSA accelerator for this hash and `psa_can_do_hash(alg)`, call the corresponding PSA function, and if applicable set the engine to `MBEDTLS_MD_ENGINE_PSA`. (Skip this is `MBEDTLS_MD_SOME_PSA` is not defined.)
450* Otherwise dispatch to the legacy module based on the type as currently done. (Skip this is `MBEDTLS_MD_SOME_LEGACY` is not defined.)
451* If no dispatch is possible, return `MBEDTLS_ERR_MD_FEATURE_UNAVAILABLE`.
452
453Note that this assumes that an operation that has been started via PSA can be completed. This implies that `mbedtls_psa_crypto_free` must not be called while an operation using PSA is in progress. Document this.
454
455#### Error code conversion
456
457After calling a PSA function, call `mbedtls_md_error_from_psa` to convert its status code. This function is currently defined in `hash_info.c`.
Gilles Peskinef634fe12022-11-25 23:04:51 +0100458
459### Migration to MD light
460
461#### Migration of modules that used to call MD and now do the legacy-or-PSA dance
462
463Get rid of the case where `MBEDTLS_MD_C` is undefined. Enable `MBEDTLS_MD_LIGHT` in `build_info.h`.
464
465#### Migration of modules that used to call a low-level hash module and now do the legacy-or-PSA dance
466
467Switch to calling MD (light) unconditionally. Enable `MBEDTLS_MD_LIGHT` in `build_info.h`.
468
469#### Migration of modules that call a low-level hash module
470
471Switch to calling MD (light). Enable `MBEDTLS_MD_LIGHT` in `build_info.h`.
472
473#### Migration of use-PSA mixed code
474
475Instead of calling `hash_info.h` functions to obtain metadata, get it from `md.h`.
476
477Optionally, code that currently tests on `MBEDTLS_USE_PSA_CRYPTO` just to determine whether to call MD or PSA to calculate hashes can switch to just having the MD variant.
478
479#### Remove `legacy_or_psa.h`
480
481It's no longer used.
Gilles Peskine4eefade2022-11-25 23:05:14 +0100482
Gilles Peskinefad34a42023-02-07 20:37:56 +0100483### Support all legacy algorithms in PSA
484
485As discussed in [“Implications between legacy availability and PSA availability”](#implications-between-legacy-availability-and-psa-availability), we require the following property:
486
487> If an algorithm has a legacy implementation, it is also available through PSA.
488
489When `MBEDTLS_PSA_CRYPTO_CONFIG` is disabled, this is already the case. When is enabled, we will now make it so as well. Change `include/mbedtls/config_psa.h` accordingly.
490
Gilles Peskine4eefade2022-11-25 23:05:14 +0100491### MD light optimizations
492
493This section is not necessary to implement MD light, but will cut down its code size.
494
495#### Split names out of MD light
496
497Remove hash names from `mbedtls_md_info_t`. Use a simple switch-case or a separate list to implement `mbedtls_md_info_from_string` and `mbedtls_md_get_name`.
498
499#### Remove metadata from the info structure
500
501In `mbedtls_md_get_size` and in modules that want a hash's block size, instead of looking up hash metadata in the info structure, call the PSA macros.
502
503#### Optimize type conversions
504
505To allow optimizing conversions between `mbedtls_md_type_t` and `psa_algorithm_t`, renumber the `mbedtls_md_type_t` enum so that the values are the 8 lower bits of the PSA encoding.
506
507With this optimization,
508```
509static inline psa_algorithm_t psa_alg_of_md_info(
510 const mbedtls_md_info_t *md_info )
511{
512 if( md_info == NULL )
513 return( PSA_ALG_NONE );
514 return( PSA_ALG_CATEGORY_HASH | md_info->type );
515}
516```
517
518Work in progress on this conversion is at https://github.com/gilles-peskine-arm/mbedtls/tree/hash-unify-ids-wip-1
519
520#### Get rid of the hash_info module
521
522The hash_info module is redundant with MD light. Move `mbedtls_md_error_from_psa` to `md.c`, defined only when `MBEDTLS_MD_SOME_PSA` is defined. The rest is no longer used.
523
524#### Unify HMAC with PSA
525
526PSA has its own HMAC implementation. In builds with both `MBEDTLS_MD_C` and `PSA_WANT_ALG_HMAC` not fully provided by drivers, we should have a single implementation. Replace the one in `md.h` by calls to the PSA driver interface. This will also give mixed-domain modules access to HMAC accelerated directly by a PSA driver (eliminating the need to a HMAC interface in software if all supported hashes have an accelerator that includes HMAC support).
Gilles Peskine199ee452023-02-08 12:35:19 +0100527
528### Improving support for `MBEDTLS_PSA_CRYPTO_CLIENT`
529
530So far, MD light only dispatches to PSA if an algorithm is available via `MBEDTLS_PSA_CRYPTO_C`, not if it's available via `MBEDTLS_PSA_CRYPTO_CLIENT`. This is acceptable because `MBEDTLS_USE_PSA_CRYPTO` requires `MBEDTLS_PSA_CRYPTO_C`, hence mixed-domain code never invokes PSA.
531
532The architecture can be extended to support `MBEDTLS_PSA_CRYPTO_CLIENT` with a little extra work. Here is an overview of the task breakdown, which should be fleshed up after we've done the first [migration](#migration-to-md-light):
533
534* Compile-time dependencies: instead of checking `defined(MBEDTLS_PSA_CRYPTO_C)`, check `defined(MBEDTLS_PSA_CRYPTO_C) || defined(MBEDTLS_PSA_CRYPTO_CLIENT)`.
535* Implementers of `MBEDTLS_PSA_CRYPTO_CLIENT` will need to provide `psa_can_do_hash()` (or a more general function `psa_can_do`) alongside `psa_crypto_init()`. Note that at this point, it will become a public interface, hence we won't be able to change it at a whim.
Manuel Pégourié-Gonnard36cd3f92023-08-11 10:06:42 +0200536
537### Cipher light
538
539#### Definition
540
541**Note:** this definition is tentative an may be refined when implementing and
Manuel Pégourié-Gonnardca18b772023-10-10 09:45:28 +0200542testing, based and what's needed by internal users of Cipher light. The new
543config symbol will not be considered public so its definition may change.
Manuel Pégourié-Gonnard36cd3f92023-08-11 10:06:42 +0200544
545Cipher light will be automatically enabled in `build_info.h` by modules that
Manuel Pégourié-Gonnardca18b772023-10-10 09:45:28 +0200546need it, namely: CTR\_DRBG, CCM, GCM. Note: CCM and GCM currently depend on
547the full `CIPHER_C` (enforced by `check_config.h`); this hard dependency would
548be replaced by the above auto-enablement.
Manuel Pégourié-Gonnard36cd3f92023-08-11 10:06:42 +0200549
550Cipher light includes:
551- info functions;
Manuel Pégourié-Gonnardca18b772023-10-10 09:45:28 +0200552- support for block ciphers in ECB mode, encrypt only (note: in Cipher, "ECB"
553 means just one block, contrary to PSA);
554- the one-shot API as well as (part of) the streaming API;
555- only AES, Aria and Camellia.
Manuel Pégourié-Gonnard36cd3f92023-08-11 10:06:42 +0200556
557This excludes:
558- the AEAD/KW API (both one-shot and streaming);
559- support for stream ciphers;
Manuel Pégourié-Gonnardca18b772023-10-10 09:45:28 +0200560- support for other modes of block ciphers (CBC, CTR, CFB, etc.);
561- DES and variants (3DES).
Manuel Pégourié-Gonnard36cd3f92023-08-11 10:06:42 +0200562
563The following API functions, and supporting types, are candidates for
564inclusion in the Cipher light API, with limited features as above:
565```
Manuel Pégourié-Gonnard36cd3f92023-08-11 10:06:42 +0200566mbedtls_cipher_info_from_type
Manuel Pégourié-Gonnard36cd3f92023-08-11 10:06:42 +0200567mbedtls_cipher_info_get_block_size
Manuel Pégourié-Gonnard36cd3f92023-08-11 10:06:42 +0200568
569mbedtls_cipher_init
570mbedtls_cipher_setup
571mbedtls_cipher_setkey
Manuel Pégourié-Gonnard36cd3f92023-08-11 10:06:42 +0200572mbedtls_cipher_crypt
573mbedtls_cipher_free
574
Manuel Pégourié-Gonnard36cd3f92023-08-11 10:06:42 +0200575mbedtls_cipher_update
Manuel Pégourié-Gonnardca18b772023-10-10 09:45:28 +0200576(mbedtls_cipher_finish)
Manuel Pégourié-Gonnard36cd3f92023-08-11 10:06:42 +0200577```
Manuel Pégourié-Gonnardca18b772023-10-10 09:45:28 +0200578
579Note: `mbedtls_cipher_info_get_block_size()` can be hard-coded to return 16,
580as all three supported block ciphers have the same block size (DES was
581excluded).
582
583Note: `mbedtls_cipher_finish()` is not required by any of the modules using
584Cipher light, but it might be convenient to include it anyway as it's used in
585the implementation of `mbedtls_cipher_crypt()`.
586
587#### Cipher light dual dispatch
588
589This is likely to come in the future, but has not been defined yet.