blob: 06bdcc056da82f2f8f8f4e064266cbb1e3f78c4a [file] [log] [blame] [view]
Gilles Peskine41d03342022-02-14 23:55:59 +01001Thread safety of the PSA subsystem
Gilles Peskinea42a8de2021-11-03 12:18:41 +01002==================================
3
Gilles Peskine41d03342022-02-14 23:55:59 +01004## Requirements
5
6### Backward compatibility requirement
7
8Code that is currently working must keep working. There can be an exception for code that uses features that are advertised as experimental; for example, it would be annoying but ok to add extra requirements for drivers.
9
Gilles Peskine41618da2022-02-16 22:32:12 +010010(In this section, currently means Mbed TLS releases without proper concurrency management: 3.0.0, 3.1.0, and any other subsequent 3.x version.)
11
12In particular, if you either protect all PSA calls with a mutex, or only ever call PSA functions from a single thread, your application currently works and must keep working. If your application currently builds and works with `MBEDTLS_PSA_CRYPTO_C` and `MBEDTLS_THREADING_C` enabled, it must keep building and working.
Gilles Peskine41d03342022-02-14 23:55:59 +010013
14As a consequence, we must not add a new platform requirement beyond mutexes for the base case. It would be ok to add new platform requirements if they're only needed for PSA drivers, or if they're only performance improvements.
15
16Tempting platform requirements that we cannot add to the default `MBEDTLS_THREADING_C` include:
17
18* Releasing a mutex from a different thread than the one that acquired it. This isn't even guaranteed to work with pthreads.
19* New primitives such as semaphores or condition variables.
20
21### Correctness out of the box
22
23If you build with `MBEDTLS_PSA_CRYPTO_C` and `MBEDTLS_THREADING_C`, the code must be functionally correct: no race conditions, deadlocks or livelocks.
24
25The [PSA Crypto API specification](https://armmbed.github.io/mbed-crypto/html/overview/conventions.html#concurrent-calls) defines minimum expectations for concurrent calls. They must work as if they had been executed one at a time, except that the following cases have undefined behavior:
26
27* Destroying a key while it's in use.
28* Concurrent calls using the same operation object. (An operation object may not be used by more than one thread at a time. But it can move from one thread to another between calls.)
29* Overlap of an output buffer with an input or output of a concurrent call.
30* Modification of an input buffer during a call.
31
32Note that while the specification does not define the behavior in such cases, Mbed TLS can be used as a crypto service. It's acceptable if an application can mess itself up, but it is not acceptable if an application can mess up the crypto service. As a consequence, destroying a key while it's in use may violate the security property that all key material is erased as soon as `psa_destroy_key` returns, but it may not cause data corruption or read-after-free inside the key store.
33
34### No spinning
35
36The code must not spin on a potentially non-blocking task. For example, this is proscribed:
37```
38lock(m);
39while (!its_my_turn) {
40 unlock(m);
41 lock(m);
42}
43```
44
45Rationale: this can cause battery drain, and can even be a livelock (spinning forever), e.g. if the thread that might unblock this one has a lower priority.
46
47### Driver requirements
48
49At the time of writing, the driver interface specification does not consider multithreaded environments.
50
51We need to define clear policies so that driver implementers know what to expect. Here are two possible policies at two ends of the spectrum; what is desirable is probably somewhere in between.
52
53* Driver entry points may be called concurrently from multiple threads, even if they're using the same key, and even including destroying a key while an operation is in progress on it.
54* At most one driver entry point is active at any given time.
55
56A more reasonable policy could be:
57
58* By default, each driver only has at most one entry point active at any given time. In other words, each driver has its own exclusive lock.
59* Drivers have an optional `"thread_safe"` boolean property. If true, it allows concurrent calls to this driver.
60* Even with a thread-safe driver, the core never starts the destruction of a key while there are operations in progress on it, and never performs concurrent calls on the same multipart operation.
61
62### Long-term performance requirements
63
64In the short term, correctness is the important thing. We can start with a global lock.
65
66In the medium to long term, performing a slow or blocking operation (for example, a driver call, or an RSA decryption) should not block other threads, even if they're calling the same driver or using the same key object.
67
68We may want to go directly to a more sophisticated approach because when a system works with a global lock, it's typically hard to get rid of it to get more fine-grained concurrency.
69
Janos Follath7ec993d2023-08-23 16:00:14 +010070### Key destruction short-term requirements
Gilles Peskinea42a8de2021-11-03 12:18:41 +010071
Janos Follath15d9ec22023-08-31 08:22:21 +010072#### Summary of guarantees in the short term
73
74When `psa_destroy_key` returns:
Gilles Peskine584bf982023-08-07 16:29:19 +020075
Janos Follath0385c282023-08-30 16:41:06 +0100761. The key identifier doesn't exist. Rationale: this is a functional requirement for persistent keys: the caller can immediately create a new key with the same identifier.
772. The resources from the key have been freed. Rationale: in a low-resource condition, this may be necessary for the caller to re-create a similar key, which should be possible.
Janos Follathb4527fb2023-08-31 14:01:24 +0100783. The call must not block indefinitely, and in particular cannot wait for an event that is triggered by application code such as calling an abort function. Rationale: this may not strictly be a functional requirement, but it is an expectation `psa_destroy_key` does not block forever due to another thread, which could potentially be another process on a multi-process system. In particular, it is only acceptable for `psa_destroy_key` to block, when waiting for another thread to complete a PSA Cryptography API call that it had already started.
Janos Follath7ec993d2023-08-23 16:00:14 +010079
Janos Follath0385c282023-08-30 16:41:06 +010080When `psa_destroy_key` is called on a key that is in use, guarantee 2. might be violated. (This is consistent with the requirement [“Correctness out of the box”](#correctness-out-of-the-box), as destroying a key while it's in use is undefined behavior.)
Janos Follath7ec993d2023-08-23 16:00:14 +010081
82### Key destruction long-term requirements
83
Janos Follathb6954732023-08-31 13:54:21 +010084The [PSA Crypto API specification](https://armmbed.github.io/mbed-crypto/html/api/keys/management.html#key-destruction) mandates that implementations make a best effort to ensure that the key material cannot be recovered. In the long term, it would be good to guarantee that `psa_destroy_key` wipes all copies of the key material.
Janos Follath7ec993d2023-08-23 16:00:14 +010085
Janos Follath15d9ec22023-08-31 08:22:21 +010086#### Summary of guarantees in the long term
Janos Follath7ec993d2023-08-23 16:00:14 +010087
Janos Follath15d9ec22023-08-31 08:22:21 +010088When `psa_destroy_key` returns:
89
901. The key identifier doesn't exist. Rationale: this is a functional requirement for persistent keys: the caller can immediately create a new key with the same identifier.
912. The resources from the key have been freed. Rationale: in a low-resource condition, this may be necessary for the caller to re-create a similar key, which should be possible.
Janos Follathb4527fb2023-08-31 14:01:24 +0100923. The call must not block indefinitely, and in particular cannot wait for an event that is triggered by application code such as calling an abort function. Rationale: this may not strictly be a functional requirement, but it is an expectation `psa_destroy_key` does not block forever due to another thread, which could potentially be another process on a multi-process system. In particular, it is only acceptable for `psa_destroy_key` to block, when waiting for another thread to complete a PSA Cryptography API call that it had already started.
934. No copy of the key material exists. Rationale: this is a security requirement. We do not have this requirement yet, but we need to document this as a security weakness, and we would like to satisfy this security requirement in the future.
Janos Follath7ec993d2023-08-23 16:00:14 +010094
Janos Follath0385c282023-08-30 16:41:06 +010095As opposed to the short term requirements, all the above guarantees hold even if `psa_destroy_key` is called on a key that is in use.
Gilles Peskine584bf982023-08-07 16:29:19 +020096
Gilles Peskinea42a8de2021-11-03 12:18:41 +010097## Resources to protect
98
Gilles Peskine41d03342022-02-14 23:55:59 +010099Analysis of the behavior of the PSA key store as of Mbed TLS 9202ba37b19d3ea25c8451fd8597fce69eaa6867.
100
Gilles Peskinea42a8de2021-11-03 12:18:41 +0100101### Global variables
102
103* `psa_crypto_slot_management::global_data.key_slots[i]`: see [“Key slots”](#key-slots).
104
105* `psa_crypto_slot_management::global_data.key_slots_initialized`:
106 * `psa_initialize_key_slots`: modification.
107 * `psa_wipe_all_key_slots`: modification.
108 * `psa_get_empty_key_slot`: read.
109 * `psa_get_and_lock_key_slot`: read.
110
111* `psa_crypto::global_data.rng`: depends on the RNG implementation. See [“Random generator”](#random-generator).
112 * `psa_generate_random`: query.
113 * `mbedtls_psa_crypto_configure_entropy_sources` (only if `MBEDTLS_PSA_CRYPTO_EXTERNAL_RNG` is enabled): setup. Only called from `psa_crypto_init` via `mbedtls_psa_random_init`, or from test code.
114 * `mbedtls_psa_crypto_free`: deinit.
115 * `psa_crypto_init`: seed (via `mbedtls_psa_random_seed`); setup via `mbedtls_psa_crypto_configure_entropy_sources.
116
117* `psa_crypto::global_data.{initialized,rng_state}`: these are bit-fields and cannot be modified independently so they must be protected by the same mutex. The following functions access these fields:
118 * `mbedtls_psa_crypto_configure_entropy_sources` [`rng_state`] (only if `MBEDTLS_PSA_CRYPTO_EXTERNAL_RNG` is enabled): read. Only called from `psa_crypto_init` via `mbedtls_psa_random_init`, or from test code.
119 * `mbedtls_psa_crypto_free`: modification.
120 * `psa_crypto_init`: modification.
121 * Many functions via `GUARD_MODULE_INITIALIZED`: read.
122
123### Key slots
124
125#### Key slot array traversal
126
127“Occupied key slot” is determined by `psa_is_key_slot_occupied` based on `slot->attr.type`.
128
129The following functions traverse the key slot array:
130
131* `psa_get_and_lock_key_slot_in_memory`: reads `slot->attr.id`.
132* `psa_get_and_lock_key_slot_in_memory`: calls `psa_lock_key_slot` on one occupied slot.
133* `psa_get_empty_key_slot`: calls `psa_is_key_slot_occupied`.
134* `psa_get_empty_key_slot`: calls `psa_wipe_key_slot` and more modifications on one occupied slot with no active user.
135* `psa_get_empty_key_slot`: calls `psa_lock_key_slot` and more modification on one unoccupied slot.
136* `psa_wipe_all_key_slots`: writes to all slots.
137* `mbedtls_psa_get_stats`: reads from all slots.
138
139#### Key slot state
140
141The following functions modify a slot's usage state:
142
143* `psa_lock_key_slot`: writes to `slot->lock_count`.
144* `psa_unlock_key_slot`: writes to `slot->lock_count`.
145* `psa_wipe_key_slot`: writes to `slot->lock_count`.
146* `psa_destroy_key`: reads `slot->lock_count`, calls `psa_lock_key_slot`.
147* `psa_wipe_all_key_slots`: writes to all slots.
148* `psa_get_empty_key_slot`: writes to `slot->lock_count` and calls `psa_wipe_key_slot` and `psa_lock_key_slot` on one occupied slot with no active user; calls `psa_lock_key_slot` on one unoccupied slot.
149* `psa_close_key`: reads `slot->lock_count`; calls `psa_get_and_lock_key_slot_in_memory`, `psa_wipe_key_slot` and `psa_unlock_key_slot`.
150* `psa_purge_key`: reads `slot->lock_count`; calls `psa_get_and_lock_key_slot_in_memory`, `psa_wipe_key_slot` and `psa_unlock_key_slot`.
151
Andrzej Kurekeec6b2c2021-11-08 14:09:29 +0100152**slot->attr access:**
153`psa_crypto_core.h`:
154* `psa_key_slot_set_flags` - writes to attr.flags
155* `psa_key_slot_set_bits_in_flags` - writes to attr.flags
156* `psa_key_slot_clear_bits` - writes to attr.flags
Gilles Peskined3a79772023-08-02 18:36:06 +0200157* `psa_is_key_slot_occupied` - reads attr.type (but see “[Determining whether a key slot is occupied](#determining-whether-a-key-slot-is-occupied)”)
Andrzej Kurekeec6b2c2021-11-08 14:09:29 +0100158* `psa_key_slot_get_flags` - reads attr.flags
159
160`psa_crypto_slot_management.c`:
161* `psa_get_and_lock_key_slot_in_memory` - reads attr.id
162* `psa_get_empty_key_slot` - reads attr.lifetime
163* `psa_load_persistent_key_into_slot` - passes attr pointer to psa_load_persistent_key
164* `psa_load_persistent_key` - reads attr.id and passes pointer to psa_parse_key_data_from_storage
165* `psa_parse_key_data_from_storage` - writes to many attributes
166* `psa_get_and_lock_key_slot` - writes to attr.id, attr.lifetime, and attr.policy.usage
167* `psa_purge_key` - reads attr.lifetime, calls psa_wipe_key_slot
168* `mbedtls_psa_get_stats` - reads attr.lifetime, attr.id
169
170`psa_crypto.c`:
171* `psa_get_and_lock_key_slot_with_policy` - reads attr.type, attr.policy.
172* `psa_get_and_lock_transparent_key_slot_with_policy` - reads attr.lifetime
173* `psa_destroy_key` - reads attr.lifetime, attr.id
174* `psa_get_key_attributes` - copies all publicly available attributes of a key
175* `psa_export_key` - copies attributes
176* `psa_export_public_key` - reads attr.type, copies attributes
177* `psa_start_key_creation` - writes to the whole attr structure
178* `psa_validate_optional_attributes` - reads attr.type, attr.bits
179* `psa_import_key` - reads attr.bits
180* `psa_copy_key` - reads attr.bits, attr.type, attr.lifetime, attr.policy
181* `psa_mac_setup` - copies whole attr structure
182* `psa_mac_compute_internal` - copies whole attr structure
183* `psa_verify_internal` - copies whole attr structure
184* `psa_sign_internal` - copies whole attr structure, reads attr.type
185* `psa_assymmetric_encrypt` - reads attr.type
186* `psa_assymetric_decrypt` - reads attr.type
187* `psa_cipher_setup` - copies whole attr structure, reads attr.type
188* `psa_cipher_encrypt` - copies whole attr structure, reads attr.type
189* `psa_cipher_decrypt` - copies whole attr structure, reads attr.type
190* `psa_aead_encrypt` - copies whole attr structure
191* `psa_aead_decrypt` - copies whole attr structure
192* `psa_aead_setup` - copies whole attr structure
193* `psa_generate_derived_key_internal` - reads attr.type, writes to and reads from attr.bits, copies whole attr structure
194* `psa_key_derivation_input_key` - reads attr.type
195* `psa_key_agreement_raw_internal` - reads attr.type and attr.bits
Gilles Peskinea42a8de2021-11-03 12:18:41 +0100196
Gilles Peskined3a79772023-08-02 18:36:06 +0200197#### Determining whether a key slot is occupied
198
199`psa_is_key_slot_occupied` currently uses the `attr.type` field to determine whether a key slot is occupied. This works because we maintain the invariant that an occupied slot contains key material. With concurrency, it is desirable to allow a key slot to be reserved, but not yet contain key material or even metadata. When creating a key, determining the key type can be costly, for example when loading a persistent key from storage or (not yet implemented) when importing or unwrapping a key using an interface that determines the key type from the data that it parses. So we should not need to hold the global key store lock while the key type is undetermined.
200
201Instead, `psa_is_key_slot_occupied` should use the key identifier to decide whether a slot is occupied. The key identifier is always readily available: when allocating a slot for a persistent key, it's an input of the function that allocates the key slot; when allocating a slot for a volatile key, the identifier is calculated from the choice of slot.
Gilles Peskinea42a8de2021-11-03 12:18:41 +0100202
203#### Key slot content
204
205Other than what is used to determine the [“key slot state”](#key-slot-state), the contents of a key slot are only accessed as follows:
206
207* Modification during key creation (between `psa_start_key_creation` and `psa_finish_key_creation` or `psa_fail_key_creation`).
208* Destruction in `psa_wipe_key_slot`.
209* Read in many functions, between calls to `psa_lock_key_slot` and `psa_unlock_key_slot`.
210
Andrzej Kurekeec6b2c2021-11-08 14:09:29 +0100211**slot->key access:**
212* `psa_allocate_buffer_to_slot` - allocates key.data, sets key.bytes;
213* `psa_copy_key_material_into_slot` - writes to key.data
214* `psa_remove_key_data_from_memory` - writes and reads to/from key data
215* `psa_get_key_attributes` - reads from key data
216* `psa_export_key` - passes key data to psa_driver_wrapper_export_key
217* `psa_export_public_key` - passes key data to psa_driver_wrapper_export_public_key
218* `psa_finish_key_creation` - passes key data to psa_save_persistent_key
219* `psa_validate_optional_attributes` - passes key data and bytes to mbedtls_psa_rsa_load_representation
220* `psa_import_key` - passes key data to psa_driver_wrapper_import_key
221* `psa_copy_key` - passes key data to psa_driver_wrapper_copy_key, psa_copy_key_material_into_slot
222* `psa_mac_setup` - passes key data to psa_driver_wrapper_mac_sign_setup, psa_driver_wrapper_mac_verify_setup
223* `psa_mac_compute_internal` - passes key data to psa_driver_wrapper_mac_compute
224* `psa_sign_internal` - passes key data to psa_driver_wrapper_sign_message, psa_driver_wrapper_sign_hash
225* `psa_verify_internal` - passes key data to psa_driver_wrapper_verify_message, psa_driver_wrapper_verify_hash
226* `psa_asymmetric_encrypt` - passes key data to mbedtls_psa_rsa_load_representation
227* `psa_asymmetric_decrypt` - passes key data to mbedtls_psa_rsa_load_representation
228* `psa_cipher_setup ` - passes key data to psa_driver_wrapper_cipher_encrypt_setup and psa_driver_wrapper_cipher_decrypt_setup
229* `psa_cipher_encrypt` - passes key data to psa_driver_wrapper_cipher_encrypt
230* `psa_cipher_decrypt` - passes key data to psa_driver_wrapper_cipher_decrypt
231* `psa_aead_encrypt` - passes key data to psa_driver_wrapper_aead_encrypt
232* `psa_aead_decrypt` - passes key data to psa_driver_wrapper_aead_decrypt
233* `psa_aead_setup` - passes key data to psa_driver_wrapper_aead_encrypt_setup and psa_driver_wrapper_aead_decrypt_setup
234* `psa_generate_derived_key_internal` - passes key data to psa_driver_wrapper_import_key
235* `psa_key_derivation_input_key` - passes key data to psa_key_derivation_input_internal
236* `psa_key_agreement_raw_internal` - passes key data to mbedtls_psa_ecp_load_representation
237* `psa_generate_key` - passes key data to psa_driver_wrapper_generate_key
238
Gilles Peskinea42a8de2021-11-03 12:18:41 +0100239### Random generator
240
241The PSA RNG can be accessed both from various PSA functions, and from application code via `mbedtls_psa_get_random`.
242
243With the built-in RNG implementations using `mbedtls_ctr_drbg_context` or `mbedtls_hmac_drbg_context`, querying the RNG with `mbedtls_xxx_drbg_random()` is thread-safe (protected by a mutex inside the RNG implementation), but other operations (init, free, seed) are not.
244
245When `MBEDTLS_PSA_CRYPTO_EXTERNAL_RNG` is enabled, thread safety depends on the implementation.
246
247### Driver resources
248
249Depends on the driver. The PSA driver interface specification does not discuss whether drivers must support concurrent calls.
250
251## Simple global lock strategy
252
253Have a single mutex protecting all accesses to the key store and other global variables. In practice, this means every PSA API function needs to take the lock on entry and release on exit, except for:
254
255* Hash function.
256* Accessors for key attributes and other local structures.
257
258Note that operation functions do need to take the lock, since they need to prevent the destruction of the key.
259
260Note that this does not protect access to the RNG via `mbedtls_psa_get_random`, which is guaranteed to be thread-safe when `MBEDTLS_PSA_CRYPTO_EXTERNAL_RNG` is disabled.
261
262This approach is conceptually simple, but requires extra instrumentation to every function and has bad performance in a multithreaded environment since a slow operation in one thread blocks unrelated operations on other threads.
263
264## Global lock excluding slot content
265
266Have a single mutex protecting all accesses to the key store and other global variables, except that it's ok to access the content of a key slot without taking the lock if one of the following conditions holds:
267
268* The key slot is in a state that guarantees that the thread has exclusive access.
269* The key slot is in a state that guarantees that no other thread can modify the slot content, and the accessing thread is only reading the slot.
270
271Note that a thread must hold the global mutex when it reads or changes a slot's state.
272
273### Slot states
274
275For concurrency purposes, a slot can be in one of three states:
276
277* UNUSED: no thread is currently accessing the slot. It may be occupied by a volatile key or a cached key.
278* WRITING: a thread has exclusive access to the slot. This can only happen in specific circumstances as detailed below.
279* READING: any thread may read from the slot.
280
281A high-level view of state transitions:
282
283* `psa_get_empty_key_slot`: UNUSED → WRITING.
284* `psa_get_and_lock_key_slot_in_memory`: UNUSED or READING → READING. This function only accepts slots in the UNUSED or READING state. A slot with the correct id but in the WRITING state is considered free.
285* `psa_unlock_key_slot`: READING → UNUSED or READING.
286* `psa_finish_key_creation`: WRITING → READING.
287* `psa_fail_key_creation`: WRITING → UNUSED.
288* `psa_wipe_key_slot`: any → UNUSED. If the slot is READING or WRITING on entry, this function must wait until the writer or all readers have finished. (By the way, the WRITING state is possible if `mbedtls_psa_crypto_free` is called while a key creation is in progress.) See [“Destruction of a key in use”](#destruction of a key in use).
289
290The current `state->lock_count` corresponds to the difference between UNUSED and READING: a slot is in use iff its lock count is nonzero, so `lock_count == 0` corresponds to UNUSED and `lock_count != 0` corresponds to READING.
291
292There is currently no indication of when a slot is in the WRITING state. This only happens between a call to `psa_start_key_creation` and a call to one of `psa_finish_key_creation` or `psa_fail_key_creation`. This new state can be conveyed by a new boolean flag, or by setting `lock_count` to `~0`.
293
294### Destruction of a key in use
295
296Problem: a key slot is destroyed (by `psa_wipe_key_slot`) while it's in use (READING or WRITING).
297
298TODO: how do we ensure that? This needs something more sophisticated than mutexes (concurrency number >2)! Even a per-slot mutex isn't enough (we'd need a reader-writer lock).
Gilles Peskine9aa93c82023-08-07 16:32:09 +0200299
300Solution: after some team discussion, we've decided to rely on a new threading abstraction which mimics C11 (i.e. `mbedtls_fff` where `fff` is the C11 function name, having the same parameters and return type, with default implementations for C11, pthreads and Windows). We'll likely use condition variables in addition to mutexes.