| |
| .. note:: |
| The feature described in this section depends on a Linux |
| `kernel patch`_ that is not available upstream and has been maintained in |
| the `linaro-swg kernel`_ repository up to OP-TEE version 3.15.0. The latest |
| kernel source with this patch can be found in the `optee-3.15.0`_ branch |
| based on Linux 5.14. |
| |
| The benchmark framework should still work as described here with OP-TEE |
| 3.16.0 or later provided that either: (a) a Linux kernel built from branch |
| `optee-3.15.0`_ is used, or (b) the benchmark `kernel patch`_ is forward |
| ported. |
| |
| If the kernel patch is missing the following errors are printed: |
| |
| .. code:: |
| |
| $ benchmark optee_example_hello_world |
| [Benchmark] INFO: 1. Opening Benchmark Static TA... |
| [Benchmark] INFO: 2. Allocating per-core buffers, cores detected = 2 |
| [Benchmark] ERROR: TEEC_InvokeCommand: 0xffff000c |
| |
| E/TC:? 0 alloc_benchmark_buffer:72 Benchmark: can't create mobj for timestamp buffer |
| |
| .. _kernel patch: https://github.com/linaro-swg/linux/commit/d9b0331b46540fa67c0f16e391940f12fde1288b |
| .. _linaro-swg kernel: https://github.com/linaro-swg/Linux |
| .. _optee-3.15.0: https://github.com/linaro-swg/linux/commits/optee-3.15.0 |
| |
| .. _benchmark_framework: |
| |
| Benchmark framework |
| ################### |
| Due to its nature, OP-TEE is being a solution spanning over several |
| architectural layers, where each layer includes its own complex parts. For |
| further optimizations of performance, there is a need of tool which will |
| provide detailed and precise profiling information for each layer. |
| |
| It is necessary to receive latency values for: |
| |
| * The roundtrip time for going from a client application in normal world, |
| down to a Trusted Application and back again. |
| |
| * Detailed information for amount of time taken to go through each layer: |
| |
| * libTEEC -> Linux OP-TEE kernel driver |
| * Linux OP-TEE kernel driver -> OP-TEE OS Core |
| * OP-TEE OS Core -> TA entry point (**not supported yet**) |
| * The same way back |
| |
| Implementation details |
| ********************** |
| |
| Design overview |
| =============== |
| Benchmark framework consists of such components: |
| |
| 1. **Benchmark Client Application (CA)**: a dedicated client application, |
| which is responsible for allocating timestamp circular buffers, |
| registering these buffers in the Benchmark PTA and consuming all |
| timestamp data generated by all OP-TEE layers. Finally, it puts timestamp |
| data into appropriate file with ``.ts`` extension. Additional build |
| details can be found at :ref:`optee_benchmark`. |
| |
| 2. **Benchmark Pseudo Trusted Application (PTA)**: which owns all per-cpu |
| circular non-secure buffers from a shared memory. Benchmark PTA must be |
| invoked (by a CA) to register the timestamp circular buffers. In turn, |
| the Benchmark PTA invokes the OP-TEE Linux driver (through some RPC mean) |
| to register this circular buffers in the Linux kernel layer. |
| |
| 3. **libTEEC** and **Linux kernel OP-TEE driver** include functionality for |
| handling timestamp buffer registration requests from the Benchmark |
| PTA. |
| |
| When the benchmark is enabled, all OP-TEE layers (libTEEC, Linux kernel OP-TEE |
| driver, OP-TEE OS core) do fill the registered timestamp circular buffer with |
| timestamp data for all invocation requests on condition that the circular buffer |
| is allocated/registered. |
| |
| .. image:: ../images/benchmark/benchmark_design.png |
| |
| .. To edit benchmark_design diagram use http://draw.io and benchmark_design.xml |
| source file |
| |
| Timestamp source |
| ================ |
| Arm Performance Monitor Units are used as the main source of timestamp values. |
| The reason why this technology was chosen is that it is supported on all |
| Armv7-A/Armv8-A cores. Besides it can provide precise pre-cpu cycle counter |
| values, it is possible to enable EL0 access to all events, so usermode |
| applications can directly read cpu counter values from coprocessor registers, |
| achieving minimal latency by avoiding additional syscalls to EL1 core. |
| |
| Besides CPU cycle counter values, timestamp by itself contains also information |
| about: |
| |
| * Executing CPU core index |
| |
| * OP-TEE layer id, where this timestamp was obtained from |
| |
| * Program counter value when timestamp was logged, which can be used for |
| getting a symbol name (a filename and line number) |
| |
| Call sequence diagram |
| ===================== |
| .. image:: ../images/benchmark/benchmark_sequence.png |
| |
| .. For benchmark call sequence diagram use http://mscgen.js.org and |
| benchmark_sequence.msc source file |
| |
| Adding custom timestamps |
| ************************ |
| |
| Currently, timestamping is done only for ``InvokeCommand`` calls, but it's also |
| possible to choose custom places in the supported OP-TEE layers. To add |
| timestamp storing command to custom c source file: |
| |
| 1. Include appropriate header: |
| |
| * OP-TEE OS Core: ``bench.h`` |
| |
| * Linux kernel OP-TEE module: ``optee_bench.h`` |
| |
| * libTEEC: ``teec_benchmark.h`` |
| |
| 2. Invoke ``bm_timestamp()`` (for linux kmod use ``optee_bm_timestamp()``) |
| in the function, where you want to put timestamp from. |
| |
| .. todo:: |
| |
| Joakim: Igor's planned tool should go here. |
| Analyzing results |
| ================= |
| Will be added soon. |
| |
| Build and run benchmark |
| *********************** |
| Please see the instructions available at :ref:`optee_benchmark`. |
| |
| |
| Limitations and further steps |
| ***************************** |
| |
| * Implementation of application which will analyze timestamp data and |
| provide statistics for different types of calls providing avg/min/max |
| values (both CPU cycles and time values). |
| |
| * Add support for all platforms, where OP-TEE is supported. |
| |
| * Adding support of S-EL0 timestamping. |
| |
| * Attaching additional payload information to each timestamp, for example, |
| session. |
| |
| * Timestamping within interrupt context in the OP-TEE OS Core. |