blob: b678119639bd81bd0ea9370dde16c0dd0481741c [file] [log] [blame]
Joakim Bech8e5c5b32018-10-25 08:18:32 +02001.. _benchmark_framework:
2
Jerome Forissier69a11fd2022-06-08 17:21:08 +02003.. note::
4 The feature described in this section depends on a Linux
5 `kernel patch`_ that is not available upstream and has been maintained in
6 the `linaro-swg kernel`_ repository up to OP-TEE version 3.15.0. The latest
7 kernel source with this patch can be found in the `optee-3.15.0`_ branch
8 based on Linux 5.14.
9
10 The benchmark framework should still work as described here with OP-TEE
11 3.16.0 or later provided that either: (a) a Linux kernel built from branch
12 `optee-3.15.0`_ is used, or (b) the benchmark `kernel patch`_ is forward
13 ported.
14
15 If the kernel patch is missing the following errors are printed:
16
17 .. code::
18
19 $ benchmark optee_example_hello_world
20 [Benchmark] INFO: 1. Opening Benchmark Static TA...
21 [Benchmark] INFO: 2. Allocating per-core buffers, cores detected = 2
22 [Benchmark] ERROR: TEEC_InvokeCommand: 0xffff000c
23
24 E/TC:? 0 alloc_benchmark_buffer:72 Benchmark: can't create mobj for timestamp buffer
25
26.. _kernel patch: https://github.com/linaro-swg/linux/commit/d9b0331b46540fa67c0f16e391940f12fde1288b
27.. _linaro-swg kernel: https://github.com/linaro-swg/Linux
28.. _optee-3.15.0: https://github.com/linaro-swg/linux/commits/optee-3.15.0
29
Joakim Bech8e5c5b32018-10-25 08:18:32 +020030Benchmark framework
31###################
32Due to its nature, OP-TEE is being a solution spanning over several
33architectural layers, where each layer includes its own complex parts. For
34further optimizations of performance, there is a need of tool which will
35provide detailed and precise profiling information for each layer.
36
37It is necessary to receive latency values for:
38
39 * The roundtrip time for going from a client application in normal world,
40 down to a Trusted Application and back again.
41
42 * Detailed information for amount of time taken to go through each layer:
43
44 * libTEEC -> Linux OP-TEE kernel driver
45 * Linux OP-TEE kernel driver -> OP-TEE OS Core
46 * OP-TEE OS Core -> TA entry point (**not supported yet**)
47 * The same way back
48
49Implementation details
50**********************
51
52Design overview
53===============
54Benchmark framework consists of such components:
55
56 1. **Benchmark Client Application (CA)**: a dedicated client application,
57 which is responsible for allocating timestamp circular buffers,
58 registering these buffers in the Benchmark PTA and consuming all
59 timestamp data generated by all OP-TEE layers. Finally, it puts timestamp
60 data into appropriate file with ``.ts`` extension. Additional build
61 details can be found at :ref:`optee_benchmark`.
62
63 2. **Benchmark Pseudo Trusted Application (PTA)**: which owns all per-cpu
64 circular non-secure buffers from a shared memory. Benchmark PTA must be
65 invoked (by a CA) to register the timestamp circular buffers. In turn,
66 the Benchmark PTA invokes the OP-TEE Linux driver (through some RPC mean)
67 to register this circular buffers in the Linux kernel layer.
68
69 3. **libTEEC** and **Linux kernel OP-TEE driver** include functionality for
70 handling timestamp buffer registration requests from the Benchmark
71 PTA.
72
73When the benchmark is enabled, all OP-TEE layers (libTEEC, Linux kernel OP-TEE
74driver, OP-TEE OS core) do fill the registered timestamp circular buffer with
75timestamp data for all invocation requests on condition that the circular buffer
76is allocated/registered.
77
78.. image:: ../images/benchmark/benchmark_design.png
79
80.. To edit benchmark_design diagram use http://draw.io and benchmark_design.xml
81 source file
82
83Timestamp source
84================
85Arm Performance Monitor Units are used as the main source of timestamp values.
86The reason why this technology was chosen is that it is supported on all
87Armv7-A/Armv8-A cores. Besides it can provide precise pre-cpu cycle counter
88values, it is possible to enable EL0 access to all events, so usermode
89applications can directly read cpu counter values from coprocessor registers,
90achieving minimal latency by avoiding additional syscalls to EL1 core.
91
92Besides CPU cycle counter values, timestamp by itself contains also information
93about:
94
95 * Executing CPU core index
96
97 * OP-TEE layer id, where this timestamp was obtained from
98
99 * Program counter value when timestamp was logged, which can be used for
100 getting a symbol name (a filename and line number)
101
102Call sequence diagram
103=====================
104.. image:: ../images/benchmark/benchmark_sequence.png
105
106.. For benchmark call sequence diagram use http://mscgen.js.org and
107 benchmark_sequence.msc source file
108
109Adding custom timestamps
110************************
111
112Currently, timestamping is done only for ``InvokeCommand`` calls, but it's also
113possible to choose custom places in the supported OP-TEE layers. To add
114timestamp storing command to custom c source file:
115
116 1. Include appropriate header:
117
118 * OP-TEE OS Core: ``bench.h``
119
120 * Linux kernel OP-TEE module: ``optee_bench.h``
121
122 * libTEEC: ``teec_benchmark.h``
123
124 2. Invoke ``bm_timestamp()`` (for linux kmod use ``optee_bm_timestamp()``)
125 in the function, where you want to put timestamp from.
126
127.. todo::
128
129 Joakim: Igor's planned tool should go here.
130 Analyzing results
131 =================
132 Will be added soon.
133
134Build and run benchmark
135***********************
136Please see the instructions available at :ref:`optee_benchmark`.
137
138
139Limitations and further steps
140*****************************
141
142 * Implementation of application which will analyze timestamp data and
143 provide statistics for different types of calls providing avg/min/max
144 values (both CPU cycles and time values).
145
146 * Add support for all platforms, where OP-TEE is supported.
147
148 * Adding support of S-EL0 timestamping.
149
150 * Attaching additional payload information to each timestamp, for example,
151 session.
152
153 * Timestamping within interrupt context in the OP-TEE OS Core.