blob: e91f072d53a55884a1ba6684625c26fa540eb54b [file] [log] [blame]
.. note::
The feature described in this section depends on a Linux
`kernel patch`_ that is not available upstream and has been maintained in
the `linaro-swg kernel`_ repository up to OP-TEE version 3.15.0. The latest
kernel source with this patch can be found in the `optee-3.15.0`_ branch
based on Linux 5.14.
The benchmark framework should still work as described here with OP-TEE
3.16.0 or later provided that either: (a) a Linux kernel built from branch
`optee-3.15.0`_ is used, or (b) the benchmark `kernel patch`_ is forward
ported.
If the kernel patch is missing the following errors are printed:
.. code::
$ benchmark optee_example_hello_world
[Benchmark] INFO: 1. Opening Benchmark Static TA...
[Benchmark] INFO: 2. Allocating per-core buffers, cores detected = 2
[Benchmark] ERROR: TEEC_InvokeCommand: 0xffff000c
E/TC:? 0 alloc_benchmark_buffer:72 Benchmark: can't create mobj for timestamp buffer
.. _kernel patch: https://github.com/linaro-swg/linux/commit/d9b0331b46540fa67c0f16e391940f12fde1288b
.. _linaro-swg kernel: https://github.com/linaro-swg/Linux
.. _optee-3.15.0: https://github.com/linaro-swg/linux/commits/optee-3.15.0
.. _benchmark_framework:
Benchmark framework
###################
Due to its nature, OP-TEE is being a solution spanning over several
architectural layers, where each layer includes its own complex parts. For
further optimizations of performance, there is a need of tool which will
provide detailed and precise profiling information for each layer.
It is necessary to receive latency values for:
* The roundtrip time for going from a client application in normal world,
down to a Trusted Application and back again.
* Detailed information for amount of time taken to go through each layer:
* libTEEC -> Linux OP-TEE kernel driver
* Linux OP-TEE kernel driver -> OP-TEE OS Core
* OP-TEE OS Core -> TA entry point (**not supported yet**)
* The same way back
Implementation details
**********************
Design overview
===============
Benchmark framework consists of such components:
1. **Benchmark Client Application (CA)**: a dedicated client application,
which is responsible for allocating timestamp circular buffers,
registering these buffers in the Benchmark PTA and consuming all
timestamp data generated by all OP-TEE layers. Finally, it puts timestamp
data into appropriate file with ``.ts`` extension. Additional build
details can be found at :ref:`optee_benchmark`.
2. **Benchmark Pseudo Trusted Application (PTA)**: which owns all per-cpu
circular non-secure buffers from a shared memory. Benchmark PTA must be
invoked (by a CA) to register the timestamp circular buffers. In turn,
the Benchmark PTA invokes the OP-TEE Linux driver (through some RPC mean)
to register this circular buffers in the Linux kernel layer.
3. **libTEEC** and **Linux kernel OP-TEE driver** include functionality for
handling timestamp buffer registration requests from the Benchmark
PTA.
When the benchmark is enabled, all OP-TEE layers (libTEEC, Linux kernel OP-TEE
driver, OP-TEE OS core) do fill the registered timestamp circular buffer with
timestamp data for all invocation requests on condition that the circular buffer
is allocated/registered.
.. image:: ../images/benchmark/benchmark_design.png
.. To edit benchmark_design diagram use http://draw.io and benchmark_design.xml
source file
Timestamp source
================
Arm Performance Monitor Units are used as the main source of timestamp values.
The reason why this technology was chosen is that it is supported on all
Armv7-A/Armv8-A cores. Besides it can provide precise pre-cpu cycle counter
values, it is possible to enable EL0 access to all events, so usermode
applications can directly read cpu counter values from coprocessor registers,
achieving minimal latency by avoiding additional syscalls to EL1 core.
Besides CPU cycle counter values, timestamp by itself contains also information
about:
* Executing CPU core index
* OP-TEE layer id, where this timestamp was obtained from
* Program counter value when timestamp was logged, which can be used for
getting a symbol name (a filename and line number)
Call sequence diagram
=====================
.. image:: ../images/benchmark/benchmark_sequence.png
.. For benchmark call sequence diagram use http://mscgen.js.org and
benchmark_sequence.msc source file
Adding custom timestamps
************************
Currently, timestamping is done only for ``InvokeCommand`` calls, but it's also
possible to choose custom places in the supported OP-TEE layers. To add
timestamp storing command to custom c source file:
1. Include appropriate header:
* OP-TEE OS Core: ``bench.h``
* Linux kernel OP-TEE module: ``optee_bench.h``
* libTEEC: ``teec_benchmark.h``
2. Invoke ``bm_timestamp()`` (for linux kmod use ``optee_bm_timestamp()``)
in the function, where you want to put timestamp from.
.. todo::
Joakim: Igor's planned tool should go here.
Analyzing results
=================
Will be added soon.
Build and run benchmark
***********************
Please see the instructions available at :ref:`optee_benchmark`.
Limitations and further steps
*****************************
* Implementation of application which will analyze timestamp data and
provide statistics for different types of calls providing avg/min/max
values (both CPU cycles and time values).
* Add support for all platforms, where OP-TEE is supported.
* Adding support of S-EL0 timestamping.
* Attaching additional payload information to each timestamp, for example,
session.
* Timestamping within interrupt context in the OP-TEE OS Core.