blob: 93235119393d6545cf7596df61bdb80ee8cc144e [file] [log] [blame]
Joakim Bech8e5c5b32018-10-25 08:18:32 +02001.. _gprof:
2
3Gprof
4#####
5This describes to do profiling of user Trusted Applications with ``gprof``.
6
7The configuration option ``CFG_TA_GPROF_SUPPORT=y`` enables OP-TEE to collect
8profiling information from Trusted Applications running in user mode and
9compiled with ``-pg``. Once collected, the profiling data are formatted in the
10``gmon.out`` format and sent to ``tee-supplicant`` via RPC, so they can be saved
11to disk and later processed and displayed by the standard ``gprof`` tool.
12
13Usage
14*****
15
16 - Build OP-TEE OS with ``CFG_TA_GPROF_SUPPORT=y``. You may also set
Jerome Forissier2a224112020-05-27 11:45:25 +020017 ``CFG_ULIBS_MCOUNT=y`` to instrument the user TA libraries contained in
18 ``optee_os`` (such as ``libutee`` and ``libutils``).
Joakim Bech8e5c5b32018-10-25 08:18:32 +020019
Sumit Gargfcc4ed62019-06-10 13:59:53 +053020 - Build user TAs with ``-pg``, for instance enable: ``CFG_TA_MCOUNT=y`` to
21 instrument whole user TA. Note that instrumented TAs have a larger
Joakim Bech8e5c5b32018-10-25 08:18:32 +020022 ``.bss`` section. The memory overhead is 1.36 times the ``.text`` size for
23 32-bit TAs, and 1.77 times for 64-bit ones (refer to the TA linker script
24 for details: ``ta/arch/arm/ta.ld.S``).
25
26 - Run the application normally. When the last session exits,
27 ``tee-supplicant`` will write profiling data to
28 ``/tmp/gmon-<ta_uuid>.out``. If the file already exists, a number is
29 appended, such as: ``gmon-<ta_uuid>.1.out``.
30
31 - Run gprof on the TA ELF file and profiling output: ``gprof <ta_uuid>.elf
32 gmon-<ta_uuid>.out``
33
34Implementation
35**************
36Part of the profiling is implemented in libutee. Another part is done in the TEE
37core by a pseudo-TA (``core/arch/arm/sta/gprof.c``). Two types of data are
38collected:
39
40 1. Call graph information
41 - When TA source files are compiled with the -pg switch, the compiler
42 generates extra code into each function prologue to call the
43 instrumentation entry point (``__gnu_mcount_nc`` or ``_mcount``
44 depending on the architecture). Each time an instrumented function is
45 called, libutee records a pair of program counters (one is the caller
46 and the other one is the callee) as well as the number of times this
47 specific arc of the call graph has been invoked.
48
49 2. PC distribution over time
50 - When an instrumented TA starts, libutee calls the pseudo-TA to start
51 PC sampling for the current session. Sampling data are written into
52 the user-space buffer directly by the TEE core.
53
54 - Whenever the TA execution is interrupted, the TEE core records the
55 current program counter value and builds a histogram of program
56 locations (i.e., relative amount of time spent for each value of the
57 PC). This is later used by the gprof tool to derive the time spent in
58 each function. The sampling rate, which is assumed to be roughly
59 constant, is computed by keeping track of the time spent executing
60 user TA code and dividing the number of interrupts by the total time.
61
62 - The profiling buffer into which call graph and sampling data are
63 recorded is allocated in the TA's ``.bss`` section. Some space is
64 reserved by the linker script, only when the TA is instrumented.