Sumit Garg | 444bd73 | 2019-04-09 18:40:02 +0530 | [diff] [blame] | 1 | .. _ftrace: |
| 2 | |
Jerome Forissier | be6f978 | 2023-06-29 10:53:20 +0200 | [diff] [blame] | 3 | Ftrace (function tracing) |
| 4 | ######################### |
| 5 | This section describes how to generate a function call graph for user Trusted |
| 6 | Applications using ``ftrace``. The name comes from the Linux framework which |
| 7 | has a similar purpose, but the OP-TEE ``ftrace`` is very much specific. |
| 8 | |
| 9 | A call graph logs all the calls to instrumented functions and contains timing |
| 10 | information. It is therefore a valuable tool to troubleshoot performance |
| 11 | problems or to optimize the code in general. |
Sumit Garg | 444bd73 | 2019-04-09 18:40:02 +0530 | [diff] [blame] | 12 | |
Sumit Garg | ba868fd | 2019-10-21 18:49:41 +0530 | [diff] [blame] | 13 | The configuration option ``CFG_FTRACE_SUPPORT=y`` enables OP-TEE to collect |
Sumit Garg | 444bd73 | 2019-04-09 18:40:02 +0530 | [diff] [blame] | 14 | function graph information from Trusted Applications running in user mode and |
Jerome Forissier | be6f978 | 2023-06-29 10:53:20 +0200 | [diff] [blame] | 15 | compiled with ``-pg``. Once collected, the function graph data is sent to |
| 16 | ``tee-supplicant`` via RPC, so they can be saved to disk, later processed |
| 17 | and displayed using helper scripts (``ftrace_format.py`` and ``symbolize.py`` |
| 18 | which can be found in ``optee_os/scripts``). |
Sumit Garg | 444bd73 | 2019-04-09 18:40:02 +0530 | [diff] [blame] | 19 | |
Sumit Garg | ba868fd | 2019-10-21 18:49:41 +0530 | [diff] [blame] | 20 | Another configuration option ``CFG_SYSCALL_FTRACE=y`` in addition to |
| 21 | ``CFG_FTRACE_SUPPORT=y`` enables OP-TEE to collect function graph information |
| 22 | for syscalls as well while running in kernel mode on behalf of Trusted |
Jerome Forissier | be6f978 | 2023-06-29 10:53:20 +0200 | [diff] [blame] | 23 | Applications. Note that a small number of kernel functions cannot be traced; |
| 24 | they have the ``__noprof`` attribute in the source code. |
| 25 | |
| 26 | A third configuration option ``CFG_ULIBS_MCOUNT=y`` enables tracing of user |
| 27 | space libraries contained in ``optee_os`` and used by TAs (such as ``libutee`` |
| 28 | and ``libutils``). |
Sumit Garg | ba868fd | 2019-10-21 18:49:41 +0530 | [diff] [blame] | 29 | |
Sumit Garg | 444bd73 | 2019-04-09 18:40:02 +0530 | [diff] [blame] | 30 | Usage |
| 31 | ***** |
| 32 | |
Jerome Forissier | be6f978 | 2023-06-29 10:53:20 +0200 | [diff] [blame] | 33 | - Build OP-TEE OS with ``CFG_FTRACE_SUPPORT=y`` and optionally |
| 34 | ``CFG_ULIBS_MCOUNT=y`` and ``CFG_SYSCALL_FTRACE=y``. |
Sumit Garg | ba868fd | 2019-10-21 18:49:41 +0530 | [diff] [blame] | 35 | |
Sumit Garg | 444bd73 | 2019-04-09 18:40:02 +0530 | [diff] [blame] | 36 | - Build user TAs with ``-pg``, for instance enable ``CFG_TA_MCOUNT=y`` to |
Jerome Forissier | be6f978 | 2023-06-29 10:53:20 +0200 | [diff] [blame] | 37 | instrument the whole TA. Also, in case user wants to set ``-pg`` for a |
Sumit Garg | 444bd73 | 2019-04-09 18:40:02 +0530 | [diff] [blame] | 38 | particular file, following should go in corresponding sub.mk: |
| 39 | ``cflags-<file-name>-y+=-pg``. Note that instrumented TAs have a larger |
| 40 | ``.bss`` section. The memory overhead depends on ``CFG_FTRACE_BUF_SIZE`` |
| 41 | macro which can be configured specific to user TAs using config: |
| 42 | ``CFG_FTRACE_BUF_SIZE=4096`` (default value: 2048, refer to the TA linker |
Jerome Forissier | be6f978 | 2023-06-29 10:53:20 +0200 | [diff] [blame] | 43 | script for details: ``ta/arch/$(ARCH)/ta.ld.S``). |
Sumit Garg | 444bd73 | 2019-04-09 18:40:02 +0530 | [diff] [blame] | 44 | |
| 45 | - Run the application normally. When the current session exits or there is |
| 46 | any abort during TA execution, ``tee-supplicant`` will write function |
| 47 | graph data to ``/tmp/ftrace-<ta_uuid>.out``. If the file already exists, |
| 48 | a number is appended, such as: ``ftrace-<ta_uuid>.1.out``. |
| 49 | |
Jerome Forissier | be6f978 | 2023-06-29 10:53:20 +0200 | [diff] [blame] | 50 | - Run helper scripts called ``ftrace_format.py`` to translate the function |
| 51 | graph binary data into human readable text and ``symbolize.py`` to |
| 52 | convert function addresses into function names: |
Stefan Gloor | 2e3a5da | 2023-11-08 12:46:19 +0100 | [diff] [blame] | 53 | ``optee_os/scripts/ftrace_format.py ftrace-<ta_uuid>.out | |
Jerome Forissier | be6f978 | 2023-06-29 10:53:20 +0200 | [diff] [blame] | 54 | optee_os/scripts/symbolize.py -d <ta_uuid>.elf -d tee.elf`` |
| 55 | |
| 56 | - Refer to `commit 5c2c0fb31efb`_ for a full usage example on QEMU. |
Sumit Garg | 5a34726 | 2019-09-05 12:25:59 +0530 | [diff] [blame] | 57 | |
| 58 | Typical output |
| 59 | ************** |
| 60 | |
Jerome Forissier | be6f978 | 2023-06-29 10:53:20 +0200 | [diff] [blame] | 61 | .. code-block:: none |
Sumit Garg | 5a34726 | 2019-09-05 12:25:59 +0530 | [diff] [blame] | 62 | |
Jerome Forissier | be6f978 | 2023-06-29 10:53:20 +0200 | [diff] [blame] | 63 | TEE load address @ 0x5ab04000 |
| 64 | Function graph for TA: cb3e5ba0-adf1-11e0-998b-0002a5d5c51b @ 80085000 |
| 65 | | 1 | __ta_entry() { |
| 66 | | 2 | __utee_entry() { |
| 67 | 43.840 us | 3 | ta_header_get_session() |
| 68 | 7.216 us | 3 | tahead_get_trace_level() |
| 69 | 14.480 us | 3 | trace_set_level() |
| 70 | | 3 | malloc_add_pool() { |
| 71 | | 4 | raw_malloc_add_pool() { |
| 72 | 46.032 us | 5 | bpool() |
| 73 | | 5 | raw_realloc() { |
| 74 | 166.256 us | 6 | bget() |
| 75 | 23.056 us | 6 | raw_malloc_return_hook() |
| 76 | 267.952 us | 5 | } |
| 77 | 398.720 us | 4 | } |
| 78 | 426.992 us | 3 | } |
| 79 | | 3 | TEE_GetPropertyAsU32() { |
| 80 | 23.600 us | 4 | is_propset_pseudo_handle() |
| 81 | | 4 | __utee_check_instring_annotation() { |
| 82 | 26.416 us | 5 | strlen() |
| 83 | | 5 | check_access() { |
| 84 | | 6 | TEE_CheckMemoryAccessRights() { |
| 85 | | 7 | _utee_check_access_rights() { |
| 86 | | 8 | syscall_check_access_rights() { |
| 87 | | 9 | ts_get_current_session() { |
| 88 | 4.304 us | 10 | ts_get_current_session_may_fail() |
| 89 | 10.976 us | 9 | } |
| 90 | | 9 | to_user_ta_ctx() { |
| 91 | 2.496 us | 10 | is_user_ta_ctx() |
| 92 | 8.096 us | 9 | } |
| 93 | | 9 | vm_check_access_rights() { |
| 94 | | 10 | vm_buf_is_inside_um_private() { |
| 95 | | 11 | core_is_buffer_inside() { |
| 96 | ... |
Sumit Garg | 5a34726 | 2019-09-05 12:25:59 +0530 | [diff] [blame] | 97 | |
| 98 | The duration (function's time of execution) is displayed on the closing bracket |
| 99 | line of a function or on the same line in case the function is the leaf one. |
| 100 | In other words, duration is displayed whenever an instrumented function returns. |
| 101 | It comprises the time spent executing the function and any of its callees. The |
| 102 | Counter-timer Physical Count register (CNTPCT) and the Counter-timer Frequency |
| 103 | register (CNTFRQ) are used to compute durations. Time spent servicing foreign |
| 104 | interrupts is subtracted. |
Jerome Forissier | be6f978 | 2023-06-29 10:53:20 +0200 | [diff] [blame] | 105 | |
| 106 | The second column is the stack depth for the current function. It helps |
| 107 | visually match function entries and exit. |
| 108 | |
| 109 | .. _commit 5c2c0fb31efb: https://github.com/OP-TEE/optee_os/commit/5c2c0fb31efb |