Debugging OP-TEE

  1. QEMU
    1. Prerequisites
    2. Download gdb for ARM
    3. Scripts
    4. Debug
    5. Use graphical frontends
      1. ddd
      2. GNU Visual Debugger (gvd)
      3. Affinic Debugger
    6. Known issues
  2. Ftrace
  3. Abort dumps

In this document we would like to describe how to debug OP-TEE. Depending on the platform you are using you will have a couple of different options.

1. QEMU

To debug OP-TEE when using QEMU you could use gdb as the main debugger. Using this setup will also make it possible to use some frontends for gdb if you don't feel comfortable using a command line debugging tool.

1.1 Prerequisites

Since there are inter-dependencies between the gits used when building OP-TEE, we recommend that you have been following the guide as described in README.md.

# Root folder for the project
$HOME/devel/optee

1.2 Download gdb for ARM

If you've followed the instructions until here, then you should have the toolchains already in

$HOME/devel/optee/toolchains

1.3 Scripts

A few helper scripts that makes life easier.

Start by creating $HOME/.gdbinit and add:

set print array on
set print pretty on

define optee
	handle SIGTRAP noprint nostop pass
	symbol-file $HOME/devel/optee/optee_os/out/arm/core/tee.elf
	target remote localhost:1234
end
document optee
	Loads and setup the binary (tee.elf) for OP-TEE and also connects to the QEMU
	remote.
end

Now you are good to go for doing debugging using command line.

1.4 Debug

Start QEMU according to the instructions in QEMU ARMv7-A, however, do not start the emulation, i.e. do not type the c command in QEMU. The main reason for not doing so is because you cannot set the breakpoints on secure side when when kernel has booted up (if anyone knows why, please let us now about it, we haven't investigated it) and then in another shell start gdb like this:

$ $HOME/devel/optee/toolchains/aarch32/bin/arm-linux-gnueabihf-gdb -q

To connect to the remote and to load the tee.elf, simply type:

(gdb) optee
SIGTRAP is used by the debugger.
Are you sure you want to change it? (y or n) [answered Y; input not from
terminal]
0x00000000 in ?? ()

Now it is time to set the breakpoints. For example

(gdb) b tee_entry_std
Breakpoint 1 at 0x7df0c7be: file core/arch/arm/tee/entry_std.c, line 268.

and then start the execution by writing the continue command in gdb.

(gdb) c
Continuing.

When the driver has been loaded and you start using OP-TEE the breakpoint will trigger, which will look something like this:

Breakpoint 1, tee_entry_std (smc_args=0x7df6ff98 <stack_thread+8216>)
    at core/arch/arm/tee/entry_std.c:268
268		struct optee_msg_arg *arg = NULL;
(gdb)

1.5. Use graphical frontends

1.5.1 ddd

With the PATH exported to the arm-none-eabi-gdb binary and the optee helper function defined as above in the .gdbinit file, you invoke ddd by typing:

ddd --debugger arm-none-eabi-gdb

Then in the lower pane (which is the gdb command window), just simply type optee and ddd will connect to the remote and load tee.elf, just as described above for the command line version.

1.5.2 GNU Visual Debugger (gvd)

This is a rather old frontend for gdb and share a lot of similarities with ddd, however it seems like it's more stable compared to ddd. To run it, you simply need to tell the path to the arm-none-eabi-gdb binary:

gvd --debugger $HOME/devel/toolchains/gcc-linaro-arm-none-eabi-4.9-2014.09_linux/bin/arm-none-eabi-gdb

Similarly to ddd, just simply run optee in the lower gdb command pane in gvd.

1.5.3 Affinic Debugger / ADG

Affinic Debugger seems to be the most stable graphical front end. It's not free (at this moment it costs roughly $50 USD). If you can afford it and prefer graphical frontends we highly recommend this tool.

1.6 Known issues

  1. Printing the call stack using bt makes gdb go into an endless loop. Temporary workaround, in gdb, instead of simply writing bt, also mention how many frames you would like to see, for example bt 10.
  2. Cannot set breakpoints when the system is up and running. Workaround, set the breakpoints before booting up the system.

2. Ftrace

Ftrace is useful set of tools for debugging both kernel and to some extent user space. Ftrace is really useful if you want to learn how some piece of code interact with other parts of the system. It's nothing special you have to do to make use of ftrace for OP-TEE. But for a reference we list a couple of commands and scenarios that could be handy to have ready to be copy/pasted.

2.1 Enable ftrace in menuconfig

First you will need to enable ftrace in the kernel. Depending on which version you are using it might look a bit different compared to what is shown below (here we were using 4.1.0-rc4)

make ARCH=arm menuconfig
    # Go into "Kernel hacking"
    General setup  --->
    ...
    Kernel hacking  --->

    # Enable and go into Tracers
    ...
    [*] Tracers  --->

    # Below is a good set of features (*) to enable
    --- Tracers
    -*-   Kernel Function Tracer
    [*]     Kernel Function Graph Tracer
    [ ]   Interrupts-off Latency Tracer
    [ ]   Scheduling Latency Tracer
    [*]   Trace syscalls
    [ ]   Create a snapshot trace buffer
          Branch Profiling (No branch profiling)  --->
    [*]   Trace max stack
    [ ]   Support for tracing block IO actions
    [ ]   Enable uprobes-based dynamic events
    [*]   enable/disable function tracing dynamically
    [*]   Kernel function profiler
    [ ]   Perform a startup test on ftrace
    [ ]   Add tracepoint that benchmarks tracepoints
    < >   Ring buffer benchmark stress tester
    [ ]   Ring buffer startup self test
    [ ]   Show enum mappings for trace events

Then simply recompile the kernel.

2.2 Use cases

2.2.1 Filter OP-TEE functions

$ modprobe optee_armtz
$ cd /sys/kernel/debug/tracing
$ echo ':mod:optee' > set_ftrace_filter
$ echo ':mod:optee_armtz' >> set_ftrace_filter

2.2.2 Use the function tracer and function profiling

Using the commands below will enable function profiling for the functions currently mentioned in the set_ftrace_filter

$ echo "function" > current_tracer
$ echo "1" > function_profile_enabled

If you now run xtest for example, then when done you can get profiling data by reading the content of the files in /sys/kernel/debug/tracing/trace_stat

$ cat trace_stat/function0
$ cat trace_stat/function1
...

The result will look something like this:

  Function                               Hit    Time            Avg             s^2
  --------                               ---    ----            ---             ---
  call_tee.isra.13                     13499    55772240 us     4131.583 us     1537657 us
  tee_session_ioctl                    11330    54380860 us     4799.722 us     35403.79 us
  tee_session_invoke_be                11330    54330744 us     4795.299 us     162939.5 us
  tz_invoke                            11330    54014297 us     4767.369 us     573472.7 us
  tee_ioctl                             1139    2893849 us     2540.692 us     2841179 us
  tee_session_create_fd                 1135    2889859 us     2546.131 us     2615175 us
  ...

2.2.2.1 Oneliners

# Print also the core number in the log
$ for core in `seq 0 7`; do echo core: $core; cat trace_stat/function$core; done
# The functions that are called mostly:
$ cat trace_stat/function0  | sort -nk2 -r | less
# The functions taking most time:
$ cat trace_stat/function0  | sort -nk5 -r | less

2.2.3 Using the function_graph

The function_graph will give you the call flow and also tell you the amount of time spent in the functions. There are ways to turn of sleep time and not count time spent when calling other functions. Let us say that your are interested in knowing how much various open, invoke and close and the call_tee command takes, then you can do like this:

$ echo "tz_open" > set_ftrace_filter
$ echo "tz_close" >> set_ftrace_filter
$ echo "tz_invoke" >> set_ftrace_filter
$ echo "call_tee*" >> set_ftrace_filter

# Don't count the time if you are being schduled out
$ echo 0 > options/sleep-time

# Enable the function_graph tracer
$ echo "function_graph" > current_tracer

Now if you run xtest and then done, read the contents of trace, you will see something like this:

# CPU  DURATION                  FUNCTION CALLS
# |     |   |                     |   |   |   |
 2)               |  tz_open [optee_armtz]() {
 2) ! 3145.834 us |    call_tee.isra.13 [optee_armtz]();
 2) ! 3222.500 us |  }
 2)               |  tz_invoke [optee_armtz]() {
 2) ! 125.833 us  |    call_tee.isra.13 [optee_armtz]();
 2) ! 166.667 us  |  }
 2)               |  tz_invoke [optee_armtz]() {
 2) ! 135.833 us  |    call_tee.isra.13 [optee_armtz]();
 2) ! 170.833 us  |  }
 2)               |  tz_invoke [optee_armtz]() {
 2) ! 153.334 us  |    call_tee.isra.13 [optee_armtz]();
 2) ! 186.667 us  |  }
...

2.2.4 Options

If you don't want to count the time when being scheduled out, then run:

$ echo 0 > options/sleep-time

If you only want to measure the time spent in the function, then disable the graph-time.

$ echo 0 > options/graph-time

3. Abort dumps, panics

When OP-TEE encounters a serious error condition, it prints diagnostic information to the secure console. The message contains a call stack if CFG_UNWIND=y (enabled by default). The following errors will trigger a dump:

  • Data or prefetch abort exception in the TEE core (kernel mode) or in a TA (user mode),
  • When a user-mode Trusted Application panics, either by calling TEE_Panic() directly or due to some error detected by the TEE Core Internal API,
  • When the TEE core detects a fatal error and decides to hang the system because there is no way to proceed safely (core panic).

The messages look slightly different depending on:

  • Whether the error is an exception or a panic,
  • The exception/privilege level when the exception occurred (PL0/EL0 if a user mode Trusted Application was running, PL1/EL1 if it was the TEE core),
  • Whether the TEE and TA are 32 or 64 bits,
  • The exact type of exception (data or prefetch abort, translation fault, read or write permission fault).

Here is an example of a data abort with write permission fault in a 32-bit Trusted Application, running on a 32-bit TEE core (QEMU):

ERROR:   [0x0] TEE-CORE: User TA data-abort at address 0x131658 (write permission fault)
ERROR:   [0x0] TEE-CORE:  fsr 0x0000080f  ttbr0 0x0e07a06a  ttbr1 0x0e07406a  cidr 0x1
ERROR:   [0x0] TEE-CORE:  cpu #0          cpsr 0x20000030
ERROR:   [0x0] TEE-CORE:  r0 0x0012937f      r4 0x0013a0e4    r8 0x00000000   r12 0x0e07dd80
ERROR:   [0x0] TEE-CORE:  r1 0x00001b00      r5 0x00121e47    r9 0x00000000    sp 0x001026f4
ERROR:   [0x0] TEE-CORE:  r2 0x00131658      r6 0x001026f4   r10 0x00000000    lr 0x00121bc9
ERROR:   [0x0] TEE-CORE:  r3 0x00000002      r7 0x001026f4   r11 0x00000000    pc 0x00105bdc
ERROR:   [0x0] TEE-CORE: Status of TA 5b9e0e40-2636-11e1-ad9e-0002a5d5c51b (0xe073b70) (active)
ERROR:   [0x0] TEE-CORE:  arch: arm  load address: 0x103000  ctx-idr: 1
ERROR:   [0x0] TEE-CORE:  stack: 0x100000 10240
ERROR:   [0x0] TEE-CORE:  region 0: va 0x100000 pa 0xe31e000 size 0x3000 flags rw-
ERROR:   [0x0] TEE-CORE:  region 1: va 0x103000 pa 0xe200000 size 0x2e000 flags r-x
ERROR:   [0x0] TEE-CORE:  region 2: va 0x131000 pa 0xe22e000 size 0xa000 flags r--
ERROR:   [0x0] TEE-CORE:  region 3: va 0x13b000 pa 0xe238000 size 0xe6000 flags rw-
ERROR:   [0x0] TEE-CORE:  region 4: va 0 pa 0 size 0 flags ---
ERROR:   [0x0] TEE-CORE:  region 5: va 0 pa 0 size 0 flags ---
ERROR:   [0x0] TEE-CORE:  region 6: va 0 pa 0 size 0 flags ---
ERROR:   [0x0] TEE-CORE:  region 7: va 0 pa 0 size 0 flags ---
ERROR:   [0x0] TEE-CORE: Call stack:
ERROR:   [0x0] TEE-CORE:  0x00105bdc
ERROR:   [0x0] TEE-CORE:  0x00121bc9
ERROR:   [0x0] TEE-CORE:  0x00121d1b
ERROR:   [0x0] TEE-CORE:  0x00121e6f

The above dump was triggered by the TA trying to write to a global const variable. Such variables are stored in the .rodata section of the ELF binary, and loaded into a read-only section of memory.

OP-TEE provides a helper script called symbolize.py to facilitate the analysis of such issues. It is located in the OP-TEE OS source tree, and is also copied to the TA development kit. Whenever you are confronted with an error message reporting a serious error and showing a "Call stack:" line, you may use the symbolize script.

symbolize.py reads its input from stdin and writes extended debug information to stdout. The -d (directories) option tells the script where to look for TA ELF file(s) or for tee.elf (the TEE core). The -s (strip) option is used to shorten the source file paths. Please refer to symbolize.py --help for details.

$ cat dump.txt | ./optee_os/scripts/symbolize.py -d optee_test/out/ta/* -d optee_os/out/arm/core -s `pwd`
# (or run the script, copy and paste the dump, then press Ctrl+D)
ERROR:   [0x0] TEE-CORE: User TA data-abort at address 0x131658 (write permission fault)
ERROR:   [0x0] TEE-CORE:  fsr 0x0000080f  ttbr0 0x0e07a06a  ttbr1 0x0e07406a  cidr 0x1
ERROR:   [0x0] TEE-CORE:  cpu #0          cpsr 0x20000030
ERROR:   [0x0] TEE-CORE:  r0 0x0012937f      r4 0x0013a0e4    r8 0x00000000   r12 0x0e07dd80
ERROR:   [0x0] TEE-CORE:  r1 0x00001b00      r5 0x00121e47    r9 0x00000000    sp 0x001026f4
ERROR:   [0x0] TEE-CORE:  r2 0x00131658      r6 0x001026f4   r10 0x00000000    lr 0x00121bc9
ERROR:   [0x0] TEE-CORE:  r3 0x00000002      r7 0x001026f4   r11 0x00000000    pc 0x00105bdc
ERROR:   [0x0] TEE-CORE: Status of TA 5b9e0e40-2636-11e1-ad9e-0002a5d5c51b (0xe073b70) (active)
ERROR:   [0x0] TEE-CORE:  arch: arm  load address: 0x103000  ctx-idr: 1
ERROR:   [0x0] TEE-CORE:  stack: 0x100000 10240
ERROR:   [0x0] TEE-CORE:  region 0: va 0x100000 pa 0xe31e000 size 0x3000 flags rw-
ERROR:   [0x0] TEE-CORE:  region 1: va 0x103000 pa 0xe200000 size 0x2e000 flags r-x .ta_head .text .rodata
ERROR:   [0x0] TEE-CORE:  region 2: va 0x131000 pa 0xe22e000 size 0xa000 flags r-- .rodata .ARM.extab .ARM.extab.text.__aeabi_ldivmod .ARM.extab.text.__aeabi_uldivmod .ARM.exidx .got .dynsym .rel.got .dynamic .dynstr .hash .rel.dyn
ERROR:   [0x0] TEE-CORE:  region 3: va 0x13b000 pa 0xe238000 size 0xe6000 flags rw- .data .bss
ERROR:   [0x0] TEE-CORE:  region 4: va 0 pa 0 size 0 flags ---
ERROR:   [0x0] TEE-CORE:  region 5: va 0 pa 0 size 0 flags ---
ERROR:   [0x0] TEE-CORE:  region 6: va 0 pa 0 size 0 flags ---
ERROR:   [0x0] TEE-CORE:  region 7: va 0 pa 0 size 0 flags ---
ERROR:   [0x0] TEE-CORE: User TA data-abort at address 0x131658 const_var .rodata+4852 (write permission fault)
ERROR:   [0x0] TEE-CORE: Call stack:
ERROR:   [0x0] TEE-CORE:  0x00105bdc TA_CreateEntryPoint at optee_test/ta/os_test/ta_entry.c:44
ERROR:   [0x0] TEE-CORE:  0x00121bc9 ta_header_add_session at optee_os/lib/libutee/arch/arm/user_ta_entry.c:115
ERROR:   [0x0] TEE-CORE:  0x00121d1b entry_open_session at optee_os/lib/libutee/arch/arm/user_ta_entry.c:159
ERROR:   [0x0] TEE-CORE:  0x00121e6f __utee_entry at optee_os/lib/libutee/arch/arm/user_ta_entry.c:229

The Python script uses several tools from the GNU Binutils package to perform the following tasks:

  1. Translate the call stack addresses into function names, file names and line numbers;
  2. Convert the abort address to a symbol plus some offset (const_var in the example) and/or an ELF section name plus some offset (.rodata+4852 in the example);
  3. Print the names of the ELF sections contained in each memory region of a TA.