blob: e9d92475deae0da9207045cf9407d1975474f31d [file] [log] [blame]
Jianliang Sheneba97722023-08-16 13:34:50 +08001################################
2Profiler tool and TF-M Profiling
3################################
David Wangbcb8b142022-02-17 17:31:40 +08004
Jianliang Sheneba97722023-08-16 13:34:50 +08005The profiler is a tool for profiling and benchmarking programs. The developer can
David Wangbcb8b142022-02-17 17:31:40 +08006leverage it to get the interested data of runtime.
7
Jianliang Sheneba97722023-08-16 13:34:50 +08008Initially, the profiler supports only count logging. You can add "checkpoint"
David Wangbcb8b142022-02-17 17:31:40 +08009in the program. The timer count or CPU cycle count of this checkpoint can be
10saved at runtime and be analysed in the future.
11
Jianliang Sheneba97722023-08-16 13:34:50 +080012*********************************
13TF-M Profiling Build Instructions
14*********************************
David Wangbcb8b142022-02-17 17:31:40 +080015
Jianliang Sheneba97722023-08-16 13:34:50 +080016TF-M has integrated some built-in profiling cases. There are two configurations
17for profiling:
David Wangbcb8b142022-02-17 17:31:40 +080018
Jianliang Sheneba97722023-08-16 13:34:50 +080019* ``CONFIG_TFM_ENABLE_PROFILING``: Enable profiling building in TF-M SPE and NSPE.
20 It cannot be enabled together with any regression test configs, for example ``TEST_NS``.
21* ``TFM_TOOLS_PATH``: Path of tf-m-tools repo. The default value is ``DOWNLOAD``
22 to fetch the remote source.
David Wangbcb8b142022-02-17 17:31:40 +080023
Jianliang Sheneba97722023-08-16 13:34:50 +080024The section `TF-M Profiling Cases`_ introduces the profiling cases in TF-M.
25To enable the built-in profiling cases in TF-M, run:
David Wangbcb8b142022-02-17 17:31:40 +080026
Jianliang Sheneba97722023-08-16 13:34:50 +080027.. code-block:: console
David Wangbcb8b142022-02-17 17:31:40 +080028
Jianliang Shen25f6d7b2023-11-07 14:30:48 +080029 cd <path to tf-m-tools>/profiling/profiling_cases/tfm_profiling
30 mkdir build
31
32 # Build SPE
33 cmake -S <path to tf-m> -B build/spe -DTFM_PLATFORM=arm/mps2/an521 \
34 -DCONFIG_TFM_ENABLE_PROFILING=ON -DCMAKE_BUILD_TYPE=Release \
35 -DTFM_EXTRA_PARTITION_PATHS=${PWD}/../prof_psa_client_api/partitions/prof_server_partition;${PWD}/../prof_psa_client_api/partitions/prof_client_partition \
36 -DTFM_EXTRA_MANIFEST_LIST_FILES=${PWD}/../prof_psa_client_api/partitions/prof_psa_client_api_manifest_list.yaml \
37 -DTFM_PARTITION_LOG_LEVEL=TFM_PARTITION_LOG_LEVEL_INFO
38
39 # Another simple way to configure SPE:
40 cmake -S <path to tf-m> -B build/spe -DTFM_PLATFORM=arm/mps2/an521 \
41 -DTFM_EXTRA_CONFIG_PATH=${PWD}/../prof_psa_client_api/partitions/config_spe.cmake
42 cmake --build build/spe -- install -j
43
44 # Build NSPE
Jianliang Shend5a00d62023-12-05 16:00:25 +080045 cmake -S . -B build/nspe -DCONFIG_SPE_PATH=${PWD}/build/spe/api_ns \
Jianliang Shen25f6d7b2023-11-07 14:30:48 +080046 -DTFM_TOOLCHAIN_FILE=build/spe/api_ns/cmake/toolchain_ns_GNUARM.cmake
47 cmake --build build/nspe -- -j
David Wangbcb8b142022-02-17 17:31:40 +080048
Jianliang Shen2e134572023-11-22 15:38:33 +080049.. Note::
50
51 TF-M profiling implementation relies on the physical CPU cycles provided by hardware
52 timer (refer to `Implement the HAL`_). It may not be supported on virtual platforms
53 or emulators.
54
Jianliang Sheneba97722023-08-16 13:34:50 +080055******************************
56Profiler Integration Reference
57******************************
David Wangbcb8b142022-02-17 17:31:40 +080058
Jianliang Sheneba97722023-08-16 13:34:50 +080059`profiler/profiler.c` is the main source file to be complied with the tagert program.
David Wangbcb8b142022-02-17 17:31:40 +080060
61Initialization
62==============
63
Jianliang Sheneba97722023-08-16 13:34:50 +080064``PROFILING_INIT()`` defined in `profiling/export/prof_intf_s.h` shall be called
65on the secure side before calling any other API of the profiler. It initializes the
66HAL and the backend database which can be customized by users.
David Wangbcb8b142022-02-17 17:31:40 +080067
Jianliang Sheneba97722023-08-16 13:34:50 +080068Implement the HAL
69-----------------
70
71`export/prof_hal.h` defines the HAL that should be implemented by the platform.
72
73* ``prof_hal_init()``: Initialize the counter hardware.
74
75* ``prof_hal_get_count()``: Get current counter value.
76
77Users shall implement platform-specific hardware support in ``prof_hal_init()``
78and ``prof_hal_get_count()`` under `export/platform`.
79
80Take `export/platform/tfm_hal_dwt_prof.c` as an example, it uses Data Watchpoint
81and Trace unit (DWT) to count the CPU cycles which can be a reference for
82performance.
83
84Setup Database
85--------------
86
87The size of the database is determined by ``PROF_DB_MAX`` defined in
88`export/prof_common.h`.
89
90The developer can override the size by redefining ``PROF_DB_MAX``.
91
92Add Checkpoints
93===============
94
95The developer should identify the places in the source code for adding the
96checkpoints. The count value of the timer or CPU cycle will be saved into the
97database for the checkpoints. The interface APIs are defined in `export/prof_intf_s.h` for the secure side.
98
99It's also supported to add checkpoints on the non-secure side.
100Add `export/ns/prof_intf_ns.c` to the source file list of the non-secure side.
101The interface APIs for the non-secure side are defined in `export/ns/prof_intf_ns.h`.
David Wangbcb8b142022-02-17 17:31:40 +0800102
103The counter logging related APIs are defined in macros to keep the interface
Jianliang Sheneba97722023-08-16 13:34:50 +0800104consistent between the secure and non-secure sides.
David Wangbcb8b142022-02-17 17:31:40 +0800105
Jianliang Sheneba97722023-08-16 13:34:50 +0800106Users can call macro ``PROF_TIMING_LOG()`` logs the counter value.
David Wangbcb8b142022-02-17 17:31:40 +0800107
Jianliang Sheneba97722023-08-16 13:34:50 +0800108.. code-block:: c
Elena Uziunaiteb90a3402023-11-13 16:24:28 +0000109
Jianliang Sheneba97722023-08-16 13:34:50 +0800110 PROF_TIMING_LOG(topic_id, cp_id);
David Wangbcb8b142022-02-17 17:31:40 +0800111
Jianliang Sheneba97722023-08-16 13:34:50 +0800112+------------+--------------------------------------------------------------+
113| Parameters | Description |
114+============+==============================================================+
115| topic_id | Topic is used to gather a group of checkpoints. |
116| | It's useful when you have many checkpoints for different |
117| | purposes. Topic can help to organize them and filter the |
118| | related information out. It's an 8-bit unsigned value. |
119+------------+--------------------------------------------------------------+
120| cp_id | Checkpoint ID. Different topics can have same cp_id. |
121| | It's a 16-bit unsigned value. |
122+------------+--------------------------------------------------------------+
David Wangbcb8b142022-02-17 17:31:40 +0800123
Jianliang Sheneba97722023-08-16 13:34:50 +0800124Collect Data
125============
David Wangbcb8b142022-02-17 17:31:40 +0800126
Jianliang Sheneba97722023-08-16 13:34:50 +0800127After successfully running the program, the data should be saved into the database.
128The developer can dump the data through the interface defined in the header
129files mentioned above.
David Wangbcb8b142022-02-17 17:31:40 +0800130
Jianliang Sheneba97722023-08-16 13:34:50 +0800131For the same consistent reason as counter logging, the same macros are defined as
132the interfaces for both secure and non-secure sides.
133
134The data fetching interfaces work in a stream way. ``PROF_FETCH_DATA_START`` and
135``PROF_FETCH_DATA_BY_TOPIC_START`` search the data that matches the given pattern
136from the beginning of the database. ``PROF_FETCH_DATA_CONTINUE`` and
137``PROF_FETCH_DATA_BY_TOPIC_CONTINUE`` search from the next data set of the
David Wangbcb8b142022-02-17 17:31:40 +0800138previous result.
139
Jianliang Sheneba97722023-08-16 13:34:50 +0800140.. Note::
Kevin Pengdc06d4b2023-07-13 15:31:15 +0800141
Jianliang Sheneba97722023-08-16 13:34:50 +0800142 All the APIs increase the internal search index, be careful about mixing using them
Kevin Pengdc06d4b2023-07-13 15:31:15 +0800143 for different checkpoints and topics at the same time.
144
Jianliang Sheneba97722023-08-16 13:34:50 +0800145The match condition of a search is controlled by the tag mask. It's ``tag value``
146& ``tag_mask`` == ``tag_pattern``. To enumerate the whole database, set
147``tag_mask`` and ``tag_pattern`` both to ``0``.
David Wangbcb8b142022-02-17 17:31:40 +0800148
Jianliang Sheneba97722023-08-16 13:34:50 +0800149* ``PROF_FETCH_DATA_XXX``: The generic interface for getting data.
150* ``PROF_FETCH_DATA_BY_TOPIC_XXX``: Get data for a specific ``topic``.
David Wangbcb8b142022-02-17 17:31:40 +0800151
Jianliang Sheneba97722023-08-16 13:34:50 +0800152The APIs return ``false`` if no matching data is found until the end of the database.
David Wangbcb8b142022-02-17 17:31:40 +0800153
154Calibration
155===========
156
Jianliang Sheneba97722023-08-16 13:34:50 +0800157The profiler itself has the tick or cycle cost. To get more accurate data, a
David Wangbcb8b142022-02-17 17:31:40 +0800158calibration system is introduced. It's optional.
159
Jianliang Sheneba97722023-08-16 13:34:50 +0800160The counter logging APIs can be called from the secure or non-secure side. And the
161cost of calling functions from these two worlds is different. So, secure and
David Wangbcb8b142022-02-17 17:31:40 +0800162non-secure have different calibration data.
163
Jianliang Sheneba97722023-08-16 13:34:50 +0800164The system performance might float during the initialization, for example, change
165CPU frequency, enable cache, etc. So, it's recommended that the calibration is
David Wangbcb8b142022-02-17 17:31:40 +0800166done just before the first checkpoint.
167
Jianliang Sheneba97722023-08-16 13:34:50 +0800168* ``PROF_DO_CALIBRATE``: Call this macro to get the calibration value. The more ``rounds``
169 the more accurate.
170* ``PROF_GET_CALI_VALUE_FROM_TAG``: Get the calibration value from the tag.
171 The calibrated counter is ``current_counter - previous_counter - current_cali_value``.
172 Here ``current_cali_value`` equals ``PROF_GET_CALI_VALUE_FROM_TAG`` (current_tag).
David Wangbcb8b142022-02-17 17:31:40 +0800173
Jianliang Sheneba97722023-08-16 13:34:50 +0800174Data Analysis
Summer Qin07e8f212023-07-05 17:05:07 +0800175=============
176
177Data analysis interfaces can be used to do some basic analysis and the data
178returned is calibrated already.
179
Jianliang Sheneba97722023-08-16 13:34:50 +0800180``PROF_DATA_DIFF``: Get the counter value difference for the two tags. Returning
181``0`` indicates errors.
Summer Qin07e8f212023-07-05 17:05:07 +0800182
183If the checkpoints are logged by multi-times, you can get the following counter
184value differences between two tags:
185
Jianliang Sheneba97722023-08-16 13:34:50 +0800186* ``PROF_DATA_DIFF_MIN``: Get the minimum counter value difference for the two tags.
187 Returning ``UINT32_MAX`` indicates errors.
188* ``PROF_DATA_DIFF_MAX``: Get the maximum counter value difference for the two tags.
189 Returning ``0`` indicates errors.
190* ``PROF_DATA_DIFF_AVG``: Get the average counter value difference for the two tags.
191 Returning ``0`` indicates errors.
Summer Qin07e8f212023-07-05 17:05:07 +0800192
Jianliang Sheneba97722023-08-16 13:34:50 +0800193A customized software or tool can be used to generate the analysis report based
194on the data.
Summer Qin07e8f212023-07-05 17:05:07 +0800195
Jianliang Sheneba97722023-08-16 13:34:50 +0800196Profiler Self-test
197==================
198
199`profiler_self_test` is a quick test for all interfaces above. To build and run
200in the Linux:
201
202.. code-block:: console
203
204 cd profiler_self_test
205 mkdir build && cd build
206 cmake .. && make
207 ./prof_self_test
208
209********************
210TF-M Profiling Cases
211********************
212
213The profiler tool has already been integrated into TF-M to analyze the program
214performance with the built-in profiling cases. Users can also add a new
215profiling case to get a specific profiling report. TF-M profiling provides
216example profiling cases in `profiling_cases`.
217
218PSA Client API Profiling
219========================
220
221This profiling case analyzes the performance of PSA Client APIs called from SPE
222and NSPE, including ``psa_connect()``, ``psa_call()``, ``psa_close()`` and ``stateless psa_call()``.
223The main structure is:
224
225::
226
227 prof_psa_client_api/
228 ├── cases
229 │ ├── non_secure
230 │ └── secure
231 └── partitions
232 ├── prof_server_partition
233 └── prof_client_partition
234
235* The `cases` folder is the basic SPE and NSPE profiling log and analysis code.
Jianliang Shen25f6d7b2023-11-07 14:30:48 +0800236* NSPE can use `prof_log` library to print the analysis result.
Jianliang Sheneba97722023-08-16 13:34:50 +0800237* `prof_server_partition` is a dummy secure partition. It immediately returns
238 once it receives a PSA client call from a client.
239* `prof_client_partition` is the SPE profiling entry to trigger the secure profiling.
240
241To make this profiling report more accurate, It is recommended to disable other
242partitions and all irrelevant tests.
243
244Adding New TF-M Profiling Case
245==============================
246
247Users can add source folder `<prof_example>` under path `profiling_cases` to
248customize performance analysis of target processes, such as the APIs of secure
249partitions, the functions in the SPM, or the user's interfaces. The
250integration requires these steps:
251
2521. Confirm the target process block to create profiling cases.
2532. Enable or create the server partition if necessary. Note that the other
254 irrelevant partitions shall be disabled.
2553. Find ways to output profiling data.
2564. Trigger profiling cases in SPE or NSPE.
257
258 a. For SPE, a secure client partition can be created to trigger the secure profiling.
Jianliang Shen25f6d7b2023-11-07 14:30:48 +0800259 b. For NSPE, the profiling case entry can be added to the 'tfm_ns' target under the `tfm_profiling` folder.
Jianliang Sheneba97722023-08-16 13:34:50 +0800260
261.. Note::
262
263 If the profiling case requires extra out-of-tree secure partition build, the
264 paths of extra partitions and manifest list file shall be appended in
265 ``TFM_EXTRA_PARTITION_PATHS`` and ``TFM_EXTRA_MANIFEST_LIST_FILES``. Refer to
Elena Uziunaiteb90a3402023-11-13 16:24:28 +0000266 :doc:`Adding Secure Partition<TF-M:integration_guide/services/tfm_secure_partition_addition>`.
Summer Qin07e8f212023-07-05 17:05:07 +0800267
David Wangbcb8b142022-02-17 17:31:40 +0800268--------------
269
Summer Qin07e8f212023-07-05 17:05:07 +0800270*Copyright (c) 2022-2023, Arm Limited. All rights reserved.*